Intrinsic androgen-dependent gene expression patterns revealed by comparison of genital fibroblasts from normal males and individuals with complete and partial androgen insensitivity syndrome

Background To better understand the molecular programs of normal and abnormal genital development, clear-cut definition of androgen-dependent gene expression patterns, without the influence of genotype (46, XX vs. 46, XY), is warranted. Previously, we have identified global gene expression profiles in genital-derived fibroblasts that differ between 46, XY males and 46, XY females with complete androgen insensitivity syndrome (CAIS) due to inactivating mutations of the androgen receptor (AR). While these differences could be due to cell autonomous changes in gene expression induced by androgen programming, recent work suggests they could also be influenced by the location from which the fibroblasts were harvested (topology). To minimize the influence of topology, we compared gene expression patterns of fibroblasts derived from identical urogenital anlagen: the scrotum in normally virilized 46, XY males and the labia majora from completely feminized 46, XY individuals with CAIS. Results 612 transcripts representing 440 unique genes differed significantly in expression levels between scrotum and CAIS labia majora, suggesting the effects of androgen programming. While some genes coincided with those we had identified previously (TBX3, IGFBP5, EGFR, CSPG2), a significant number did not, implying that topology had influenced gene expression in our previous experiments. Supervised clustering of gene expression data derived from a large set of fibroblast cultures from individuals with partial AIS revealed that the new, topology controlled data set better classified the specimens. Conclusion Inactivating mutations of the AR, in themselves, appear to induce lasting changes in gene expression in cultured fibroblasts, independent of topology and genotype. Genes identified are likely to be relevant candidates to decipher androgen-dependent normal and abnormal genital development.


Background
Androgen receptor (AR) signaling is the key determinant of virilization in male external genitalia development [1][2][3][4]. Its importance is highlighted in the androgen insensitivity syndrome (AIS), a virtual human AR "knock-out" due to inactivating mutations of the AR gene, and characterized by defects in virilization of 46, XY individuals despite normal or elevated serum testosterone levels. The phenotype can range from normal female in complete AIS (CAIS), to lesser degrees of genital ambiguity in partial AIS (PAIS) in which the degree of virilization is related to the degree of AR function [2,3]. Genital masculinization involves a comprehensive re-organization of genital anatomy in which androgens induce a permanent male developmental fate in the originally bipotent anlagen. For example, in response to embryonal androgen signaling, the labioscrotal swellings will develop into a scrotum and not into labia majora (as overviewed by [5]). These observations indicate the existence of androgen-mediated gene expression programs that are responsible for implementation and persistence of male-specific genital morphology and function. In general, distinguishing X-and Y-chromosomal influences from hormonal influences on genital development has proven difficult [6,7]; however, CAIS, in which all individuals possess an XY genotype, represents the ideal situation for focusing on the role of androgens.
In a previous study, we reported evidence for androgenmediated programming of gene expression by comparing genital skin fibroblasts from normal males to those derived from CAIS individuals [8]. However, the demonstration that cultured fibroblasts show stable transcription signatures that reflect their site of origin in mature adult tissues (transcriptional topographic memory) [9] raises the possibility that fibroblast transcript signatures might also reflect androgen-independent aspects of their developmental history. Were that the case, the transcript signatures we had identified might be influenced both by androgen signaling during genital development and topology. Even though the normal male and CAIS fibroblasts were both derived from the genital skin, their precise sites of biopsy differed both topologically and with regard to embryological origin. The normal male genital fibroblasts were grown from biopsies of the foreskin (derived from the genital tubercle), while CAIS fibroblasts came from biopsies of the labia majora (derived from the labioscrotal swellings) [8]. To better define the gene expression programs dependent only on developmental androgen actions, we analyzed gene expression profiles of fibroblasts derived from homologous embryonic structures: the labioscrotal swellings in 46, XY males (scrotum) and 46, XY females (labia majora). We validated the biological relevance of this approach by comparing the ability of this new gene expression set and the previous set to classify a large set of individuals representing the whole clinical spectrum of AIS phenotypes based on gene expression profiles alone.

Identification of a topology-independent, AR-dependent gene expression program
We had previously identified an AR-dependent gene expression signature by comparing normal male foreskin fibroblasts to those cultured from diverse sites in the genitals of patients with documented CAIS. Subsequent reports that cultured fibroblasts retain topographic transcriptional memory (gene expression signatures that reflect their site of biopsy) led to concerns that the ARdependent gene expression signature we identified could have been affected by topological differences in the fibroblast samples used. To test for this potential confounder, we repeated microarray experiments on seven independent strains of normal male scrotal fibroblasts (S1, S4, S5, S8, S9, S11, and S12) and duplicate samples of four labia majora fibroblasts derived from 46, XY individuals with CAIS due to proven inactivating mutations of the AR-gene. Both fibroblasts were derived from identical anlagen: the labioscrotal swellings. The SAM procedure revealed 612 transcripts representing 440 unique genes that differed significantly in expression level between the groups at a false discovery rate of 0.038 ( Figure 1).
The new topology-controlled data set showed some similarities to the AR-dependent gene set we had identified previously, with 42 unique transcripts found in both gene sets, including 34 transcripts that were up-regulated in normal male derived fibroblasts, and 8 up-regulated in the CAIS female-derived fibroblasts. Genes up-regulated in normal male fibroblasts in both data sets included TBX3 (T-box 3), CBX6 (Chromobox homologue 6), IGFBP5 (Insulin-like growth factor binding protein 5), and EGFR (epidermal growth factor receptor) while several others were no longer identified as significantly different between the data sets, such as TBX2 (T-box 2), TBX5 (T-box 5), BMP4 (bone morphogenetic protein 4), HOXA13 (Homeobox A13), WNT2 (Wingless-type MMTV integration site family, member 2), and FOXF2 (Forkheadbox F2). The significant change in the gene lists strongly suggested that topology influenced the gene expression signatures identified in our original series of experiments.

Topology independent AR gene expression program classifies diverse AIS samples
To evaluate the relevance of the topology-controlled, AR dependent gene list, we tested its ability to classify 72 microarray experiments performed on fibroblast samples derived from 51 individuals that included normal males, normal females, and individuals with PAIS and CAIS ( Fig.  2 and table 1). AIS samples were graded according to the Transcripts with significant differences of expression levels between normal scrotum and CAIS labia majora Figure 1 Transcripts with significant differences of expression levels between normal scrotum and CAIS labia majora. Transcript levels of 612 genes identified by SAM analysis as differing between fibroblasts derived from normal male scrotum (green) and labia majora of 46, XY females with CAIS (pink). Individual transcripts are grouped by hierarchical cluster analysis and are displayed in rows while experiments are represented in columns. Expression values per gene are centered by the mean log 2 red/green normalized ratio. Increasing red intensity corresponds to higher relative transcript levels compared to the mean expression level across all 15 array experiments. Increasing green intensity corresponds to relatively decreased transcript levels compared to the mean. On the right side, examples of individual genes (gene symbols according to S.O.U.R.C.E. [24]) discussed in the paper or falling into the biological processes and cellular pathways detected by PANTHER are displayed. Detailed data on figure 1 is available in additional files 1, 2, 3, 4. ARAF  STX3A  FHOD3  KRT25A  MLLT6  WNT5B  CCND1   EGFR  CCNI  ALKBH  LYN  AIG-1   RAB3B  IGFBP5  VAMP4   ANK1  DDB1  UNQ501  ATXN10  EPHA2  MYC   LYN  MET  BDNF  ADAMTS2  RAB38  MYO10   PTPN3  VIL2  TLK  APC  COL5A2  KRT5  CSPG2  SPOCK   PIM2  NPDC1   ATXN2 FEZ2 FLNC FMOD Cluster analysis of normal male fibroblasts from scrotum and foreskin as well as 46, XY individuals with PAIS and CAIS Figure 2 Cluster analysis of normal male fibroblasts from scrotum and foreskin as well as 46, XY individuals with PAIS and CAIS. Hierarchical clustering analysis of 72 microarray experiments of cultured genital fibroblasts using the SAM derived gene list. The heatmap on the left displays 259 genes that had at least 85% interpretable data across the experiments whose expression levels were at least 2-fold different from the mean expression across all samples in at least 5 microarrays. (A) The cluster dendrogram demonstrating the degree of relatedness (Pearson correlation) between the expression patterns of the 259 genes in the cultured fibroblast samples. The length of the arms of the dendrogram reflects the degree of correlation between the samples. Samples are color coded to reflect the localization of the biopsy and the degree of external genital virilization according to a grading scheme developed by Sinnecker et al. [10]. The grey bar below indicates whether a sample was derived from dataset 1, 2 or 3. ''L.n.d.'' signifies that the biopsy localization was not accurately documented. Italics indicate a sample with a 46, XX karyotype. (B) Schematic depiction of the external genitalia phenotype of the cases from which the fibroblast cultures were derived using color coding that corresponds to the degree of genital ambiguity and the location of the biopsy. Color coding corresponds to the bar below the dendrogram in (A). (C) Cluster of AR-dependent transcripts that are highly expressed in the left ''male'' major branch of the cluster that are expressed at significantly lower levels in the ''female'' branch of the cluster on the right. TBX3, previously reported in ulnar mammary syndrome and IGF2, previously reported as being down-regulated in CAIS [20] are shown in this cluster. (D) Cluster of AR-dependent transcripts that are expressed at significantly lower levels in the lefthand ''male'' branch of the cluster and at higher levels in the righthand ''female'' branch. These include many extracellular matrix genes such as proteoglycan testican, versican, and fibrillin 1. Detailed data on figure 2 is available in additional files 5, 6, 7, 8.   system suggested by Sinnecker [10] wherein phenotypically male genitalia are scored AIS 1, while female external genitalia are scored AIS 5 (Fig. 2B). Stringent filtering conditions of the combined data sets reduced the number of transcripts from 612 to 259; however, altering the stringency of the filtering conditions and the number of transcripts used did not significantly change the clustering pattern of the individual samples (data not shown).

MAFF RAB5B
Clustering separated the 72 experiments into two major subgroups. The righthand major branch included predominantly patients with female external genitalia while most of the patients in the lefthand major branch had normal male or highly virilized external genitalia ( Fig. 2A  and 2B). All but one of the samples with CAIS (AIS 5) and a normal 46, XX female clustered in the right ("female") branch ( Fig. 2A). Interestingly, the one exception expressed a wild type AR in a portion of cells due to somatic mosaicism (ARD465). Two skin fibroblast samples from normal males derived from regions without an obvious androgen induced sexual dimorphism (abdomen, forearm) also clustered in the right ("female") branch. The left ("male") major branch contained all genital skin fibroblasts derived from normal male controls and 8 of 10 microarry experiments reflecting patients with higher degrees of virilization due to partial AIS (AIS 2). This cluster also included a fibroblast sample from an individual with 5α-reductase type II deficiency, a defect which results in ambiguous genitalia due to lack of conversion of testosterone to dihydrotestosterone. This individual presumably possesses a wild-type AR, meaning that androgen signaling pathways remained intact. Three of the four labioscrotal fibroblast samples from individuals with AIS 3 phenotypes with significant genital ambiguity clustered in the "female" major branch and the remaining in the "male" one ( Fig. 2). Of note, clustering did not appear to be influenced by array type or RNA referencetype, indicating that normalization procedures did not influence data quality.
Structure within the cluster dendrogram suggested that there was some residual influence of topology on gene expression in the samples. In the righthand "female" branch, fibroblast samples derived from AIS gonads clustered separately from all skin-derived samples ( Fig. 2A). Similarly, the lefthand "male" branch showed a subcluster that contained all the foreskin-derived fibroblasts, including the AIS 2 fibroblasts originating from the foreskin. Interestingly, this branch also contained two strains of labia minora fibroblasts from two 46, XX individuals, one of whom had ambiguous genitalia (Prader stage 3) due to 21-hydroxylase deficiency (female pseudohermaphroditism), while the other individual was a normal female.
Since the labia minora are analogous to the urethral folds that participate in penile morphogenesis, this finding suggests that topographic origin influenced expression within the selected set of genes of these two samples more than AR signaling. In some cases, the anatomic origin of biopsy was not well documented (table 1). This might explain why some samples did not cluster as expected (e.g., ARD380 and ARD659, Fig. 2A) although other factors might have contributed to these findings.
We also wanted to reconsider whether the new topologyindependent, AR gene list better classified samples than our previous gene list that did not control for the locations from which samples were harvested. After removing the microarray experiments that were used to define either the previous [8] or the new gene set by SAM, the remaining samples were clustered. As expected, in case of both gene lists the samples sorted into two main branches that separated primarily male and female samples ( Fig. 3A and 3B). Moreover, when we considered only topology-controlled samples originating from the labioscrotal swellings excluding the mosaic samples and those of insufficiently described biopsy localization, the mean AISgrades between the two branches differed significantly using both gene lists (new gene list: AIS-grades: 2.5 ± 0.55 (male); 4.17 ± 1.12 (female); p < 0.001 by t-test; previous gene list (Holterhus et al. 2003): AIS-grades: 1.6 + 0.79 (male); 3.3 + 1.6 (female); p < 0.01 by t-test). However, in contrast to the new gene set, the previous gene set that did not account for topology misclassified many individuals. It resulted in incorrectly female classification of most of the highly virilized individuals with AIS 2 (3 of 4 individuals = 75%) and of a large fraction of the normal male scrotal fibroblast controls (3 of 7 individuals = 43%) ( Fig.  3A and 3B). The new gene set misclassified only one individual with AIS 2 (ARD306).

Biological processes in the AR-dependent gene expression program
We performed a systematic analysis for enrichment of genes belonging to defined biological processes and cellular pathways using the PANTHER classification system [11]. PANTHER classifies genes by their functions, based on published experimental evidence and on evolutionary relationships. The 612 significant transcripts corresponded to 527 named transcripts, of which PANTHER recognized 440 unique gene IDs. Several related biological processes were significantly over-represented in the AR-dependent gene list including "control of cell proliferation and differentiation" (p = 0.00001, "developmental processes" (p = 0.00013) and "cell cycle control" (p = 0.00041) (table 2). Analysis of cellular pathways also revealed several interesting signaling pathways including "angiogenesis" (p = 0.00001) and WNT-signaling" (p = 0.00002) (table 3). These processes and pathways were reflected in the major branches of the cluster dendrogram revealing differential expression in the phenotypically male and female samples ( Figures 2C and 2D). For instance, samples in the "male" branch showed high expression of CCN1 (cyclin 1), CCND1 (cyclin D1), IGF2 (insulin-like growth factor 2), IGFBP5 (Insulin-like growth factor binding protein 5), MYC (V-myc myelocytomatosis viral oncogene homolog), MAFF (V-maff musculoaponeurotic fibrosarcoma oncogene homolog F), EGFR (epidermal growth factor receptor), PTPN3 (protein tyrosine phophatase), MET (hepatocyte growth factor receptor) and several other genes important in cell growth and proliferation ( Fig. 1 and 2A). Transcripts expressed at high levels in the "female" branch included ANAPC7 (anaphase promoting complex subunit 7), FZD8 (frizzled homolog 8) and FZD6 (frizzled homolog 6) (Fig. 1).

Discussion
We demonstrate reproducible, lasting differences in gene expression in cultured fibroblasts harvested from analogous structures in the genital tissues of normal males and 46, XY CAIS females. This unique model system, in which samples were matched for genotype and topology, allows identification of gene expression programs predominantly influenced by AR signaling, likely during genital morphogenesis. By selecting fibroblasts derived from the labioscrotal swellings in normal and CAIS individuals, we were able to largely exclude androgen-independent mechanisms as a cause of systematic differences of baseline gene expression. While we had identified a related set of genes previously using fibroblast cells that had not been matched for site of origin, the significant differences between our previous and current gene sets implies that topology significantly influenced gene expression in our original set. This finding is instructive for future studies Experiment clustering with topography controlled (A) -versus previous (B) [8] gene set Figure 3 Experiment clustering with topography controlled (A) -versus previous (B) [8] gene set. Hierarchical clustering analyses of microarray experiments using the new topology-independent AR gene list (A) and our previous gene list [8] that did not control for topology of biopsy location (B). In each of the two clusters A and B, the microarray experiments used to generate the underlying list of significant genes by SAM were removed before clustering resulting 57 remaining experiments in (A) and 58 in (B), respectively. The color code is the same as in Fig. 2. Based on the previous gene set, several highly virilized AIS 2 patients and normal scrotal fibroblasts were incorrectly classified female. A B seeking to identify the effects of inactivating mutations of single genes in tissues or cells cultured from tissues. Gene expression profiles can be influenced by obvious genotypic differences, such as the presence of an X or Y chromosome [6] but also by subtle differences including differences in topology [9].
The transcripts we have identified appear to be highly relevant to male and female external genital morphogenesis. When used to classify an independent set of fibroblasts cultured from normal male, normal female, and PAISaffected individuals, this gene set performed considerably better than our previous set in which we had not controlled for differences in topology. The observation that fibroblasts derived from the labia majora of normal female external genitalia clustered with AIS derived samples from the same site demonstrates the primacy of developmental androgen actions in influencing gene expression in this set of genes, and possibly in external genital morphogenesis. Furthermore, our findings suggest that gene expression profiling of genital fibroblasts might be correlated with phenotype in vivo and could serve as a diagnostic marker of the extent of developmental androgenization, although many more cases will need to be analyzed, and other analytic methods will need to be used to identify an ideal classifier.
Chang and co-workers have demonstrated that fibroblasts cultured from different regions of the body retain position-specific expression signatures [9]. Our work confirms the presence of lasting, topology-influenced differences in gene expression in human fibroblasts. Controlling for the site from which genital fibroblasts were derived significantly altered the set of genes we identified as differing between normal and CAIS-derived fibroblasts. Despite that control, signatures of topology remained in our new   gene set. For instance, normal male fibroblasts and CAISderived gonadal fibroblasts each clustered as distinct entities in the "male" and "female" arms, respectively. Thus, even in our stringently selected fibroblast samples and gene set, topology remains evident in the gene expression profile. Although Chang and co-workers have found that HOX gene signatures account for much of the topological gene classifier, our gene list lacked HOX genes, implying that traces of topology remain encoded in other sets of transcripts. Together with the results of Chang et al., our work demonstrates substantial heterogeneity in gene expression in cultured fibroblasts that can be influenced by topology and, in the case of our unique fibroblast samples, AR signaling. We suspect that other signaling pathways, possibly including other steroid hormones or growth factors, can program lasting changes gene expression in fibroblasts. Whether these differences in gene expression can influence development of associated tissues in vivo remains to be elucidated. Regardless, our data suggests one possible model system for teasing out the effects of mutations in single genes on global gene expression patterns in cells cultured from in vivo models.
The biological processes and the cellular pathways enriched in the AR-dependent gene set could provide some insights into the role of AR-mediated programming of fibroblasts in male and female external genital differentiation. For instance, activation of several growth signaling pathways in male derived fibroblasts and WNT signaling pathways in female genital fibroblasts, e.g., FZD8 (Fig. 1, 2 and 4), might signal their important roles in genital development. The same could be true for genes contributing to maintenance and modification of tissue shape and structural identity, e.g., versican, tenascin, and ADAM12 ( Fig. 1, 2 and 4). It is unclear, however, whether expression of these genes in cultured fibroblasts reflects their expression in vivo, whether expression is necessary for maintenance of external genital structure, or whether these pathways are reactivated as a result of cell culture. Moreover, it cannot be excluded that potentially higher cumulative estrogens during development and postnatal life in the AIS patients until the time of genital skin biopsy may have had additional influences on gene expression. It is possible that the gene expression programs have little to do with genital development; however, the finding that many genes related to cell and tissue structure and integrity is tantalizing. We identified many genes of the cytoskeletal network (KRT5, 19, 25A, MYO10) and of the surrounding extracellular matrix (TNC, SPOCK, CSPG2, COL5A2) in the AR-dependent gene expression data set [12][13][14]. Additional in vivo work including strategies comprising early embryonal urogenital tissues will clarify whether genes identified using our approach are important in external genital morphogenesis.

Conclusion
Our data demonstrate the existence of large-scale ARdependent gene expression programs in fibroblasts cultured from human external genitalia. These programs are influenced neither by differences in sex chromosomes nor by the topographic differences between the genital tubercle and the labioscrotal swellings. Improved classification of individuals with partial and complete androgen insensitivity syndrome with this gene list compared to a set we had identified previously suggests that androgen programming must have played a key role in establishing these programs. Therefore, the detected genes represent valuable targets for unraveling human external genital differentiation and the role of androgen therein.

Methods
The study was approved by the ethical committee of the University of Lübeck, Germany. Informed consent was obtained from the control subjects, affected individuals or their parents.

Cell strains
A total of 51 different primary fibroblast cultures were analyzed (table 1). Genital fibroblasts were established from scrotal skin biopsies from normal males undergoing orchidopexy or testis biopsy (scrotal fibroblasts) or from circumcisions (foreskin fibroblasts). Normal female genital skin fibroblasts were established from individuals undergoing genital plastic surgery (labia majora fibroblasts). Genital skin fibroblasts of patients with disorders of sex development (DSD, [15]), mostly AIS, were obtained either for diagnostic reasons or during reconstructive surgery (foreskin-derived or labioscrotal-derived or labial fibroblasts) or following gonadectomy (gonadal fibroblasts). Extra-genital skin fibroblasts were obtained from phenotypically normal male individuals. For gene expression profiling, fibroblasts of the 3rd to 7th passage were cultured to confluency (wherein they entered G 0 arrest), as described previously [8].

RNA-isolation and cDNA-labeling
Protocols for mRNA and total RNA preparation, and cDNA labeling are available at [16]. Either 2 μg of mRNA (dataset 1) or 50 μg of total RNA (datasets 2 and 3) from genital fibroblasts were reverse transcribed and labeled with Cy5. Labeled cDNAs from genital fibroblasts were mixed with Cy-3-labeled reference RNA from Stategene (datasets 2 and 3) or a pooled reference of RNA from fibroblasts and 11 immortalized cell lines (dataset 1, see [8]).

Microarrays and hybridizations
Poly L-lysine coated (data set 1 = 19 new hybridized micoarrays plus 22 previously published microarrays [8] and dataset 2 = 24 new hybridized microarrays) or Corn-Man Universal PCR mastermix (Applied Biosystems, Foster City, CA, USA) including the Taq polymerase were added to a final volume of 25 μl. Initial denaturation was performed at 95°C for 10 min followed by 1 min cycling intervals at 60°C using a RotorGene RG-3000 cycler (Corbett-Research, Sydney, Australia). Differential transcription levels between all different cell lines were calculated according to the ΔΔ-CT method [20].