Experimental design and overall microarray results
Trout spermatogenesis occurs seasonally in such a way that all morphological and cellular events tend to be synchronized [11, 12]. We thus used gonads at key stages in the male reproductive cycle (Figure 1A-E) to study the changes in gene expression underlying testis development. These included gonads at early stages containing slowly-dividing type A spermatogonia (Stage I, Figure 1A) or both type A and actively-dividing type B spermatogonia (Stages IIa and IIb, Figure 1B), maturing testes containing meiotic spermatocytes (Stage IIIb, Figure 1C) and post-meiotic spermatids (Stage V, Figure 1D) in addition to growing numbers of spermatogonia and, finally, gonads at a later stage, i.e. spawning testes, containing essentially fully developed spermatozoa (stage VIII, Figure 1E). Additionally, fractions of isolated germ cells enriched in spermatogonia (Figure 1F), spermatocytes (Figure 1G) or spermatids (Figure 1H) were used to identify the cellular origins of measured expression signals.
Unsupervised hierarchical classification of the samples (Figure 1I) clearly distinguished immature (Stages I, IIa and IIb) and spawning (Stages VIII) gonads from isolated germ cell fractions and maturing gonads undergoing meiosis and spermiogenesis (Stages IIIb and V, respectively). The correlation observed between isolated germ cell samples and total testes harbouring active spermatogenesis is not surprising given the large number of differentiating germ cells that accumulate during testis maturation. Inversely, the correlation observed between the early stages and stage VIII is more intriguing: Gonads in the early stages contain mainly spermatogonia under various proliferation states together with somatic cells (i.e. Leydig, Sertoli and peritubular myoid cells), whereas stage VIII gonads are predominantly composed of mature spermatozoa and somatic cells (Figure 1H). Since spermatozoa only contain very small amounts of RNA [13], expression signals from stage VIII gonads are actually likely to represent the somatic complement of spawning testes. The overall expression profiles from entire gonads thus appeared consistent with the cellular events taking place throughout the male trout reproductive cycle.
Annotation of the trout cDNA microarray
Genes/transcripts corresponding to the EST sequences available for the 9024 clones spotted on the microarray were annotated by searching for potential orthologous genes in the genome of model fish species - for instance Gasterosteus aculeatus, Danio rerio, Oryzias latipes and Takifugu rubripes. Among the 8665 clones mapped on at least one of these fish genomes (BLAT algorithm), 8647 were associated with Ensembl gene IDs. Overall, 8197 trout clones were thus confidently linked to a model fish gene - i.e. with at least 2 EST hits across all 4 genomes leading to the same Ensembl gene or to similar genes encoding proteins belonging to the same family - and corresponded to 6661 non redundant (NR) genes. We next used information from the Ensembl database to associate clones with GeneOntology (GO) terms [14]. To increase the somehow weak annotation of fish genes, GO terms associated with corresponding mammalian genes were also used. This inference was possible for 7065 clones for which associated fish genes also had rat, mouse and/or human orthologous genes, as predicted by the Ensembl database (Compara, version 52). This strategy enabled us to annotate 88% of the spotted clones in terms of "biological process", "molecular function" or "cellular component" annotations.
Identification of differentially-expressed genes during trout testis spermatogenesis
A statistical analysis, based on the comparison of 38 samples hybridized onto 38 microarrays, was performed to identify genes exhibiting significant changes in expression during trout spermatogenesis. Among 7821 well-measured clones, 3379 were thus found to be differentially-expressed and corresponded to 2771 NR genes (Additional file 1). Their classification in 9 expression clusters then allowed the identification of sequential events of gene activation or repression during testis development and/or germ cell differentiation. To further distinguish somatic cell from germ cell transcripts we focused on 1694 highly differentially-expressed clones (1441 NR genes; labelled "High differential" in additional file 1) for which cellular origins can be more confidently inferred (Figures 2 and 3). Note that the color-code used for heatmap representations reflects the signal intensity, i.e. the expression level, of a given gene, ranging from not detected (dark blue) to highly expressed (red). It therefore allows not only to represent changes in expression for a given gene, but also to compare expression levels between genes.
At least 3 clusters corresponding to genes preferentially-expressed in somatic cells, as evidenced by their low or null expression in isolated germ cell samples, could thus be distinguished. Genes in cluster A (334 clones/284 NR genes) are expressed in early developmental stages, then decrease by a dilution-effect as germ cells accumulate within the gonads (from stage IIb onwards) and finally increase in stage VIII testes when very few germ cells, apart from spermatozoa, are found within the testis (Figure 2). In contrast to this archetypical somatic expression pattern, genes in clusters B (160/133) and C (170/141) are expressed at lower and higher levels in stage VIII gonads, respectively (Figure 2). These genes are therefore likely to be subjected to differential expression regulations during the reproductive cycle in comparison with cluster A. Genes in cluster D (95/79) exhibit a somatic developmental expression pattern that is somehow similar to cluster A but are, moreover, highly expressed in isolated spermatogonia (Figure 2).
Clusters E and F contain genes that are most highly expressed in isolated spermatogonia but that could be distinguished on the basis of their developmental expression pattern (Figure 3). Genes in cluster E (111/94) exhibit a progressive decrease in expression as spermatogenesis proceeds (from stage IIIb onwards) and might correspond to weakly differentiated spermatogonia. In contrast, genes in cluster F (202/154) transiently increase from stages IIb to V and could correspond to more differentiated, actively-dividing spermatogonia and early spermatocytes that accumulate during the course of the reproductive cycle (Figures 3).
Clusters G, H and I were found to contain genes expressed within the germline or with expression peaking in meiotic and post-meiotic germ cells. Cluster G (129/108) indeed contains genes highly expressed in all germ cell types that are also highly expressed in stage IIIb and V gonads (Figure 3). Cluster H (365/302) corresponds to genes more specifically expressed in both spermatocytes and spermatids. Again, these transcripts are highly accumulating in stage IIIb and V gonads (Figure 3). Finally, genes belonging to cluster I (128/116) are highly expressed within the germline, especially in both spermatocytes and spermatids, but differ from clusters G and H as they do not show a clear differential expression pattern during testis maturation (Figure 3).
Normalized expression data, expression cluster information and corresponding annotations are available for all differentially-expressed trout clones in a searchable excel spreadsheet provided in additional file 1. Note that the expression profiles for 5 somatic genes (Amh, Dmrt1, Slc26a4, Sox9a and Tbx1) were verified and confirmed by qPCR experiments (Additional file 2). The comparison of the 2 methods revealed that microarrays data tend to minimize the differences between experimental groups of samples.
Functional mining of expression clusters highlights relevant functions in trout testis development
We performed a GeneOntology term analysis to investigate the biological significance of the 9 expression clusters described above. We found enriched functions in each cluster that are consistent with known events occurring at different stages of spermatogenesis and in specific cellular compartments of the testis ("biological process" terms, Figure 4).
The somatic expression cluster A was thus found to be significantly enriched in genes coding for proteins involved in hormone biosynthetic or catabolic processes, which may originate from steroidogenic Leydig cells, including some genes more specifically involved in C21-steroid hormone or retinol metabolic processes (Akr1d1, Cyp17a1, Cyp11a1, Dhrs3 and Dhrs9, Ece1, Hsd11b2). Additionally, a number of genes involved in the regulation of muscle contraction (Anxa6, Atp1a1, Gucy1a3, Myl9, Tpm1 and Tpm3) were found in this cluster and are possibly expressed by contractile peritubular myoid cells. Cluster B, which might correspond to somatic genes undergoing down regulation as the testis matures, was found to contain many genes involved in various developmental processes. It included genes encoding extracellular matrix constituents or organizers (Itga6, Col1a1 and Col1a2), but also transcription factors (Bapx1, Pcgf2, Sox10 and Tbx1) involved in skin and skeletal system morphogenesis or ondontogenesis. Genes in cluster C are highly expressed in stage VIII testes and were notably found to be involved in water transport and homeostasis, cAMP metabolic processes (Aqp4, Itnp, Timp2 and Vsnp), lactation and secretion (Ank1, Agpat9, Canx, Cav2, Dgat1, Gars, Glrx5 and Slc39a1). These biological processes appear to be highly relevant functions for semen fluid constitution and sperm hydration during spawning. Cluster D, which corresponds to genes that might be expressed in both somatic cells and spermatogonia, contains genes involved in glycolysis (Aldoa, Eno1p, Eno3, Gapdh, Pgk1 and Tpi1a), antigen processing and presentation (Cd74, B2m, Mhc1uca and Mica) and the aldehyde metabolic process (Tpia1, Aldh7a1 and Aldh9a1). Type A spermatogonia cluster E was found to be enriched in genes involved in the response to bacterium and ionizing radiation or terpenoid metabolic process (Adh5, C10orf33, Dhrs4, Gpx3, Hadha, Hamp1 and Tf). More interestingly, these genes involved in double strand break repair (Xrcc4 and Xrcc6 and Msh2) may participate in specific checkpoints in these cells to ensure the integrity of the transmitted genome. Additionally, numerous genes were involved in ribosome biogenesis (Aatf, Exosc5, Mina, Nmd3, Rplp0, Rpsa and Wdr3), a function that was also found to be enriched in cluster F. Cluster F is likely to correspond to more differentiated spermatogonia and was found to contain genes involved in RNA metabolism, including mRNA splicing, via spliceosome, RNA transport (Ckap5, Ddx39, Dye, Nup43, Thoc4 and Xpo1), rRNA processing (Ddx56, Dkc1, Eif4a3, Exocs7, Exocs8, Fbl, Nol5 and Utp14c) or mRNA polyadenylation (Cpsf1, Cstf2 and Pabpn1), which could be important for fine-tuning gene expression in these cells. In agreement with the high proliferative activity of these cells, a number of genes were also involved in DNA metabolic processes, notably in the initiation of DNA replication (Mcm2-7, Orcl1 and Rpa4) and nucleotide-excision repair, DNA gap filling or base excision repair (Msh6, Pcna, Pold1, Rfc3, Rfc4, Ung and Rpa4). Cluster G corresponds to genes widely expressed within the germline that are involved in DNA replication along with cell division, mitosis or mitotic cell cycle checkpoint (Cdc2, Mad2l1, Ttk and Zw10). Furthermore, genes involved in meiosis were found in cluster F (Fancd2, Kif2c, Msh6, Psmd13, Spin1, and Utp14c), cluster G (Psmc3ip, Rad50, Sycp3l, Ttk and Zw10) and cluster H as well (Ccna1, Ccnb3, Cdc25c, Dmc1, Kif2c, Meig1, Nek2, Topbp1 and Tsga2). Finally, whereas the meiotic/post-meiotic cluster H shared common biological functions with cluster G (Cell division, mitosis and microtubule-based movement), it was also consistently and specifically enriched in genes involved in microtubule cytoskeleton organization, ciliary or flagellar mobility (Cetn3, C14orf143, Dnah1, Dnah7, Dnah8, Dnah12, Fam179b, Gabarap, Kif2c, LOC795513, Ndel1, Tekt3, Tekt4, Tubb1 and Ube2c) and spermatogenesis (Ccna1, Dmc1, Dnah9, Dnajb13, Ggnbp2, Hist1h1d, Klhl10, Nme5, Ropn1l, Sapg6, Spata4, Tegt, Tle3 and Txndc6). Only cluster I did not show any specifically enriched biological processes and its biological significance still remains unclear.
Taken together, the statistical enrichments obtained in this GO term analysis validate our gene expression analysis and the annotation of trout genes. It can be noted that concordant results were obtained at molecular function or cellular component levels (Additional files 3 and 4) as well as for conserved genes correlated with mice and for testis-specific genes (see below and additional files 5 and 6).
Correlating gene expression during spermatogenesis in fish and mammals
We next conducted a cross-species comparison to identify evolutionary-conserved genes that would also exhibit conserved expression during trout and mouse testis development. We used expression data corresponding to postnatal testis development in the mouse (GEO repository: GSE12769), as the first wave of spermatogenesis occurs, and compared the expression of 1105 differentially-expressed NR trout genes with their mouse orthologs. Firstly, we selected mouse genes with significant expression (≥ overall median value) in at least one developmental stage and identified 1044 genes that were differentially-expressed in trout and also consistently detected in mouse testes, suggesting roles during testis development or spermatogenesis for these genes in both species. In addition, we performed a correlation test to compare expression profiles in trout developmental stages (I, IIb, IIIb and V) and mouse post-partum ontogeny (0, 8, 18 and 30 dpp). This enabled us to select 442 clones (403 NR genes) that exhibit very similar developmental expression patterns and covered most of the expression clusters (Figure 5). Indeed, only genes in cluster C, whose expression profiles depend mainly on stage VIII gonads (a stage specific to highly cyclic fish species with no equivalent in mice), and in cluster I, which do not show important changes in expression during testis development, were found to exhibit low correlated expression with their mouse orthologs. Importantly, the genes selected by this filtration on the basis of developmental expression profiles also exhibited a clear-cut expression pattern in testicular cells isolated from mice. These genes might therefore be expressed in the same cell types in both species where they are likely to exert similar functions. Furthermore, the GO term analysis performed on 3 broad expression clusters (i.e. somatic, spermatogonia and meiotic/post-meiotic) for these conserved genes, not only confirmed previous results obtained for trout differentially-expressed genes, but extended these findings (Additional files 5 and 6). A striking example is that of the transforming growth factor receptor signalling pathway, for which 8 out of 9 differentially-expressed trout somatic genes were found to have conserved expression during mouse spermatogenesis (Bambi, Col1a2, Id1, Nfic, Nfix, Sptbn1a and b and Smad7). Similarly, 5 of the 6 genes involved in sperm mobility and/or fertilization that belong to the meiotic/post-meiotic cluster were also found conserved during mouse spermatogenesis (Gas8, HistH3, Klhl10, Pvrl3, Ropn1l and Sapg6). This confirms that identifying genes with evolutionary conserved expression is an efficient way to highlight factors with important functions in a given process.
Trout clones for which corresponding orthologous genes exhibit correlated expression or are significantly expressed during mouse spermatogenesis are labelled respectively "Correlated" and "Expressed" in additional file 1.
Tissue profiling analysis reveals potential testis-specific genes in trout
We used expression data from 5 non-testicular tissues (ovary, liver, muscle, gill and brain) hybridized on the same microarray to identify potential testis-specific genes and found 121 clones (111 genes) expressed predominantly in the testis with low or no expression in all the other tissues (Figure 6; labelled "Testis" in additional file 1). Only 23 of these potential testis-specific genes did not pass the standard deviation filtration step, thus demonstrating highly specific gene expression, not only at tissue level, but also at cellular level. Thirty four of these genes belong to somatic expression clusters and include notably 2 genes that are crucial for male gonad development: Dmrt1, a well-known sexually-dimorphic gene involved in both Sertoli cell differentiation and spermatogenesis maintenance [15], and Lhcgr, which encodes the receptor for LH and chorionic gonadotropins [16, 17]. This also includes genes encoding receptors or extracellular proteins that may account for specific paracrine regulation within the testis (Ca6, Enpp6, Gpc3, Gpr175, Lgi1b, Lrp1, Podn, Tgfbr3 and Vmo1). The spermatogonia expression clusters also contain 14 genes with apparently preferential expression in the male gonad. Importantly, 6 of these genes - Cpsf1, Ddx19, Exosc8, Mcrs1, Msi2 and Piwil2, the latter gene encoding a stem-cell protein essential for germ cell differentiation [18] - were again found to bear functions related to RNA processing or transport. Finally 73 genes with expression detected in the testis only were found within the germline and meiotic/post-meiotic expression clusters: They consistently contained a number of genes coding for proteins known to be important in meiosis and/or spermatid development (Dmc1, Dnah9, Dnajb13, Dnal1, Gmcl1, Nek2, Nme5, Spata4, Trip13 and Zw10). In addition they contained genes encoding proteins localized on or involved in the mobility of flagella and cilia (c10orf63, c14orf143, Dnah8, Efhc1, Rshl3 and Ttll9) that are relevant sperm cell components. In addition, several transcription factors and cell cycle regulators that could be important for the progression throughout meiosis were found (Cdkn3, Lcmt1, Polr2i, Smc4, Styxl1 and Xccr3). It can be noted that when exploring enrichment for GO annotations linked to potential testis-specific genes, only terms related to meiosis or spermiogenesis were highlighted (Additional files 5 and 6), as previously observed in mammals [19].
Attention should be given to genes expressed in the female trout gonad as they are apparently also candidate genes for the male gonad function. Importantly, 67 additional genes highly expressed in the testis were also found to be expressed in the mature ovary (Additional file 7; labelled "Gonad" in additional file 1), including some genes already known to be important for meiosis or oocyte differentiation (Ccna1, Nmp2, Psmc3ip and Sycp3l). Also among this group of transcripts, a majority were annotated with functions related to cell proliferation or cell cycle regulation (Ada, Cdkn1c, Cks1b, Gsg2, Nasp and Pdcd6), spindle formation and chromosomal positioning or structure (Kntc2 and Smc2), DNA replication or repair (Cdt1, Mcm5, Nasp, Pold3, Recql5 and Rfc4) that are all of obvious interest for germ cell development.
Candidate genes of interest in spermatogenesis regulation
To gain insight into the regulatory mechanisms that drive spermatogenesis we focused on genes preferentially expressed in the somatic cell compartment and may be involved in cell fate commitment during early testis maturation or in driving germ cell differentiation throughout the male reproductive cycle. This was performed by searching for genes encoding transcriptional regulators (corresponding to the following GO term "transcription factor activity"; GO:0003700), growth factors (growth factor activity; GO:0008083), receptors for growth factors (growth factor binding; GO:0019838 and "growth factor receptor activity" terms) or a more general class of extracellular proteins (extracellular space; GO: 0005615) in all 4 somatic cell expression clusters. Expression profiles of these genes are presented in figure 7.
We next performed in situ hybridization to localize expression for some of the somatic genes selected above. In order to validate our In Situ hybridization (ISH) protocol and the quality of the gonad samples we investigated the expression of genes of known cellular origin. The gene encoding the anti-Müllerian hormone, Amh, was strongly detected in Sertoli cells in the early stages I-II (Cluster C; Figure 8A) and, in agreement with its microarray expression profile, the staining progressively decreased in Sertoli cells in maturing testes and at the spawning stage (data not shown). The signal for Ddx4, the gene encoding the Drosophila VASA homolog, was strongest in spermatogonia and early spermatocytes (cluster F; Figure 8B) and decreased in subsequent germ cells in a similar way to the microarray experiment signals. Finally, the strong signal detected for Txdnc6 in both spermatocytes and spermatids (Figure 8C) confirmed its meiotic/post-meiotic profile (cluster H). Firstly, we investigated the expression of Tcf23 (Cluster A), a gene encoding the basic helix-loop-helix transcription regulator TCF23 (OUT), that we found to be preferentially-expressed in both the testis and ovary. Tcf23 staining was found at each developmental stage in thin cells very closely associated with spermatogenetic tubule borders (Figure 8D), which are likely to be peritubular cells. ISH performed on Igfb7, a gene encoding an Insulin-like growth factor-binding protein (Cluster A) revealed its expression in the interstitial Leydig cells (Figure 8E). In addition, Igfbp7 mRNA was also detected in endothelial cells from vessels (data not shown). Ghr2, a duplicated isoform of the gene encoding the growth-hormone receptor (Cluster A), was found to be expressed in stage VIII testes in large, round cells present in the interstitial tissue that are likely to be Leydig cells (Figure 8F). Finally, the expression of Tgfbr3, which encodes the type III TGF-beta receptor and that was only detected in the testis (Cluster A), was evidenced in interstitial Leydig cells in stage II testes (Figure 8G). Overall these results further confirmed our prediction of "somatic", "spermatogonial" and "meiotic/post-meiotic" expression clusters.