Comparative transcriptome analysis of three gonadal development stages reveals potential genes involved in gametogenesis of the fluted giant clam (Tridacna squamosa)

Background Gonad development and differentiation is an essential function for all sexually reproducing species, and many aspects of these developmental processes are highly conserved among the metazoa. However, the mechanisms underlying gonad development and gametogenesis remain unclear in Tridacna squamosa, a large-size bivalve of great ecological value. They are protandrous simultaneous hermaphrodites, with the male gonad maturing first, eventually followed by the female gonads. In this study, nine gonad libraries representing resting, male and hermaphrodite stages in T. squamosa were performed to identify the molecular mechanisms. Results Sixteen thousand four hundred ninety-one unigenes were annotated in the NCBI non-redundant protein database. Among the annotated unigenes, 5091 and 7328 unigenes were assigned to Gene Ontology categories and the Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway database, respectively. A total of 4763 differentially expressed genes (DEGs) were identified by comparing male to resting gonads, consisting of 3499 which were comparatively upregulated in males and 1264 which were downregulated in males. Six hundred-ninteen DEGs between male and hermaphroditic gonads were identified, with 518 DEGs more strongly expressed in hermaphrodites and 101 more strongly expressed in males. GO (Gene Ontology) and KEGG pathway analyses revealed that various biological functions and processes, including functions related to the endocrine system, oocyte meiosis, carbon metabolism, and the cell cycle, were involved in regulating gonadal development and gametogenesis in T. squamosa. Testis-specific serine/threonine kinases 1 (TSSK1), TSSK4, TSSK5, Doublesex- and mab-3-related transcription factor 1 (DMRT1), SOX, Sperm surface protein 17 (SP17) and other genes were involved in male gonadal development in Tridacna squamosal. Both spermatogenesis- (TSSK4, spermatogenesis-associated protein 17, spermatogenesis-associated protein 8, sperm motility kinase X, SP17) and oogenesis-related genes (zona pellucida protein, Forkhead Box L2, Vitellogenin, Vitellogenin receptor, 5-hydroxytryptamine, 5-hydroxytryptamine receptor) were simultaneously highly expressed in the hermaphroditic gonad to maintain the hermaphroditism of T. squamosa. Conclusion All these results from our study will facilitate better understanding of the molecular mechanisms underlying giant clam gonad development and gametogenesis, which can provided a base on obtaining excellent gametes during the seed production process for giant clams. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-020-07276-5.

Conclusion: All these results from our study will facilitate better understanding of the molecular mechanisms underlying giant clam gonad development and gametogenesis, which can provided a base on obtaining excellent gametes during the seed production process for giant clams.
Keywords: Tridacna squamosa, Gonadal development and gametogenesis, Transcriptome, Reproduction, Differential expression genes Background Reproductive development and sex determination are widespread and significant processes which have long been of interest to biologists. The processes of sex determination and differentiation are tremendously diverse in mollusks, ranging from functional (simultaneous) hermaphroditism, alternative sexuality (sequential hermaphroditism), strict gonochorism or dioecy (species that exist as separate males and females), to species that are capable of sex changes [1]. Giant clams (subfamily Tridacninae), the largest living bivalves in the world, are native to coral reefs throughout much of the tropical Indo-Pacific [2]. These organisms play various roles in coral reef ecosystems, for example, their shells act as substrates for epibionts, and serve as nurseries to various organisms [2]. All giant clams are protandrous functional hermaphrodites, becoming simultaneous hermaphrodites in later years. The male phase of the gonad develops first and eventually matures the female gonads. The normal spawning sequence is for sperm to be produced first, followed by egg production after a short interval. Release of sperm is triggered in nature by the presence of a spawning inducer associated with ripe eggs [3]. Unfortunately, giant clams have suffered from widespread harvesting for food, shell collecting and the aquarium trade. The over-exploitation of giant clams has led to the decline of the population throughout its geographic range and ecological extinction [4]. Thus, a certain degree of difference was found between the genetic structures of giant clam species [5,6]. Consequently, all giant clam species are protected under the Convention of International Trade in Endangered Species of Wild Fauna and Flora (CITES) and are listed on the International Union for Conservation of Nature (IUCN) Red List of Threatened Species [7]. Therefore, better quality and higher seeds production are required to maintain the sustainable development of giant clams.
In order to control the quality and quantity of giant clams and their eggs in aquaculture, it is crucial to understand the molecular mechanisms of gonad development and gametogenesis, which may facilitate the production of high-quality clam seeds. The first step toward understanding molecular mechanisms of gonad development and gametogenesis is to identify and characterize reproduction-related genes and pathways. However, studies on gonad development and gametogenesis genes and pathways in mollusks are few and limited. In these previous studies, many efforts have been made to reveal genes homologous to sex-determining pathway genes in model species [8][9][10]. The vertebrate female-determining genes including β-catenin and forkhead box L2 (FOXL2), as well as male-determining genes including double-sex-and mab-3-related transcription factor (DMRT) and SOXE, have been identified in some mollusks. In Crassostrea gigas, CgFOXL2 expression increases during the adult gametogenetic cycle for both sexes, but with a significant increase occurring earlier in females than in males [11]. Cg-β-catenin is expressed in vitellogenic oocytes and may be involved in early oyster gonadic differentiation [12]. In Chlamys nobilis, CnDMRT2 is likely to be involved in playing a functional role in male gonadal development or maintenance of gonadal function, and CnDMRT5 may be involved in biological processes other than gonadal development in C. nobilis [13]. In Pinctada martensii, PmDMRT2 might play a functional role during spermatogenic cell differentiation from spermatocytes and spermatids into sperm [14]. However, unlike other families of bivalves, which have doubly uniparental inheritance (DUI) and sex reversal [15,16], T. squamosa is a functional hermaphroditic bivalve [17]. In T. squamosa, sex is more likely to be dominated by the interaction of multiple genes. Next-generation sequencing technology has been utilized to study the genes related to reproduction in various species [18][19][20][21][22][23][24], but no data is currently available on the gonad transcriptome of T. squamosa.
In the present study, to obtain a comprehensive transcriptome database of the various gonad developmental stages in T. squamosa, we used the Illumina sequencing technology to discover genes potentially involved in gonad development and gametogenesis for resting, male, and hermaphroditic gonadal developmental stages. To our knowledge, this work is the first report on transcriptome profile analysis of gonads in T. squamosa. Results from the transcriptome analysis would be particularly important for improving understanding of the molecular mechanisms underlying the regulation of gonadal development and providing novel insights into the aquaculture of T. squamosa.

Giant clam gonad development and histological observation
To gain a better understanding of gonad development, histological analysis using HE-stained sections was conducted to compare different development stages. Histology showed that resting gonads are filled with connective tissue and lack any gamete-producing tissue or other tissue which could be associated with a particular sex. In the male gonads, the tissues were comprised of spermatogonia, primary spermatocytes, secondary spermatocytes, and spermatids. In the hermaphrodite gonads, both oocytes and sperm were detected (Fig. 1).

Evaluation of biological replicates
Pearson's Correlation Coefficient (r) is an important index for the evaluation of the correlation of the samples. Based on the r2 values in Table S2, two comparisons were made (resting versus male, male versus hermaphrodite) to avoid comparing significantly different samples, improving data authenticity and repeatability between samples.

Sequencing and de novo assembly
In the present study, nine cDNA libraries were constructed for Illumina sequencing. The data processing results were summarized in Table 1. After eliminating primers, adapter sequences, and low-quality reads, a total of 43,251,171 clean reads were obtained from the resting gonads, 42,793,935 from the male gonads, and 38,375,061 from the hermaphroditic gonads. All clean data were assembled into 124,565 transcripts and 95,408 unigenes with a mean length of 872. 13 Table 2). The annotation results showed that more than half (72.72%) of the genes were not well annotated, due to lacked significant similarity with other sequences deposited in the aforementioned databases.

Functional annotation of transcriptome
Functional prediction and classification of the unigenes was conducted by searching the KOG and GO databases. For the KOG annotation, all the unigenes were annotated and classified into 26 functional categories (Fig.  S1). The top three terms were: general function prediction only (2448, 20.48%); signal transduction mechanisms (1947,16.29%); and posttranslational modification, protein turnover, chaperones (1015, 8.49%), respectively. However, a certain number of unigenes were assigned to unknown protein (843, 7.05%), due to the lack of available databases. GO is an international gene functional classification system that is utilized for functional categorization of DEGs [25]. Five thousand ninety-one unigenes were classified according to three major GO categories (Fig. S2). In the biological process category, "cellular process" was the most abundant GO term, while in the cellular component and molecular function categories, "cell part" and "catalytic activity" were the most enriched terms, respectively.

Differential expression and functional analysis of assembled giant clam transcripts
To better survey the biological mechanism of gonad development, it is important to identify the genes which are differentially expressed between stages. To increase the accuracy of the measured expression levels for further analyses, data from 3 libraries derived from the  (Table S3, S4). An overall view of the expression patterns between the two groups is shown in Fig. 2 (FDR ≤ 0.01 and log 2 -Ratio ≥ 1). Hierarchical cluster analysis showed that the clustering branch displayed the similarity of genes or samples, which conformed to the evaluation of biological replicates (Fig. 3).
Enrichment analysis in the molecular function, cellular component and biological process categories produced 613, 85 and 172 enriched GO-terms, respectively, for the Resting versus Male group, and 55, 12 and 14 for the Male versus Hermaphrodite group. (Table S5). The most-enriched GO-terms for the Resting versus Male group were "serine/threonine kinase activity" in the molecular function category, "chromosome" in the cellular component category, and "single-organism transport" in the biology process category. In the Male versus Hermaphrodite group, the most-enriched GO-terms were "lipid particle" and "membrane" in the cellular component category; "binding" and "signal transducer activity" in the molecular function category; and "oocyte maturation", "activation of MAPKK activity" and "protein peptidyl-prolyl isomerization" in the biological process category (Fig. 4).

Identification of genes involved in the regulation of gonad development
By analyzing the overall gene expression profiles of gonads, at least 31 genes involved in spermatogenesis were identified in the male group, including doublesex-and mab-3-related transcription factor, transcription factor Sox-8, sperm surface protein 17, sex determining protein Fem-1, TSSK4 and other potential candidates (Table 3). More than 40 genes, including both spermatogenesis (SPATA17, SOX8, SP17, SMKX, testis-specific serine/ threonine kinases 4 and Sperm-associated antigen 8) and oogenesis genes (Zona pellucida, vitellogenin, 5hydroxytryptamine receptor, Forkhead Box L2, vitellogenin receptor, and transcriptional regulator ATRX), were found to be responsible for the maintenance of hermaphrodite giant clams ( Table 4). Identification of these essential genes and their regulatory mechanisms provides new understanding about the complex processes of reproduction and development. The information gained about these genes can be used to improve giant clam aquaculture.

Validation of differentially expressed genes using qRT-PCR
To validate the expression levels of DEGs identified by RNA-Seq in gonads, we randomly selected 10 DEGs related to sex-differentiation (DMRT, SPAPA17, SOX8, TAAK1, SP17, ZP, FOXL2, 5HTR, VGR, ATRX) for qRT-PCR validation. Expression of DMRT, SPAPA17, SOX8, TSSK1, and SP17 was higher in testes, whereas ZP, FOXL2, 5HTR, VGR, ATRX were found to be elevated in ovaries. Comparison of the transcriptome data from RNA-Seq with the qRT-PCR results from seven selected differentially expressed genes revealed that they were consistent with each other at these gonad developmental stages (Fig. 6). These results reiterate the differential gene expression pattern observed in gonadal transcriptome analysis.

Discussion
Gonad development is a very complex and critical process which begins before sexual differentiation. During this process, many genes cause the gonad to  . The x-axis shows three terms and the y-axis shows the proportion of DEGs and unigenes corresponding to each subcategory. The red column represents annotation of all genes, while the blue column represents annotation of DEGs differentiate into either a testis or ovary and, subsequently, cause the development of a male, female, or hermaphroditic phenotype [26][27][28][29][30]. Giant clams are protandrous hermaphrodites [31]. Their sequential sexual development begins in the juvenile stage with no visible gonads and progresses to the development of testes, which is followed later by ovary development, resulting in hermaphroditic individuals [3]. Recent research on the sex determination mechanisms and sex-related genes of mollusks has made considerable progress with the advancement of next-generation sequencing technology. However, research efforts have mainly focused on dioecious mollsuks such as Haliotis rufescens (Myosho et al., 2012), Chlamys nobilis [32] Patinopecten yessoensis [33] Haliotis discus discus [34] Crassostrea hongkongensis [35] Mytilus edulis [36] and Crassostrea gigas [37]; studies on hermaphroditic mollusks such as giant clams are extremely scarce. Thus, it's vital to identify genes that are involved in the gonadal development of hermaphroditic animals. Here, we proposed to unravel some molecular mechanisms and genes involved in gonad development and gametogenesis of a tropical marine hermaphrodite mollusk, T. squamosa, using Illuminabased RNAseq.

Annotation of giant clam gonad transcriptome
To obtain a gonadal expression profile from the giant clam, 9 samples of gonads in different reproductive stages were sequenced using an Illumina HiSeq2500 highthroughput sequencing platform. From these, a total of 124,564 transcripts (N50 = 1488) and 95,408 unigenes (N50 = 1143) were identified. On average, the statistics for the de novo assemblies are similar to those for other transcriptomes of other species [38][39][40]. Because no reference genome exists for giant clams, the high-quality reads from the nine libraries were combined and assembled into a The abscissa is the enrichment factor, which increases the more significant the enrichment level of differentially expressed genes in the pathway. The ordinate is log10 (Q value), which increases with greater significance of differentially expressed genes in the pathway reference transcriptome using Trinity software with default parameter settings [41]. Annotation results showed that more than half (72.72%) of unigenes could not be assigned to known genes, making it difficult for transcriptomes annotation, which is consistent with previous studies in the Pacific oyster (24.42% annotation) [42], and Laternula elliptica (16.93% annotation rate) [43]. The poor annotation efficiency could be due to insufficient sequences in public databases for phylogenetically closely related species. Some of the unannotated sequences may be untranslated regions, non-coding RNAs, small RNAs, or short sequences which do not contain known protein domains [44]. In addition, 4763 and 619 DEGs were identified in Resting versus Male and Male versus Hermaphrodite group, respectively. Interestingly, the number of DEGs in Resting versus Male group was much greater than that in Male versus Hermaphrodite group. This may be due to the fact that many researchers are mainly concerned with the reproductive phenotype of male bivalves. Consequently, male bivalves have much more genetic information than females.
Although several bivalve species (e.g. oysters, scallops, mussels, etc.) have been the target of transcriptome studies, so far these have not been applied to giant clams. This study identifies the first transcriptome relating to gonadal expression profile in giant clams, which will provide abundant information for the future study of giant clams. Among the biological processes, it is noteworthy that "reproduction" (GO: 0000003) and "reproductive process" (GO: 0022414), which relate to reproduction and gonadal development, were also enriched. This was similar to results found in other mollusks, including Patinopecten yessoensis [45], Reishia clavigera [46], C. farreri [47]. For example, in P. yessoensis, the most-enriched terms were "cellular process" in biology process, "cell" and "cell part" in cellular component and "cellular component" in molecular function. This demonstrates the similarity of mRNA composition in mollusks.
The Kyoto Encyclopedia of Genes and Genomes (KEGG) contains functional information on metabolic pathways or regulatory networks of genes and interacting molecules in cells, which is very useful in the study of the complex biological behaviors of genes [48]. In model organisms, it has been reported that the oocyte meiosis pathway plays a vital role in ovary development [49,50]. The Ras proteins are GTPases that function as molecular switches for signaling pathways regulating cell proliferation, survival, growth, migration, differentiation or cytoskeletal dynamism [51]. Hydroxylation of phenylalanine into tyrosine is essential for oogenesis and oocyte maturation in the Anopheles gambiae [52]. These pathways were assigned to the KEGG categories of "signal transduction" and "endocrine system", indicating the significance of signal transduction systems and endocrine regulation in ovarian differentiation and maintenance in giant clams. The GO and KEGG annotations were helpful for identifying potential genes with specific functions from a large-scale transcriptome database, and provided a substantial resource for studying significant processes, functions and pathways during gonadal development in giant clam. The candidate genes related to male gonad development of giant clam The DEGs that were identified in the Resting versus Male comparison included several well established genes involved in gonadal development and gametogenesis, such as the testis-specific serine/threonine kinases 1 (TSSK1), TSSK4, TSSK5, Doublesex-and mab-3-related transcription factor 1 (DMRT1), SOX, Sperm surface protein 17 (SP17), Sperm-specific protein, and others. Below, several well documented and important candidate genes are listed in more detail. Doublesex-and mab-3-related transcription factor (DMRT) is a sex determining gene that is essential for the maintenance of male-specified germ cells and testis differentiation [53,54]. At present, eight members of the family (DMRT1-8) have been reported in vertebrates [55]. In mollusks, orthologs of DMRT have been characterized from the oysters C. gigas [56], P. martensii [14] and Pinctada fucata [57]. In our study, DMRT1 was indentified in giant clams, with higher expression in the male gonad than in resting and hermaphroditic gonads. The results of qRT-PCR were also consistent with the RNA-Seq results. These results reveal that DMRT may be an important factor for sex determination and differentiation of T. squamosa.
The Sox transcription factors exert essential functions that regulate male sex determination and testicular differentiation [58]. The Sox family has more than 20 homologues across different species. The transcription factor Sox9 is the direct target of Sry and plays a pivotal role in the male gonadal development of many vertebrates [59]. Recently, Sox9 as a critical factor in sex determination and differentiation was confirmed in a marine mollusk [12]. In Haliotis asinina, SOXB and SOXC were detected in the cerebral and pleuropedal ganglia and SOXB was found to participate in neural structure formation in the limpet Patella vulgata [60,61]. In the cephalopod Sepia officinalis, three members of SOX from group B and E were identified, which showed different expression patterns in early embryogenesis and vasculogenesis [62]. A gene closely related to Sox30 in vertebrates has been reported in C. gigas and was exclusively expressed in the testis [37]. In C. farreri, the expression levels of SoxB2 were similar in male and female gonads at different developmental stages of reproduction [63]. In the present study, we identified a SOX8 gene and the results showed that its expression levels were significantly different between males and restings. Although the detailed function of SOX8 is unknown, we speculate that Sox8 is likely to replace the function of Sry in testis determination in the giant clam.
The testis-specific serine/threonine kinase (TSSK) family may have an important role in sperm differentiation in the testis and/or fertilization. Six members of this family have been reported: TSSK1-TSSK6 [64]. These protein kinases belong to the 5′ adenosine Fig. 6 RT-PCR analysis of differential gene expression in different gonad development stages of T. squamosa. The relative mRNA expression of 10 genes was calculated using the2 -ΔΔCt method, with EF1α as a reference gene. The data are presented as the mean relative expression ± standard deviation (SD) of three replicate real-time reactions from pooled tissue samples (N = 5). Significant differences are indicated: **p < 0.01 monophosphate-activated protein kinase (AMPK) family. However, it is not yet known whether TSSK5 has all the domains needed to be an active kinase [65]. In the mouse, the male-biased genes TSSK1 and TSSK5 are essential for stages of spermatogenesis, and knockout of TSSK1 resulted in deficiencies in spermatogonial or spermatocytic apoptosis [66]. TSSK4 knockout male mice exhibited a subfertile phenotype due to seriously decreased sperm motility [67]. TSSK6 deletion resulted in an infertile male phenotype caused by certain morphological defects in the sperm [68]. In our trancriptome, TSSK1, TSSK4, and TSSK5 were identified and exhibited high levels of expression in male gonads. These results are similar to those found in P. yessonsis, suggesting an important role of TSSK in giant clam spermatogenesis [69]. In addition, the present study also found significantly higher expression of SP17 and Sperm Specific Protein in male than in resting and hermaphroditic clams, and the expression difference might be associated with spermatogenesis. In bovine sperm, SP17 had been found to be essential for spermatogenesis and played significant roles in fertilization events, such as membrane remodeling, transport, protection and function [70]. However, the role of SP17 in T. squamosa spermatogenesis remains further investigation.

The candidate genes facilitating to hermaphroditic gonad development of giant clam
The ovary is the female reproductive organ and is responsible for estrogen synthesis and oogenesis. Oogenesis consists of sequential and continuous changes in the cytoplasmic and nuclear content of maturing oocytes, which are regulated by a series of genes [71]. In this study, sevaral unigenes related to oogenesis were selected from the giant clam transcriptome data and verified by qPCR, including zona pellucida protein (ZP), Forkhead Box L2 (Foxl2), Vitellogenin, Vitellogenin receptor (VgR), 5-hydroxytryptamine (5-HT), and 5hydroxytryptamine receptor (5-HTR).
The zona pellucida protein (ZP), also known as the egg coat protein, is involved in oocyte and gamete development and mediates sperm-oocyte binding, as well as sperm penetration [72]. In fish, the expression of ZP gene mRNA is significantly increased during oogenesis, especially at the previtellogenic stage, when the expression level is higher than at undeveloped stages [73]. In Acipenser sinensis, two ZP proteins (AsZPAX and AsZPB) were identified in the developing oocytes, which will shed light on the regulatory mechanism of egg envelope formation and fertilization process in this fish [74]. Previous studies have shown that the ZP2 gene played a key role in the early formation of the oocyte envelope and ZP3 was a major class of female-specific reproduction-related molecules [75]. Recent evidence indicates that at least nine ZP domain proteins are prominent components of the egg coat of marine gastropod abalone. In this study, a large number of unigenes in hermaphroditic giant clam gonads were annotated with the various domains of the VEZP protein and showed higher expression levels in the ovary than in the testis, suggesting that these oocyte-specific genes may also play important roles in oocyte maintenance and reproduction.
FOXL2, originally identified in Drosophila, is an essential gene controlling the differentiation and development of the ovary in vertebrates [76,77]. In vertebrates, FOXL2 localizes to the granulosa cells and the early ovarian stroma, and knockout of this gene causes disordered ovarian follicular formation and partial ovary-totestis sex reversal [78]. In Crassostrea gigas, Cg-FOXL2 is significantly more expressed in mature females than in mature males, indicating its role in vitellogenesis or female sex determination [79]. In C. farreri, Cf-FOXL2 is specifically expressed in the ovary during gonadogenesis and adult gonadal development cycle, which suggests that it can be used as a female marker and a key gene in the study of ovary identification [80]. In this study, Foxl2 was predominantly expressed in the ovary, suggesting a role in ovarian differentiation and development in giant clams. A FOXN2 and FOXI1 gene, belonging to the FKH family, showed significant differences in expression between males and females, suggesting that they also may be involved in ovarian determination, although the precise mechanisms remain unclear.
Other genes involved in the formation and maintenance of oocytes were also found in the gonad transcriptome of the giant clams in this study. As observed in many model organisms, oocyte maturation depends on massive production of the egg yolk precursor protein, vitellogenin (VG). Vitellogenin receptor (VGR) is involved in Vg uptake by oocytes and plays a critical role in egg development [81]. In the current study, we found high occurrence of Vg and VgR in the ovaries, and RT-PCR demonstrated that the expression level in hermaphroditic gonads was significantly higher than in male gonads, providing some clues to further elucidate its function in ovary development in giant clams. 5-hydroxytryptamine (5-HT) has been found to play a variety of biological roles in gonad maturation and sequential spawning [82,83]. It acts through 5-HT receptors, a superfamily of G-proteincoupled receptors, to modulate oocyte development and maturation in vertebrates and invertebrates [84,85]. In the present study, both the 5-HT and 5-HT receptor were significantly high expression in hermaphroditic gonads, which indicated its poetical role in ovary development.
In addition to these female-related genes, our study observed significantly higher expression of TSSK4, spermatogenesis-associated protein 17 (SPAG17), SPAG8, sperm motility kinase X (SMKX) and SP17 in hermaphroditic gonads than in resting gonads, and the expression difference might be associated with the maintenance of spermatogenesis. The spermatogenesisassociated gene (SPATA), a testis-specific gene, played an important role in maintaining the physiological function of germ cells involved in the regulation of apoptosis during spermatogenesis [86]. SPATA4 has previously been confirmed to be specifically expressed in the testes of animals ranging from mammals to birds [87,88]. SPATA6 has been shown to be a critical protein in either the assembly or structural integrity of the sperm tail axoneme [89,90]. TSSK4, SPAG17, SPAG8, SMKX and SP17 have been shown to be essential for spermatogenesis [39]. All these results indicate that both spermatogenesis-and oogenesis-related genes were simultaneously highly expressed in the hermaphroditic gonad to maintain the hermaphroditism of these giant clams.

Conclusion
Taken together, this work represents the first attempt to use RNA-seq technology to identify gonad development and gametogenesis genes in resting, male and hermaphroditic giant clam gonads. Based on the comparative transcriptome analysis, we obtained 95,408 unigenes, which were assigned to biological processes and functions after annotation in GO, KOG and KEGG databases. A significant number of gonad-related functional genes and sex-related biological pathways were found and many of them were involved in gonad development. These results provide novel insights into the genetic mechanism of giant clam sex determination and also facilitate future research into aquaculture breeding of giant clams.

Ethics statement
All procedures during this study were conducted in accordance with the guidelines of the South China Sea Institute of Chinese Academy of Sciences Animal Care and Use Committee.

Experimental giant clams and samples collection
Twenty five Giant clams were collected from Xisha island China on December 10th 2016 for scientific purposes. Then, they were acclimated with aerated sand-filtered seawater in the hatchery in Sanya at 28 ± 2°C before the experiment. The resting gonad samples were collected on 12 December, 2017. The male and hermaphrodite gonad samples were collected on 78 days and 123 days, respectively, after the first sampling. Gonad tissues were sampled for RNA extraction and fixed for histology analysis. The gonad tissues were taken using a hypodermic needle through biopsy, immediately immersed in the RNAlater (Ambion, Austin, TX, USA), and stored at − 80°C until RNA preparation or fixation in Bouin's solution for histological analysis. Fixed gonad tissues were dehydrated with serial solutions of ethanol, and embedded in paraffin. The standard 6-μm sections were stained following the standard haematoxylin-eosin staining protocol for histological analysis. Resting-stage gonads were identified by the absence of sperm and oocytes. Male-stage gonads were identified by the presence of well-developed sperm. Hermaphrodite-stage gonads were identified by the presence of both sperm and oocytes.

RNA isolation, cDNA library construction and sequencing
Total RNA was extracted from gonad suspensions in RNA-free microcentrifuge tubes containing 1 mL TRIzol Reagent (Invitrogen, Carlsbad, CA) per 100 mg of tissue, according to the manufacturer's instructions. RNA concentrations were measured using a Nanodrop 2000c Spectrophotometer (Thermo Scientific, USA) and the quality of the RNA was analyzed with Bioanalyzer 2100.
RNA-Seq library preparation and sequencing were carried out for three resting, three male and three Hermaphrodite gonads using Illumina TrueSeq RNA Sample Preparation Kit according to the manufacturer's instructions. The mRNA was isolated from total RNA using Sera-mag magnetic oligo-(dT) beads (Illumina), and fragmented into short fragments with fragmentation buffer. The fragmented mRNA was then subjected to cDNA synthesis using random hexamer primer. After end repair and addition of 3'dA overhangs, the doublestranded cDNA was ligated into the Illumina TruSeq adaptors and enriched by polymerase chain reaction. The libraries were subjected to 150 bp paired-end sequencing on the Illumina HiSeq 2500. All the obtained data have been submitted to the NCBI Sequence Read Archive under under the BioProject PRJNA598738.

Transcriptome assembly and annotation
Before assembly, the raw Illumina reads were converted to FASTAQ format, and trimmed to remove the adaptor sequences, duplicated sequences, ambiguous reads and lowquality reads using the tools from FASTX-Toolkit. Due to the lack of genome information on T. squamosa, the clean reads from nine libraries were mixed together as a reference database, and de novo assembled using Trinity v.2.4.0 with default parameter settings [41]. CD-HIT (version 4.6.5) with the setting of "-c 0.95" was used to remove transcripts whose sequence similarity exceeded 95% [91]. The resulting sequences were called unigenes. Analysis of assembly completeness was performed using BUSCO v3 obtain the percentage of singlecopy orthologs represented in the metazoa_odb9 dataset [92].
Open reading frames (ORFs) of transcript and unigene sequences were predicted by the Trans-Decoder package (http://transdecoder.sourceforge.net/), with the minimum ORF length of 100 bp. For homology annotation, unigenes were subjected to analysis using public databases, including NCBI (http://www.ncbi.nlm.nih.gov/) non-redundant protein (NR), Swiss-Prot, PFAM, KOG, eggNOG, and COG databases with an E-value cutoff of 10 − 6 . Gene names and descriptions were assigned to each contig based on the BLASTX results. Afterwards, Blas-t2GO software was employed to obtain Gene Ontology (GO) annotations of transcripts with an E-value cutoff of 10 − 6 and WEGO software was used to perform GO functional classification of all unigenes in order to understand the distribution of gene functions. Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were used to assemble transcripts with the KEGG Automatic Annotation Server (http://www.genome.jp/kegg/kaas/).

Differential expressed genes and analysis of potential reproduction-related genes
The cleaned reads of each RNA-seq library were mapped to the assembled transcripts with Bowtie program [93]. Unique mapped reads, including paired-end reads for which only one part matched, were used to calculate the level of gene expression using the transcripts per kilobase million (TPM), which is able to eliminate the influence of different gene lengths and sequencing levels using RSEM [94]. The edgeR method was used to identify differentially expressed genes (DEGs) between two groups [95]. The threshold for the p-value was determined by the false-discovery rate (FDR). Unigenes with FDR ≤ 0.01 and fold change of the two samples ≥2 (genes for which FPKM < 1 were filtered) were considered to be differentially expressed genes in this study. Three biological replicates were designed to acquire more reliable DEGs. To further investigate these DEGs, GO and KEGG enrichment analyses were performed using the algorithm implemented in GO stats [96]. A p value ≤0.05 designated a significantly enriched gene set and pathway among transcriptomic profiles.

Quantitative real-time PCR validation
Quantitative real-time PCR (qRT-PCR) was used to verify the reliability of our sequencing and analysis obtained from the transcriptome data. Ten unigenes were randomly selected to design primers and EF1α was used as an internal control gene. The qRT-PCR analysis was performed on a Lightcycler480 (Roche, Basel, Switzerland) using a SYBR Green MasterMix (SYBR Premix EX Taq™, TaKaRa), according to the manufacturer's instructions. Primers used for qRT-PCR are listed in Table S1. Reactions were performed in triplicate, and the relative expression levels were calculated using the 2 -ΔΔCt method [97]. The Student's t-test was conducted and significant differences were determined at a p-value of < 0.05 or < 0.01.