Sensitive and robust gene expression changes in fish exposed to estrogen – a microarray approach

Background Vitellogenin is a well established biomarker for estrogenic exposure in fish. However, effects on gonadal differentiation at concentrations of estrogen not sufficient to give rise to a measurable vitellogenin response suggest that more sensitive biomarkers would be useful. Induction of zona pellucida genes may be more sensitive but their specificities are not as clear. The objective of this study was to find additional sensitive and robust candidate biomarkers of estrogenic exposure. Results Hepatic mRNA expression profiles were characterized in juvenile rainbow trout exposed to a measured concentration of 0.87 and 10 ng ethinylestradiol/L using a salmonid cDNA microarray. The higher concentration was used to guide the subsequent identification of generally more subtle responses at the low concentration not sufficient to induce vitellogenin. A meta-analysis was performed with data from the present study and three similar microarray studies using different fish species and platforms. Within the generated list of presumably robust responses, several well-known estrogen-regulated genes were identified. Two genes, confirmed by quantitative RT-PCR (qPCR), fulfilled both the criteria of high sensitivity and robustness; the induction of the genes encoding zona pellucida protein 3 and a nucleoside diphosphate kinase (nm23). Conclusion The cross-species, cross-platform meta-analysis correctly identified several robust responses. This adds confidence to our approach used for identifying candidate biomarkers. Specifically, we propose that analyses of an nm23 gene together with zona pellucida genes may increase the possibilities to detect an exposure to low levels of estrogenic compounds in fish.


Background
The contraceptive estrogen, ethinylestradiol (EE 2 ) is an important contributor to the feminization of fish downstream from sewage treatment works [1][2][3][4][5]. This discovery was greatly facilitated by the use of vitellogenin (VTG) as a biomarker. VTG is produced in the liver of sexually maturing female fish under the influence of endogenous estrogen. Normally, VTG is not expressed in males or juve-niles, unless they are exposed to estrogens via water or food. Both VTG mRNA and protein in male and juvenile fish have thus become established biomarkers for exposure to environmental estrogens [6]. However, estrogens can effect gonadal sex differentiation of fish at concentration not sufficient to give rise to a measurable VTG response [7]. It has also been shown that life cycle exposure of fathead minnow to an inordinately low concentration of EE 2 (0.32 ng/L) was sufficient to decrease the egg fertilisation and to skew the sex ratios towards female [8]. This suggests that more sensitive biomarkers would be useful. Zona pellucida (ZP) genes may be more sensitive than VTG [9] but their specificity for estrogens is not as clear [10][11][12]. Additional, sensitive biomarkers would thus increase our possibilities to identify exposure to low, but biologically important concentrations of estrogens.
Rapidly accumulating data on genomes and proteomes have increased the possibilities to use different types of discovery-driven methods in ecotoxicology [13,14]. The large number of potential responses that can be studied with microarrays renders the method suitable for identifying candidate biomarkers of exposure [15][16][17][18][19][20]. Such candidates may then be further evaluated to find if they are useful as biomarkers. In general, a good biomarker should be sensitive, specific and robust. A robust response implies for example that it should be measurable at complex exposure situations, at different exposure concentrations, at different temperatures, after different exposure times, by different analytical approaches, in different labs and preferably also in different species.
The main objective of the present study was to use microarrays to find novel, sensitive and robust biomarkers of estrogenic exposure in fish. We have used a salmonid cDNA microarray from cGRASP [21] to analyze hepatic expression profiles in juvenile rainbow trout (Oncorhynchus mykiss) exposed to EE 2 in vivo. The responses identified at a high concentration of EE 2 were used to guide the subsequent identification of generally more subtle responses at a low concentration of estrogen. We also identified estrogen-responses shared between fish species, experimental conditions and analytical platforms. This was achieved by a meta-analysis using our dataset together with results from three recently published articles describing hepatic gene expression profiles in fish exposed to estrogens [16,20,22].

Sensitive gene-expression changes
Both male and female juvenile fish exposed to 0.87 ng EE 2 /L were analyzed with microarray. The microarray analysis of female fish suggested that only three out of four females had an induced expression of the known estrogen-responsive gene ZP3. In contrast, an induction was present in all eight males. This observation suggested that some juvenile females may have sufficient endogenous estrogen to induce sensitive estrogen-responsive genes. Thus, in our search for genes responding to low concentrations of estrogens only the microarray results from male fish were used.
Thirty-six sets of cDNAs (presumably corresponding to 29 genes) were regulated in male fish both by the low and the high concentration of EE 2 ( Table 1). All of the cDNAs responded in a dose-dependent manner. ZP3 was the most differentially expressed gene in fish exposed to both high and low concentrations with a fold change of 84 and 3.5 respectively. VTG was not affected by the low concentration while it was up-regulated 537 times by 10 ng EE 2 / L as measured by quantitative RT-PCR (qPCR) (Figure 1).

Robust gene-expression changes
A meta-analysis was performed with the aim to identify robust estrogen-responsive genes. The microarray data from fish (both sexes) exposed to 10 ng/L from the present study and available microarray data from three other exposure studies with fish and estradiol (E 2 ) or EE 2 were used in the meta-analysis [16,20,22]. Information about the different studies is shown in Table 2. Transcripts (360) presumably corresponding to 55 genes or groups of paralog genes were identified as differentially expressed in at least two of the four different studies (see Additional file 1). VTG and ZP3 were differentially expressed in all four studies and nine genes had an altered expression in at least three studies ( Figure 2). It should be noted that ZP1 and the estrogen receptor-α, which are well-know estrogen-responsive genes in fish, have poor sequence representation on the cGRASP microarrays and are therefore not present in Figure 2.

Confirmation of microarray data with quantitative RT-PCR
Genes that were likely to be both sensitive (Table 1) and robust ( Figure 2) were chosen for subsequent qPCR analysis. Three genes fulfilled these criteria: ZP3, a nucleoside diphosphate kinase (nm23) and fatty acid binding protein 3 (fabp3 or H-FABP). In addition, VTG was subjected to the qPCR analysis as well as the reference gene ubiquitin. In accordance with the microarray results the expression of VTG, ZP3 and nm23 were significantly induced in fish exposed to the high concentration. Also, as suggested by the microarrays, ZP3 and nm23 were significantly induced by the low concentration as well, whereas VTG expression was not induced (Figure 1). In stark contrast to the microarray results fabp3 had no tendency to any regulation caused by the treatment but showed a large variation within each treatment group (data not shown). The fabp3 and nm23 qPCR products were sequenced in order to confirm the amplification of the right products and they were identical to fabp3 and nm23 [EMBL:U95296, AF350241] in rainbow trout (data not shown).

Discussion
Our meta-analysis correctly identified some of the most well known estrogen-responsive genes (VTG, ZP3, ZP2). This suggests that the approach has a good potential to identify other robust, less well known estrogen-regulated genes. We also showed that ZP3 and a hepatic nucleoside diphosphate kinase nm23 are more sensitive to estrogenic exposure than the widely used biomarker VTG. As far as we know, no other microarray study has identified the effects of as low concentrations of estrogen as used here.
The recognition of nm23 induction as a highly sensitive response is therefore a novel finding. Thus we propose that analyses of nm23 together with ZP genes may increase the possibilities to detect an exposure to low levels of estrogenic compounds in fish. However, more studies are required in order to fully assess the potential of nm23 as a biomarker.
Sensitive biomarkers can be used as early warning signals to indicate exposure and thus potential risk of adverse effects. It has been suggested that the induction of ZP mRNAs are more sensitive than induction of VTG [9,23]. However expression of ZP genes can, as most genes, be affected by other environmental factors, for example cortisol exposure [10][11][12]. The regulation of a single gene is rarely sufficient to conclusively demonstrate a specific exposure, but a combination of responses would together potentially increase the degree of evidence.
We identified nine genes (or groups of paralog genes) that were affected by estrogen in at least three out of the four studies included in the meta-analysis. The known estrogen-responsive genes VTG and ZP3 were up-regulated in all four studies. The robust gene expression changes of ZP3, nm23 and fabp3 were also tentatively identified to be sensitive. However, the induction of fabp3 was not confirmed by qPCR. The incorrect identification from microarray data might be explained by cross hybridization to related mRNAs, a known problem for cDNA microarrays. Nm23, on the other hand, was confirmed with qPCR to be significantly induced both by a low and a high water concentration of EE 2 . In addition, microarray results from the other studies of rainbow trout exposed to 50 ng EE 2 /L and dietary exposed to 5 µg/g of E 2 further supports an estrogen-induction of nm23 in rainbow trout during different exposure conditions [20,22]. The study of estrogen-exposed medaka did not report nm23 as an estrogen-responsive gene but it is unclear if nm23 was represented on the medaka microarray [16]. Whether nm23 is regulated by estrogen in other fish species is still an open question, although mammalian studies suggest a conserved induction mechanism [24].
Gene expression changes of VTG, ZP3 and nm23 measured by qPCR and microarray Figure 1 Gene expression changes of VTG, ZP3 and nm23 measured by qPCR and microarray. Hepatic gene expression in rainbow trout of vitellogenin (VTG), zona pellucida protein 3 (ZP3) and a nucleoside diphosphate kinase (nm23) after EE 2 exposure measured with qPCR (green bars, male fish) or microarray (blue bars, male fish: red bars, female fish). Values are expressed as fold change (log 2 ) compared to control fish. Paired student's ttests (single sided) were performed on the qPCR data to confirm/ test the putative regulation suggested from microarray data. VTG, ZP3 and nm23 were confirmed to be significantly up-regulated in fish exposed to 10 ng/L (p = 0.001, 0.001 and 0.007 respectively, four biological replicates in each group). ZP3 and nm23, but not VTG were up-regulated in fish exposed to 0.87 ng/L (p = 0.0004, 0.006 and 0.5 respectively, eight biological replicates in each group) in accordance with the microarray data. The salmon and zebrafish nm23 shows high similarity to the human nm23-H1 and H2 genes. A phyl-ogenetic analysis suggests that nm23-H1 and H2 have arisen by gene duplication after the speciation event that gave rise to modern teleost fish and tetrapods [25]. Therefore it is assumed that the salmonid genome would only have one gene homologue to the nm23 -H1 and -H2 genes. In mammals the nm23-H2 gene encodes the c-MYC transcription factor and the nm23-H1 gene has been shown to be metastasis associated [24]. Moreover, the nm23-H1 gene and protein is up-regulated by E 2 -treatment in human breast carcinoma cell lines. This induction seems to be mediated, at least in part, at the transcriptional level via the estrogen receptor α binding to an estrogen responsive element in the promoter region of the human nm23-H1 [24]. The physiological function of nm23 in fish remains to be determined as well as the possibilities of a regulation by other factors than estrogen (specificity) and the robustness of the response during more complex exposure scenarios.
To be useful as a biomarker, a response should ideally be as robust as possible. In the meta-analysis we tested gene responses for robustness between species, exposure conditions and analytical platforms. Combining microarray data from different species and platforms is a challenging task, particularly when sequence information and good annotations are limited. We have addressed the cross platform/cross species comparison by using the zebrafish transcriptome as a reference. In contrast to most other fish Robust estrogen-responsive genes Figure 2 Robust estrogen-responsive genes. Genes affected by estrogen in at least three out of four studies used in the meta-analysis. Red refers to an up-regulation, whereas green refers to a down-regulation. Only the zebrafish transcripts with the best TBLASTX hit to each of the probes from the different studies are presented in the figure. Genes that were selected for qPCR analysis are marked with bold text.  species zebrafish has the advantage of being both well sequenced and well annotated. However, using zebrafish as a reference also has limitations, e.g. a lack of identified homologes for some genes. The results in the meta-analysis were also influenced by the shortage of available microarray raw data and therefore we had to accept the different statistical approaches used for selecting estrogenresponsive genes in the different studies.
It is certainly possible that more than nine hepatic genes are robustly regulated by estrogen in the analyzed species/ conditions. We have only included comprehensive fish arrays in the meta-analysis. Nevertheless, many genes are still likely to be represented on only one or a few of the array platforms which limits the possibility of identifying robust responses. The choice of microarray platform also affects the possibilities to accurately identify differentially expressed genes. Amplicon arrays (cDNA arrays) show less concordance with other platforms, for example qPCR and commercially produced high density arrays with oligonucleotide probes or cDNA arrays with synthetic oligonucleotides [26]. Although, it has been shown that when two independent platforms give consistent results, the outcome of qPCR analysis will most often also be in agreement [27,28]. This adds confidence that several of the differentially expressed genes identified by the meta-analysis indeed are relatively robust responses.
By making the data on putative sensitive and/or robust gene responses public, it can be used as a base for further investigations on the effects of environmental estrogens in fish in order to develop biomarkers or to increase the understanding of the physiological impact of environmental estrogens.

Conclusion
We have used microarrays to identify a range of potentially sensitive and/or robust gene expression changes in fish exposed to estrogen. We have identified the induction of ZP3 and a hepatic nm23 mRNA as being both sensitive and presumably robust responses. After further evaluation, nm23 induction would therefore be a good candidate biomarker together with ZP genes to reveal exposure to low levels of estrogens not sufficient to induce VTG but still with potential to affect gonadal differentiation of fish.

Experimental animals, exposure and preparation of hepatic total RNA
Fish from a previously published study were analysed [29]. The experimental setup was in short: 15, 14 and 14 juvenile rainbow trout (weighing around 100 g) were divided into three aquaria and exposed for two weeks to measured concentrations of 0, 0.87 and 10 ng/L respectively of EE 2 in a flow-through system. Water samples were taken from the low and high EE 2 concentration aquaria, before the transfer of the fish, on day 8 and on day 13. One sample was collected from the control aquaria on day 8. Solid phase columns were used to extract and purify EE 2 from the water followed by derivatization (pentafluorobenzoylester) and further purification. EE 2 -concentrations were determined using GC/MS. The limit of detection (signal-to-noise set to 5) was 0.01 ng/L. Samuelsson et al showed that the fish exposed to 10 ng/L EE 2 had significantly increased plasma levels of VTG, increased hepatosomatic index and the plasma metabolite profile were affected by the treatment. However, in the fish exposed to 0.87 ng/L neither induction of plasma VTG protein nor an altered metabolite pattern could be demonstrated using a specific VTG-ELISA and NMR respectively [29]. Gene responses in liver are widely used as biomarkers for environmental pollutants, i.e. estrogens, and the hepatic responses to estrogens are not restricted to a short developmental period. A prerequisite for our meta-analysis was availability of additional arraydata from the same tissue in estrogen-exposed fish. Only hepatic microarray data was available in the literature, which therefore also contributed to our choice of tissue. Livers were collected and snap frozen in liquid nitrogen. Total hepatic RNA was isolated from individual trout liver using TRI reagent (Sigma chemicals Co, St Louis, MO, USA). RNA quality and quantity were assessed by agarose gel electrophoresis and spectrophotometric measurements (Nanodrop 1000, NanoDrop Technologies, USA and Spectra MAXplus, Molecular Devices, CA, USA).

Microarray chip, hybridisation and wash
Salmonid cDNA microarrays (GRASP16k v1) were purchased from cGRASP, Univerity of Victoria, BC, Canada [21]. Microarray fabrication and quality control have previously been described in von Schalburg et al. [30]. The array contains 13,421 Atlantic salmon (Salmon salar) cDNAs and 2,576 rainbow trout cDNAs that together with a few more expressed sequence tags (ESTs) from other salmonid fish results in 16 006 spotted cDNAs in total. It has previously been shown that the sequence similarity between the Atlantic salmon and rainbow trout is sufficiently high for cross species use of the array [31].
Several cDNAs on the array correspond to the same gene and to reduce redundancy, a sequence based clustering was made as follows. Each cDNA sequence was compared to all other sequences on the array using BLAST [32]. A stringent cut-off value of at least 98% sequence similarity over 250 base-pairs or more was used to define equality. Single-link clustering was then applied which resulted in 13853 sets of cDNAs.
Slide preparation have been described in detail in von Schalburg et al. [30]. Briefly, 8 µg of total RNA was reverse transcribed and labelled using SuperScript Indirect cDNA labelling System kit (Invitrogen, Carlsbad, CA, USA) and fluorescent dyes Cy5 and Cy3 (GE Healthcare, Buckinghamshire, UK). cDNA from one control fish and one exposed fish were labelled and hybridised to the same array. Every other pair was dye swapped to compensate for cyanine flour effects. Eight male control fish were paired with male fish exposed to 0.87 ng EE2/L matching individual weights and lengths as closely as possible. Four of the same male control fish were also paired to fish exposed to 10 ng EE2/L. In the same way, four female control fish were paired both to females exposed to the low dose and the high dose. Hybridisation and wash were performed as described before by von Schalburg et al. [30] with the exception of the prehybridization that was preformed for 1.5 h in 5xSSC, 0.1% SDS, 0.2% BSA at 49°C. In total 20 microarrays were analysed.

Microarray analyses
Fluorescent images of hybridized arrays were acquired using an Agilent MicroArray Scanner (Agilent Technologies, Palo Alto, CA, USA). Intensity data were extracted from TIFF images using Imagene version 6.0 (BioDiscovery, CA, USA). The statistical analysis was performed using the R package [33] LIMMA [34], which is available at the Bioconductor repository [35,36]. For each cDNA on the chip, M-values (log 2 fold change) and A-values (average log 2 intensity) were calculated. Loess normalization was applied to each array to remove intensity dependent trends [37]. For each set of cDNAs (defined above), an Mvalue was calculated by taking the average of the M-values of all the cDNAs in the set. Next, each set was annotated based on the cDNA with highest A-value (i.e. the spot with best hybridization). Finally, the sets of cDNAs were ranked by moderated t-statistic [34] to reduce the proportion of false positives. Data from the complete microarray experiment is available according to the MIAME guidelines at Array Express [38].

Meta-analysis
Microarray data from four different studies on estrogenexposed fish were subjected to a meta-analysis with the aim to identify robust estrogen responsive genes [16,20,22]. Another study with estrogen-exposed adult female zebrafish was excluded since the control fish presumable had high levels of endogenuous estrogen (neither VTG nor ZP3 was regulated in this study) [39]. To our knowledge, no other relevant microarray studies covering more than 3000 cDNAs/transcripts were publicly available (i.e. open access to transcript sequences) in October 2006 when we performed the meta-analyses. The studies included are summarized in Table 2.
The meta-analysis was done as follows. For each study, a list of the reported estrogen-regulated genes and the corre-sponding transcripts/cDNA-sequences was created. Note that the studies used different statistical methods to find the regulated genes (Table 2).
From the present study, the topmost 250 sets of cDNA from fish (both female and male) exposed to 10 ng EE2/L were chosen. To compare the platforms, zebrafish was used as a reference species. It was chosen since it is almost fully sequenced and well annotated compared to the other species involved. All transcripts/cDNAs were compared to the zebrafish transcriptome available through Ensembl release 40 (26,679 in total) using tblastx [32]. A cut-off E-value of 10 -25 was used to define a match. This resulted in a list of 360 zebrafish transcripts that had a match to transcripts/cDNAs from at least two studies (the transcripts should be regulated in the same direction). The list of zebrafish transcripts contained both multiple transcripts from the same gene (different splice variants) and paralogs and therefore the list was clustered into groups of transcripts. A similarity indicator matrix was created by comparing all transcripts in the list to each other using tblastx. Pairs of transcripts with an E-value of 10 -50 or less were defined to be equal. Otherwise the distance was set to zero. Single link clustering was then applied to create the groups of transcripts. Finally, all transcripts were annotated using Ensembl. The complete list of transcripts is available in Additional file 1.

Quantitative RT-PCR
To confirm regulation of four selected genes, the abundances of the mRNAs were analysed with qPCR. The qPCR was performed on isolated total RNA from the same fish used in the microarray analysis. Total RNA (0.5 µg) was reverse transcribed in duplicate with a mixture of random hexamers and oligo(dT) primers, using the iScript™ cDNA Synthesis Kit (Bio-Rad, Hercules, CA, USA). The cDNA synthesis was performed according to the manufacturer's instructions, except for a scale-down of the reaction volume to 10 µl. Pooled RNA samples were used as no reverse transcriptase controls to control for genomic contamination. It was discovered that three samples might have been contaminated with DNA. . Primer sequences were as follows: 5'-ccctgcgtatctttgtgga-3' and 5'-gtgggaacctgtcattttgg-3' for ZP3; 5'-ccttcttccctggtctcgt-3' and 5'-gatgatgttcctgcccactt-3' for nm23 ; 5'ctttccctgtttcccctcct-3' and 5'-tgctgtgtgcttcttgctactc-3' for VTG; 5'-ggggcagtatggcttgtatg-3' and 5'-ctggcaccctaatcacctct-3' for beta actin; 5'-cgatagacggtggtaagatgg-3' and 5'aggtgtggcaaagggtagtg-3' for fabp3 ; 5'-atgtcaaggccaagatccag -3' and 5'-ataatgcctccacgaagacg -3' for ubiquitin. For all genes except the reference gene, ubiquitin (for which the qPCR was performed according to a previously published protocol [41]), the qPCR reactions contained 1× Real Time PCR Buffer, 3 mM MgCl 2 , 400 µM dNTP, 300 nM of each primer, 1 U TaKaRa Ex Taq™ R-PCR Version 2.1 (TaKaRa Bio Inc., Shiga, Japan), 0.25× SYBR Green I (Molecular Probes Eugene, OR, USA) and cDNA corresponding to 20 ng total RNA, in a final reaction volume of 20 µl. Real-time qPCR was performed on a Stratagene Mx3005p with 30 sec initial denaturation at 95°C, followed by 45 cycles of 95°C for 20s, 60°C for 20s and 72°C for 20s. A melting curve analysis was performed after each run to verify specific amplification. In addition the qPCR products were subjected to an agarose gel electrophoresis to confirm the expected size of the product. Both beta actin and ubiquitin were chosen as potential reference genes. Beta actin had a high variance and also a tendency to be regulated in the high dose group and therefore only ubiquitin was used. All signals were normalized against ubiquitin and ratios were calculated for exposed fish compared to control fish paired in the same manner as in the microarray analysis. Paired single-sided student's t-test were performed to test for significantly regulated genes. Since all samples could not be run at the same occasion, two standard samples were run at all occasions in order to enable a compensation for a possible run to run variation. Applying a run to run factor made little difference and the differently expressed genes VTG, ZP3 and nm23 were significant up-regulated both with and without applying the factor. The presented qPCR results are calculated without this factor.