Differences in TCDD-elicited gene expression profiles in human HepG2, mouse Hepa1c1c7 and rat H4IIE hepatoma cells

Background 2,3,7,8-Tetrachlorodibenzo-p-dioxin (TCDD) is an environmental contaminant that elicits a broad spectrum of toxic effects in a species-specific manner. Current risk assessment practices routinely extrapolate results from in vivo and in vitro rodent models to assess human risk. In order to further investigate the species-specific responses elicited by TCDD, temporal gene expression responses in human HepG2, mouse Hepa1c1c7 and rat H4IIE cells were compared. Results Microarray analysis identified a core set of conserved gene expression responses across species consistent with the role of AhR in mediating adaptive metabolic responses. However, significant species-specific as well as species-divergent responses were identified. Computational analysis of the regulatory regions of species-specific and -divergent responses suggests that dioxin response elements (DREs) are involved. These results are consistent with in vivo rat vs. mouse species-specific differential gene expression, and more comprehensive comparative DRE searches. Conclusions Comparative analysis of human HepG2, mouse Hepa1c1c7 and rat H4IIE TCDD-elicited gene expression responses is consistent with in vivo rat-mouse comparative gene expression studies, and more comprehensive comparative DRE searches, suggesting that AhR-mediated gene expression is species-specific.

Background 2,3,7,8-Tetrachlorodibenzo-p-dioxin (TCDD) is a ubiquitous environmental contaminant that elicits a broad spectrum of biochemical and physiological effects in a species-specific manner [1]. These effects include lethality, cancer, developmental abnormalities, immunotoxicity, skin lesions, hepatotoxicity, and xenobiotic enzyme metabolism induction. They result from altered gene expression mediated by the aryl hydrocarbon receptor (AhR), a ligand activated transcription factor [1,2]. Briefly, TCDD binds to the cytoplasmic AhR causing nuclear translocation and heterodimerization with the AhR nuclear translocator (ARNT). The heterodimer then binds to specific DNA elements, termed dioxin response elements (DREs), within the regulatory regions of targeted genes to modulate expression, resulting in downstream physiological responses [3]. Although the structure and function of the AhR are highly conserved [4], the sensitivity to and the responses elicited by TCDD vary widely across species, suggesting TCDD and related compounds may activate species-specific AhRmediated gene expression networks.
Risk assessment assumes that there is a conserved mode of action and comparable toxic responses between species. However, there are inherent differences between species that compromise the extrapolation of rodent data to estimate potential human risks. Moreover, there is a discord between preclinical animal testing compared to human clinical trials regarding toxicity [5]. Similarly, species differ widely in response to TCDD exposure. For example, LD 50 values range from 1 μg/kg in the guinea pig [6] to >1000 μg/kg for the hamster, and several responses exhibit species-specific sensitivities and toxicities. These differential effects are not attributed to differences in binding affinity or AhR complex stability [7][8][9]. Collectively, these data suggest that although the AhR is well conserved, subsequent differential gene expression responses are species-specific.
To further investigate differences in TCDD elicited differential gene expression, global gene expression was assessed in human HepG2, mouse Hepa1c1c7 and rat H4IIE hepatoma cells following treatment with TCDD. Comparative analysis indicates there are significant differences in gene expression between species, suggesting AhR-mediated gene expression may not be conserved.

Temporally Conserved Gene Expression Responses Elicited by TCDD
Species-specific, cDNA microarrays were used to profile the temporal gene expression elicited by TCDD in human HepG2, mouse Hepa1c1c7 and rat H4IIE cells. The microarrays queried 6995, 8478 and 5169 unique human, mouse and rat genes, respectively (Table 1). Empirical Bayes analysis identified 691, 439 and 57 differentially expressed genes (P1(t) > 0.999 and |fold change| > 1.4) in HepG2, Hepa1c1c7 and H4IIE cells, respectively. Complete cDNA microarray data sets are provided in Additional files 1, 2, 3. HepG2 cells were the most responsive as indicated by both the overall number of differentially expressed genes as well as the number of responsive genes at each time point ( Figure  1). H4IIE cells exhibited significantly less differentially expressed genes, partially explained by the smaller microarray and the immaturity of rat genome annotation compared to the human and mouse. Differentially expressed genes were hierarchically clustered based on Euclidian distance and distinct clusters of temporal gene expression patterns were identified ( Figure 2). Pair-wise comparisons of differentially expressed genes were conducted using HomoloGene (build 35) defined orthologs ( Figure 3A). Human and mouse cDNA microarrays shared 4546 orthologous genes, however only 0.9% (41 orthologs; Additional file 4) were differentially regulated by TCDD in HepG2 and Hepa1c1c7 cells. Comparison of the rodent platforms identified 3850 orthologs with only 0.2% (8 orthologs; Additional file 5) responding in both Hepa1c1c7 and H4IIE cells. The lack of conserved ortholog differential expression in Hepa1c1c7 and H4IIE cells is consistent with the reported differences in differential expression observed in vivo between mice and rats [10][11][12]. Time dependent profiling of HepG2 and H4IIE gene expression identified only 5 conserved responses out of 2625 possible orthologs, representing only 0.2% (Additional file 6). Comparative analysis across all three species with 2252 shared orthologous probes identified only one ortholog that was differentially regulated in all three cell lines (immediate early response 3, IER3; HomoloGene ID 2894). Note that other members of the AhR gene battery, namely CYP1A1, ALDH3A1 and NQO1, were not present across all of the cDNA microarray platforms. However, their responses were also conserved across all three cell lines when measured using QRTPCR. The results for CYP1A1 are shown in Figure 4.

Identification of Putative Primary Gene Expression Responses
In order to further investigate AhR-mediated responses, TCDD-elicited differential gene expression was examined in the presence of cycloheximide (CHX), a protein translation inhibitor. Putative primary responses were defined in this study as those where CHX co-treatment either maintained or enhanced the response elicited by TCDD, while responses that were attenuated or blocked by CHX co-treatment were classified as secondary responses based on their assumed dependence on additional protein translation. cDNA microarray analysis confirmed the superinduction of CYP1A1 mRNA in CHX co-treated Hepa1c1c7 cells [13,14], consistent with the superinduction in human MCF10A cells treated with TCDD [15]. Additionally, Hepa1c1c7 ARNT-deficient c4 mutants treated with TCDD did not exhibit induction of the prototypical AhR battery genes including CYP1A1 [16]. Collectively, these results indicate that CYP1A1 is a primary gene expression response consistent with the direct interaction of the AhR with DREs within the promoter region. For each species, differentially expressed orthologs were classified as primary or secondary responses based on CHX co-treatment studies at both 4 and 12 hrs. Overall 61, 38 and 2 human, mouse and rat orthologs, respectively, were considered primary responses (Additional files 7,8,9). Furthermore, 45, 12 and 10 orthologous genes in the HepG2, Hepa1c1c7 and H4IIE cells were classified as secondary AhR responses (Additional files 10,11,12). Comparative examination of the CHX co-treatment data suggested that each cell line had its own unique set of primary responsive orthologs.

Whole-Genome Analysis of Conserved TCDD-Elicited Gene Expression Responses
The lack of whole genome coverage on the human, mouse and rat cDNA microarrays limited the number of orthologs that could be investigated. Therefore, whole genome expression analysis was performed at 24 hrs, one of the most active time points in terms of the number of differentially expressed genes (Figure 1), using 4 × 44 k Agilent oligonucleotide microarrays. Each microarray contained more than 41,000 individual probes, representing more than 18,000 unique genes ( Table 2). Despite the increased coverage, the number of TCDD elicited differentially expressed genes were surprisingly small relative to the cDNA microarray results. For example, the human Agilent microarray consisted of 19,406 known genes, representing a 2.8-fold increase in coverage compared to the human cDNA microarray. However, only 899 unique genes were differentially expressed, a modest increase from the 691 genes identified using cDNA microarrays. Similarly, only 519 and 121 genes were responsive in the Hepa1c1c7 and H4IIE cells, respectively. Complete Agilent microarray data sets are provided in Additional files 13, 14, 15. The use of whole genome microarrays also increased the number of orthologs that could be examined ( Figure  5A). As seen with the cDNA microarray dataset pairwise comparisons, HepG2 and Hepa1c1c7 cells shared the greatest number of TCDD responsive orthologs (Additional files 16,17,18). Ortholog coverage between all three species increased from 2252 on the cDNA microarrays to 12,388 across the Agilent platforms. Comparative analysis (P1(t) > 0.999 and |fold change| > 1.4) identified only 10 orthologs that were differentially expressed by TCDD at 24 hrs ( Figure 5B; Table 3).  Whole genome expression profiling identified the species-conserved induction of CYP1A1, TIPARP and UGT1A6 in all three cell lines. Despite this increased coverage, the number of differentially expressed orthologs across all three cell lines remained small, consistent with our cDNA microarray results.

Species-Specific & Species-Divergent Gene Expression Responses
Whole genome comparative analysis of HepG2, Hepa1c1c7 and H4IIE responses identified genes that were species-specific, i.e. differentially expressed in only a single species. For example, microarray analysis at 24 hrs found that fibromodulin (FMOD, HomoloGene ID 1530) was significantly up-regulated 17.2-fold in the HepG2 cells while no significant change in expression was detected in the Hepa1c1c7 and H4IIE cells. Other examples of mouse and rat specific responses include forkhead box Q1 (FOXQ1, HomoloGene ID 7359) and ectonucleoside triphosphate diphosphohydrolase 2 (ENTPD2, HomoloGene ID 20333). FOXQ1 was upregulated 5.9-fold in the Hepa1c1c7 cells while ENTPD2 was up-regulated 3.2-fold in the H4IIE cells. For both of these genes, the corresponding ortholog in the other two species did not exhibit significant differential expression. The species-specific responses of FMOD, FOXQ1 and ENTPD were verified using QRTPCR ( Figure 6). These responses are consistent with previous reports of species-specific TCDD elicited hepatic gene expression characterized in mice and rats [10,11].
Comparative analysis of the orthologous gene expression responses in HepG2, Hepa1c1c7 and H4IIE datasets identified 10 orthologs that were differentially expressed by TCDD across the three models. Further analysis indicated that not all of these responses were directionally conserved, i.e. pattern of expression was not the same in all species. For example, IER3 was induced 1.4-fold in Hepa1c1c7 cells at 4 hrs and 1.5-fold in H4IIE cells at 12 hrs, but repressed -1.9-fold in HepG2 cells at 24 hrs. Likewise, glutathione S-transferase alpha 5 (GSTA5, HomoloGene ID 74378) was induced 1.5-fold in Hepa1c1c7 cells and 4.9-fold in H4IIE cells, but downregulated 2.5-fold in HepG2 cells. QRTPCR confirmed the divergent expression of GSTA5 in the rat and human cell lines, but found Hepa1c1c7 cells were    relatively non-responsive ( Figure 7A). Two other orthologs, cyclin D1 (CCND, HomoloGene ID 1334) and inhibitor of DNA binding 3 (ID3, HomoloGene ID 1633), also exhibited divergent expression between species, where the orthologs were down-regulated in the rodent cell lines but induced in HepG2 cells (Table 3). Species-specific and species-divergent responses have also been reported in vivo for TCDD elicited hepatic differential gene expression in mice and rats [10,11].

Functional Enrichment and Network Analysis
Microarray analysis identified a small subset of TCDD responsive genes in the entire genome of each species. Differentially expressed genes elicited by TCDD in each cell line were clustered for functional enrichment based on GO terms (enrichment score ≥ 1.5). Enriched HepG2, Hepa1c1c7, and H4IIE responses clusters included GO terms related to xenobiotic exposure (GO:0009410) and metabolism (GO:0006805) (Tables 4, 5, 6), indicating that these processes are conserved in all three cell lines, and consistent with reported in vivo studies using rats and mice [10][11][12]. However, other enriched GO terms, such as those related to lipid metabolism and transport, were only enriched in HepG2 and H4IIE cell lines. Genes functionally related to lipid metabolism were further analyzed using Ingenuity Pathway Analysis to identify a network of TCDD-responsive orthologs, and species-conserved and -specific responses ( Figure 8). Although GO terms related to lipid metabolism were enriched HepG2 and H4IIE differential gene expression data sets, there were few orthologs that exhibited differential regulation in both cell lines based on the fold-change and statistical cutoffs used. In vivo rat studies also identified enrichment of genes related to lipid metabolism following TCDD treatment, suggesting that H4IIE cells may be predictive of TCDD-induced perturbations of this pathway rats.
Despite the nature of these continuous cell lines and the dysregulation of genes related to cell cycle control and regulation, GO terms associated with these functions were not enriched across all three cell lines. TCDD-treated Human HepG2 and mouse Hepa1c1c7 cell lines identified Functional clusters associated with cell cycle control (GO:0022402) were in enriched TCDD-treated HepG2 and Hepa1c1c7 gene expression data sets, but not in the H4IIE data set. Species-   Table 3. Additional files 16,17,18 lists the shared responses between pairs of species.
differences in functional clustering of TCDD-elicited differentially expressed gene data sets further corroborate species-specific responses to TCDD.

Dioxin Response Element Analysis
Putative functional DREs are not equally distributed between species with more DREs associated with known human genes [17,18].   searched for DRE cores. Human orthologs of CCND1 and ID3 contained 12 and 14 DRE cores, respectively, greater than the number found in either the mouse or rat genomes. In contrast, mouse and rat orthologs of GSTA5 each had 7 and 12 DRE cores, respectively, while the human only had 3 ( Figures 7B and 7C). In each of these divergently regulated orthologs, the species with the greatest number of DRE cores within the regulatory region had the highest fold induction suggesting that species-specific regulons have important roles in regulating gene expression.
Each DRE core in the GSTA5 orthologs was extended by 7 bp on either end and assessed for sequence similarity by measuring the Euclidean distance between sequence pairs. Only the mouse and rat GSTA5 contained highly similar DRE sequences (Euclidian distance of ≤ 3.0; shaded rows in Figure 7C), and no human DRE sequences with high sequence similarity. Furthermore, the position of the conserved DRE sequence 5 bp upstream of the TSS in the mouse appears to be positionally conserved with the DRE sequence in rat the ortholog located 22 bp upstream of the TSS. Collectively, the disproportionate number of DREs between species and lack of sequence and spatially conserved DREs may account for the divergent regulation of human, mouse and rat GSTA5 orthologs.

Discussion
This study comprehensively and systematically compares the gene expression responses elicited by TCDD across human, mouse and rat cells. Incorporating both custom cDNA microarrays to profile the temporal responses and more comprehensive commercial oligonucleotide microarrays, a limited number of conserved responses between species were identified. In addition, divergent and a large number of species-specific responses were identified that may contribute to species-specific differences in sensitivity and toxicity. These results are consistent with the poor response correlations of orthologous genes between C57BL/6 mice, Sprague Dawley and Long-Evans rats [10,11]. Collectively, reported in vivo rodent comparisons, and the in vitro data presented in this study suggest there are significant differences in TCDD elicited gene expression between species, despite the conservation of the AhR [4] and its signaling pathway.
In vivo and in vitro studies examining TCDD-elicited global gene expression have demonstrated that AhR targets a limited portion of the genome [10][11][12][19][20][21][22][23]. In addition, PWM-based computational searches identified a low percentage of orthologs with conserved putative functional DREs within their regulatory regions (10kb upstream of the TSS and the 5'UTR) [17,18]. Our comparative in vitro microarray results corroborate these findings. Temporal analysis using custom cDNA microarrays found that TCDD elicited a response in 9.9% of the represented genes in the HepG2 cells, 5.2% in the Hepa1c1c7 cells and only 1.1% in the H4IIE cells.   Similar results were obtained using whole-genome microarrays where 4.7%, 2.4% and 0.7% of the genes exhibiting differential expression in HepG2, Hepa1c1c7 and H4IIE cells, respectively, at 24 hrs. All three cell lines differentially expressed a core set of conserved gene responses that included the induction of CYP1A1, NQO1 and UGT1A6, members of the AhR gene battery [24]. However, a significant number of responses were specific to a single species (Figures 3B  and 3B), as reported in in vivo studies [10,11]. Moreover, these studies not only identified species-specific responses, but also orthologs with divergent responses (i.e. same gene up-regulated in one species and downregulated in other). Comparisons of C57BL/6 mouse and Sprague Dawley rat responses found that 29% of the commonly regulated orthologs exhibited divergent regulation [11]. Similarly, GSTA5, CCND1 and ID3, exhibited divergent regulation across HepG2, Hepa1c1c7 and H4IIE cells ( Table 3). Each of these genes exhibited the same pattern with Hepa1c1c7 and H4IIE cells having comparable profiles while HepG2 cells exhibited the divergent response. For example, GSTA5 was up-regulated in Hepa1c1c7 and H4IIE cells, but down-regulated in HepG2 cells (Table 3). In vivo gene expression responses of GSTA5 orthologs matched the in vitro responses in the Hepa1c1c7 and H4IIE cells; TCDD treated mice and rats exhibited 1.8-and 1.9-fold maximum induction, respectively [11,12]. QRTPCR verified the divergent response of GSTA5 in the H4IIE and HepG2 cells, but did not detect a significant induction in the Hepa1c1c7 cells ( Figure 7A). Computational analysis of GST5A found a disproportionate number of DRE cores within the regulatory region sequence of each species (Figure 7B and 7C). Sequence analysis also found DRE sequences with high similarity in the mouse and rat orthologs, but not in the human GSTA5. The identification of species-specific DREs is consistent with the divergent regulation of GSTA5 between these cell lines. Overall, differences in ortholog expression may contribute to differences in TCDD sensitivity and toxicity across species.
Although these continuous cells lines are similar in morphology and were derived from hepatomas, there are inherent differences that may bias the identification of species-conserved and -divergent responses. For example, HepG2 cells were originally derived from the liver biopsy of a 15-year old Caucasian male and therefore may not be representative of a mature adult liver [25]. Recent studies have compared basal gene expression of liver samples to primary human hepatocytes, HepG2 and HepaRG cells, another human hepatoma cell line. Although, HepaRG cells were most similar to primary hepatocytes and liver samples [26], toxicogenomic studies report that HepaRG and HepG2 gene expression responses retained common functional processes [27]. In addition, the HepaRG donor differed in age and sex compared to the HepG2 patient, and was also infected with hepatitis C, which may affect both basal and TCDD-elicited gene expression. HepG2 cells were more sensitive in terms of the magnitude of regulation, and also in the terms of the total number of differentially regulated genes, and may be a more sensitive model for assessing TCDD exposure [27].

Conclusion
Although a core set of conserved gene responses was identified, consistent with the role of AhR in mediating the adaptive metabolic responses, further evidence of differences in genome-wide gene expression profiles between species (i.e. species-specific regulons) is also presented. This is consistent with species-specific differences in TCDD sensitivity and toxicity [6,8,28,29], which are due to alterations in gene expression. Furthermore, there is a lack of conserved putative DREs within orthologous genes [18], and differences in genome-wide gene expression profiles has been reported between mice and rats in vivo [10,11]. Undoubtedly, the number of conserved responses and immaturity of the genome annotation, especially for the rat, limits the overall interspecies comparison [30]. Differences in AhR levels, co-activator availability, and protocols used in their isolation of these hepatoma cells may confound our comparisons. Nevertheless, the identification of numerous species-specific responses, evidence of divergent gene expression

Microarray Experimental Design and Analysis
Gene expression changes were analyzed using custom human, mouse and rat cDNA microarrays as previously described [16,31,32]. Responses to CHX and TCDD co-  treatment were also assayed with cDNA microarrays using a 2 × 2 factorial design (Additional file 20) [33]. Three replicates were performed with two independent labelings per sample (dye swap). In total, 42 cDNA microarrays were performed for each individual cell line. Additionally, 4 × 44 k Agilent Technologies whole-genome oligonucleotide microarrays (Santa Clara, CA), were used to profile the responses elicited by TCDD 24 hrs post-treatment according to the manufacturer's Two-Color Microarray-Based Gene Expression Analysis protocol Version 5.0.1, including dye swap labelings. Three replicates were performed for a total of 9 whole genome microarrays for each cell line. Microarray data quality was first assessed using a quality assurance protocol to ensure consistent high quality data throughout all studies prior to normalization and further analysis [34]. Microarray data were normalized using a semiparametric method [35], and statistically analyzed using an empirical Bayes methods [36]. The data were hierarchically clustered using the Euclidian distance in Cluster 3.0 [37] and visualized with Java Treeview [38]. Functional annotation clustering of Gene Ontology (GO) terms for differentially expressed genes was performed using DAVID (Database for Annotation, Visualization, and Integrated Discovery) [39]. Annotation clusters with an enrichment score ≥ 1.3 were considered significantly enriched. Networks of direct and indirect molecular interactions based on the whole-genome expression data were identified and visualized using the Ingenuity Pathway Analysis http://www.ingenuity.com.

Quantitative Real-Time PCR
The same total RNA samples isolated for microarray studies were used for QRTPCR as previously described [16]. The copy number of each unknown sample was standardized to the geometric mean of three housekeeping genes (β-actin, Gapd, Hprt or Rpl13). Official gene names and symbols, RefSeq and Entrez Gene IDs, forward and reverse primer sequences, and amplicon sizes are provided in Additional file 21. Data were analyzed by analysis of variance (ANOVA) followed by Tukey's post hoc test using SAS 9.1 (SAS Institute, Cary, NC). Differences between treatment groups were considered significant when p < 0.05.

Computational DREs Searches
The proximal promoter region (10 kb upstream and 1 kb downstream of a TSS) for a RefSeq corresponding to an individual gene were computationally searched for the substitute intolerant 5'-GCGTG-3' DRE core sequence and the position relative to the TSS were determined using the center of the 5 bp core (underlined). Each core was then extended by 7 bp upstream and downstream of the core to generate a 19 bp DRE core containing sequence, which were assigned a matrix similarity using a previously defined algorithm [18] and with an updated position weight matrix [17]. The 19 bp DRE core sequences from orthologous human, mouse and rat genes were hierarchically clustered in R http:// www.R-project.org by measuring the Euclidean distance between pairs of sequences. DRE sequences that clustered together with a distance value ≤ 3.0 were characterized as orthologous DREs.