In this study we identified the total CYP complement in zebrafish, and assessed patterns of expression of these genes during normal embryonic development. Zebrafish are an increasingly important model in developmental toxicology, pharmacology and chemical effects on disease. Knowing the identity and regulation of CYP genes is essential to strong inference regarding chemical effects in this model, and to assess pathways of metabolism of xenobiotics and endobiotics, and the relationship to CYP roles in these processes in humans. The 94 zebrafish CYPs occur in the same 18 gene families that are found in humans and other mammals, but with differences in numbers of genes, and often with uncertain function.
CYP Families 5-51
Many of the CYP gene families in this group have single genes in zebrafish, as they do in humans, and show a high degree of conservation of sequence with their human counterparts. These genes also exhibit similar syntenic relationships as found in human. Together, the sequence data and gene location data indicate that in many of these families the genes are direct orthologs of their human (mammalian) counterparts. Where there is 1:1 correspondence in these gene families, i.e., with CYP5A1, 7A1, 7B1, 8A1, 20A1, 21A1, 24A1, 26A1, 26B1, 39A1, 46A1 and 51A1, it most likely indicates conservation of enzyme activities and physiological function. In other gene families in the "endogenous set", defining relationships between human CYPs with important endogenous functions and the zebrafish homologs is complicated by the presence of multiple closely related paralogs in zebrafish, not found in human. Thus, zebrafish have two CYP17As, two CYP19As, and four CYP46As, while humans have only half that number in each case. Such doubling of the numbers of genes in several CYP gene families could be the result of individual gene duplication, or could be remnants of the third round of whole genome duplication (WGD-3 [24, 92]), with the retention of duplicated genes in zebrafish.
Zebrafish CYP paralogs that are co-orthologs of the human CYPs could have distinct functions, as a result of function partitioning (subfunctionalization) or differential regulation (temporal and/or organ differences or differences in induction). This has been observed for CYP19, with CYP19A1 (ovarian) and CYP19A2 (brain) aromatases displaying distinct expression patterns and inducibility. The neural form (CYP19A2) exhibits sensitivity to induction by estrogen, while the ovarian form appears to be mostly recalcitrant to induction by estrogen receptor agonists [36, 93–95]. While functions have not been confirmed for zebrafish, the two CYP17A genes in other fish also appear to represent enzyme sub-functionalization following gene duplication. Tilapia and medaka CYP17A1 possess both steroid-17α-hydroxylase and 17, 20-lyase activities, as does mammalian CYP17A1, but both tilapia and medaka CYP17A2 possess only 17α-hydroxylase activity, as they only convert pregnenolone or progesterone to 17α- hydroxy products, but do not perform the subsequent conversion to androstenedione or DHEA [32, 33]. There also are significant differences in spatial expression patterns for the duplicated CYP17A genes during development, and during the spawning cycle in other fish . We did not see any substantial temporal separation of CYP17A1 and CYP17A2 expression during development, although CYP17A1 is more strongly expressed (Figure 4 and Additional File 2, Table S2).
In some families of "endogenous" genes, zebrafish have more than twice the number of genes than occur in humans, and some have novel subfamilies as well. This is evident in the CYP27s, and the CYP46s, where the numbers of genes are greater than would be expected to have resulted from WGD-3. It is likely that the ancestral condition would be one of fewer genes, with expansion in zebrafish rather than loss in humans. That expansion in zebrafish could involve WGD, but is most evident in tandem duplication as well as translocation. This is clearly suggested in CYP27 by the four genes that share synteny with the single human CYP27A1, and the presence of CYP27B1 and 27C1, which do not share synteny with human CYPs. The biological significance of some duplicated genes also could involve distinctions in temporal or organ- and cell-specific regulation, but determining this can be complicated by the strong possibility of substrate overlap.
CYP Families 1-4
Zebrafish CYP genes in families 1, 2, and 3 are more diverse than in humans, and sequence identities often are too low to discern orthology between zebrafish and mammalian genes in these families. However, analysis of the additional character of shared synteny clarifies evolutionary relationships between human and zebrafish genes in these families. CYP family 4 differs from families 1-3, as there are fewer CYP4 genes in zebrafish than there are in mammals. However, like the CYP1 s, CYP2 s and CYP3 s, the CYP4 s also are involved with (induced by or metabolize) xenobiotics, while this is seldom the case with CYP5-CYP51 genes.
Consistent with phylogeny, the fish CYP1 and CYP3 clades appear as sister groups to the mammalian clades for these genes. However, as discussed earlier , both mammals and fish share the CYP1 subfamilies CYP1A and CYP1B. Zebrafish also express CYP1Cs, which do not occur in humans, and CYP1D1, a pseudogene in human. All mammalian CYP3 s are in a single subfamily, CYP3A, which occurs in zebrafish as well. However, zebrafish CYP3A65 shares synteny with single exon pseudogene CYP3As in human (CYP3A-se1 and -se2), while the novel CYP3C subfamily  shares synteny with the functional human CYP3A3, 3A4 and 3A7 . Fugu and possibly other fishes also have a second CYP3 subfamily, CYP3B, about which nothing is known [96, 97].
The functional similarities in different taxa, suggest similar biological roles for the homologous CYP1 s and CYP3 s. Thus, CYP3As are the primary catalysts of testosterone 6β-hydroxylase in fish and mammals , and mammalian and fish CYP1As and likely CYP1Bs are prominent in the metabolism of some PAH pro-carcinogens. The roles of orthologous CYPs in metabolism of particular compounds can differ between taxa, however. Thus, the regio-specific oxidation of BaP and the rates of metabolism of planar HAH appear to differ in degree between fish and mammalian CYP1As [99, 100], apparently reflecting species differences in CYP1A structures . The functions of the novel CYP1 s and CYP3 s are less well defined, although zebrafish CYP1Cs and CYP1D1 have been expressed and functions have been determined with BaP , estradiol , and a number of other exogenous and endogenous substrates (Urban, Stegeman, et al. unpublished results). Little is known of the function of the CYP3Cs, but CYP3C1 appears not to be responsive to chemicals that induce CYP3A65 .
Identifying zebrafish-human orthologs is most difficult in the CYP2 family, where the differences in CYP2 divergence between zebrafish and humans obscure many homologous relationships. Thus as noted, of the 11 zebrafish and 11 human CYP2 subfamilies, only two (CYP2R and CYP2U) warrant the same designation in zebrafish as in humans based on sequence identity. While the disparity between zebrafish, or other fish, and mammals is exaggerated by evolutionary distance, our analysis of shared synteny indicates that members of distinct subfamilies in mammals and fish still may bear co-orthology. The CYP2J-related genes are a key example. The single clade of 11 genes in the zebrafish CYP2N, 2P, 2V, and 2AD subfamilies and human CYP2J2 (Figure 2), implies relationship between the fish and human genes. Our analysis shows the zebrafish genes occur in tandem in a cluster that shares synteny with CYP2J2, indicating co-orthology (Figure 3). This suggests that there are catalytic functions among these zebrafish CYPs that are similar to the human CYP2J2. A similar hypothesis was borne out in functional characterization of previously identified killifish CYP2P3, which also clusters with the CYP2Js on phylogenetic analysis. Heterologously expressed CYP2P3 exhibited nearly identical regio- and stereoselectivity for oxidation of arachidonic acid as human CYP2J2, consistent with molecular phylogeny indicating a shared ancestral origin . Notably, at present there is little or nothing known about the catalytic or biological functions or the chemical regulation of the majority of zebrafish CYP2 s.
Zebrafish CYPs in development
In addition to annotating the full complement of CYP genes, we analyzed the expression of 88 CYP genes over the course of development in zebrafish (Figure 4). Several of these zebrafish CYPs exhibited markedly elevated expression levels at some time during development, but a large number, including more than three-quarters of the total genomic complement, showed distinct temporal expression patterns (Figure 5, Additional File 1, Figure S6). Importantly, gene expression profiling performed on whole embryos often under-estimates the importance of tissue- or cell-specific gene expression due to dilution effects.
Our array results are similar to developmental expression that has been determined for some individual CYP genes (e.g. CYP1 s [59, 62], CYP2K6 , CYP3C1 , CYP19 [36, 102]). Developmental roles have been established principally for some of the "endogenous" CYPs. Such genes include CYP11A1, which is essential for the synthesis of pregnenolone, critical for cell migration ; and the three CYP26 s, which contribute to retinoic acid gradients that regulate hindbrain and neural crest patterning [13, 103, 104] and osteogenesis [2, 47]. We and others have seen complex CYP expression patterns in development. Developmental roles of CYPs in families 1-3 are unknown, although morpholino knockdown of CYP1Cs appears to protect from developmental toxicity of dioxin (Kubota et al., unpublished data) and over-expression of CYP2P6 caused developmental abnormalities, including cardiovascular abnormalities , suggesting developmental significance of these genes.
The roles for many CYPs, including roles in development, cannot necessarily be inferred from sequence identities. Thus, CYP20A1 is an 'orphan' CYP that does not have a defined function in zebrafish or in humans [37, 38], and some large subfamilies (CYP2X, CYP2AA) do not have any homologs in mammals. As well, the different numbers of genes that are co-orthologs of single human CYPs precludes assignment of a function to any one, which requires empirical determination. This is true for the multiple co-orthologs in the "endogenous" CYP families, such as the CYP11 s, the CYP27 s and CYP46's, as well as for most those in the "xenobiotic" CYP families. The issues in CYP11 exemplify the questions and approaches. Zebrafish CYP11A1 is expressed throughout development, as in the murine model, but the CYP11A1 knockdown is not lethal [5, 27]. As zebrafish have two CYP11A genes and a CYP11C gene, it is possible that overlapping substrate specificity and spatiotemporal expression patterns might allow one to substitute for the other in loss-of-function studies.
There is a greater dearth of information regarding CYP genes that may have maternally derived transcripts deposited in oocytes. Our analysis of transcripts of a few selected CYP genes, CYP1A, CYP2V1, CYP2AA4, and CYP20A1, showed that transcripts for all four were present in unfertilized zebrafish eggs, and that the levels of transcript could be substantial. CYP19 mRNA also has been reported in oocytes , and CYP1A transcript also was reportedly recently by others . The significance of maternal transcript of these CYPs is not known. Whether other CYPs also have maternal transcripts deposited in oocytes, and what influences that deposition, is under investigation.