The Xenopus alcohol dehydrogenase gene family: characterization and comparative analysis incorporating amphibian and reptilian genomes

Background The alcohol dehydrogenase (ADH) gene family uniquely illustrates the concept of enzymogenesis. In vertebrates, tandem duplications gave rise to a multiplicity of forms that have been classified in eight enzyme classes, according to primary structure and function. Some of these classes appear to be exclusive of particular organisms, such as the frog ADH8, a unique NADP+-dependent ADH enzyme. This work describes the ADH system of Xenopus, as a model organism, and explores the first amphibian and reptilian genomes released in order to contribute towards a better knowledge of the vertebrate ADH gene family. Results Xenopus cDNA and genomic sequences along with expressed sequence tags (ESTs) were used in phylogenetic analyses and structure-function correlations of amphibian ADHs. Novel ADH sequences identified in the genomes of Anolis carolinensis (anole lizard) and Pelodiscus sinensis (turtle) were also included in these studies. Tissue and stage-specific libraries provided expression data, which has been supported by mRNA detection in Xenopus laevis tissues and regulatory elements in promoter regions. Exon-intron boundaries, position and orientation of ADH genes were deduced from the amphibian and reptilian genome assemblies, thus revealing syntenic regions and gene rearrangements with respect to the human genome. Our results reveal the high complexity of the ADH system in amphibians, with eleven genes, coding for seven enzyme classes in Xenopus tropicalis. Frogs possess the amphibian-specific ADH8 and the novel ADH1-derived forms ADH9 and ADH10. In addition, they exhibit ADH1, ADH2, ADH3 and ADH7, also present in reptiles and birds. Class-specific signatures have been assigned to ADH7, and ancestral ADH2 is predicted to be a mixed-class as the ostrich enzyme, structurally close to mammalian ADH2 but with class-I kinetic properties. Remarkably, many ADH1 and ADH7 forms are observed in the lizard, probably due to lineage-specific duplications. ADH4 is not present in amphibians and reptiles. Conclusions The study of the ancient forms of ADH2 and ADH7 sheds new light on the evolution of the vertebrate ADH system, whereas the special features showed by the novel forms point to the acquisition of new functions following the ADH gene family expansion which occurred in amphibians.


Background
Vertebrate alcohol dehydrogenases (ADH, EC1.1.1.1) are dimeric zinc-containing enzymes with a 40-kDa subunit and 373-383 amino acid residues. Structurally, they belong to the medium-chain dehydrogenase/reductase (MDR) superfamily [1]. ADHs catalyze the reversible oxidation of a wide range of alcohol substrates to the corresponding aldehydes or ketones, and can be grouped in eight enzyme classes (ADH1-8, class I to VIII) [2], according to their primary structure and function. The human ADH gene nomenclature used throughout the text is the enzyme class-based nomenclature currently used for vertebrate ADH [2] and differs from that approved by the Human Genome Organization (HUGO) Gene Nomenclature Committee [3], as the former facilitates comparisons with ADHs from other mammals and lower vertebrate species.
Tandem gene duplications gave rise to the multiplicity of forms in the ADH family, including isoenzymes and allelic forms in particular lineages. ADH3 is the most ancient form and the only class present before chordates. It is a glutathione-dependent formaldehyde dehydrogenase (FDH), a highly conserved and ubiquitous detoxifying enzyme. Duplication of the ancestral ADH3 gene near the agnathan/gnathostome split originated ADH1, which evolved independently in the fish and tetrapod lines becoming the classical hepatic ethanol dehydrogenase [4,5]. In tetrapods, a second duplication of the gene coding for ADH3 generated ADH2, also hepatic but active at higher ethanol concentrations [6]. Close to the origin of mammals, ADH1 duplicated giving rise to ADH4, a highly retinoid-active enzyme [7,8] present in eye, skin and gastric tissues [9][10][11]. The most evolutionarily recent classes in mammals are ADH5 and ADH6 [12], the latter being absent in primates [13]. These two classes, identified at DNA level, are the most divergent within mammalian ADHs. On the other hand, ADH7, previously named ADH-F due to its fetal expression, is a steroid/retinoid dehydrogenase that was first described in chicken [14]. Finally, ADH8 is a unique NADP + -dependent ADH isolated from the stomach of the frog Rana perezi and its proposed function is the reduction of retinaldehyde to retinol [15].
Studies on amphibian ADH genetics have been scarce. Isozyme patterns of X. laevis liver ethanol dehydrogenase suggested the existence of two polymorphic genes encoding ADH subunits that did not form heterodimers and were located in different linkage groups [16,17]. The enzymes ADH1, ADH3 and ADH8 from the frog Rana perezi were purified and characterized by our group, and the ADH1 and ADH8 proteins were also sequenced [15,18]. The cloning of the cDNA of R. perezi ADH8 [15] allowed to perform mutagenesis studies on coenzyme specificity [19] and to obtain the crystal structure of the enzyme [20,21]. Partial cDNAs of X. laevis ADH1 and an ADH4-like form were cloned and used for expression analysis in embryonic and adult tissues [22]. Later, two reviews on MDR-ADH evolution [4,23], which included genomic data, provided some partial information on the amphibian ADH system. Here the ADH system of the development model frog X. laevis has been further investigated, especially the retinaldehyde-active ADH8. Tetraploidy of X. laevis (2n = 36) was a handicap for genetic studies, thus the present work was restricted to expression patterns and extended with additional information from expressed sequence tag (EST) collections. On the other hand, its diploid relative X. tropicalis (2n = 20), the subject of the only amphibian genome project, was used for a genomic approach to the amphibian ADH family. Since the reptile genome of the anole lizard (Anolis carolinensis) and the turtle (Pelodiscus sinensis) had been sequenced at the time of the study, the ADH gene sequences from these organisms could be identified and used in phylogenetic analyses and genomic comparisons.
The joint analysis of Xenopus genome-wide data and the results of the expression analysis described herein provide an integrated view of the amphibian ADH system. Moreover, since this organism occupies a key phylogenetic position, this work provides insight into the molecular evolution of the ADH gene family in vertebrates.

Animal tissues
Tissues were obtained from adult X. laevis females (130 mm long) provided by Horst Kähler (Hamburg, Germany). The animals were kept in an ice bath for 15 min to diminish their metabolism prior to euthanasia. After decapitation, the head was immersed in liquid nitrogen to assure total unconsciousness, as recommended [24]. The organs were then removed, cleaned, rinsed in distilled water and stored at -80°C. Prior to analysis, frozen tissues were pulverized in liquid nitrogen and homogenized. This study was approved by the Ethical Committee of the Universitat Autònoma de Barcelona.
Isolation and cloning of X. laevis cDNAs Stomach poly(A) + RNA (2 μg) was isolated with the "QuickPrep Micro mRNA purification kit" (GE Healthcare) and a cDNA pool was synthesized using the "First Strand cDNA Synthesis kit for RT-PCR (AMV)" (Roche) with the oligo (dT) 17 R I R O primer adaptor [25]. Nested-PCRs combined degenerate primers based on R. perezi ADH8 (Table 1) and amplification products were cloned into pBluescript II SK(+) (Stratagene) and sequenced. The partial cloned sequences were later identified as ADH1B and ADH3, whereas the ADH8B and ADH9 partial cDNAs were obtained as described [21]. The 3′-ends were amplified by rapid amplification of cDNA ends [25] combining the adaptor-specific primers R O and R I with specific forward primers (Table 1), and then cloned and sequenced as described above.
Northern blot analysis of X. laevis ADH1B, ADH3 and ADH9 Total RNA from stomach, liver, kidney and intestine was isolated by the acid guanidinium thiocyanate method [26]. Samples (15 μg) were electrophoresed on a 1% agarose gel containing 2.6 M formaldehyde and transferred onto a Nylon filter. 18S rRNA (1.8 kb) and 28S rRNA (4.1 kb) were used to check the integrity and amount of loaded RNA and to estimate the size of the RNA hybrids. Probes included X. laevis ADH1B, ADH3 and ADH9 cDNAs (their 3′-end moieties of~700 bp), labeled with [α-32 P]dCTP (GE Healthcare) by a random hexamer-primed method using the "Prime-a-Gene Labeling System" (Promega) except for the Klenow enzyme (Invitrogen). After a 45-min prehybridization at 60°C in 0.2 M sodium phosphate, pH 7.2, 1 mM EDTA, 1% BSA and 1% SDS, filters were hybridized for 18-24 h at 60°C in the presence of 10 6 cpm/ml of radiolabeled probe. Final 30-min washes at 60°C were performed twice in 40 mM sodium phosphate, pH 7.2, 1 mM EDTA and 1% SDS. Autoradiography was carried out at -80°C for 2-5 days with Hyperfilm-MP (GE Healthcare) using an intensifying screen. Hybridization signals were then scanned in a Bio-Rad GS-700 imaging densitometer.
RT-PCR of X. laevis ADH8B cDNA pools from esophagus, stomach, intestine and liver were prepared from 3-8 μg of total RNA using the ADH8B-specific reverse outer primer described in Table 1. First PCR amplification combined this primer with the ADH8B forward outer primer, and a second round used the inner primer pair (Table 1), generating a 603-bp ADH8B cDNA fragment.
Starch gel electrophoresis and activity staining of X. laevis ADH1 and ADH3 Tissues were homogenized (1:1, w/v) in 30 mM Tris-HCl, pH 7.6, 0.5 mM dithiothreitol, centrifuged at 27000 ×g for 30 min and supernatants were used for analysis by starch gel electrophoresis [27]. Total protein was determined by the Bio-Rad protein assay method, using bovine serum albumin as standard. In order to discriminate between the NAD + -dependent classes ADH1 and ADH3, gel slices were stained for ADH activity with 0.1 M 2-buten-1-ol and 0.6 mM NAD + (grade AA1, Sigma), to mainly detect ADH1, and for glutathione-dependent FDH activity with 4.8 mM formaldehyde and 1 mM glutathione, to specifically stain ADH3.

Identification of Xenopus ADH sequences in protein and expression databases
Several ADH protein sequences from both Xenopus species were gathered at UniProt [28], first by name search and then by exploring clusters with 90% or 50% identity. ESTs of X. laevis and X. tropicalis were obtained by BLAST (Basic Local Alignment Search Tool) [29] search, using X. laevis and other vertebrate ADHs as protein queries against translated nucleotide sequences (TBLASTN), at the Sanger Institute X. tropicalis EST Project [30] and TGI Gene Indices [31] websites. Additional information on the expression sites of Xenopus ADH clustered transcripts was obtained by name search at the NCBI UniGene data bank [32].
Identification and analysis of ADH genes in Xenopus tropicalis, Anolis carolinensis and Pelodiscus sinensis genomes X. tropicalis genome assembly 4.2, A. carolinensis Ano-Car2.0 and P. sinensis PelSin_1.0 were interrogated with BLAST-based search tools to identify possible ADH gene locations. For X. tropicalis, TBLASTN searches were undertaken at the Joint Genome Institute X. tropicalis Genome Assembly 4.1 website [33] using X. laevis ADHs as protein queries. For A. carolinensis and P. sinensis, TBLASTN searches were conducted using the Ensembl genome browser [34] to compare X. tropicalis ADHs against genomic databases, allowing some local mismatch. Protein BLAT (BLAST-Like Alignment Tool) analyses [35] using the UCSC web browser [36] were also performed and the same scaffolds producing significant alignments were obtained.
These ADH-containing scaffolds were then exported with Ensembl, 3-frame translated in both orientations, and manually screened for the presence of ADH genes Degenerate nucleotides are R (A or G), W (A or T), H (A or C or T) and I (inosine, able to base pair with any natural nucleotide). 2 Numbering refers to amino acid residues unless otherwise indicated.
either using exon sequences derived from X. tropicalis ESTs or consensus ADH sequences. The Ensembl Genome Browser [37] was also used to align syntenic regions of the genomes studied. Exon-intron boundaries were determined according to the general GT/AG consensus, and intron lengths and discontinuities in the DNA sequence were annotated. The first 600 bp of the 5′non-coding regions of X. tropicalis ADH genes were checked for potential transcription factor binding sites using MATCH [38], based on the TRANSFAC database of position weight matrices, using 100% coincidence for the core and 95% for the whole matrix. In addition, the first 650 bp of the 3′-non-coding regions were screened for polyadenylation signals.

Sequence alignment and phylogenetic analyses
Sequence analysis and manipulation were carried out using BioEdit Sequence Alignment Editor, version 7.0.4.1 [39]. Multiple sequence alignments were performed with Clustal Omega [40,41]. Gaps and missing positions were not removed from the alignment and trees were constructed considering partial deletions for the pairwise comparisons. Phylogenetic analyses were conducted using MEGA version 5 [42]. Unrooted phylogenetic trees were constructed by Neighbour-joining (NJ) [43] using the JTT (Jones-Taylor-Thornton) matrix [44] for amino acid distance calculations. Evolutionary rates among sites were considered γ-distributed and the α parameter was calculated with TREE-PUZZLE 5.2 [45]. Bootstrap analysis [46] with 1000 replicates was performed to assess the relative confidence on the topology obtained. A second tree was constructed following the Maximum-likelihood (ML) method [47], by using the PHYML 2.4.5 program [48] included in the MacGDE software package. The tree parameters were the same as those used for the NJ tree; in this case, the reliability of the inferred phylogeny was assessed by 500 bootstrap repetitions.

Accession numbers of ADH sequences
The accession numbers of the vertebrate ADH sequences used in alignments and phylogenetic analyses are listed in Table 2, except for those of X. laevis, X. tropicalis, A. carolinensis and P. sinensis ADHs, which are provided in Tables 3 and 4.

Results
Isolation, cloning and identification of X. laevis cDNAs With the initial aim of studying the amphibian NADP +dependent ADH8 in the development model X. laevis, degenerated primers were designed, based on the R. perezi ADH8 sequence. By RT-PCR amplification from a X. laevis stomach cDNA pool, four cDNAs were cloned and sequenced, and, on the basis of amino acid sequence identity, they were assigned to ADH1, ADH3, ADH8 and a novel ADH9 class. ADH1, ADH3 and ADH8 were similar to their R. perezi orthologues and showed the typical residues of each class, while the low sequence identity values (<58%) between ADH9 and other classes pointed it to be a new class. The ADH1 cDNA was likely to be an allele of the same gene which partial sequence had been reported by Hoffmann et al. [21], while ADH9 was identical to the alleged ADH4-like form reported in the same study. Our studies here indicate that Xenopus does not possess an ADH4 ortholog.
Since the X. tropicalis genome project provided reliable genomic data from this organism, a parallel study was conducted to identify X. tropicalis ADH genes, and the same nomenclature was used for the two Xenopus species. Some ADH genes appear to be closely related, probably encoding isozymes of the same enzymatic class. Therefore, our gene notation includes an Arabic number indicating the ADH class, followed by a capital letter corresponding to the encoded isozyme, assigned in ascending order by the gene location in the scaffold. For those genes that had been previously identified in X. tropicalis genomic studies [4], their names were conserved whenever it was possible. Moreover, putative duplicated genes in X. laevis relative to X. tropicalis were denoted with a "1" or "2" tag after the name field.
According to this nomenclature, subsequent screening of data banks retrieved the following X. laevis ADH sequences, as detailed in Table 3: Four ADH1 forms (1A1 -partial-and 1A2, corresponding to a single 1A form in X. tropicalis; 1B and 1C); one form each of ADH3, ADH8 and ADH9; two novel sequences that were named ADH10 (10A and 10B); and a partial sequence similar to that of chicken ADH7. The here cloned X. laevis ADH3 and ADH9 cDNAs were identical to the sequences retrieved while ADH1 cDNA corresponded to ADH1B. The only ADH8 transcript found in databases showed a nonsense mutation after codon 25 (a likely amplification or sequencing artifact) and notably differed from the cloned sequence from gastric tissue. Thus, this new sequence was considered to be a different gene, designed as ADH8A, while that previously cloned was named ADH8B.
Class assignment was mainly based on amino acid identities in pairwise comparisons with other vertebrate ADH enzymes (see Additional file 1), considering an intraclass identity >65% for amphibian ADH sequences (see Additional file 2). The presence of class-specific residues and phylogenetic relationships were also taken into account for Xenopus ADH7 and ADH10, which were considered as separate classes despite their high amino acid identities with ADH1 enzymes.

Expression of X. laevis ADHs
Northern blot analysis performed on intestine, kidney, liver and stomach from X. laevis, with cDNA probes for cloned ADH1B, ADH3 and ADH9, showed a 1.6-kb mRNA in positive samples ( Figure 1A). ADH3 transcripts were present in all the tissues analyzed at relatively low levels. Starch gel electrophoresis confirmed its generalized expression and revealed high glutathionedependent formaldehyde dehydrogenase activity in the ovary as compared with esophagus, stomach and liver ( Figure 2). ADH1B expression followed the typical pattern found for ADH1 genes in vertebrates, as it was differentially expressed in the tissues analyzed (liver > kidney > stomach > intestine), while ADH9 was only detected in stomach. Moreover, ADH8B transcripts were detected by RT-PCR in X. laevis esophagus, stomach and intestine but not in liver ( Figure 1B), confirming the gastrointestinal location of this enzyme reported in R. perezi [15]. The diffuse band of low molecular weight observed in the liver corresponds to an unspecific amplification product, as confirmed by the lack of NADP + -dependent activity observed in electrophoresed liver extracts (not shown).
Additional data on the expression profile of the identified ADHs was obtained from Xenopus EST libraries. Although EST evidence is not quantitative, it can reflect very low transcript amounts, undetectable by less sensitive methods, but which may be physiologically relevant. Table 3 combines the results of localization studies with data obtained from expression libraries of adult tissues and embryonic stages.
Sequence data of X. laevis, X. tropicalis, A. carolinensis and P. sinensis ADHs were deposited in the European Molecular Biology Laboratory (EMBL) nucleotide sequence database [49], and accession numbers are provided in Tables 3 and 4.
Synteny of amphibian, reptile and human regions containing ADH genes is shown in Figure 3. Scaffolds GL172747.1 of X. tropicalis, GL343323.1 of A. carolinensis, and JH209104.1 and JH210661.1 of P. sinensis are syntenic to human chromosome 4, where all seven human ADH genes are closely linked. On the other hand, X. tropicalis scaffold GL172865.1 may be syntenic to human chromosome 9. Arrangement of several orthologous genes flanking the ADH loci in X. tropicalis scaffold GL172747.1 (the whole human syntenic region spans 2.7 kb) indicates a past inversion that isolated ADH10A from the main ADH cluster, presumably followed by other rearrangements in the All Xenopus ADHs supported by evidence at the protein, transcript or gene level, are included, together with their accession numbers and expression sites. A single ADH1A gene exists in X. tropicalis while two genes are found in X. laevis. In contrast, no X. laevis ADH2 sequence has been found to the date. Accession numbers are taken from UniProt or EMBL (all the EMBL sequences, in bold, are from this study). Expression sites were mostly obtained from UniGene (cluster numbers are provided), but also from GenBank (where indicated). For X. laevis, results of expression experiments from our group (in bold) and other sources, are also included. 1 Partial sequence, probably another allele of this gene; 2 Hoffmann et al. [22]; 3 Partial sequence, containing frameshift mutations; n.a.: not available.
same region. In contrast, reptiles show a single cluster although ADH genes are found in both orientations in the A. carolinensis genome. All X. tropicalis genes contain nine exons and eight introns and the insertion points are those conserved in animal ADHs [50]. Comparison of the gene structures shows a wide range of intron sizes; therefore this parameter has not been conserved, even among close genes. The fact that no stop codons or reading frame alterations were detected in the coding sequences, together with evidence from X. tropicalis EST collections, support the idea that all these genes are transcriptionally active. Moreover, all the X. tropicalis ESTs found could be assigned to an ADH genomic sequence. The eleven genes were phylogenetically classified in seven classes: ADH1 (1A, 1B, 1C), 2, 3, 7, 8 (8A, 8B), 9, and 10 (10A and 10B). Furthermore, sequences from UniProt [28] provided supporting evidence and sometimes complementary information, like the first exon of ADH7, missing in the assembly but present in sequence Q5M7K9 (Table 3).
Promoter analysis of X. tropicalis ADH genes Regions including 600 bp upstream of the translation start codon were screened for proximal regulatory elements (see Additional file 3, Additional file 4, Additional file 5, Additional file 6, Additional file 7, Additional file 8, Additional file 9, Additional file 10, Additional file 11, Additional file 12 and Additional file 13 for X. tropicalis ADH cDNAs including predicted regulatory elements). Putative TATA boxes were established for all genes except for ADH3. Upstream TATA box has been described for most ADH genes, such as those encoding human and mouse class I [51][52][53][54] and human class II [55]; whereas the promoters of human and mouse class III contain GC boxes clustered near the start site [56][57][58].
Putative transcription factor binding sites were also predicted, although they should be functionally validated in vivo since competition, chromatin structure and other influences are as important as binding affinity. Promoters of ADH1A and ADH1B contain putative sites for HNF3beta, GATA-1, overlapping half-sites for estrogen and retinoid receptors (repeated twice in ADH1B), three AP1 sites in ADH1A and one site for Sp1 in ADH1B. Positive regulation of the human ADH1A gene was reported to be influenced by GATA-2, while differences in HNF3beta binding could be related with tissue specificity of ADH1 [59]. In addition, AP1-responsive genes are susceptible to be negatively regulated by retinoic acid [60]. ADH1C promoter has a single site for Oct1 and reverse    sites for HNF3beta and Sp1. The ADH2 promoter contains one NF1-binding site in forward orientation and two reverse sites for GATA-1, NF1 and AP1, and one for Gfi1. The ADH3 TATA-less promoter shows single sites for Oct1, c-Myb and HNF3beta, a reverse CCAAT box and reverse sites for USF, GATA-1 and AP1. Both ADH8 promoters exhibit a CCAAT box, and GATA-1 (two in ADH8A), XFD and HNF3beta sites. Their differential traits are CRE-BP1 and AP1 sites in ADH8A, and overlapping half-sites for estrogen and retinoid receptors in ADH8B. The ADH9 promoter has also a CCAAT box, and GATA-1 and XFD sites, in addition to a site for Oct1. Finally, ADH10A promoter has one USF, one CHOP:C/EBPalpha and three HNF3beta sites, while ADH10B shows a putative site for estrogen receptor, and GATA-1 and AP1-binding sites. Furthermore, single polyadenylation signals were found within the first 650 bp of the 3′-non-coding region of the three ADH1 genes, ADH3, ADH7 and ADH8B; two signals were observed in the case of ADH2, ADH8A, ADH9 and ADH10B, and up to five signals were located at the 3′-end of ADH10A.

Sequence analysis and evolutionary relationships
Available amino acid sequences of R. perezi, X. laevis and X. tropicalis were aligned (see Additional file 14) and key residues for substrate and coenzyme binding are summarized in Table 5.
Xenopus orthologs are closely related phylogenetically, with an intraclass identity of 85-97% (see Additional file 2). Evidence of X. laevis genome duplication was exclusively found for X. tropicalis ADH1A, which corresponds to X. laevis ADH1A1 and ADH1A2 genes ( Table 3). Although only partial sequences were found for X. laevis ADH1A1, identical key residues in X. tropicalis ADH1A (Arg47, His51 and Val141), which are not found in X. laevis ADH1A2, suggest that the two former sequences would be All available sequences, full-length and partial, are considered. Sequence annotation consists of source organism abbreviation (Rp: Rana perezi, Xl: Xenopus laevis, Xt: Xenopus tropicalis) followed by the assigned class number. Position numbering is based on horse ADH1E. Residues at positions 223-225 (in bold) are involved in the interaction with the extra phosphate group of NADP + in ADH8 and determine preference for this coenzyme [19].
functionally closer. The percentage of identity between X. laevis ADH1A1 and ADH1A2 is 89.3% in 244 residues. When Rana and Xenopus class I sequences were compared, R. perezi ADH1 showed the highest identity with Xenopus ADH1B forms (~76%) and identical residues at most important positions (Table 5). For ADH8, the percent identity between the R. perezi and Xenopus forms is 71-73% and all possess Ser48, Ser51, Phe93, Gly223 and Leu309.
Amphibian sequences were also compared to other vertebrate ADHs, including those identified in A. carolinensis (anole lizard) and P. sinensis (turtle), in unrooted phylogenetic trees (Figure 4). Molecular phylogenetic analysis on the deduced protein sequences using NJ and ML estimations produced similar topologies. For each tree construction, among-site rate heterogeneity was taken into account and confidence in each node was assessed by 1000 and 500 bootstrap replicates, respectively. In the tree of Figure 4, Xenopus ADH1, ADH2, ADH3, ADH7 and ADH8 sequences cluster with other extant members of their classes, whereas ADH9 branches separately. The tree topology pictures the constant nature of class III, in contrast to the other ADH classes. Related to class I, amphibian ADH1, ADH8, ADH9 and ADH10 form a protein cluster presumably derived from a common ancestor. An important radiation would have occurred in amphibians from a primitive ADH1 gene, originating the four classes mentioned above, although the order and genes involved in each duplicatory event cannot be ascertained. Likewise, the presence of eight ADH1 and three ADH7 forms in A. carolinensis suggests that specific duplications could have occurred in lizards but not in turtles, as these organisms belong to different reptilian lineages. Percent identity within anole class I sequences ranges from 70.6% to 86.4%, and ADH1A and ADH1B  Tables 3 and 4. Accession numbers of other ADH sequences are compiled in Table 2. Alignment of all vertebrate ADHs included in the phylogenetic tree is presented in Additional file 15. Scale-bar represents substitutions per nucleotide.
Interestingly, multiple alignments reveal that all class I enzymes from reptiles to humans, as well as other classes derived from amniote ADH1, such as ADH4 and ADH6, show a deletion at position 60 with respect to amphibian ADH1 proteins and the remaining ADH classes (see Additional file 15).

Discussion
A total of eleven ADH genes with a conserved structure have been identified in the X. tropicalis genome, and grouped in seven enzyme classes: ADH 1 (1A, 1B, 1C), 2, 3, 7, 8 (8A, 8B), 9, and 10 (10A and 10B). These loci are distributed in two scaffolds, one containing the main discontinuous ADH cluster, syntenic to human 4q21-23, but broken by several rearrangements, and the other showing the single ADH1A locus. The amphibian ADH system represents a unique organization among tetrapods since sequencing data and comparative analysis of genomes describe single clusters in human, rat, mouse and chicken [4,61]. Genes similar to those of X. tropicalis have been identified in X. laevis, indicating that the multiplicity of ADH forms was present prior to the divergence of the two species. Duplication of the X. laevis genome (30 Mya) [62] affected a great number of gene families, such as those of globin or α-actin [63,64], but the only ADH duplicates found to date correspond to the X. tropicalis ADH1A gene, which were named ADH1A1 and ADH1A2 in X. laevis. This suggests that many ADH duplicated loci could have been lost. Nevertheless, further identification of other gene duplicates in X. laevis should not be discarded. In this regard, activity staining of hepatic ADHs revealed the existence of two polymorphic genes coding for ethanol dehydrogenase subunits that did not heterodimerize and were placed in separate genetic linkage groups [17].
In the following description of the amphibian ADH properties, we include functional features of forms not yet characterized, predicted from the wide information of the structure/function relationships available for the ADH family. However, the proposed functions have to be confirmed by the expression and kinetic characterization of the novel enzymes, especially ADH9 and ADH10.

Ancient forms of vertebrate ADH classes in amphibians ADH1
Amphibian class ADH1 clusters with the novel amphibian classes ADH8, ADH9 and ADH10, since all of them derive from a primitive ADH1 gene ancestor, also common to the amniote class I line. Later, ADH1 duplications generated ADH1A, 1B and 1C (these duplications were independent from mammalian ADH1 duplications that generated human ADH1A, 1B and 1C after rodent/primate divergence); ADH8A and 8B; and ADH10A and 10B from their corresponding ancestors.
Xenopus ADH1A is the most similar to other vertebrate ADH1 enzymes, whereas Xenopus ADH1B shows the highest identity with R. perezi ADH1. Xenopus ADH1A and ADH1B show Arg47, His51, Phe93 and Phe140, typical class I residues that are associated with ethanol dehydrogenase activity. The substrate-binding pocket of R. perezi ADH1B is extremely hydrophobic and space-restricted, resulting in low K m values for aliphatic alcohols, it has wide substrate specificity and is moderately active with retinoids [18]. In ADH1A, smaller substrate-binding residues Val141 and Val294 anticipate higher K m values for this isozyme, and substitution by His363 (Arg in many class I enzymes) suggests an increased rate of NAD + dissociation and higher k cat values. X. laevis ADH1A2 has many atypical residues, such as Gly47 or Thr51, which suggest an alternative proton-relay pathway in comparison with all the other class I enzymes, showing His51. Moreover, voluminous residues Phe57, Met110 and Met141 would increase hydrophobicity and would narrow the substrate cleft even more than in ADH1B. These residue exchanges predict different substrate specificity and suggest that ADH1A2 may have acquired a new function after gene duplication, while ADH1A1 would have maintained the original one.
ADH1C has unique features among substrate-binding residues. At position 93, the lack of an aromatic ring expands the substrate cleft and permits the accommodation of large substrates, as in human ADH1A and chicken ADH7 [14,65,66]. In contrast, an unusual Phe116 would narrow the entrance, although still may allow productive binding of retinoids as occurs in X. laevis ADH8B [67]. These features predict that ADH1C binds large alcohol substrates better than ethanol. Substitutions in the coenzyme-binding site, in relation to amniote class I, are His47 instead of Arg (ADH1C has His residues at both positions 47 and 51), Asp271, and Asn/Thr363, which could weaken the coenzyme binding and increase k cat values.
Expression pattern of amphibian class I in adult and embryonic tissues resembles that of other vertebrates, and transcripts of ADH1B are abundant in the developing tadpole (Table 3 and [22]).
Regarding reptiles, specific duplications of the ADH1 gene occurred in lizards. Anole ADH1A is the ortholog of uromastyx ADH1A and clusters with other known reptilian and avian class I enzymes, while anole ADH1B is the ortholog of uromastyx ADH1B and clusters with the rest of anole class I forms. Thus, A. carolinensis ADH1C-1H genes may have arisen from further tandem duplications of ADH1B in this organism, although the existence of additional ADH1 genes in uromastyx cannot be discarded.
Interestingly, A. carolinensis ADH1D, ADH1E and ADH1G share the residues Gln-Arg-Ser instead of the typical class I triad Asp-Ile-Gln at positions 223-225, which interact with the adenosine moiety of the coenzyme. Similar residues are found in NADP + -dependent MDR enzymes such as Sulfolobus solfataricus glucose dehydrogenase (Gln-Arg-Arg), Xilella fastidiosa cinnamyl alcohol dehydrogenase (Thr-Arg-Ser) or Saccharomyces cerevisiae ADH6 (Ser-Arg-Ser), among others, thus suggesting a higher preference for this coenzyme. Given the phylogenetic distance between these ADH1 forms and amphibian class ADH8, this suggests that the NADP + specificity could have arisen at least twice during the vertebrate ADH evolution.

ADH2
Amphibians are the most ancient organisms that possess a class II enzyme. Therefore, the duplicatory event that generated ADH2 from ADH3 can be placed between fish/tetrapod and amphibian/amniote splits, 450 to 360 Mya [68]. Moreover, X. tropicalis ADH2 already has the four-residue insertion, considered as the most distinctive trait of class II enzymes. The phylogenetic proximity of amphibian and avian ADH2 enzymes, in spite of the overall variability of this class, predicts similar structure and kinetic behaviour. The ostrich enzyme has been described as a mixed-class, structurally similar to mammalian class II but resembling class-I kinetic properties, since it is notably active with short-chain alcohol substrates such as ethanol.
Ostrich ADH2 shares 81.6%, 77.3% and 68.8% identity with turtle, frog and human ADH2, respectively. These four sequences show Ser/Thr48, Phe57, Tyr93, Leu110, Phe140, Val294, Ile/Leu309 and Phe318 in the substratebinding site ( Table 6); and also Ser115 and Ser128, which distinguish human class II from class I [69]. All the residues involved in the substrate interaction are almost identical in frog, turtle and ostrich ADH2. Moreover, the three enzymes show Arg47, Ser/Thr48, His51 and Ile269 at the coenzyme-binding site, the same residues which are found in human ADH1B1, concluding that the primitive class II forms may share common kinetic properties with class I enzymes.
Class II can be divided in two structurally and functionally distinct subgroups [70]. The first one exhibits a low activity with ethanol and comprises mouse and rat ADH2, both showing Pro47, and rabbit ADH2B, which lacks a His residue at both positions 47 or 51 (Table 6). In contrast, the second group is constituted by rabbit ADH2A and amphibian, reptilian and avian ADH2, all of them possessing His51; and human, marmoset and bovine ADH2, which show His47 (bovine ADH2 has His residues at both positions 47 and 51). These forms may share not only the ethanol dehydrogenase activity but also the ability of metabolizing retinoids, as reported for human ADH2 [71].

ADH3
X. laevis and X. tropicalis ADH3 sequences show the 22 functionally important residues strictly conserved in class III enzymes from reptiles to mammals [72,73]. ADH3 is a glutathione-dependent formaldehyde dehydrogenase, as seen spectrophotometrically for the purified enzyme of R. perezi [18] and by activity staining of electrophoresed tissue homogenates from R. perezi and X. laevis. Expression of amphibian ADH3 is detected in every stage and tissue studied, although it is more abundant in some organs such as the ovary, suggesting that oocytes may store large amounts of maternal ADH3 for its later use during the embryonic development. The maternal origin of ADH3 mRNA has been previously described in Drosophila [74] and zebrafish [75] embryos.

ADH7
The novel reptilian genomic sequences supported the class assignment of X. tropicalis ADH7, sharing identity percentages of 71.2% and 67.4% with turtle ADH7 and anole ADH7B, respectively. As listed in Table 7, all ADH7 enzymes show Thr48, involved in the stereospecificity for secondary alcohols; a small residue such as Cys at position 93 (Pro in chicken), indicative of high K m values for ethanol and correct positioning of steroid substrates [14]; Phe140 and Leu141 (in most sequences); and similar coenzyme-binding residues. Positions 112 to 126 are almost identical, and His115 and Trp142 are common for all ADH7 enzymes. These two positions were reported to affect the conformation of the loop 112-120, widening the entrance of the substrate-binding site and conferring to ADH7 the ability to oxidize large hydrophobic alcohols [14]. Among the three forms identified in A. carolinensis, ADH7B is the most similar to turtle, chicken and frog ADH7, with identity percentages of 79.4%, 73.6% and 67.4%, respectively. Already present in amphibians, ADH7 appeared between the tetrapod/fish and the amniote/amphibian splits, 450-360 Mya [61]. On the other hand, the common position of turtle and chicken ADH7 and vertebrate ADH5-ADH6 loci within the ADH cluster [4,13,61], together with the fact that they do not coexist in any organism, suggests that the loss of ADH7 might have occurred close to the origin of ADH5 and ADH6, or even in the same genetic event.

Novel ADH classes in amphibians ADH8
A total of five members are now known from the NADP + -dependent class VIII: ADH8 from R. perezi, and ADH8A and ADH8B from both X. laevis and X. tropicalis. They show the conserved triad Gly223-Ser/Thr224-Gln/His225 that interacts with the 2′-phosphate of the adenosine moiety of NADP + [19]. Despite their different substrate-binding sites, especially concerning large hydrophobic residues, both X. laevis ADH8B and R. perezi ADH8 reduce retinaldehyde and medium-chain aldehydes [15,67]. One major difference is the substitution of Phe93 by Cys that correlates with the poor ethanol oxidizing activity of X. laevis ADH8B (Borràs et al., unpublished results). Residues Gly47, Ser48 and Ser51 would determine the proposed catalytic mechanism and proton-relay pathway of R. perezi ADH8 [21]. While Gly47 is conserved in all ADH8 members, Ser48 and Ser51, also found in ADH8A, are substituted by Thr48 and Ala51 in ADH8B enzymes. Remarkably, several deletions exist in ADH8 with respect to amphibian class I. The first one is found at position 57 of R. perezi ADH8 and both Xenopus ADH8B enzymes, and it may account to some extent for the wide substrate pocket observed in R. perezi ADH8 [21]. The second one, at position 167, is common for all ADH8 sequences. And the third, located  at position 186, is only found in Xenopus ADH8B enzymes. Regarding coenzyme-interacting positions, the presence of Gly47 and the lack of a typical hydrophobic residue in position 269, usually Val or Ile, suggest a lower affinity for the coenzyme and increased k cat values for ADH8 enzymes. Adult expression pattern of ADH8 includes the gastrointestinal tract and skin, where it may participate in cell differentiation through regulation of retinoic acid levels acting as a retinaldehyde reductase [15]. In embryos, ADH8 expression is also observed after neurulation, although EST evidences are too scarce to suggest a possible function (Table 3). ADH8A and ADH8B promoters are closely related, both exhibiting a putative CCAAT box, and common HNF-3beta, XFD and GATA-binding sites.

ADH9
X. laevis and X. tropicalis ADH9 are the only members known from this novel class, and none of them has been characterized at the protein level. These enzymes share a percentage of amino acid identity lower than 60% with any other ADH, and show many special sequence features, such as Cys93, Met57, Phe110, Val116, Met141, and Phe318; together with His residues at both positions 47 and 51. His51 indicates the same proton-relay pathway as in class I enzymes, while His47 suggests rapid coenzyme dissociation. Asp223 at the cofactor-binding site indicates NAD + -dependence. Cys93, Phe110, Met141, and Phe318 predict a substrate pocked enlarged at its inner part, but narrow and hydrophobic at the middle and the entrance, Figure 5 Hypothetical evolutionary pathways leading to tetrapod ADH multiplicity. At the base of vertebrate radiation, an initial tandem duplication of the ancestral ADH3 led to a two-gene cluster. Actinopterygia (ray-finned fish) and sarcopterygia (lobe-finned fish and tetrapods) acquired ADH1 activity by the most 5′ member of the cluster [4]. Before the amniota/amphibian split (360 Mya), ADH2 and ADH7 would have arisen in tetrapods as a consequence of gene duplication events. In reptiles and birds, no additional ADH classes have been found. In contrast, ADH1 tandem duplications led to further class multiplicity in the amphibian lineage; thus, ADH8, ADH9, and more recently ADH10 forms would derive from the ancestral ADH1. Close to the origin of mammals, ADH7 was lost while gene duplications generated ADH5 and ADH6, and tandem duplication of ADH1 gave rise to ADH4. Only in primates, ADH6 was lost simultaneously or close to ADH1 duplications generating ADH1A-C isozymes [13]. Likewise, additional duplications occurred in other vertebrate lineages, and those ADH genes leading to isoenzyme multiplicity in at least one member of that lineage are underlined (in reptiles, multiple ADH1 and ADH7 are found in lizards, but not in turtles). In some organisms, ADH pseudogenes are also observed. similar to that of ADH8. Substrate preferences may be large substrates rather than ethanol, but probably not steroids. Northern blot studies detect ADH9 transcripts in adult stomach, esophagus and skin, while discard the expression of the enzyme at embryonic stages ( Figure 1 and [22]). Colocalization with ADH8, in spite of their sequence divergence and different cofactor specificity, may obey to the common regulatory elements found in their gene promoters, and it is consistent with the adjacent chromosomal location of their genes. ADH9 was initially described as an ADH4-like form in X. laevis, likely because its tissue localization [22]. Based on phylogenetic analyses and class-specific sequence signatures, it is now clear that ADH9 constitutes a separate class and thus ADH4 is not present in amphibians. The absence of ADH4 forms in reptiles and birds, and its presence in marsupials [13] supports its emergence at the origin of mammals (310 Mya).

ADH10
Two isoenzymatic forms occur in this class, ADH10A and ADH10B, which are closely related to ADH1. Common residues of ADH10 and ADH1 enzymes are Ser48, Phe140, Met306 and Leu309, for substrate interaction; and Arg47, His51 and Leu363, for coenzyme binding (Table 5). Val269 and His271 are particular of ADH10. Substitution of typical Ile269 by smaller Val can affect the strength of coenzyme interaction, and substitution of Arg271 by His was suggested to increase k cat values in human ADH4 [76]. Three residue exchanges characterize ADH10B: Val93, Phe57 and Arg110. Val93 results in a wider bottom of the substrate pocket and it has not been found in any other ADH. Phe57, present in most class II enzymes, narrows the middle region and increases its hydrophobicity, but hydrophilic Arg110 should compensate for this fact. ADH10 is the only class with a charged residue at position 110. Also interesting is the basic residue found at position 115 which, similarly to Arg115 in class III enzymes [73], could contribute to a substrate-binding site with a wider entrance and higher volume. A deletion at position 114 of X. tropicalis ADH10B could also participate in this rearrangement. The substrate pocket of ADH10B with a widened entrance and inner part (Ser48 instead of Thr) could accommodate large substrates, such as steroids, provided that Phe57 was not a major steric constraint. Ser48 can also be found in horse ADH1S and human ADH1C, both able to oxidize steroids. Adrenal and gonadal steroids are therefore proposed as ADH10 substrates, in agreement with its predominant expression in Xenopus mesonephric kidney and testis, and the presence of a putative site for estrogen receptor in the ADH10B promoter.

Conclusions
In conclusion, the complex Xenopus ADH system is composed of the vertebrate classes ADH1, ADH2, ADH3 and ADH7, together with novel class I-derived enzymes, ADH8, ADH9 and ADH10, exclusively found in amphibians. ADH4 is not present in amphibians and reptiles. The study of the ancient forms of ADH2 and ADH7, also found in reptiles, led to significant conclusions about the evolutionary history of the ADH family ( Figure 5), whereas the special features showed by the novel forms described herein point to the acquisition of new functions following the ADH gene family expansion occurred in amphibians.