Comparative phylogenomic analyses of teleost fish Hox gene clusters: lessons from the cichlid fish Astatotilapia burtoni: comment
BMC Genomics volume 9, Article number: 35 (2008)
A reanalysis of the sequences reported by Hoegg et al has highlighted the presence of a putative HoxC1a gene in Astatotilapia burtoni. We discuss the evolutionary history of the HoxC1a gene in the teleost fish lineages and suggest that HoxC1a gene was lost twice independently in the Neoteleosts. This comment points out that combining several gene-finding methods and a Hox-dedicated program can improve the identification of Hox genes.
The identification of individual Hox genes is an essential basis for their study in evolutionary research fields. It is even more important in teleost fish where the unravelling of zebrafish and pufferfish Hox clusters have contributed to the establishment of the fish-specific genome duplication hypothesis. In a recent study, Hoegg et al  present the Hox gene content of the cichlid fish Astatotilapia burtoni, characterized from the complete sequence of the seven Hox clusters. Availability of these sequences is extremely valuable to help the community better understand the evolutionary plasticity of Hox genes and their regulatory elements in teleosts. A total of 46 Hox coding sequences have been identified, using common gene detection methods relying on sequence similarity. Identification of Hox multigenic family members is hampered by the high conservation of the homeodomain. We believe that a set of complementary approaches is thus required to correctly annotate all members of the Hox family.
Here, we complement the analysis of the Hox gene content of Astatotilapia burtoni reported in , using a combination of sequence similarity methods, de-novo gene predictions and a program we have developed that specifically classifies Hox proteins in their homology groups . In addition, we collect a comprehensive set of HoxC1a sequences that allows us to re-investigate the presence of HoxC1a pseudogenes in various teleost species. In light of our findings, we discuss a revised version of HoxC1a gene loss events in teleost lineages.
Results and Discussion
HoxC1a gene detection
Astatotilapia burtoni Hox cluster genomic sequences were collected from GenBank and submitted to the de-novo gene prediction program GENSCAN . A total of 102 putative coding sequences were localized on the genomic sequences. We applied HoxPred  on each putative peptide, and 37 sequences were predicted as Hox proteins. We have also applied HoxPred on the protein sequences detected by Hoegg et al and the resulting classification in homology groups is concordant with their result in both cases. We observed that GENSCAN predictions sometimes encompass two or three genes in a single predicted peptide. This sole method thus detects less Hox genes than reported in .
With this method, we have located a paralogous group (PG) 1 prediction on the HoxCa cluster. A detailed analysis of the predicted gene shows that GENSCAN, and other de-novo gene prediction programs, fail to correctly predict the C-terminal portion of its homeodomain. Alignment of zebrafish HoxC1a peptide to the HoxCa cluster with GeneWise (global mode)  supports the prediction of the first exon and completes the homeodomain sequence. The resulting putative peptide (Additional file 1) comprises two exons and one intron, its length is 295 residues and its genomic position downstream of HoxC3a (Additional file 2) strongly suggest a potential HoxC1a. Expression data would be needed to support this prediction.
HoxPred has previously been applied on the teleost fish Gasterosteus aculeatus proteome to characterise its Hox gene content . We have shown that this fish comprises a putative HoxC1a gene partially supported by EST evidence. We performed pairwise alignments between full-length HoxC1a protein sequences detected in teleosts. The Neosteleost A. burtoni and G. aculeatus HoxC1a are very similar with 66% identity. On the contrary, A. burtoni HoxC1a only shares 30% identity with the Ostariophysii zebrafish HoxC1a, in the homeodomain for the most part.
Phylogenetic analyses in paralogous group 1
Phylogenetic reconstructions of PG1 homeodomains from teleost species were conducted as in , with the addition of A. burtoni and Fundulus heteroclitus sequences. The HoxC1a sequence recently reported in the Ostariophysii Megalobrama amblycephala  was not included as the PCR fragment does not comprise the homeodomain.
Phylogenetic analyses confirm that Astatotilapia burtoni comprises a putative HoxC1a gene (Figure 1). In the frame of novel HoxC1a sequences, these phylogenetic tree reconstructions refine the analysis of F. heteroclitus PG1 PCR fragments  and provide additional evidence to confirm the classification previously proposed in .
Fishing HoxC1a pseudogenes out of teleost genomes
The Neoteleosts HoxC1a genes we have identified provide a more comprehensive set of sequences that can be used to investigate HoxC1a pseudogenes in teleost sequences. We have analysed the region downstream HoxC3a of the medaka Oryzia latipes in search of a putative HoxC1a gene. The EnsEMBL  GENSCAN prediction that spans this region did not return any potential Hox gene with HoxPred. GeneWise alignments of O. latipes genomic sequence with G. aculeatus and A. burtoni HoxC1a proteins nevertheless highlight a genomic region in O. latipes, corresponding to both N- and C-terminal portions of the proteins (see Additional file 2 for genomic positions). As the homeodomain is highly degenerate and contains a frameshift mutation, we argue in favor of a HoxC1a pseudogene in medaka.
We performed similar analyses on the genomic sequences of pufferfishes (Takifugu rubripes and Tetraodon nigroviridis) and observed imprints of HoxC1a pseudogenes in both cases. For T. rubripes, this finding is in agreement with the HoxC1a pseudogene described in . For T. nigroviridis, no HoxC1a pseudogene has been reported yet, and a previous attempt to identify a functional HoxC1a gene was unsuccessful .
We have constructed mVista plots  as performed in  to highlight conserved non-coding sequences downstream of HoxC3a with a comparative genomic approach (Additional file 3). As previously noted by Hoegg et al, we observe a high similarity between G. aculeatus and A. burtoni. This plot also shows a high similarity between the Neoteleost pseudogenes and putative genes, whereas zebrafish HoxC1a sequence is clearly less similar.
HoxC1a gene loss in the teleost Hox clusters
Based on the sole presence of HoxC1a gene in zebrafish, Hoegg et al suggest that HoxC1a has been lost once in the lineage leading to Neoteleosts (as illustrated in figure 3 in ). Presence of this gene in both G. aculeatus and A. burtoni rejects this hypothesis. Figure 2 is a comprehensive overview of the current HoxC1a set of orthologs in teleosts, according to our results based on publicly available data. We have mapped HoxC1a gene loss events on the phylogeny reported in . It indicates that HoxC1a has been lost independently among Neoteleosts, in both lineages leading to O. latipes and to the pufferfishes. Whether an additional HoxC1a gene loss has occurred in the lineage leading to the cichlid fish Oreochromis niloticus remains to be investigated.
A. burtoni was reported to contain 46 Hox genes. We have complemented the Hox gene content of this fish with a putative HoxC1a gene. Combined with the detection of HoxC1a orthologs in G. aculeatus and F. heteroclitus, we introduce here a more comprehensive set of HoxC1a genes in teleosts. These Neoteleost genes facilitate the investigation of pseudogenes in O. latipes and pufferfishes in comparison with the more distant zebrafish ortholog. We report two novel HoxC1a pseudogenes, in O. latipes and T. nigroviridis respectively. In addition, this case-study illustrates the annotation challenge posed by the Hox multigenic family. We have shown that Hox identification can be improved by combining several gene-finding methods and a Hox-dedicated program.
This comment has hopefully given new insights into the gene loss events presented by Hoegg et al, as regards to HoxC1a. Our results modify their conclusions and rule out the hypothesis of a unique HoxC1a gene loss event in the lineage leading to Neoteleosts. We propose that HoxC1a was independently lost in the lineage leading to O. latipes and in the lineage leading to pufferfishes. Our findings do not affect other aspects of the Hoegg et al study, especially the fact that each teleost species studied so far contains a different Hox gene set. Rather, we believe that this contribution reinforces their conclusions about non-essential Hox genes that can be easily and repeatedly lost like HoxB7a or HoxC1a.
Hoegg S, Boore J, Kuehl J, Meyer A: Comparative phylogenomic analyses of teleost fish Hox gene clusters: lessons from the cichlid fish Astatotilapia burtoni. BMC Genomics. 2007, 8: 317-10.1186/1471-2164-8-317.
Thomas-Chollier M, Leyns L, Ledent V: HoxPred: automated classification of Hox proteins using combinations of generalised profiles. BMC Bioinformatics. 2007, 8: 247-10.1186/1471-2105-8-247.
Burge C, Karlin S: Prediction of complete gene structures in human genomic DNA. J Mol Biol. 1997, 268: 78-94. 10.1006/jmbi.1997.0951.
Birney E, Clamp M, Durbin R: GeneWise and Genomewise. Genome Res. 2004, 14 (5): 988-995. 10.1101/gr.1865504.
Zou SM, Jiang XY, He ZZ, Yuan J, Yuan XN, Li SF: Hox gene clusters in blunt snout bream, Megalobrama amblycephala andcomparison with those of zebrafish, fugu and medaka genomes. Gene. 2007, 400 (1–2): 60-70. 10.1016/j.gene.2007.05.021.
Misof BY, Wagner GP: Evidence for four Hox clusters in the killifish Fundulus heteroclitus (teleostei). Mol Phylogenet Evol. 1996, 5 (2): 309-322. 10.1006/mpev.1996.0026.
Prohaska SJ, Stadler PF: The duplication of the Hox gene clusters in teleost fishes. Theory in Biosciences. 2004, 123: 89-110. 10.1016/j.thbio.2004.03.004.
Hubbard TJP, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Coates G, Cunningham F, Cutts T, Down T, Dyer SC, Fitzgerald S, Fernandez-Banet J, Graf S, Haider S, Hammond M, Herrero J, Holland R, Howe K, Howe K, Johnson N, Kahari A, Keefe D, Kokocinski F, Kulesha E, Lawson D, Longden I, Melsopp C, Megy K, Meidl P, Ouverdin B, Parker A, Prlic A, Rice S, Rios D, Schuster M, Sealy I, Severin J, Slater G, Smedley D, Spudich G, Trevanion S, Vilella A, Vogel J, White S, Wood M, Cox T, Curwen V, Durbin R, Fernandez-Suarez XM, Flicek P, Kasprzyk A, Proctor G, Searle S, Smith J, Ureta-Vidal A, Birney E: Ensembl 2007. Nucleic Acids Res. 2007, D610-7. 10.1093/nar/gkl996. 35 Database
Aparicio S, Hawker K, Cottage A, Mikawa Y, Zuo L, Venkatesh B, Chen E, Krumlauf R, Brenner S: Organization of the Fugu rubripes Hox clusters: evidence for continuing evolution of vertebrate Hox complexes. Nat Genet. 1997, 16: 79-83. 10.1038/ng0597-79.
Mayor C, Brudno M, Schwartz JR, Poliakov A, Rubin EM, Frazer KA, Pachter LS, Dubchak I: VISTA : visualizing global DNA sequence alignments of arbitrary length. Bioinformatics. 2000, 16 (11): 1046-1047. 10.1093/bioinformatics/16.11.1046.
Huelsenbeck JP, Ronquist F: MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics. 2001, 17 (8): 754-755. 10.1093/bioinformatics/17.8.754.
Ronquist F, Huelsenbeck JP: MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 2003, 19 (12): 1572-1574. 10.1093/bioinformatics/btg180.
Guindon S, Gascuel O: A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst Biol. 2003, 52 (5): 696-704. 10.1080/10635150390235520.
This work was supported by the Vrije Universiteit Brussel (Geconcerteerde Onderzoeksactie 29) (M.T.-C) and the Belgian Science Policy (VL). We thank Simone Hoegg for kindly inviting us to reanalyse Astatotilapia burtoni sequences and Axel Meyer for welcoming this comment. MT-C gratefully thanks Luc Leyns and Jacques van Helden for their support as well as Michel Vervoort for fruitful discussions and Olivier Sand for critical reading of the manuscript.
MT-C conceived the study, performed the sequence analyses and drafted the manuscript. VL performed the phylogenetic analyses and participated in the editing of the manuscript. All the authors read and approved the final manuscript.
Electronic supplementary material
Additional file 1: Putative HoxC1a protein sequences. The data provided represent the HoxC1a protein sequences predicted in selected teleosts, in Fasta format. (FA 1 KB)
Additional file 2: Genomic coordinates of putative HoxC1a genes and pseudogenes. The data provided represent the genomic positions of the predicted HoxC1a genes and pseudogenes in selected teleosts. (XLS 39 KB)
Additional file 3: mVista plot downstream of the HoxC3a gene. Figure showing the evolutionary conserved regions downstream of the HoxC3a gene, in selected teleosts. (JPEG 90 KB)
About this article
Cite this article
Thomas-Chollier, M., Ledent, V. Comparative phylogenomic analyses of teleost fish Hox gene clusters: lessons from the cichlid fish Astatotilapia burtoni: comment. BMC Genomics 9, 35 (2008). https://doi.org/10.1186/1471-2164-9-35