Transcriptome sequencing based annotation and homologous evidence based scaffolding of Anguilla japonica draft genome
© Liu et al. 2015
Published: 11 January 2016
Anguilla japonica (Japanese eel) is currently one of the most important research subjects in eastern Asia aquaculture. Enigmatic life cycle of the organism makes study of artificial reproduction extremely limited. Henceforth genomic and transcriptomic resources of eels are urgently needed to help solving the problems surrounding this organism across multiple fields. We hereby provide a reconstructed transcriptome from deep sequencing of juvenile (glass eels) whole body samples. The provided expressed sequence tags were used to annotate the currently available draft genome sequence. Homologous information derived from the annotation result was applied to improve the group of scaffolds into available linkage groups.
With the transcriptome sequence data combined with publicly available expressed sequence tags evidences, 18,121 genes were structurally and functionally annotated on the draft genome. Among them, 3,921 genes were located in the 19 linkage groups. 137 scaffolds covering 13 million bases were grouped into the linkage groups in additional to the original partial linkage groups, increasing the linkage group coverage from 13 to 14 %.
This annotation provide information of the coding regions of the genes supported by transcriptome based evidence. The derived homologous evidences pave the way for phylogenetic analysis of important genetic traits and the improvement of the genome assembly.
Abundance of Japanese eel, as well as other freshwater eels belongs to genus Anguilla, has been radically shrinking in the past decade . Catadromous eels’ enigmatic life cycle makes their reproduction affected deeply by anthropogenic impacts. Lack of an economical method to artificial reproduce makes this organism extremely vulnerable to overconsumption. Mature eels migrate thousands of kilometers into the open ocean to spawn eggs. Exact spawning locations of Japanese eels were hard to pinpoint and remained unknown until recently when they were found near Western Mariana Ridge. What prohibits the research from further improvement is that, eggs and larvae of Anguilla japonica are spread by the Kuroshio Current, making the habitats spans a wide area of Eastern Asia . Cylindrical shape larva develops into transparent color leptocephalus larvae, which eventually metamorphosis into glass eels. Glass eels migrate back into the freshwater through estuaries, sometimes traveling within wet sands into the inner continent, where they spend years going through pigmentation into yellow eels, and then silver eels . Such wide area of habitation potentially makes effect of pollution and diseases to be accumulated. Long life cycle and the spawning habit through migration make wild Anguilla eels hard to recover from the damage caused by overfishing.
Physical linkage map of Japanese eel were constructed in 2011 . High throughput Sequencing technology was rarely applied upon this issue before. However, with the advances of sequencing technologies bringing down the cost and time consuming of DNA and RNA sequencing, plus the approaching extinction of fresh water eels, the field began to change. In 2010, mRNA-Seq study of deep sequencing and de novo reconstruction of European glass eel were reported as well as the hox genes of the specie , 2 years following that, draft genome sequence of European were also published . The incorporative research of genomic and transcriptomics information from the deep sequencing should have major impacts in multiple fields. Expression profiling of both transcriptome of European eels response to environmental pollution were reported in 2012 . The first draft genome of Japanese eel was assembled , proving that the hox genes and genomic distance of European and Japanese eels were conserved. By 2014, a ddRAD-based linkage map was published, providing 13 % coverage of the draft genome . Such results left plenty of space for improvement.
Hereby, we provide a reconstructed transcriptome from whole body samples of juvenile (glass eels). The high throughput sequencing provides unprecedented amount of transcriptomic information. Instead of focusing only on certain types of tissues or organs, full transcriptome of the entire organism was sequenced. For future experimental design and guidance on ecological, physiological, artificially breeding and even toxicity resistance study of Japanese eel, such transcriptome can provide additional guidance. What‘s more, the massive amount of evidence provided by the transcriptome helps the complete of draft genome structural annotation. Combining transcriptome sequence data with publicly available expressed sequence tags evidences, 18,121 genes were structurally and functionally annotated on the draft genome. The structural annotation was performed through an established pipeline, MAKER . Functional annotation was based on sequence alignment. The acquired homologous evidences were further used to improve the draft genome scaffolding. Applying an improved version of scaffolding algorithm developed by Aganezov et al. , synteny of Anguilla japonica was compared to the genome of Fugu, Stickleback, Medaka, Tetraodon, Coelacanth and Zebra fish. Obtained results were integrated with previously published linkage map , putting 3,921 genes into the 19 linkage groups, which represent chromosomes of Anguilla japonica. 137 scaffolds were grouped into the linkage groups in addition to the original partial linkage groups. Phylogenetic analysis of the gene clusters correlation to thyroid hormone receptors and pigmentation were performed with MEGA 6.0 .
Results and discussion
Sequencing through Illumina HiSeqTM 2000 generated total 85,233,812 reads, with length of 101 nucleotides. After quality control, low quality reads were trimmed and left 77,939,562 reads were left with an average length of 99.575 nucleotides (Additional file 1: Figure S1). Quality control of the sequence reads is summarized in Additional file 1: Figure S2 and Additional file 1: Table S3. Assembly were assessed through average length of unigenes, as well as quality score N50 and N90. As the result shows in Additional file 1: Table S2, average length, N50 and N90 of clustered unigenes are significantly higher than results of single De Novo assembly tools. Composition of assembled unigenes showed in Additional file 1: Table S3 demonstrates that clustered unigenes tends to have higher composition of longer nucleotides. Hence assembles generated through clustering were considered to have higher accuracy and were used for further annotation. In Additional file 1: Figure S2, we demonstrate that expression level measured with FPKM, frequency of reads per kilo base per million, distributes through all different length of assembles.
Summary of the functional annotation is listed in Additional file 1: Table S4. From the total 32,210 assembled unigenes, 16,106 were found to be aligned to known proteins in NCBI Non Redundant protein database. 10,848 of the unigenes were found to contain functional domains through RPSBLAST against NCBI Conserved Domain Database. 5641 transcripts were found to involve with system biological pathway in Kyoto Encyclopedia of Genes and Genomes (KEGG) . 13,434 transcripts were annotated to certain Gene Ontology terms. Up to top 5 blast hits per query were considered in the process. Distribution of homologs belonging to other organisms were illustrated in Additional file 1: Figure S3.
Expression level of the transcripts was examined together with their frequency to be assigned to certain Gene Ontology terms. As demonstrated in Additional file 1: Figure S4, transcripts regarding enzyme regulator activity express level exceeded the total average expression level despite the fact that only four of them were found. Distribution of assembled transcripts through different KEGG pathways categories (Additional file 1: Figures S5 and S6) were also observed alongside with average expression level. As shows in Additional file 1: Figure S5, despite only few transcripts found in some pathway categories such as Circulatory system and Reaction module maps, average expression level of transcripts within these categories demonstrates potential rich activities of these pathways.
Protein functional domains found on the assembled transcripts were also viewed in their distribution alongside with expression level, demonstrated in Additional file 1: Figures S7 and S8. Protein functional domains with extreme expression level can provide guidance to the future protein- protein interaction study.
Catadromous eels’ reproduction is limited by their long life cycle and migration spawning. To produce enough supply without consuming wild glass eels, development of technology that would shorten the period of time for eels to mature, and would artificially induced spawning of healthy larvae is inevitable. Hence, revealing the mechanisms of metamorphosis from leptocephalus larvae into glass eels, as well as fermentation from glass eels to mature silver eels is the key to successful artificial reproduction to supply commercial demands and keep wild eels from extinction.
In the past, transcriptomic studies of eels mainly relied on classical molecular biological experimental methods. Studies of various mechanisms were performed with classical molecular methods such as cloning and protein purification. Cloning and protein purification provide only partial view of the transcripts . To fully capture all protein coding transcripts, a combination of next generation sequencing and new transcript assembly algorisms is necessary . In 2014, a study of mRNA expression profile through RT-PCR of prolactin, growth hormone, and somatolactin of Japanese eel was reported . However, researches utilizing genome information of Japanese eel, and the respective resources available for the experimental design are still limited. On the other hand, hybrids of European and American eels were found occurred naturally in Iceland , import of European glass eels into East Asia could trigger interspecific hybridization of Anguilla eel, inducing further anthropogenic impacts to this species near extinction . Proven possibilities of hybrid reproduction , as well as the successful artificial hybrid of European and Japanese eels also bring new possibilities to the artificially reproduction . However, the transcriptomic information is still limited. In 2013, the first transcriptomic study through 454 deep sequencing was performed on gill of Anguilla japonica . Utilization of proteomic approaches and transcriptomic sequencing gave insights into the osmoregulation mechanism, providing transcriptomic view of Anguilla japonica’s catadromous behavior. However the study  didn’t correlate with the currently available draft genome.
On eel sexualize mechanism, several surveys have been conducted on ovarian steroid genesis . Expression level of several genes were also found to be related to the ovarian development. Through the attempts of artificial reproduction of glass eels has been attempted since 1930s in Europe , only until 2003, first successful artificially induced spawning of Japanese eel was achieved through injection of salmon pituitary extracts into the female eel and human chorionic gonadotropin into male eel . An unpublished successful F2 generation was declaimed in 2011 . However current technology is not sufficient for large scale reproduction. Mortality of artificially cultured eels is still high.
Under current circumstance, all main stream studies of Japanese eels should inevitably focus on how to successfully improve life cycle of eels under the artificial environment to suit the existing demand. Such studies would take into consideration with all kinds of mechanisms. Metamorphosis, pigmentation and sexualize mechanisms of eels are all deeply correlated to their catadromous spawning activities, especially the metamorphosis mechanisms including reorganization of the entire body plan. Complete genome structure and transcriptome is essential for future study.
To suite such a purpose, we provide the first complete transcriptome of glass eels. Application of deep sequencing provides not only the information of homologs, but also the potential novel genes of Japanese eels. Clustering of De Novo assembled transcripts from different tools through overlapping successfully increase the assembly quality. Distribution of assembled transcripts through different species, GO terms, system biological pathway and protein functional domains of found genes were examined and demonstrated. In addition, we further provide expression level alongside the distributions. Such demonstration successfully provides the hidden information about pathways with few genes but extreme expression levels.
While such a full transcriptome from whole body is proven to be effective on functional annotation, scope of this annotation is still limited by the sample. Since the messenger RNAs were isolated from glass eels before sexualization, certain types of hormones from gender specific tissues of silver eels such as ovarian and testicular cannot be found in such samples. On the other hand, satellite sequences and mRNAs from eggs and larval might not necessary been expressed in our samples. Despite some of a portion of this information is available on NCBI, which we used in our annotation, these factors could still limit the completeness of the annotation.
Materials and methods
Isolated whole RNA from the five glass eels were prepare for the RNA sequencing. Libraries for the RNA-Seq were sequenced through Illumina HiSeqTM 2000 following the manufacturer’s manual. Pair end libraries were sequenced in 101 X 2 nucleotides length reads, with 120 nucleotides adaptors. The entire fragment length was 357 nucleotides. The base calling and image analysis were done following Illumina standard pipeline. Raw reads of deep sequencing went through quality control procedure done by using FASTX-Toolkit: FASTQ/A short reads pre-processing tools, with only Quality value Phred score over 20 nucleotides remain, which means only reads with per base accuracy over 99 % were kept. Also, we trimmed the length of the reads down to 70 nucleotides for low quality reads. The transcriptome were first reconstructed through de novo assembly. To achieve maximum accuracy, we applied three different main stream de novo assembly tools : Trinity , Oases , and SOAPdenovo-Trans . Quality controlled reads were assembled into three separate sets of contigs, with the three different tools. Trinity was applied with default settings, while Oases and SOAPdenovo-Trans were applied with multiple-kmers strategy. To further eliminate overlapping contigs, we clustered the three sets of contigs with CD-HIT-EST  into three sets of unigenes. Finally, we clustered unigenes with high similarity together with the tool CAP3 . Quality of the assembly was estimated mainly through an average length of unigenes, as well as quality score N50 and N90. N50 represents the length of the longest unigene among the collection of unigenes equal to a half of the sum of all unigenes, while N90 means the length of the shortest unigene among the collection of unigenes equal to ninety percent of the sum of all unigenes. Maximum and minimum length of assembled unigene also serves as an index for the assessment.
Abundance of the assembled unigenes were estimated through RSEM pipeline . Quantities of the transcripts were estimated through FPKM value. FPKM, frequency of reads per kilo base per million ,value was calculated through aligning reads onto assembled transcripts with Bowtie . A de novo repeat library of Anguilla japonica was built from the draft genome  through RepeatScout . Then, de novo assembly of the RNA-Seq data were pooled with the complete and partial CDS, EST and previously done gill RNA-seq assembly  of Anguilla japonica from NCBI as EST evidence. The EST evidence includes the public available RNA-seq data sets SRX482728, SRX247092, SRX115953 and the whole body transcriptome sequence data. Together with the known proteins from NCBI, genome structural annotation was performed through the pipeline MAKER . The pipeline firstly masked the repeat sequences according to the previously build library with Repeatmasker (http://repeatmasker.org), and then perform ab initio prediction through repeat training of SNAP  and polished with Exonerate .
To find the assembled transcripts coding proteins, unigenes were blasted against NCBI non-redundant protein data base, TrEMBL and Swiss-Port  with BLASTX. Hits with an e value lower than 10 to negative 5, filtered by penalty estimation through the credibility of the protein, would be considered as homologs. Next, available Gene Ontology  terms were found listed. On the other hand, potential protein conserved domain were found through RPSBLAST against Pfam  and NCBI COG . To help the system biological analysis in the future, available KEGG pathways  were also annotated.
Conclusions and prospective works
We provide a reconstructed transcriptome from whole body samples of juvenile (glass eels). The high throughput sequencing provides unprecedented amount of transcriptomic information. For future experimental design and guidance on ecological, physiological, artificially breeding and even toxicity resistance study of Japanese eel, the transcriptome provide guidance. For example, expression of specific genes shows extreme patterns in glass eels and can be further compared with larvae as well as silver eels through QPCR to provide further reevaluation of the metamorphosis and pigmentation mechanism.
Link to a gtf file of the annotation, fasta files of the protein and transcripts and genetic linkage map would be available in Additional file 1.
The authors would like to thank the Ministry of Science and Technology, for financially supporting this research. We also thank the UST-UCSD International Center of Excellence in Advanced Bioengineering and Veterans General Hospitals and University System of Taiwan. This work was funded by Ministry of Science and Technology, [MOST 101-2311-B-009-005-MY3, MOST 103-2628-B-009-001-MY3, MOST104-2319-B-400-002, MOST 103-2319-B-010-002 and MOST103-3111-Y-001-027], UST-UCSD International Center of Excellence in Advanced Bioengineering. sponsored by the Ministry of Science and Technology I-RiCE Program [MOST 103-2911-I-009-101], Veterans General Hospitals and University System of Taiwan (VGHUST) Joint Research Program [VGHUST103-G5-11-2], MOE ATU and the funding for open access charge: Ministry of Science and Technology of the Republic of China, Taiwan [103-2628-B-009-001-MY3] and [MOHW104-TDU-B-212-124-005]. The work of S.A. and M.A.A. was supported by the National Science Foundation under the grant No. IIS-1462107.
- Minegishi Y, Henkel CV, Dirks RP, van den Thillart GEEJM: Genomics in Eels - Towards Aquaculture and Biology. Marine Biotechnol. 2012, 14 (5): 583-590. 10.1007/s10126-012-9444-5.View ArticleGoogle Scholar
- Tsukamoto K, Aoyama J, Miller MJ: Migration, speciation, and the evolution of diadromy in anguillid eels. Can J Fish Aquat Sci. 2002, 59 (12): 1989-1998. 10.1139/f02-165.View ArticleGoogle Scholar
- Henkel CV, Burgerhout E, de Wijze DL, Dirks RP, Minegishi Y, Jansen HJ, et al: Primitive Duplicate Hox Clusters in the European Eel's Genome. PloS One. 2012, 7 (2): e32231-10.1371/journal.pone.0032231.View ArticlePubMedPubMed CentralGoogle Scholar
- Nomura K, Ozaki A, Morishima K, Yoshikawa Y, Tanaka H, Unuma T, et al: A genetic linkage map of the Japanese eel (Anguilla japonica) based on AFLP and microsatellite markers. Aquaculture. 2011, 310 (3–4): 329-342. 10.1016/j.aquaculture.2010.11.006.View ArticleGoogle Scholar
- Coppe A, Pujolar JM, Maes GE, Larsen PF, Hansen MM, Bernatchez L, et al: Sequencing, de novo annotation and analysis of the first Anguilla anguilla transcriptome: EeelBase opens new perspectives for the study of the critically endangered european eel. BMC Genomics. 2010, 11: 635-10.1186/1471-2164-11-635.View ArticlePubMedPubMed CentralGoogle Scholar
- Pujolar JM, Marino IA, Milan M, Coppe A, Maes GE, Capoccioni F, et al: Surviving in a toxic world: transcriptomics and gene expression profiling in response to environmental pollution in the critically endangered European eel. BMC Genomics. 2012, 13: 507-10.1186/1471-2164-13-507.View ArticlePubMedPubMed CentralGoogle Scholar
- Henkel CV, Dirks RP, de Wijze DL, Minegishi Y, Aoyama J, Jansen HJ, et al: First draft genome sequence of the Japanese eel, Anguilla japonica. Gene. 2012, 511 (2): 195-201. 10.1016/j.gene.2012.09.064.View ArticlePubMedGoogle Scholar
- Kai W, Nomura K, Fujiwara A, Nakamura Y, Yasuike M, Ojima N, et al: A ddRAD-based genetic map and its integration with the genome assembly of Japanese eel (Anguilla japonica) provides insights into genome evolution after the teleost-specific genome duplication. BMC Genomics. 2014, 15 (1): 233-10.1186/1471-2164-15-233.View ArticlePubMedPubMed CentralGoogle Scholar
- Cantarel BL, Korf I, Robb SM, Parra G, Ross E, Moore B, et al: MAKER: an easy-to-use annotation pipeline designed for emerging model organism genomes. Genome Res. 2008, 18 (1): 188-196. 10.1101/gr.6743907.View ArticlePubMedPubMed CentralGoogle Scholar
- Aganezov S, Sitdykova N, Alekseyev MA, Consortium A: Scaffold assembly based on genome rearrangement analysis. Comput Biol Chem. 2015, 57: 46-53. 10.1016/j.compbiolchem.2015.02.005.View ArticlePubMedGoogle Scholar
- Tamura K, Stecher G, Peterson D, Filipski A, Kumar S: MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol. 2013, 30 (12): 2725-2729. 10.1093/molbev/mst197.View ArticlePubMedPubMed CentralGoogle Scholar
- Kanehisa M, Goto S: KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000, 28 (1): 27-30. 10.1093/nar/28.1.27.View ArticlePubMedPubMed CentralGoogle Scholar
- Martin JA, Wang Z: Next-generation transcriptome assembly. Nat Rev Genet. 2011, 12 (10): 671-682. 10.1038/nrg3068.View ArticlePubMedGoogle Scholar
- Sudo R, Suetake H, Suzuki Y, Aoyama J, Tsukamoto K: Profiles of mRNA expression for prolactin, growth hormone, and somatolactin in Japanese eels, Anguilla japonica: The effect of salinity, silvering and seasonal change. Comp Biochem Physiol A Mol Integr Physiol. 2013, 164 (1): 10-16. 10.1016/j.cbpa.2012.09.019.View ArticlePubMedGoogle Scholar
- Albert V, Jonsson B, Bernatchez L: Natural hybrids in Atlantic eels (Anguilla anguilla, A. rostrata): evidence for successful reproduction and fluctuating abundance in space and time. Mol Ecol. 2006, 15 (7): 1903-1916. 10.1111/j.1365-294X.2006.02917.x.View ArticlePubMedGoogle Scholar
- Okamura A, Zhang H, Utoh T, Akazawa A, Yamada Y, Horie N, et al: Artificial hybrid between Anguilla anguilla and A. japonica. J Fish Biol. 2004, 64 (5): 1450-1454. 10.1111/j.0022-1112.2004.00409.x.View ArticleGoogle Scholar
- Tse WK, Sun J, Zhang H, Law AY, Yeung BH, Chow SC, et al: Transcriptomic and iTRAQ proteomic approaches reveal novel short-term hyperosmotic stress responsive proteins in the gill of the Japanese eel (Anguilla japonica). J Proteomics. 2013, 89: 81-94. 10.1016/j.jprot.2013.05.026.View ArticlePubMedGoogle Scholar
- Kazeto Y, Tosaka R, Matsubara H, Ijiri S, Adachi S: Ovarian steroidogenesis and the role of sex steroid hormones on ovarian growth and maturation of the Japanese eel. J Steroid Biochem Mol Biol. 2011, 127 (3–5): 149-154. 10.1016/j.jsbmb.2011.03.013.View ArticlePubMedGoogle Scholar
- Tanaka H, Kagawa H, Ohta H, Unuma T, Nomura K: The first production of glass eel in captivity: fish reproductive physiology facilitates great progress in aquaculture. Fish Physiol Biochem. 2003, 28 (1–4): 493-497. 10.1023/B:FISH.0000030638.56031.ed.View ArticleGoogle Scholar
- Yandell M, Ence D: A beginner's guide to eukaryotic genome annotation. Nat Rev Genet. 2012, 13 (5): 329-342. 10.1038/nrg3174.View ArticlePubMedGoogle Scholar
- Price AL, Jones NC, Pevzner PA: De novo identification of repeat families in large genomes. Bioinformatics. 2005, 21 (Suppl 1): i351-358. 10.1093/bioinformatics/bti1018.View ArticlePubMedGoogle Scholar
- Kapitonov VV, Jurka J: A universal classification of eukaryotic transposable elements implemented in Repbase. Nat Rev Genet. 2008, 9 (5): 411-412.View ArticlePubMedGoogle Scholar
- Voorrips R: MapChart: software for the graphical presentation of linkage maps and QTLs. J Hered. 2002, 93 (1): 77-78. 10.1093/jhered/93.1.77.View ArticlePubMedGoogle Scholar
- Laudet V: The origins and evolution of vertebrate metamorphosis. Curr Biol. 2011, 21 (18): R726-737. 10.1016/j.cub.2011.07.030.View ArticlePubMedGoogle Scholar
- Power D, Llewellyn L, Faustino M, Nowell MA, Björnsson BT, Einarsdottir I, et al: Thyroid hormones in growth and development of fish. Comp Biochem Physiol C Toxicol Pharmacol. 2001, 130 (4): 447-459. 10.1016/S1532-0456(01)00271-X.View ArticlePubMedGoogle Scholar
- Sturm RA: Molecular genetics of human pigmentation diversity. Hum Mol Genet. 2009, 18 (R1): R9-R17. 10.1093/hmg/ddp003.View ArticlePubMedGoogle Scholar
- Schulz MH, Zerbino DR, Vingron M, Birney E: Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics. 2012, 28 (8): 1086-1092. 10.1093/bioinformatics/bts094.View ArticlePubMedPubMed CentralGoogle Scholar
- Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, et al: SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience. 2012, 1 (1): 18-10.1186/2047-217X-1-18.View ArticlePubMedPubMed CentralGoogle Scholar
- Fu L, Niu B, Zhu Z, Wu S, Li W: CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012, 28 (23): 3150-3152. 10.1093/bioinformatics/bts565.View ArticlePubMedPubMed CentralGoogle Scholar
- Huang X, Madan A: CAP3: A DNA sequence assembly program. Genome Res. 1999, 9 (9): 868-877. 10.1101/gr.9.9.868.View ArticlePubMedPubMed CentralGoogle Scholar
- Li B, Dewey CN: RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. BMC Bioinformatics. 2011, 12: 323-10.1186/1471-2105-12-323.View ArticlePubMedPubMed CentralGoogle Scholar
- Langmead B, Trapnell C, Pop M, Salzberg SL: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009, 10 (3): R25-10.1186/gb-2009-10-3-r25.View ArticlePubMedPubMed CentralGoogle Scholar
- Korf I. Gene finding in novel genomes. BMC Bioinformatics. 2004;5(1):59.Google Scholar
- Slater GS, Birney E: Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics. 2005, 6: 31-10.1186/1471-2105-6-31.View ArticlePubMedPubMed CentralGoogle Scholar
- Bairoch A, Apweiler R: The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999. Nucleic Acids Res. 1999, 27 (1): 49-54. 10.1093/nar/27.1.49.View ArticlePubMedPubMed CentralGoogle Scholar
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, et al: Gene Ontology: tool for the unification of biology. Nat Genet. 2000, 25 (1): 25-29. 10.1038/75556.View ArticlePubMedPubMed CentralGoogle Scholar
- Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C, et al: The Pfam protein families database. Nucleic Acids Res. 2012, 40 (Database issue): D290-301. 10.1093/nar/gkr1065.View ArticlePubMedGoogle Scholar
- Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M: KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acids Res. 2012, 40 (Database issue): D109-114. 10.1093/nar/gkr988.View ArticlePubMedGoogle Scholar
- Aparicio S, Chapman J, Stupka E, Putnam N, Chia J-M, Dehal P, et al: Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes. Science. 2002, 297 (5585): 1301-1310. 10.1126/science.1072104.View ArticlePubMedGoogle Scholar
- Jaillon O, Aury J-M, Brunet F, Petit J-L, Stange-Thomann N, Mauceli E, et al: Genome duplication in the teleost fish Tetraodon nigroviridis reveals the early vertebrate proto-karyotype. Nature. 2004, 431 (7011): 946-957. 10.1038/nature03025.View ArticlePubMedGoogle Scholar
- Kasahara M, Naruse K, Sasaki S, Nakatani Y, Qu W, Ahsan B, et al: The medaka draft genome and insights into vertebrate genome evolution. Nature. 2007, 447 (7145): 714-719. 10.1038/nature05846.View ArticlePubMedGoogle Scholar
- Amemiya CT, Alföldi J, Lee AP, Fan S, Philippe H, MacCallum I, et al: The African coelacanth genome provides insights into tetrapod evolution. Nature. 2013, 496 (7445): 311-316. 10.1038/nature12027.View ArticlePubMedPubMed CentralGoogle Scholar
- Postlethwait JH, Yan Y-L, Gates MA, Horne S, Amores A, Brownlie A, et al: Vertebrate genome evolution and the zebrafish gene map. Nat Genet. 1998, 18 (4): 345-349. 10.1038/ng0498-345.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.