- Research article
- Open Access
A draft genome of the striped catfish, Pangasianodon hypophthalmus, for comparative analysis of genes relevant to development and a resource for aquaculture improvement
- Oanh T. P. Kim†1Email authorView ORCID ID profile,
- Phuong T. Nguyen†1,
- Eiichi Shoguchi†2,
- Kanako Hisata2,
- Thuy T. B. Vo1,
- Jun Inoue2,
- Chuya Shinzato2, 4,
- Binh T. N. Le1,
- Koki Nishitsuji2,
- Miyuki Kanda3,
- Vu H. Nguyen1,
- Hai V. Nong1 and
- Noriyuki Satoh2Email author
© The Author(s). 2018
- Received: 29 May 2018
- Accepted: 14 September 2018
- Published: 5 October 2018
The striped catfish, Pangasianodon hypophthalmus, is a freshwater and benthopelagic fish common in the Mekong River delta. Catfish constitute a valuable source of dietary protein. Therefore, they are cultured worldwide, and P. hypophthalmus is a food staple in the Mekong area. However, genetic information about the culture stock, is unavailable for breeding improvement, although genetics of the channel catfish, Ictalurus punctatus, has been reported. To acquire genome sequence data as a useful resource for marker-assisted breeding, we decoded a draft genome of P. hypophthalmus and performed comparative analyses.
Using the Illumina platform, we obtained both nuclear and mitochondrial DNA sequences. Molecular phylogeny using the mitochondrial genome confirmed that P. hypophthalmus is a member of the family Pangasiidae and is nested within a clade including the families Cranoglanididae and Ictaluridae. The nuclear genome was estimated at approximately 700 Mb, assembled into 568 scaffolds with an N50 of 14.29 Mbp, and was estimated to contain ~ 28,600 protein-coding genes, comparable to those of channel catfish and zebrafish. Interestingly, zebrafish produce gadusol, but genes for biosynthesis of this sunscreen compound have been lost from catfish genomes. The differences in gene contents between these two catfishes were found in genes for vitamin D-binding protein and cytosolic phospholipase A2, which have lost only in channel catfish. The Hox cluster in catfish genomes comprised seven paralogous groups, similar to that of zebrafish, and comparative analysis clarified catfish lineage-specific losses of A5a, B10a, and A11a. Genes for insulin-like growth factor (IGF) signaling were conserved between the two catfish genomes. In addition to identification of MHC class I and sex determination-related gene loci, the hypothetical chromosomes by comparison with the channel catfish demonstrated the usefulness of the striped catfish genome as a marker resource.
We developed genomic resources for the striped catfish. Possible conservation of genes for development and marker candidates were confirmed by comparing the assembled genome to that of a model fish, Danio rerio, and to channel catfish. Since the catfish genomic constituent resembles that of zebrafish, it is likely that zebrafish data for gene functions is applicable to striped catfish as well.
- Striped catfish
- Draft nuclear genome
- Gadusol biosynthetic genes
- Vitamin D-binding protein
- Hox cluster
- Sex-determination genes
- Hypothetical chromosome
Catfish comprise approximately 4000 species belonging to the teleost order Siluriformes . They are globally distributed in fresh, salty, and brackish water. Although catfish have lost their scales evolutionarily, they occupy a phylogenetic position close to cyprinid fishes including the model fish, Danio rerio [2, 3]. Catfish are also an Ostariophysian species closely related to zebrafish and carp. Catfish constitute a valuable source of dietary protein  and are therefore cultured worldwide as a leading aquaculture species [5–7]. The striped catfish, Pangasianodon hypophthalmus Sauvage, 1878, is a freshwater and benthopelagic species that is common and widely cultured in the Mekong River delta [7, 8]. Vietnam is the world’s largest producer of P. hypophthalmus, with an estimated 1.1 million tons being cultured on a farming area of more than 5000 ha [9, 10]. However, due to environmental changes and other challenges, aquaculture methods and systems must be constantly examined to improve production. Catfish genomic information may be useful to develop marker-assisted breeding and associated genome-wide analyses for catfish aquaculture.
Genomic information greatly facilitates fundamental research and applications for genetic improvement programs in cultured species [11, 12]. The genomes of several economically important fish species have been sequenced, including Atlantic cod (Gadus morhua) , rainbow trout (Oncorhynchus mykiss) , Nile tilapia (Oreochromis niloticus) , Atlantic salmon (Salmo salsar) , and channel catfish (Ictalurus punctatus) . Using decoded genomes, researchers have analyzed polymorphic markers, linkage maps, and QTL/GWAS (Quantitative Trait Loci/Genome-Wide Association Study). Results of these analyses can be used in breeding programs, including marker-assisted selection (MAS), genome selection (GS), and genome editing. For example, genomic resources for Atlantic salmon have been developed with whole-genome sequences  and 9.7 million non-redundant SNPs . Moreover, a high-density genetic linkage map  and a number of QTL studies have characterized the correlation between genetic and phenotypic variation, namely, QTLs affecting flesh color and growth-related traits [20–22], late sexual maturation , resistance to pancreatic disease (salmonid alphavirus) , and resistance to infectious pancreatic necrosis (IPN) [25, 26]. Consequently, MAS has been successfully used in the selection of IPN resistance in Atlantic salmon, which can reduce the number of IPN outbreaks by 75% in salmon farming .
Significant efforts have also been devoted to enhancing genomic and genetic research in other economically important aquaculture species, including catfish. The channel catfish, I. punctatus, is cultured mostly in the U.S., and its genome has been decoded [11, 17]. The channel catfish genome identified genes relevant to the evolutionary loss of scales in catfish although developmentally relevant genes and genes potentially relevant to aquaculture have not been analyzed in detail. In contrast, less genetic and genomic information has been reported in the striped catfish, P. hypophthalmus, which is widely cultured in the Mekong river delta. For example, Sriphairoj et al.  were unable to construct sex-specific markers for Pangasianodon. Therefore, genomic resources of P. hypophthalmus are necessary to develop genome-based technologies for Asian catfish aquaculture. Moreover, P. hypophthalmus is naturally distributed in only the Chao Phraya river of Thailand and the Mekong river, which runs through Cambodia, Laos, Thailand, and Vietnam. P. hypophthalmus migrates annually between spawning and feeding grounds. This species spawns in the upper reaches of the Cambodian Mekong River, then migrates back to the feeding grounds which are located in the floodplain of Tonle Sap, central and lower Mekong river and the Vietnamese Mekong delta . Genetic diversity of P. hypophthalmus remains poorly understood. Only a few studies of population genetics have been done for this species. However, findings are contradictory because of the limited availability of genetic markers . Genomic information about P. hypophthalmus is needed for development of molecular markers that can be used in genetic diversity and evolutionary studies.
Here, we report the decoded genome of the striped catfish, P. hypophthalmus. We compare the striped catfish genome to the channel catfish and zebrafish reference genomes, because striped catfish are phylogenetically closed to both. We also clarify the conservation of core developmental genes in each lineage. In addition, we try to construct hypothetical chromosomes by anchoring the striped catfish genome to channel catfish chromosomes as a genome sequence resource, although the chromosome number of the striped catfish has been reported as 2n = 60 , which is similar to that of channel catfish (2n = 58) .
Sequencing, assembly, and validation
The GC content of this catfish genome was 38.3%. Repeat masker software showed that interspersed repeats constituted ~ 242 Mbp (~ 33.83% of the draft genome), which was less than that of the zebrafish (52%) . Completeness of genome assembly and annotation was assessed using BUSCO . BUSCO found 89% complete, single-copy orthologs belonging to a ray-finned fish (Actinopterygii) lineage (Fig. 1b). In addition, 90% of RNA-seq data was mapped to the assembled genome (http://catfish.genome.ac.vn, http://marinegenomics.oist.jp/gallery/). Thus, we decoded a high-quality draft genome of P. hypophthalmus, which was designated assembly version 2018.
To validate the phylogenetic position of the specimen, we obtained mitochondrial genome sequence data. A BLAST search of mitochondrial genes and an analysis of gene order resulted in a single, circular mitochondrial genome that spanned approximately 16.5 kbp and contained 37 genes  (Additional file 2: Figure S2). Since the present result was consistent with that of a previous study , we used the data for molecular phylogenomics of this fish. We selected 13 protein-coding genes of the mitochondrial genome, and data for the other 112 siluriforms and 14 non-siluriform otocephalans were retrieved from the NCBI database. Using codon-partitioned 10,665 bp data, we estimated a maximum-likelihood (ML) tree according to the analytical procedure shown by Inoue et al. (2010) . We confirmed that our specimen is P. hypophthalmus due to the almost identical sequence with that of P. hypophthalmus (NC_021752) shown by the short branch lengths between the two species (Fig. 1c). In addition, the clade belonging to P. hypophthalmus (Pangasiidae) was grouped with a clade comprising members of the families Cranoglanididae and Ictaluridae. The latter included the channel catfish, Ictalurus punctatus  (Fig. 1c), which also has a decoded genome, demonstrating that catfishes are closer to cyprinid fishes.
Genome annotation and assessment of possible lost genes
Comparison of the Pangasianodon hypophthalmus genome annotation with those of four other fishes
Oryzias latipes a
Takifugu rubripes a
Danio rerio a
Ictalurus punctatus b
Number of genes
Median gene length (bp)
Median transcripts length (bp)
Median exon length (bp)
Median intron length (bp)
To further assess the usefulness of striped catfish genome, we surveyed 169 genes that were lost from channel catfish, but were found in the armored catfish (the pleco, Pterygoplichthys pardoralis, family Loricariidae and the southern striped Raphael, Platydoras armatulus, family Doradidae) . Interestingly, differences in two of those genes were detected bewteen striped catfish and channel catfish. These included vitamin D-binding protein coding gene (dbp) (Fig. 2d) and cytoplasmic phospholipase A2 gene (cPLA2) (Fig. 2e). Vitamin dbp participates in transport of vitamin D metabolites. It is known that cPLA2 functions in Golgi membrane tubule function. Thus, striped catfish genome clarified recent lost genes in the channel catfish lineage, indicating its usefulness in comparative genomic analysis.
Comparative analysis of genes relevant to development
To survey conservation of genes relevant to development, numbers of genes for transcription factors (TF) and signaling molecules (SM) in the P. hypophthalmus genome were estimated based on Pfam domain searches (Additional file 1: Tables S2 and S3) and were compared with those of O. latipes , T. rubripes , D. rerio , and I. punctatus . TF genes for the SCAN (PF02023) and TBX (PF12598) families were more numerous in the two catfishes than in other fish, suggesting that these gene families have expanded in catfish lineage. Among SM, only the gene family for the MCP signal (PF00015) appeared to have expanded. We confirmed by careful examination that the catfish lineage-specific expansion was not found in the other three fish.
The Hox cluster consists of ~ 13 homeodomain-containing transcription factor genes, which show collinearity of expression and function in establishing the antero-posterior body axis and subsequent tissue differentiation . Vertebrates experienced two-rounds of whole genome duplication (2R-WGD) [45–47], although the timing of the first and second rounds is still under debate [48, 49]. Therefore, in contrast to most invertebrates that retain a single Hox cluster, vertebrates contain four paralogous clusters (HoxA, HoxB, HoxC, and HoxD) [46, 47]. In addition, teleost fish have experienced one additional round of WGD, known as the teleost-specific WGD (TS-WGD). Therefore, theoretically, teleost genomes have eight paralogous Hox clusters (HoxAa, HoxAb, HoxBa, HoxBb, HoxCa, HoxCb, HoxDa, and HoxDb). However, all teleosts examined to date have seven clusters [50, 51]. The lineage leading to medaka, fugu, and many other fish have lost one of the HoxC duplicates, and the lineage represented by zebrafish lost one HoxD duplicate. In genome-decoding projects involving metazoans, the presence or absence of Hox genes and their clustering have frequently been used to assess proper sequencing and the assembly of their nuclear genomes. Although the Hox gene clusters of zebrafish have been analyzed extensively , those for catfish have not yet been reported.
Insulin-like growth factor (IGF) and other molecules associated with this system play pivotal physiological roles in the growth and development of fish, and have been intensively studied . One of the aims of the present study was to identify genes involved in striped catfish growth and link to identify SNPs in these gene correlated with the growth trait in the future to improve catfish aquaculture.
Genes related to the IGF system in the Pangasianodon hypophthalmus genome
Amino Acid length
insulin-like growth factor I
insulin-like growth factor I isoform X1
insulin-like growth factor II
insulin-like growth factor II
insulin-like growth factor-binding 1
insulin-like growth factor-binding 1
insulin-like growth factor-binding 2A
insulin-like growth factor-binding 2B
insulin-like growth factor-binding 3
insulin-like growth factor-binding 3
insulin-like growth factor-binding 5
insulin-like growth factor-binding 5
insulin-like growth factor-binding 6
insulin-like growth factor-binding 6
insulin-like growth factor-binding 7
insulin-like growth factor 1 receptor
insulin-like growth factor 1 receptor isoform X1
IGF-I and IGF-II transmit signals through IGF receptor (IGFR). The IGF-I receptor is a disulfide-linked, heterotetrameric transmembrane protein consisting of two alpha subunits and two beta subunits. Both the α and β subunits are encoded in a single precursor cDNA. In zebrafish, two igf1r genes (igf1ra and igf1rb) are reportedly located on chromosomes 2 and 22, respectively . We found three genes encoding IGFR in the P. hypophthalmus genome, all of which are transmembrane proteins (Table 2). Our result suggests one IGFR gene was lost in the zebrafish genome.
IGF-binding proteins (IGFBP) comprise a superfamily that includes six high-affinity IGFBP (core IGFBPs) and at least four additional low-affinity binding proteins, known as IGFBP-related proteins (IGFBP-rP) . Recently, Macqueen et al.  identified 20 IGFBP genes of salmonid fish and discussed their evolution in relation to the third and fourth rounds of WGD. We identified 11 IGFBPs in the P. hypophthalmus genome, two IGFBP-1s, IGFBP-2a, b, two IGFBP-3s, two IGFBP-5s, two IGFBP-6s, and an IGFBP-7 (Table 2) and examined their molecular phylogenic relationships (Fig. 3b). However, we found no IGFBP-4 genes in the catfish genomes, which is consistent with the zebrafish genome . This suggests that a common ancestor of catfish and zebrafish lost IGFBP-4. Zebrafish retains only nine core IGFBPs, and this lineage likely lost one of its IGFBP-3s after it split from the catfish lineage (Fig. 3b).
In the P. hypophthalmus genome, two sets of IGFBP-1 and IGFBP-3 were tandemly aligned in the same scaffolds. Similarly, two sets of IGFBP-2 (IGFBP-2a or IGFBP-2b) and IGFBP-5 were also tandemly arranged in the same scaffolds. This suggests that GFBP-1 and -3, and IGFBP-2 and -5 share an ancestor . Scaffold 3, in which IGFBP-2b and − 5 were located, also contained the HoxDa cluster. In addition, two IGFBP-6s were closely located to the HoxCa and Cb clusters, respectively. This provides further support for a previous hypothesis about their relationships . Thus, the striped catfish genome was of sufficient quality to be useful for future syntenic analysis of teleost genomes.
Next, we surveyed genes potentially relevant to improvement of aquaculture and breeding. Major histocompatibility complex class I (MHCI) molecules initiate immune responses against invading foreign elements, such as viruses. In teleosts, there are five lineages of MHCI, namely U, Z, S, L and P, which have been classified based on phylogenetic clustering . The number of genes in each lineage differs widely among teleost species. Here, we identified MHCI genes in the P. hypophthalmus genome to provide additional data for understanding the complexity of the teleost MHCI and for future studies on genetic variation of genes that may be candidates for development of molecular markers related to disease resistance.
The number of MHC Class I lineage genes predicted in the Pangasianodon hypophthalmus genome
Predicted MHC Class I lineagea
Genes related to sex determination
In teleosts, sex determination mechanisms are extremely diverse, differing among closely related species and even within species . Two sex-determining systems, the XY system (i.e., male-heterogamety) and the ZW system (i.e., female-heterogamety), have been found in fish. For example, the XY sex determination system occurs in medaka (Oryzias latipes) , zebrafish (D. rerio)  and rainbow trout (Oncorhynchus mykiss) , while the ZW sex determination system is found in turbot (Scophthalmus maximus)  and California Yellowtail (Seriola dorsalis) . However, sex determination mechanisms in most fish remain unknown. They have been clarified in only a few fish spices. In medaka, a duplicated Dmrt gene on the Y-chromosome was found to be a sex determination gene . In rainbow trout, a Y-linked gene (sdY) was identified as a sex control gene . In fugu, sex determination is controlled by an SNP in the anti-Mullerian hormone receptor type II (Amhr2) gene . In zebrafish, four sex-associated regions (sar3, sar4, sar 5 and sar16) have been identified and chromosome 4 is believed to be a sex-chromosome . In aquaculture, sex ratio control is very important because in many economically important fish species, monosex cultures are developed to increase aquaculture production . Genetic information regarding sex determination will enable us to develop sex-linked markers.
Candidate genes for sex determination in catfish genomes
Striped catfish (Pangasianodon hypophthalmus)
Channel catfish (Ictalurus punctatus)
Zebrafish (Danio rerio)
Construction of hypothetical chromosomes
Comparative analysis of genes that are relevant to development indicated that (1) the draft genome of P. hypophthalmus is of comparable quality to other fish genomes, (2) the Hox cluster of the catfish is more comparable to that of zebrafish than to those of medaka and other fish, and (3) catfish and zebrafish have experienced common and lineage-specific losses of Hox genes, although the effect is larger in zebrafish than in catfish. Comparison of the Hox cluster suggested that the phylogenetic position of striped catfish is closer to zebrafish than to other model fish. Therefore, the Hox cluster of P. hypophthalmus provides evidence for further discussion of the evolutionary modification of fish Hox clusters and TS-WGD. For example, the catfish lineage lost two posterior hox genes after splitting from the zebrafish lineage. This might be related to the special morphology of catfish.
The construction of our hypothetical chromosomes suggested that catfish genomes have experienced more frequent inter-chromosomal rearrangements (Blue scaffolds in Fig. 5) than have invertebrate genomes . The chromosome numbers of channel and striped catfishes are n = 29 and n = 30, respectively [17, 31]. Therefore, if inter-chromosomal rearrangement is rare, many scaffolds of striped catfish should be anchored on one chromosome of channel catfish. Nonetheless, our comparative genomic analysis of the two catfishes suggests that catfish chromosomes have few inter-chromosomal rearrangement regions (Fig. 5), implying that the channel catfish genome is useful in constructing a physical map of the striped catfish genome. Although sex chromosomes and the sex-determination mechanisms of the catfish are unknown, our hypothetical chromosomes from a male will be useful for analyzing these genomic regions. In a future study, we will identify single nucleotide polymorphisms and polymorphic microsatellites using the striped catfish genome as a reference, and we will prepare a fine linkage or physical map of these data.
In this study, we developed a genome sequence resource for the striped catfish, Pangasianodon hypophthalmus. Possible conservation of genes for transcription factors and signaling molecules was confirmed by comparing the assembled genome to a model fish, Danio rerio. Seven Hox cluster regions in the catfish and zebrafish genomes contained 51 and 49 genes, respectively, suggesting the conservation of core developmental mechanisms. The striped catfish retained more IGF signaling genes than zebrafish, but the biosynthetic genes for vertebrate sunscreen molecules have been found in the zebrafish genome but not the catfish genome, documenting enzymatic gene loss in this catfish. Altogether, the present whole genome sequence of the P. hypophthalmus might be useful as a reference to find SNPs with marker-assisted breeding and associated genome-wide analysis for further aquaculture development of the striped catfish.
This study was carried out using striped catfish (P. hypophthalmus) from Research Institute of Aquaculture No.2, Vietnam. Genomic DNA was isolated from the testis of an adult male striped catfish. For RNA-seq analyses, fertilized eggs, embryos, and larvae at various developmental stages were collected. Various organs and tissues were also isolated from both female and male adult fishes for RNA-seq analyses. To dissect the tissues, several incisions were made along ventral side and lateral line of the specimen. The fresh tissues were submerged into the RNAlater solution. Details of sampling for transcriptomic analyses are in the NCBI database (the accession nos., SRX3887330-SRX3887334).
DNA extraction and purification
The testis was powdered in liquid nitrogen and homogenized in DNA extraction buffer (10 mM Tris HCl, pH 8.0; 150 mM EDTA; 1% SDS; 200 μg/mL Proteinase K). DNA was extracted using a phenol-chloroform extraction protocol and pelleted with 100% ethanol. DNA quality and quantity were evaluated by electrophoresis on a 1% agarose gel, and using a NanoDrop spectrophotometer and an Agilent 2100 Bioanalyzer with an Agilent High-Sensitivity DNA Kit.
DNA library construction and Illumina sequencing
Pair-end (PE) libraries were constructed using a TruSeq DNA PCR-Free Kit (Illumina) according to manufacturer protocols. Mate-pair libraries of 3-kb, 7-kb, 10-kb, and 15-kb fragments were prepared using a Nextera Mate-Pair (MP) Library Preparation Kit (Illumina) following the manufacturer procedure. All pair-end and mate-pair libraries were sequenced using Illumina Miseq and Hiseq 2500 sequencing platforms (Additional file 1: Table S1) with Illumina protocols for whole-genome shotgun sequencing (WGS). PE read length from Miseq was ~ 2 × 310 bp. PE and MP reads from Hiseq 2500 were ~ 2 × 145 bp and ~ 2 × 295 bp, respectively (Additional file 1: Table S1).
Sequence data processing and genome assembly
Quality of raw sequencing reads was assessed using FastQC v.0.11.5 . Adapter sequences and low-quality reads were trimmed using Trimmomatic v.0.35 , PRINSEQ v.0.20.4  and NextClip v1.3 , and k-mer analysis was performed using Jellyfish . GenomeScope  was applied to estimate genome size. Miseq and Hiseq paired-end reads were assembled de novo with Platanus . Using Illumina mate-pair information, subsequent scaffolding was also performed with Platanus. Gaps in scaffolds were closed using Illumina paired-end data and Platanus software. Completeness of the assembly was estimated with CEGMA v2.5  and Benchmarking Universal Single-Copy Orthologs (BUSCO) v3 . For the post-assembly stage, HaploMerger2  was used to improve the continuity of the initial assembly generated by Platanus. The workflow of the assembly and gene prediction is shown in Fig. S1 (Additional file 2).
Simple repeat sequences were identified with RepeatScout v. 1.0.5  and RepeatModeler  and masked with RepeatMasker . Masked genome sequences were subjected to produce a gene model or prediction (Pangasianodon hypophthalmus Gene Model ver. 2018) with Augustus software  and BRAKER2 pipeline  with ab initio, homology-based, and EST-based approaches (Additional file 2: Figure S1). For the homology-based approach, protein sequences predicted for Danio rerio were aligned using Exonerate v.2.2 . With TopHat2 , high-quality RNA-seq reads of P. hypophthalmus were used to generate intron hints for EST-based prediction. Details of RNA-seq data are described elsewhere (Oanh T. P. Kim et al., in preparation).
Annotation and identification of genes
Protein-coding genes in the P. hypophthalmus genome were surveyed as follows. (i) Nucleotide and amino acid sequences of well-annotated genes of model organisms were used as queries for BLAST searches, including TBLASTN  of the P. hypophthalmus genome. (ii) Pfam domain searches were performed to identify protein domains included in the putative proteins from all gene models  (Pfam-A.hmm, release 24.0).
Hox gene clusters were surveyed based on previous reports of teleost Hox clusters [92, 93]. Hox cluster-containing scaffolds from Blast analyses using teleost Hox sequences were visualized using a genome browser of P. hypophthalmus. Gene model IDs (ver. 2017 and ver. 2018) and transcriptome contigs for Hox genes were assigned and confirmed manually (Additional file 1: Table S4).
Genes for the IGF system were screened using a BLAST search and annotated with the BLAST2GO pipeline . For the IGFBP family, the complete salmonid IGFBP gene system  was also used as a query for BLAST searches of IGFBP genes in the P. hypophthalmus genome.
MHCI genes in the striped catfish genome were identified based on previous reports [60, 95] and using BLAST searches. Newly identified MHCI genes were aligned with previously reported MHCI genes from different species using the MUSCLE  and then based on phylogenetic clustering, MHCI genes were classified into various lineages.
Sex-related genes from zebrafish  and channel catfish [70, 71] were used to survey sex-related genes in the striped catfish genome. Based on BLAST searches, candidate sex determination genes and gene-containing scaffolds were identified.
With BLAST searches, mitochondrial genome sequences in the draft genome (ver. 2018) of P. hypophthalmus were surveyed using mitochondrial genes (NC-021752) as a query. The resultant sequence was confirmed with NOVOplasty . Maximum-likelihood (ML) analysis using RAxML v. 7.2.4  was performed and a tree was constructed as previously described .
Newly identified IGFBP genes from P. hypophthalmus and IGFBP genes from different taxa available in the NCBI Nucleotide database (Additional file 1: Table S5) were used for phylogenetic analysis. Multiple alignment of IGFBP sequences was performed using the MAFFT web-based tool  with default parameters. A phylogenetic tree for IGFBPs was constructed with MEGA7.0  using neighbor-joining methods . The tree topology was evaluated with a bootstrap probability calculated on 1000 resamplings. We applied the same method for phylogenetic tree construction of MHCI genes.
Anchoring the striped catfish scaffolds to channel catfish chromosomes
To anchor scaffolds on chromosomes of the channel catfish, 28,580 gene models of the striped catfish are used as queries by BLASTN. If a scaffold had better than 50% gene matches on a chromosome, it was hypothesized to have come from a common ancestral chromosome between channel catfish and striped catfish. If a scaffold had less than 50% hit on a chromosome, the scaffold was classified as a less conserved region.
We thank Dr. Sang V. Nguyen (Research Institute of Aquaculture No.2, Vietnam) for striped catfish sampling, the IT section at OIST for supercomputing support, and Dr. Steven D. Aird for technical editing and helpful comments about the manuscript.
This work was supported by “Development and Application of Biotechnology in Aquaculture Program” from the Ministry of Agriculture and Rural Development (MARD) of Vietnam to Oanh T. P. Kim. This work was partly funded by the Internal Research Fund of the Okinawa Institute of Science and Technology (OIST) to Noriyuki Satoh. The grant from MARD funded the sampling, salary support for Vietnamese researchers to enable molecular experiments and computational analyses of the data. The grant from OIST funded Illumina sequencing and data analysis.
Availability of data and materials
All sequenced data from Pangasianodon hypophthalmus are accessible in the DDBJ/EMBL/NCBI database at BioProject ID, PRJNA448819. All Illumina reads are available under accession nos. SRR6943546 -SRR6943551 (DNA-seq) and SRR6943541-SRR6943545 (RNA-seq) on NCBI database. Assembled genomes have been deposited with accession nos. QUXB00000000. Sequence datasets generated during the current study are also available at the genome browser site (http://marinegenomics.oist.jp/gallery/ or http://catfish.genome.ac.vn).
OK, HN, and NS designed the project. OK and TV extracted DNA and mRNA from samples. TV, KN, and MK performed library preparation and sequencing. OK, PN, ES, KH, JI, CS, VN and BL analyzed sequence data. OK, PN, ES, KH, JI, and NS prepared the manuscript. All authors edited and commented on the manuscript. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Pangasianodon hypophthalmus was sampled as approved by the Institutional Review Board, Institute of Genome Research, Vietnam Academy of Science and Technology (VAST) for the use of animals in research (No: 6–2015/NCHG/HĐĐĐ).
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Sullivan JP, Lundberg JG, Hardman M. A phylogenetic analysis of the major groups of catfishes (Teleostei: Siluriformes) using rag1 and rag2 nuclear gene sequences. Mol Phylogenet Evol. 2006;41(3):636–62.View ArticleGoogle Scholar
- Liu H, Jiang Y, Wang S, Ninwichian P, Somridhivej B, Xu P, Abernathy J, Kucuktas H, Liu Z. Comparative analysis of catfish BAC end sequences with the zebrafish genome. BMC Genomics. 2009;10:592.View ArticleGoogle Scholar
- Howe K, Clark MD, Torroja CF, Torrance J, Berthelot C, Muffato M, Collins JE, Humphray S, McLaren K, Matthews L, et al. The zebrafish reference genome sequence and its relationship to the human genome. Nature. 2013;496(7446):498–503.View ArticleGoogle Scholar
- Garling DL Jr, Wilson RP. Optimum dietary protein to energy ratio for channel catfish fingerlings, Ictalurus punctatus. J Nutr. 1976;106(9):1368–75.View ArticleGoogle Scholar
- Liu Z. Development of genomic resources in support of sequencing, assembly, and annotation of the catfish genome. Comp Biochem Physiol Part D Genomics Proteomics. 2011;6(1):11–7.View ArticleGoogle Scholar
- Hecht T, Oellermann L, Verheust L. Perspectives on clariid catfish culture in Africa. Aquat Living Resour. 1996;9:197–206.View ArticleGoogle Scholar
- Phan LT, Bui TM, Nguyen TTT, Gooley GJ, Ingram BA, Nguyen HV, Nguyen PT, De Silva SS. Current status of farming practices of striped catfish, Pangasianodon hypophthalmus in the Mekong Delta, Vietnam. Aquaculture. 2009;296:227–36.View ArticleGoogle Scholar
- Roberts TR, Vidthayanon C. Systematic revision of the Asian catfish family Pangasiidae, with biological observations and descriptions of three new species. Proc Acad Nat Sci Philad. 1991;143:97–143.Google Scholar
- Nguyen AL, Truong MH, Verreth JA, Leemans R, Bosma RH, De Silva SS. Exploring the climate change concerns of striped catfish producers in the Mekong Delta, Vietnam. Springerplus. 2015;4:46.View ArticleGoogle Scholar
- Hoe TD, Thuy NTN, Ha TTV, Ngoc LTB, Thu PK. Report on Vietnam Seafood exports Q.III/2016. In: Hang L, editor. Vietnam Association of Seafood Exporters and Producers; 2016.Google Scholar
- Yue GH. Recent advances of genome mapping and marker-assisted selection in aquaculture. Fish Fish. 2014;15(3):376–96.View ArticleGoogle Scholar
- Abdelrahman H, ElHady M, Alcivar-Warren A, Allen S, Al-Tobasei R, Bao L, Beck B, Blackburn H, Bosworth B, Buchanan J, et al. Aquaculture genomics, genetics and breeding in the United States: current status, challenges, and priorities for future research. BMC Genomics. 2017;18(1):191.View ArticleGoogle Scholar
- Star B, Nederbragt AJ, Jentoft S, Grimholt U, Malmstrom M, Gregers TF, Rounge TB, Paulsen J, Solbakken MH, Sharma A, et al. The genome sequence of Atlantic cod reveals a unique immune system. Nature. 2011;477(7363):207–10.View ArticleGoogle Scholar
- Berthelot C, Brunet F, Chalopin D, Juanchich A, Bernard M, Noel B, Bento P, Da Silva C, Labadie K, Alberti A, et al. The rainbow trout genome provides novel insights into evolution after whole-genome duplication in vertebrates. Nat Commun. 2014;5:3657.View ArticleGoogle Scholar
- Brawand D, Wagner CE, Li YI, Malinsky M, Keller I, Fan S, Simakov O, Ng AY, Lim ZW, Bezault E, et al. The genomic substrate for adaptive radiation in African cichlid fish. Nature. 2014;513(7518):375–81.View ArticleGoogle Scholar
- Lien S, Koop BF, Sandve SR, Miller JR, Kent MP, Nome T, Hvidsten TR, Leong JS, Minkley DR, Zimin A, et al. The Atlantic salmon genome provides insights into rediploidization. Nature. 2016;533(7602):200–5.View ArticleGoogle Scholar
- Liu Z, Liu S, Yao J, Bao L, Zhang J, Li Y, Jiang C, Sun L, Wang R, Zhang Y, et al. The channel catfish genome sequence provides insights into the evolution of scale formation in teleosts. Nat Commun. 2016;7:11757.View ArticleGoogle Scholar
- Yanez JM, Naswa S, Lopez ME, Bassini L, Correa K, Gilbey J, Bernatchez L, Norris A, Neira R, Lhorente JP, et al. Genomewide single nucleotide polymorphism discovery in Atlantic salmon (Salmo salar): validation in wild and farmed American and European populations. Mol Ecol Resour. 2016;16(4):1002–11.View ArticleGoogle Scholar
- Lien S, Gidskehaug L, Moen T, Hayes BJ, Berg PR, Davidson WS, Omholt SW, Kent MP. A dense SNP-based linkage map for Atlantic salmon (Salmo salar) reveals extended chromosome homeologies and striking differences in sex-specific recombination patterns. BMC Genomics. 2011;12:615.View ArticleGoogle Scholar
- Baranski M, Moen T, Vage DI. Mapping of quantitative trait loci for flesh colour and growth traits in Atlantic salmon (Salmo salar). Genet Sel Evol. 2010;42:17.View ArticleGoogle Scholar
- Tsai HY, Hamilton A, Guy DR, Houston RD. Single nucleotide polymorphisms in the insulin-like growth factor 1 (IGF1) gene are associated with growth-related traits in farmed Atlantic salmon. Anim Genet. 2014;45(5):709–15.View ArticleGoogle Scholar
- Tsai HY, Hamilton A, Guy DR, Tinch AE, Bishop SC, Houston RD. The genetic architecture of growth and fillet traits in farmed Atlantic salmon (Salmo salar). BMC Genet. 2015;16:51.View ArticleGoogle Scholar
- Gutierrez AP, Lubieniecki KP, Fukui S, Withler RE, Swift B, Davidson WS. Detection of quantitative trait loci (QTL) related to grilsing and late sexual maturation in Atlantic salmon (Salmo salar). Mar Biotechnol (NY). 2014;16(1):103–10.View ArticleGoogle Scholar
- Gonen S, Baranski M, Thorland I, Norris A, Grove H, Arnesen P, Bakke H, Lien S, Bishop SC, Houston RD. Mapping and validation of a major QTL affecting resistance to pancreas disease (salmonid alphavirus) in Atlantic salmon (Salmo salar). Heredity (Edinb). 2015;115(5):405–14.View ArticleGoogle Scholar
- Houston RD, Haley CS, Hamilton A, Guy DR, Mota-Velasco JC, Gheyas AA, Tinch AE, Taggart JB, Bron JE, Starkey WG, et al. The susceptibility of Atlantic salmon fry to freshwater infectious pancreatic necrosis is largely explained by a major QTL. Heredity (Edinb). 2010;105(3):318–27.View ArticleGoogle Scholar
- Moen T, Baranski M, Sonesson AK, Kjoglum S. Confirmation and fine-mapping of a major QTL for resistance to infectious pancreatic necrosis in Atlantic salmon (Salmo salar): population-level associations between markers and trait. BMC Genomics. 2009;10:368.View ArticleGoogle Scholar
- Moen T, Torgersen J, Santi N, Davidson WS, Baranski M, Odegard J, Kjoglum S, Velle B, Kent M, Lubieniecki KP, et al. Epithelial cadherin determines resistance to infectious pancreatic necrosis virus in Atlantic Salmon. Genetics. 2015;200(4):1313–26.View ArticleGoogle Scholar
- Sriphairoja K, Na-Nakorna U, Brunellib JP, Thorgaard GH. No AFLP sex-specific markers detected in Pangasianodon gigas and P. hypophthalmus. Aquaculture. 2007;273(4):739–43.View ArticleGoogle Scholar
- So N, Maes GE, Volckaert FA. High genetic diversity in cryptic populations of the migratory sutchi catfish Pangasianodon hypophthalmus in the Mekong River. Heredity (Edinb). 2006;96(2):166–74.View ArticleGoogle Scholar
- Nguyen TTT. Patterns of use and exchange of genetic resourses of the striped catfish Pangasianodon hypophthalmus (Sauvvage 1878). Rev Aquac. 2009;1:224–31.View ArticleGoogle Scholar
- Magtoon W, Donsakul T. Karyotypes of Pangasiid catfishes, Pangasius sutchi and P.larnaidii, from Thailand. Jpn J Ichthyol. 1897;34(3):396–8.View ArticleGoogle Scholar
- Simao FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015;31(19):3210–2.View ArticleGoogle Scholar
- Iwasaki W, Fukunaga T, Isagozawa R, Yamada K, Maeda Y, Satoh TP, Sado T, Mabuchi K, Takeshima H, Miya M, et al. MitoFish and MitoAnnotator: a mitochondrial genome database of fish with an accurate and automatic annotation pipeline. Mol Biol Evol. 2013;30(11):2531–40.View ArticleGoogle Scholar
- Zhao H, Kong X, Zhou C. The mitogenome of Pangasius sutchi (Teleostei, Siluriformes: Pangasiidae). Mitochondrial DNA. 2014;25(5):342–4.View ArticleGoogle Scholar
- Inoue JG, Miya M, Miller MJ, Sado T, Hanel R, Hatooka K, Aoyama J, Minegishi Y, Nishida M, Tsukamoto K. Deep-ocean origin of the freshwater eels. Biol Lett. 2010;6(3):363–6.View ArticleGoogle Scholar
- Shick JM, Dunlap WC. Mycosporine-like amino acids and related Gadusols: biosynthesis, acumulation, and UV-protective functions in aquatic organisms. Annu Rev Physiol. 2002;64:223–62.View ArticleGoogle Scholar
- Shinzato C, Shoguchi E, Kawashima T, Hamada M, Hisata K, Tanaka M, Fujie M, Fujiwara M, Koyanagi R, Ikuta T, et al. Using the Acropora digitifera genome to understand coral responses to environmental change. Nature. 2011;476(7360):320–3.View ArticleGoogle Scholar
- Miyamoto KT, Komatsu M, Ikeda H. Discovery of gene cluster for mycosporine-like amino acid biosynthesis from Actinomycetales microorganisms and production of a novel mycosporine-like amino acid by heterologous expression. Appl Environ Microbiol. 2014;80(16):5028–36.View ArticleGoogle Scholar
- Osborn AR, Almabruk KH, Holzwarth G, Asamizu S, LaDu J, Kean KM, Karplus PA, Tanguay RL, Bakalinsky AT, Mahmud T. De novo synthesis of a sunscreen compound in vertebrates. Elife. 2015;4(e05919):1-15.Google Scholar
- Amemiya CT, Alfoldi J, Lee AP, Fan S, Philippe H, Maccallum I, Braasch I, Manousaki T, Schneider I, Rohner N, et al. The African coelacanth genome provides insights into tetrapod evolution. Nature. 2013;496(7445):311–6.View ArticleGoogle Scholar
- Nikaido M, Noguchi H, Nishihara H, Toyoda A, Suzuki Y, Kajitani R, Suzuki H, Okuno M, Aibara M, Ngatunga BP, et al. Coelacanth genomes reveal signatures for evolutionary transition from water to land. Genome Res. 2013;23(10):1740–8.View ArticleGoogle Scholar
- Kasahara M, Naruse K, Sasaki S, Nakatani Y, Qu W, Ahsan B, Yamada T, Nagayasu Y, Doi K, Kasai Y, et al. The medaka draft genome and insights into vertebrate genome evolution. Nature. 2007;447(7145):714–9.View ArticleGoogle Scholar
- Aparicio S, Chapman J, Stupka E, Putnam N, Chia JM, Dehal P, Christoffels A, Rash S, Hoon S, Smit A, et al. Whole-genome shotgun assembly and analysis of the genome of Fugu rubripes. Science. 2002;297(5585):1301–10.View ArticleGoogle Scholar
- Lewis EB. A gene complex controlling segmentation in drosophila. Nature. 1978;276(5688):565–70.View ArticleGoogle Scholar
- Holland PW. Gene duplication: past, present and future. Semin Cell Dev Biol. 1999;10(5):541–7.View ArticleGoogle Scholar
- Duboule D. Temporal colinearity and the phylotypic progression: a basis for the stability of a vertebrate Bauplan and the evolution of morphologies through heterochrony. Development. 1994;1994(Supplement):135–42.Google Scholar
- Holland PW, Garcia-Fernandez J, Williams NA, Sidow A. Gene duplications and the origins of vertebrate development. Development. 1994;1994(Supplement):125–33.Google Scholar
- Dehal P, Boore JL. Two rounds of whole genome duplication in the ancestral vertebrate. PLoS Biol. 2005;3(10):e314.View ArticleGoogle Scholar
- Kuraku S, Meyer A, Kuratani S. Timing of genome duplications relative to the origin of the vertebrates: did cyclostomes diverge before or after? Mol Biol Evol. 2009;26(1):47–59.View ArticleGoogle Scholar
- Duboule D. The rise and fall of Hox gene clusters. Development. 2007;134(14):2549–60.View ArticleGoogle Scholar
- Kuraku S, Meyer A. The evolution and maintenance of Hox gene clusters in vertebrates and the teleost-specific genome duplication. Int J Dev Biol. 2009;53(5–6):765–73.View ArticleGoogle Scholar
- Amores A, Force A, Yan YL, Joly L, Amemiya C, Fritz A, Ho RK, Langeland J, Prince V, Wang YL, et al. Zebrafish hox clusters and vertebrate genome evolution. Science. 1998;282(5394):1711–4.View ArticleGoogle Scholar
- Moriyama S, Ayson FG, Kawauchi H. Growth regulation by insulin-like growth factor-I in fish. Biosci Biotechnol Biochem. 2000;64(8):1553–62.View ArticleGoogle Scholar
- Zou S, Kamei H, Modi Z, Duan C. Zebrafish IGF genes: gene duplication, conservation and divergence, and novel roles in midline and notochord development. PLoS One. 2009;4(9):e7026.View ArticleGoogle Scholar
- Schlueter PJ, Royer T, Farah MH, Laser B, Chan SJ, Steiner DF, Duan C. Gene duplication and functional divergence of the zebrafish insulin-like growth factor 1 receptors. FASEB J. 2006;20(8):1230–2.View ArticleGoogle Scholar
- Hwa V, Oh Y, Rosenfeld RG. Insulin-like growth factor binding proteins: a proposed superfamily. Acta Paediatr Suppl. 1999;88(428):37–45.View ArticleGoogle Scholar
- Macqueen DJ, Garcia de la Serrana D, Johnston IA. Evolution of ancient functions in the vertebrate insulin-like growth factor system uncovered by study of duplicated salmonid fish genomes. Mol Biol Evol. 2013;30(5):1060–76.View ArticleGoogle Scholar
- Garcia de la Serrana D, Macqueen DJ. Insulin-like growth factor-binding proteins of teleost fishes. Front Endocrinol (Lausanne). 2018;9:80.View ArticleGoogle Scholar
- Daza DO, Sundstrom G, Bergqvist CA, Duan C, Larhammar D. Evolution of the insulin-like growth factor binding protein (IGFBP) family. Endocrinology. 2011;152(6):2278–89.View ArticleGoogle Scholar
- Grimholt U, Tsukamoto K, Azuma T, Leong J, Koop BF, Dijkstra JM. A comprehensive analysis of teleost MHC class I sequences. BMC Evol Biol. 2015;15:32.View ArticleGoogle Scholar
- Pan Q, Anderson J, Bertho S, Herpin A, Wilson C, Postlethwait JH, Schartl M, Guiguen Y. Vertebrate sex-determining genes play musical chairs. C R Biol. 2016;339(7–8):258–62.View ArticleGoogle Scholar
- Matsuda M, Nagahama Y, Shinomiya A, Sato T, Matsuda C, Kobayashi T, Morrey CE, Shibata N, Asakawa S, Shimizu N, et al. DMY is a Y-specific DM-domain gene required for male development in the medaka fish. Nature. 2002;417(6888):559–63.View ArticleGoogle Scholar
- Anderson JL, Rodriguez Mari A, Braasch I, Amores A, Hohenlohe P, Batzel P, Postlethwait JH. Multiple sex-associated regions and a putative sex chromosome in zebrafish revealed by RAD mapping and population genomics. PLoS One. 2012;7(7):e40701.View ArticleGoogle Scholar
- Yano A, Guyomard R, Nicol B, Jouanno E, Quillet E, Klopp C, Cabau C, Bouchez O, Fostier A, Guiguen Y. An immune-related gene evolved into the master sex-determining gene in rainbow trout, Oncorhynchus mykiss. Curr Biol. 2012;22(15):1423–8.View ArticleGoogle Scholar
- Martinez P, Bouza C, Hermida M, Fernandez J, Toro MA, Vera M, Pardo B, Millan A, Fernandez C, Vilas R, et al. Identification of the major sex-determining region of turbot (Scophthalmus maximus). Genetics. 2009;183(4):1443–52.View ArticleGoogle Scholar
- Purcell CM, Seetharam AS, Snodgrass O, Ortega-Garcia S, Hyde JR, Severin AJ. Insights into teleost sex determination from the Seriola dorsalis genome assembly. BMC Genomics. 2018;19(1):31.View ArticleGoogle Scholar
- Nanda I, Kondo M, Hornung U, Asakawa S, Winkler C, Shimizu A, Shan Z, Haaf T, Shimizu N, Shima A, et al. A duplicated copy of DMRT1 in the sex-determining region of the Y chromosome of the medaka, Oryzias latipes. Proc Natl Acad Sci U S A. 2002;99(18):11778–83.View ArticleGoogle Scholar
- Kamiya T, Kai W, Tasumi S, Oka A, Matsunaga T, Mizuno N, Fujita M, Suetake H, Suzuki S, Hosoya S, et al. A trans-species missense SNP in Amhr2 is associated with sex determination in the tiger pufferfish, Takifugu rubripes (fugu). PLoS Genet. 2012;8(7):e1002798.View ArticleGoogle Scholar
- Martinez P, Vinas AM, Sanchez L, Diaz N, Ribas L, Piferrer F. Genetic architecture of sex determination in fish: applications to sex ratio control in aquaculture. Front Genet. 2014;5:340.PubMedPubMed CentralGoogle Scholar
- Sun F, Liu S, Gao X, Jiang Y, Perera D, Wang X, Li C, Sun L, Zhang J, Kaltenboeck L, et al. Male-biased genes in catfish as revealed by RNA-Seq analysis of the testis transcriptome. PLoS One. 2013;8(7):e68452.View ArticleGoogle Scholar
- Zhang S, Chen X, Wang M, Zhang W, Pan J, Qin Q, Zhong L, Shao J, Sun M, Jiang H, et al. Genome-wide identification, phylogeny and expressional profile of the sox gene family in channel catfish (Ictalurus punctatus). Comp Biochem Physiol Part D Genomics Proteomics. 2018;28:17–26.View ArticleGoogle Scholar
- Hill MM, Broman KW, Stupka E, Smith WC, Jiang D, Sidow A. The C. savignyi genetic map and its integration with the reference sequence facilitates insights into chordate genome evolution. Genome Res. 2008;18(8):1369–79.View ArticleGoogle Scholar
- Andrew S: FastQC: a quality control tool for high throughput sequence data. 2010.Google Scholar
- Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20.View ArticleGoogle Scholar
- Schmieder R, Edwards R. Quality control and preprocessing of metagenomic datasets. Bioinformatics. 2011;27(6):863–4.View ArticleGoogle Scholar
- Leggett RM, Clavijo BJ, Clissold L, Clark MD, Caccamo M. NextClip: an analysis and read preparation tool for Nextera long mate pair libraries. Bioinformatics. 2014;30(4):566–8.View ArticleGoogle Scholar
- Marcais G, Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 2011;27(6):764–70.View ArticleGoogle Scholar
- Vurture GW, Sedlazeck FJ, Nattestad M, Underwood CJ, Fang H, Gurtowski J, Schatz MC. GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics. 2017;33(14):2202–4.View ArticleGoogle Scholar
- Kajitani R, Toshimoto K, Noguchi H, Toyoda A, Ogura Y, Okuno M, Yabana M, Harada M, Nagayasu E, Maruyama H, et al. Efficient de novo assembly of highly heterozygous genomes from whole-genome shotgun short reads. Genome Res. 2014;24(8):1384–95.View ArticleGoogle Scholar
- Parra G, Bradnam K, Korf I. CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics. 2007;23(9):1061–7.View ArticleGoogle Scholar
- Huang S, Kang M, Xu A. HaploMerger2: rebuilding both haploid sub-assemblies from high-heterozygosity diploid genome assembly. Bioinformatics. 2017;33(16):2577–9.View ArticleGoogle Scholar
- Price AL, Jones NC, Pevzner PA. De novo identification of repeat families in large genomes. Bioinformatics. 2005;21(Suppl 1):i351–8.View ArticleGoogle Scholar
- Smit A, Hubley R: RepeatModeler - 1.0.9. 2017.Google Scholar
- Smit A, Hubley R, Green P: RepeatMasker 4.0.7. 2017.Google Scholar
- Stanke M, Waack S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics. 2003;19(Suppl 2):ii215–25.View ArticleGoogle Scholar
- Hoff KJ, Lange S, Lomsadze A, Borodovsky M, Stanke M. BRAKER1: unsupervised RNA-Seq-based genome annotation with GeneMark-ET and AUGUSTUS. Bioinformatics. 2016;32(5):767–9.View ArticleGoogle Scholar
- Slater GS, Birney E. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics. 2005;6:31.View ArticleGoogle Scholar
- Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 2013;14(4):R36.View ArticleGoogle Scholar
- Skinner ME, Uzilov AV, Stein LD, Mungall CJ, Holmes IH. JBrowse: a next-generation genome browser. Genome Res. 2009;19(9):1630–8.View ArticleGoogle Scholar
- Mount DW. Using the basic local alignment search tool (BLAST). CSH Protoc. 2007;2007:pdb top17.PubMedGoogle Scholar
- Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, Potter SC, Punta M, Qureshi M, Sangrador-Vegas A, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44(D1):D279–85.View ArticleGoogle Scholar
- Henkel CV, Burgerhout E, de Wijze DL, Dirks RP, Minegishi Y, Jansen HJ, Spaink HP, Dufour S, Weltzien FA, Tsukamoto K, et al. Primitive duplicate Hox clusters in the European eel's genome. PLoS One. 2012;7(2):e32231.View ArticleGoogle Scholar
- Kim BM, Lee BY, Lee JH, Rhee JS, Lee JS. Conservation of Hox gene clusters in the self-fertilizing fish Kryptolebias marmoratus (Cyprinodontiformes; Rivulidae). J Fish Biol. 2016;88(3):1249–56.View ArticleGoogle Scholar
- Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21(18):3674–6.View ArticleGoogle Scholar
- Grimholt U. MHC and Evolution in Teleosts. Biology (Basel). 2016;5(6):1–20.View ArticleGoogle Scholar
- Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32(5):1792–7.View ArticleGoogle Scholar
- Dierckxsens N, Mardulyn P, Smits G. NOVOPlasty: de novo assembly of organelle genomes from whole genome data. Nucleic Acids Res. 2017;45(4):e18.PubMedGoogle Scholar
- Stamatakis A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 2006;22(21):2688–90.View ArticleGoogle Scholar
- Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–80.View ArticleGoogle Scholar
- Kumar S, Stecher G, Tamura K. MEGA7: molecular evolutionary genetics analysis version 7.0 for bigger datasets. Mol Biol Evol. 2016;33(7):1870–4.View ArticleGoogle Scholar
- Saitou N, Nei M. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol. 1987;4(4):406–25.PubMedPubMed CentralGoogle Scholar
- Kappas I, Vittas S, Pantzartzi CN, Drosopoulou E, Scouras ZG. A time-calibrated Mitogenome phylogeny of catfish (Teleostei: Siluriformes). PLoS One. 2016;11(12):e0166988.View ArticleGoogle Scholar
- Ovcharenko I, Loots GG, Hardison RC, Miller W, Stubbs L. zPicture: dynamic alignment and visualization tool for analyzing conservation profiles. Genome Res. 2004;14(3):472–7.View ArticleGoogle Scholar