X. couchianus and X. hellerii genome models provide genomic variation insight among Xiphophorus species
© Shen et al. 2016
Received: 16 July 2015
Accepted: 30 December 2015
Published: 7 January 2016
Xiphophorus fishes are represented by 26 live-bearing species of tropical fish that express many attributes (e.g., viviparity, genetic and phenotypic variation, ecological adaptation, varied sexual developmental mechanisms, ability to produce fertile interspecies hybrids) that have made attractive research models for over 85 years. Use of various interspecies hybrids to investigate the genetics underlying spontaneous and induced tumorigenesis has resulted in the development and maintenance of pedigreed Xiphophorus lines specifically bred for research. The recent availability of the X. maculatus reference genome assembly now provides unprecedented opportunities for novel and exciting comparative research studies among Xiphophorus species.
We present sequencing, assembly and annotation of two new genomes representing Xiphophorus couchianus and Xiphophorus hellerii. The final X. couchianus and X. hellerii assemblies have total sizes of 708 Mb and 734 Mb and correspond to 98 % and 102 % of the X. maculatus Jp 163 A genome size, respectively. The rates of single nucleotide change range from 1 per 52 bp to 1 per 69 bp among the three genomes and the impact of putatively damaging variants are presented. In addition, a survey of transposable elements allowed us to deduce an ancestral TE landscape, uncovered potential active TEs and document a recent burst of TEs during evolution of this genus.
Two new Xiphophorus genomes and their corresponding transcriptomes were efficiently assembled, the former using a novel guided assembly approach. Three assembled genome sequences within this single vertebrate order of new world live-bearing fishes will accelerate our understanding of relationship between environmental adaptation and genome evolution. In addition, these genome resources provide capability to determine allele specific gene regulation among interspecies hybrids produced by crossing any of the three species that are known to produce progeny predisposed to tumor development.
Xiphophorus fishes have been used as an experimental vertebrate biomedical research model for nearly 90 years. Xiphophorus interspecies hybrids have been a long-standing experimental model for both spontaneous and UV or carcinogen induced melanoma [6, 7]. The first Xiphophorus interspecies backcross leading to spontaneous development of melanoma among interspecies backcross hybrids was described in 1927 . Since this time, many other interspecies crosses have been described that produce animals displaying genetic predisposition to various types of induced tumors (i.e., require treatment of backcross hybrids to develop melanoma), and these are still actively utilized experimental models for assessment of genetic interactions leading to tumor development [6, 7].
Due to this scientific history, and an ever increasing use of Xiphophorus in contemporary experimental biology, the Xiphophorus Genetic Stock Center (XGSC) was first established in the 1930’s and has remained in continuous operation as one of the oldest live animal resource centers worldwide. Twenty-four Xiphophorus species and 55 pedigreed lines are maintained in the XGSC and fish lines that have been sequenced for this study are available for research upon request [1, 9].
The X. maculatus Jp 163 A utilized for genome sequencing was a female derived from the 104th generation of sibling inbreeding within the XGSC. The X. maculatus Jp 163 A genome assembly comprises 20,640 scaffolds with an N50 of 1.3 Mb and the final assembled sequence length is 730 Mb . More recently, a extremely dense Rad-tag map (16,114 markers) scored from X. maculatus Jp 163 A (x) X. hellerii backcross has been produced and this meiotic map aligned with the genome assembly . Consolidation of the genome assembly and Rad-tag maps provides one of the most detailed and highly resolved gene maps for any vertebrate experimental model system. However, a single map remains problematic when one wishes to assess the contribution of each parental allele to complex traits that appear within interspecies backcross hybrids, such as the genes underlying induced melanoma.
Availability of new Xiphophorus genomic resources, coupled with the capability of producing fertile interspecies hybrids and ample polymorphic content among the varied Xiphophorus species, can fully unleash the potential of Xiphophorus as an experimental model for understanding the molecular basis of morphological and physiological differences, and the inheritance of complex traits. Herein, we report sequencing and genome assembly of X. hellerii, also known as a “green swordtail”, and X. couchianus commonly called the “Monterrey playfish”. These two species, in conjunction with X. maculatus, serve as parents in four distinct spontaneous and induced melanoma models, as well as a cross leading to increased incidence of induced retinoblastoma, neurofibrosarcoma, and Schwannoma [6, 11]. The two genome assemblies detailed herein, with the previously assembled X. maculatus genome, represent a system for assessing allele specific gene regulation and detailing gene-gene interactions within a varied array of Xiphophorus interspecies hybrids.
Results and discussion
Genome sequencing of X. couchianus and X. hellerii
Assembly statistics of genomes of three Xiphophorus species
N50 length (Mb)
GC content (%)
Total size (Mb)
N50 length (Mb)
Total size (Mb)
N50 length (Mb)
Total size (Mb)
New advances in sequencing technologies have greatly reduced the cost of genome sequencing but more importantly the algorithms designed to derive assemblies from short sequences has significantly improved. Here we show that within a genus high quality assemblies can be cost effectively derived from about half the traditional Illumina coverage (~100x) for de novo assembly. Thus, it is now possible to sequence and assemble all 23 remaining extant Xiphophorus species with significant cost savings. To provide the two new Xiphophorus genomes, we used an approach that combined de novo and reference-guided assemblies. Here we show two independent genome assemblies were built with all sequence data, using the SOAPdenovo2 assembler and an assisted assembly from roughly 52X total input sequence coverage in whole-genome shotgun reads, a combination of 30X fragments, and 17X 3 kb, and 5X 8 kb matepairs for X. hellerii; and 51X total sequence coverage in whole-genome shotgun reads, a combination of 29X fragments, 14X 3 kb, and 8X 8 kb matepairs for X. couchianus. It is important to follow our outlined iterative steps to ensure new within genus references are not a mere syntenic reflection of the genome reference used for assisted assembly. Therefore, we contend the proliferation of additional genome references within genus can be in most cases at least as high quality as the original reference that serves as a starting point.
Annotation of X. hellerii and X. couchianus genomes
Statistics of transcriptomes of three Xiphophorus species
# of gene models
N50 length (bp)
Average length (bp)
Total size (Mb)
There are several reasons why the RATT tool fails to transfer some gene models to new genomes. For example, there are 174 genes annotated in the X. maculatus genome that were not transferred to X. hellerii. Attempts to manually align these gene models failed for 15 of them, three of these gene models are located in contig breakpoints, 13 of them mapped to multiple locations and the remainder can be aligned but failed one of the quality control steps during RATT transfer. Gene models aligned to new genomes but not transferred by RATT may potentially be rescued through manual curation.
The opportunity to obtain a genome reference and corresponding gene set is most desired by biologists. Previously, genome annotation required expensive computational effort, yet with the RATT genome annotation approach, the computational demands of annotating a genome are greatly reduced. In our study it requires about 10 Gb of memory and four days of manual curation steps compared with weeks of gene annotation pipeline based approaches. However, significantly shorter computational times are forthcoming that promise to speed up methods such as MAKER . For the reference-based approach, there is no additional sequencing cost once the genome is sequenced and assembled, but we emphasize it does require a well-developed reference genome from a closely related species.
Sequence variations among Xiphophorus genomes
In a previous study based on de novo assembled transcriptomes, we estimated the frequency of SNCs between X. maculatus and X. couchianus to be about 1 base in every 700 bp , yet an observed 1 base in 69 bp polymorphism frequency seen in this study is considerably higher. Not surprisingly, base variation is more conserved in protein coding sequences and our sensitivity is elevated as a result of deeper sequence coverage of the entire genome in contrast to the previous method that only considered polymorphisms in the transcribed sequences . It will be necessary to further resequence X. couchianus populations to refine our preliminary estimates of genome variation.
Structural variation among Xiphophorus genomes
In addition to SNCs, we also identified inter-chromosomal rearrangements among species. To call an inter-chromosomal rearrangement event, at least a 20 kb sequence from a single de novo assembled contig must be aligned to two different chromosomes. In total, 24 inter-chromosomal rearrangement events are found between X. couchianus and X. maculatus and 4 events are found between X. hellerii and X. maculatus (Additional file 1: Table S1 and Additional file 2: Table S2). There are six times more genomic rearrangement events between X. maculatus and X. couchianus (24 vs. 4) than between X. maculatus and X. hellerii. This result does not agree with phylogenetic studies indicating X. maculatus and X. couchianus are less evolutionarily divergent. We note the X. couchianus contigs are on average longer than contigs of X. hellerii and thus more likely to detect chromosome breakpoints. With alternative computational methods for detecting large-scale variants based on paired-end reads such as Breakdancer  and LUMPY  and the resequencing of population individuals for each species, it should be possible to resolve the presence of large-scale rearrangements relative to the reference in future studies.
Single base variation predicted to impact protein function
Number and percentage of polymorphisms’ effects in X. couchianus and X. hellerii compared with the X. maculatus reference genome
Non synonymous coding
Loss of start codon
Gain of stop codon
Loss of stop codon
The overall landscape of effects of polymorphisms in X. hellerii is very similar to X. couchianus (Table 3). The overall rate of variants between X. hellerii and X. maculatus is higher than between X. couchianus and X. maculatus, in accord with previous studies that suggest X. hellerii is more distantly related to X. maculatus than to X. couchianus [13, 16].
To test for the distributional randomness of putatively high impact gene variants in the genome, we plotted the coordinates of affected genes (Fig. 2b). Of the 452 genes in X. couchianus (orange bars) and 1,505 genes (green bars) in X. hellerii that have high impact variants relative to X. maculatus, we found the position of genes to be randomly distributed and are correlated with the density of localized gene models (black bars, Fig. 2b). Among these genes, 55 of them (blue bars, see Additional file 3: Table S3 for a complete list) are shared between species, suggesting fixation in the genus and are of increased scientific interest. To better understand these conserved 55 genes with high impact variants in both X. couchianus and X. hellerii, we performed GO categorical and KEGG pathway enrichment tests. Among these genes, 15 of them are annotated as uncharacterized proteins and thus prevent further biological inference. For the remaining 40 genes, GO and KEGG pathway enrichment analyses show genes associated with categories that involve regulation of homeostasis (RYR2, CORIN, ADCYAP1R1, ITPR1, WNK2) and response to leucine (PIK3C3 and UBR1) to be significantly enriched (FDR < 0.01, Additional file 4: Figure S1 and Additional file 5: Table S4). These results may suggest evolution of the X. maculatus species dietary traits or preferences, or some environmental or physical parameter, that placed selective pressure on X. maculatus to alter its protein composition.
Conserved synteny among three Xiphophorus genomes
Analyses of transposable elements in Xiphophorus genomes
The genome of the platyfish, X. maculatus, was the first to provide an overview of the diversity and content of transposable elements in Poeciliid genomes . Most of the TE superfamilies were identified in the different classes and subclasses (LTR, LINE, DNA) and the most active families identified from transcriptome BLAST analyses were hAT transposons and RTE (especially Rex3) LINE retrotransposons. The sequencing of two other Xiphophorus species provides the ability to perform comparative genomics of TEs in closely related species. We took this advantage to complete the TE library by including an automatic TE detection and to compare the diversity, content and age of TEs in the three genomes.
Statistics of transposable elements in Xiphophorus genomes. Left panels: Genomes without filtration. Right panels: Genomes after removing small (less than 80 bp) and divergent (less than 80 % identify) TE elements
Coverage(%, no filtration)
The X. couchianus and X. hellerii genomes were analyzed using the same library. Incomplete sequences in X. maculatus were manually verified or completed before analyses. By comparison, the three Xiphophorus genomes seem to be very close in terms of diversity and content of TEs (Table 4) containing 21.38 % (X. maculatus), 21.13 % (X. hellerii) and 21.8 % (X. couchianus) of TEs, respectively.
For the three genomes, TE sequences smaller than 80 nucleotides and sharing less than 80 % identity with reference sequences from the library were discarded. After filtering, TEs comprised about 12 % of the genomes.
We also searched for TEs in the inferred transcriptomes. We found that 5 to 6 % of the transcriptomes are derived of TEs. This result is quite similar to the 4.8 % previously found for X. maculatus . The most represented families are Tc-Mariner and hAT, as observed in the genome, followed by Jockey, LINE2, Rex-Babar and Helitron. Some superfamilies are not found in the transcriptomes, such as Copia retrotransposons.
Finally, we represented the quantity (Log[content %]) of each superfamily in both the genome and transcriptome, in a spider graph to observe the relationship between genome copy number and TE quantity in transcriptomes. In the case of basal transcription, we expect proportionality between the number of copies in the genome and the quantity of copies in the transcriptome. A family with a high copy number in the genome should be highly represented in the transcriptome. In this way, we highlight superfamilies that could be over-represented in transcriptomes compared to their respective quantity in genomes.
At first glance, genome and transcriptome spider graphs look very similar. For the three species, the most abundant superfamilies in the genomes are Tc-Mariner, hAT, L2, Rex-Babar, PIF-Harbinger and RTE. In transcriptomes, Tc-Mariner, hAT, I-Nimb-Jockey, L2, Rex-Babar and Helitron are the most represented superfamilies. Our spider graphs show that Tc-Mariner, hAT, L2 and Rex-Babar are indeed highly repeated in genomes and represented in transcriptomes. Many copies of these families are probably still active since they are located in recent bursts (Fig. 4b). We can point out interesting cases, as PIF-Harbinger, PiggyBac, L1, RTE or BEL-Pao that are more represented in transcriptomes. This is also the case for Academ transposons in the southern platyfish. Those could be real expression and not basal transcription. However, this requires more rigorous testing. Inversely, for Jockey and MITE, we observe an under-representation in the transcriptomes.
In the work presented a variety of genomic and transcriptomic resources and methods were employed to sequence, assemble and compare genomes of two new Xiphophorus species, X. couchnianus and X. hellerii, with that of X. maculatus Jp 163A.
The traditional strength of the Xiphophorus experimental model involves the non-biased assessment of genetic inheritance patterns associated with complex phenotypes within intact animals. The high genetic variability among Xiphophorus species and capability of producing fertile interspecies hybrids allows inheritance of any observable trait to be followed into individual backcross hybrid progeny.
Improvement of genomic capabilities for the Xiphophorus genetic system, as undertaken herein, promises to produce new fundamental knowledge regarding shifts in the genetic regulation within interspecies hybrids that produce altered gene expression patterns in complex traits. The genome sequences and assemblies for the species utilized herein (X. maculatus, X. couchianus, and X .hellerii) will allow researchers the capability to mechanistically dissect traits that appear among progeny from interspecies crosses between any pair of these three species. For example, interspecies crosses between pairs of these three species are known to produce several distinct experimental models for induction and progression of melanoma [5, 6]. The ability to obtain both the genome and transcriptome sequences of both parental species involved in an interspecies cross will allow unequivocal assessment of the expression of every allele, from either parent, within individual F1 or backcross hybrid progeny.
The large-scale identification of polymorphisms in genomes provides researchers with resources to further investigate and characterize Poeciliid genomes and to provide more precise analyses of genetic diversity and speciation. Such information is crucial to identification of key regulators of important complex biological traits, such as the etiology of pigment pattern compartmentalization and adaptation to divergent environmental conditions and stressors. Previous studies in Xiphophorus have associated several traits to defined DNA segments in the genome. The tumor suppressor of interspecies hybrid melanoma, termed Diff, or R , the P locus, controlling age and size at sexual maturation , and the various mechanisms employed by different Xiphophorus species for sexual differentiation serve as a few examples of well defined complex traits that can be better understood with structural characterization of the genomic regions from new species. Historically, the lack of good genetic markers has prevented fine mapping the structural regions harboring loci associated with these interesting biological events. The newly sequenced and assembled genomes and ample polymorphisms identified present opportunity to define the size of the effective genomic regions and to highlight gene candidates. Altogether, the benefits of having three high quality genomes may represent a key to finding answers of many long-standing biological questions in Xiphophorus.
All fishes utilized were supplied by the Xiphophorus Genetic Stock Center, Texas State University, San Marcos, TX (http://www.Xiphophorus.txstate.edu). The X. maculatus Jp 163 A [pedigree Jp 163 A104(A)] was in its 104th generation of sibling inbreeding, while the X. couchianus [pedigree Xc77(B)] was in its 77th generation of inbreeding. The X. hellerii (Sarabia) [pedigree 11317] stock is maintained by reciprocal cross breeding between two distinct X. hellerii strains differing by sword color (orange or green sword). In all cases, a single female was utilized for DNA isolation as described in . X. maculatus, a Southern platyfish, was originally collected in 1939 from the Rio Jamapa in Veracruz, MX. Representatives of this species have also been found in several places throughout Mexico ranging southward to Guatemala (Fig. 1). The Northern platyfish, X. couchianus, was collected in 1961 near Nuevo Leon, MX, and due to urban expansion is very likely extinct in the wild. The swordtail, X. hellerii, was originally collected in 1963 from Rio Sarabia, Oaxaca, MX. This species exhibits a very large range from central Mexico southward to Honduras (http://www.Xiphophorus.txstate.edu/stockcenter/stockcentermanual.html).
All animal studies were approved by the Texas State University Institutional Animal Care and Use Review Board (IACUC protocol # 201498170). All fish used in this study were from aquaria housed stock and were kept and sampled in accordance with the applicable national legislation regulations governing animal experimentation
Genome sequencing and assembly
Genomic DNAs of X. couchianus and X. hellerii were sequenced on an Illumina Hiseq2000 platform using libraries with tiered insert sizes from 300 bp to 8 kb. After standard quality filtering steps, over 700 and 360 million 100 bp paired-end reads were obtained for X. couchianus and X. hellerii, respectively. Genome assembly occurred in three phases, first de novo assembly of all sequences using SOAPdenovo  (Additional files 8 and 9), assisted assembly using phased alignment to the X. maculatus reference and finally a merge of the two independent assemblies. The assembly methods utilized are similar to those used in . This later merge process ensures unaligned sequences are incorporated as de novo assembled contigs or scaffolds, following strict alignment criteria . Prior to assembly submission each assembly is gap filled and cleaned of vector and contaminating contigs.
De novo assembled contigs and Illumina reads were aligned to the X. maculatus reference genome with a novel multi-phase aligner (SRprism; ftp://ftp.ncbi.nlm.nih.gov/pub/agarwala/srprism), and then using a heuristic governed space, search attempts were made to fill scaffold gap space. SRprism reported that all alignments were of equally good quality. Filtering was performed by first identifying the histogram for per library insert size observed in alignments, deciding which range to use (usually the tightest or 99th percentile), and then by retaining paired reads that had the correct orientation and an insert size in the desired range. Next, the filtered reads were mapped to build consensus contigs, by locating consecutive contigs that were bridged by mate pairs having 30 mers each side of the gap. Then de novo assembly in gaps was performed between bridged contigs, and 30 mers from reads were used to build an index for de novo assembly. Only filtered reads and reads mapped to contig ends went into the de novo assembly index. Predefined maximum gap size and the number of iterations were used to limit resources spent on any particular gap. A final step was to find structural differences between built scaffolds and the reference using paired reads with mates on different scaffolds and then to perform de novo gap filling between reordered scaffolds. Overall, the scaffold level genomes of X. couchianus and X. hellerii consisted of 45,442 and 71,868 scaffolds with total size of 715 Mb and 746 Mb, respectively.
To allocate assembled scaffolds to chromosomes, the existing X. maculatus genome with 24 cytogenetically identified chromosomes  was used as the reference to order and orient scaffolds for X. couchianus and X. hellerii using Nucmer3.0 , with the parameters of minimum cluster match length of 400 bp and max gap size of 500 bp. After Nucmer alignment, de novo assembled scaffolds of X. couchianus and X. hellerii were placed using a custom Perl script based on nucleotide alignment position of X. couchianus and X. hellerii scaffolds relative to the X. maculatus chromosomes. Scaffolds or contigs that could not be placed onto chromosomes were collected into a file called “unplaced”.
Sequences and accession codes
Xiphophorus hellerii (Sarabia)
Genome assembly annotations for both genomes and AGP files for ordered scaffolds are available at the Xiphophorus Genetic Stock Center webpage (http://www.xiphophorus.txstate.edu/).
For the X. maculatus transcriptome, cDNA sequences were downloaded from Ensembl (Build 71). Manually annotated genes (569) were compared with the Ensembl transcriptome, and sequences missing from Ensembl were added to the enhanced version of the X. maculatus transcriptome. To build the transcriptomes of X. couchianus and X. hellerii, scaffold version genomes of these two query species were aligned to the X. maculatus genome using Nucmer3.0  with parameters implemented by Rapid Annotation Transfer Tool (RATT)  for transferring annotations between species.
Using RATT  synteny between the reference and the query, InDels were established and identified between species to avoid frame shifts between two species. Gene models from X. maculatus were then transferred and corrected onto X. couchianus and X. hellerii genomes by RATT. The total of 20,482 gene models annotated in X. maculatus, resulted in transfer of 20,300 and 20,325 gene models over to X. couchianus and X. hellerii, respectively. Custom Perl scripts were used to make RATT executable on multiple threads and convert the RATT output to the latest EMBL format implementation.
To analyze conserved syntenies between species, we constructed dot plots based on orthologs identified by RATT lift over results and reciprocal best-BLAST alignment of transcriptomes. Positive orthologs and paralogs were plotted on the chromosomes based on the coordinates of the same species and the chromosome index of the other species.
Identification and annotation of variants among Xiphophorus species
To identify genome wide variants, sequences of each species were trimmed using Flexbar  and were aligned to the reference assembly from which they were derived using BWA-mem . Varscan 2.3  was used to detect Single Nucleotide Changes(SNC) and InDels from alignment results with minimum coverage of three reads and a p-value cutoff of 0.1.
For all variants the potential altered protein functions were predicted using SnpEff 3.3 h . The high impact variants are defined as causing one of the following events: chromosome (over 1 % of the chromosome), exon deleted, frame shift, rare amino acid, splice site acceptor, splice site donor, stop lost, start lost and stop gained .
Pathway enrichment analyses
Genes with high impact variants identified among three Xiphophorus species were further tested for significant association with known canonical pathways. We define variants that change coding sequences among species as critical variants. HGNC gene symbols annotated from Ensembl or top BLAST hit (NBCI non-redundant protein database, e-value cutoff E-10) for genes that were not annotated in Ensembl were used for functional analyses. The WEB-based GEne SeT AnaLysis Toolkit (WebGestalt) database was used for functional characterization and classification of gene symbols harboring high impact variants . Enriched functional groups and pathways were identified by the Benjamini & Hochberg method for Multiple Test Adjustment .
Analyses of transposable elements
Transposable elements were investigated in the three genomes and transcriptomes using the previously established library . This library was further enhanced by automatic annotation using RepeatScout  and RepeatModeler (http://www.repeatmasker.org/RepeatModeler) employing default parameters. All detected sequence redundancies were discarded. Genome assemblies and transcriptomes were masked using RepeatMasker 3.3.0 [A.F.A. Smit, R. Hubley & P. Green, unpublished data] with default parameters, and RepeatMasker outfiles (“.out”) were parsed, using a custom perl script, to establish repeat coverage and copy numbers. The number and coverage of repeat sequences smaller than 80 nucleotides and with less than 80 % of identity with the reference sequence were also established to determine the quantity of small sequences in Xiphophorus genomes. Kimura distances between genome sequences were calculated to evaluate the age (divergence) of TE copies. This analysis assumes that most TE copies would be silenced by the host genome after insertions and would accumulate neutral mutations. The proportions of transversions (corresponding to purine-purine or pyrimidine-pyrimidine mutations, noted “q”) and transitions (purine-pyrimidine mutations, noted “p”) were calculated based on the alignment between genome copies and sequences that match in the library. Rates of transversions and transitions were transformed as Kimura distances using [K = − ½ ln(1 – 2p – q) – ¼ ln(1 – 2q)].
We thank the Xiphophorus Genetic Stock Center, Texas State University, for maintaining the pedigreed fish lines, helping with dissections, and caring for the animals used in this study. This work was supported by the National Institutes of Health, Division of Comparative Medicine, R24 OD-011120 and R24 OD-011198, R24 OD-011199, R24 OD-018555 and Natural Science Foundation of Fujian Province of China (No.2015J05074).
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Kallman KD, Kazianis S. The genus Xiphophorus in Mexico and central america. Zebrafish. 2006;3(3):271–85.PubMedView ArticleGoogle Scholar
- Schartl M, Walter RB, Shen Y, Garcia T, Catchen J, Amores A, et al. The genome of the platyfish, Xiphophorus maculatus, provides insights into evolutionary adaptation and several complex traits. Nat Genet. 2013;45(5):567–72.PubMedView ArticleGoogle Scholar
- Fraser BA, Kunstner A, Reznick DN, Dreyer C, Weigel D. Population genomics of natural and experimental populations of guppies (Poecilia reticulata). Mol Ecol. 2015;24(2):389–408.PubMedView ArticleGoogle Scholar
- Shen Y, Catchen J, Garcia T, Amores A, Beldorth I, Wagner J, et al. Identification of transcriptome SNPs between Xiphophorus lines and species for assessing allele specific gene expression within F(1) interspecies hybrids. Comp Biochem Physiol C Toxicol Pharmacol. 2012;155(1):102–8.PubMedPubMed CentralView ArticleGoogle Scholar
- Shen Y, Garcia T, Pabuwal V, Boswell M, Pasquali A, Beldorth I, et al. Alternative strategies for development of a reference transcriptome for quantification of allele specific expression in organisms having sparse genomic resources. Comp Biochem Physiol Part D Genomics Proteomics. 2013;8(1):11–6.PubMedPubMed CentralView ArticleGoogle Scholar
- Walter RB, Kazianis S. Xiphophorus interspecies hybrids as genetic models of induced neoplasia. ILAR J. 2001;42(4):299–321.PubMedView ArticleGoogle Scholar
- Nairn RS, Kazianis S, Della Coletta L, Trono D, Butler AP, Walter RB, et al. Genetic analysis of susceptibility to spontaneous and UV-induced carcinogenesis in Xiphophorus hybrid fish. Mar Biotechnol (NY). 2001;3(Supplement 1):S24–36.View ArticleGoogle Scholar
- Kosswig C. Uber bastarde der teleostier Platypoecilus und Xiphophorus. Zeitschrift fur induktive Abstammungs- und Vererbungslehre. 1927;44:253.Google Scholar
- Walter RB, Hazelwood L, Kazianis S, editors. The Xiphophorus Genetic Stock Center Manual. 1st ed. San Marcos: Texas State University; 2006.Google Scholar
- Amores A, Catchen J, Nanda I, Warren W, Walter R, Schartl M, et al. A RAD-tag Genetic Map for the Platyfish (Xiphophorus maculatus) Reveals Mechanisms of Karyotype Evolution Among Teleost Fish. Genetics. 2014;197:625–41.PubMedPubMed CentralView ArticleGoogle Scholar
- Walter RB, Ju Z, Martinez A, Amemiya C, Samollow PB. Genomic resources for Xiphophorus research. Zebrafish. 2006;3(1):11–22.PubMedView ArticleGoogle Scholar
- Layer RM, Hall IM, Quinlan AR. LUMPY: A probabilistic framework for structural variant discovery. arXiv preprint arXiv:12102342 2012.Google Scholar
- Jones JC, Perez-Sato JA, Meyer A. A phylogeographic investigation of the hybrid origin of a species of swordtail fish from Mexico. Mol Ecol. 2012;21(11):2692–712.PubMedView ArticleGoogle Scholar
- Otto TD, Dillon GP, Degrave WS, Berriman M. RATT: Rapid Annotation Transfer Tool. Nucleic Acids Res. 2011;39(9):e57.PubMedPubMed CentralView ArticleGoogle Scholar
- Chen K, Wallis JW, McLellan MD, Larson DE, Kalicki JM, Pohl CS, et al. BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods. 2009;6(9):677–81.PubMedPubMed CentralView ArticleGoogle Scholar
- Kang JH, Schartl M, Walter RB, Meyer A. Comprehensive phylogenetic analysis of all species of swordtails and platies (Pisces: Genus Xiphophorus) uncovers a hybrid origin of a swordtail fish, Xiphophorus monticolus, and demonstrates that the sexually selected sword originated in the ancestral lineage of the genus, but was lost again secondarily. BMC Evol Biol. 2013;13.Google Scholar
- Howe K, Clark MD, Torroja CF, Torrance J, Berthelot C, Muffato M, et al. The zebrafish reference genome sequence and its relationship to the human genome. Nature. 2013;496(7446):498–503.PubMedPubMed CentralView ArticleGoogle Scholar
- Catchen JM, Braasch I, Postlethwait JH. Conserved synteny and the zebrafish genome. Methods Cell Biol. 2011;104:259–85.PubMedView ArticleGoogle Scholar
- Kasahara M, Naruse K, Sasaki S, Nakatani Y, Qu W, Ahsan B, et al. The medaka draft genome and insights into vertebrate genome evolution. Nature. 2007;447(7145):714–9.PubMedView ArticleGoogle Scholar
- Naruse K, Tanaka M, Mita K, Shima A, Postlethwait J, Mitani H. A medaka gene map: the trace of ancestral vertebrate proto-chromosomes revealed by comparative gene mapping. Genome Res. 2004;14(5):820–8.PubMedPubMed CentralView ArticleGoogle Scholar
- Anders F. Contributions of the Gordon-Kosswig melanoma system to the present concept of neoplasia. Pigment Cell Res. 1991;4(1):7–29.PubMedView ArticleGoogle Scholar
- Kallman KD. How the Xiphophorus problem arrived in San Marcos, Texas. Mar Biotechnol. 2001;3:S6–16.PubMedView ArticleGoogle Scholar
- Luo R, Liu B, Xie Y, Li Z, Huang W, Yuan J, et al. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience. 2012;1(1):18.PubMedPubMed CentralView ArticleGoogle Scholar
- Kurtz S, Phillippy A, Delcher AL, Smoot M, Shumway M, Antonescu C, et al. Versatile and open software for comparing large genomes. Genome Biol. 2004;5(2):R12.PubMedPubMed CentralView ArticleGoogle Scholar
- Dodt M, Roehr JT, Ahmed R, Dieterich C. FLEXBAR-Flexible Barcode and Adapter Processing for Next-Generation Sequencing Platforms. Biology. 2012;1(3):895–905.PubMedPubMed CentralView ArticleGoogle Scholar
- Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25(14):1754–60.PubMedPubMed CentralView ArticleGoogle Scholar
- Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, Lin L, et al. VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res. 2012;22(3):568–76.PubMedPubMed CentralView ArticleGoogle Scholar
- Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w(1118); iso-2; iso-3. Fly. 2012;6(2):80–92.PubMedPubMed CentralView ArticleGoogle Scholar
- Wang J, Duncan D, Shi Z, Zhang B. WEB-based GEne SeT AnaLysis Toolkit (WebGestalt): update 2013. Nucleic Acids Res. 2013;41(Web Server issue):W77–83.PubMedPubMed CentralView ArticleGoogle Scholar
- Benjamini Y, Hochberg Y. Controlling the False Discovery Rate - a Practical and Powerful Approach to Multiple Testing. J Roy Stat Soc B Met. 1995;57(1):289–300.Google Scholar
- Price AL, Jones NC, Pevzner PA. De novo identification of repeat families in large genomes. Bioinformatics. 2005;21 Suppl 1:i351–8.PubMedView ArticleGoogle Scholar