- Research article
- Open Access
Characterization of the heart transcriptome of the white shark (Carcharodon carcharias)
© Richards et al.; licensee BioMed Central Ltd. 2013
- Received: 25 February 2013
- Accepted: 20 September 2013
- Published: 11 October 2013
The white shark (Carcharodon carcharias) is a globally distributed, apex predator possessing physical, physiological, and behavioral traits that have garnered it significant public attention. In addition to interest in the genetic basis of its form and function, as a representative of the oldest extant jawed vertebrate lineage, white sharks are also of conservation concern due to their small population size and threat from overfishing. Despite this, surprisingly little is known about the biology of white sharks, and genomic resources are unavailable. To address this deficit, we combined Roche-454 and Illumina sequencing technologies to characterize the first transciptome of any tissue for this species.
From white shark heart cDNA we generated 665,399 Roche 454 reads (median length 387-bp) that were assembled into 141,626 contigs (mean length 503-bp). We also generated 78,566,588 Illumina reads, which we aligned to the 454 contigs producing 105,014 454/Illumina consensus sequences. To these, we added 3,432 non-singleton 454 contigs. By comparing these sequences to the UniProtKB/Swiss-Prot database we were able to annotate 21,019 translated open reading frames (ORFs) of ≥ 20 amino acids. Of these, 19,277 were additionally assigned Gene Ontology (GO) functional annotations. While acknowledging the limitations of our single tissue transcriptome, Fisher tests showed the white shark transcriptome to be significantly enriched for numerous metabolic GO terms compared to the zebra fish and human transcriptomes, with white shark showing more similarity to human than to zebra fish (i.e. fewer terms were significantly different). We also compared the transcriptome to other available elasmobranch sequences, for signatures of positive selection and identified several genes of putative adaptive significance on the white shark lineage. The white shark transcriptome also contained 8,404 microsatellites (dinucleotide, trinucleotide, or tetranucleotide motifs ≥ five perfect repeats). Detailed characterization of these microsatellites showed that ORFs with trinucleotide repeats, were significantly enriched for transcription regulatory roles and that trinucleotide frequency within ORFs was lower than for a wide range of taxonomic groups including other vertebrates.
The white shark heart transcriptome represents a valuable resource for future elasmobranch functional and comparative genomic studies, as well as for population and other biological studies vital for effective conservation of this globally vulnerable species.
- White shark
- Carcharodon carcharias
- Heart transcriptome
- Positive selection
Cartilaginous fishes (Class Chondrichthyes: sharks, skates, rays, chimaeras) provide a notable example of successful evolutionary perseverance, with a fossil record extending to at least the Lower Devonian over 400 million years ago . Given their extraordinary evolutionary history and basal phylogenetic origin relative to other jawed vertebrates, chondrichthyians have been proposed as an important comparative model for understanding vertebrate genome evolution in general and various specific evolutionary and mechanistic aspects of vertebrate development, physiology and immune function [2–5].
One group of chondrichthyians, the modern sharks (subclass Elasmobranchii), comprise over 500 extant species displaying an impressive diversity of form and function, including a broad spectrum of sizes (e.g. 20-1200 cm as adults), functional morphologies (e.g. fusiform heads to the novel, widened heads of hammerhead sharks), physiology (e.g. ectothermy to regional endothermy), reproduction (e.g. egg laying to live births) and habitat use (marine to freshwater; shallow waters to abyssal depths). Sharks have also become a major target of human exploitation for their fins , resulting in widespread concerns that their rapidly declining populations coupled with unique life history characteristics will not permit recovery if ongoing exploitation rates continue.
Despite representing a major vertebrate lineage of evolutionary uniqueness and ecological and conservation importance, sharks remain the least explored vertebrate group at the genome level. The handful of genome level studies conducted on sharks have already revealed some distinctive features, including the absence of the HoxC cluster of developmental pattern genes found in all other non-elasmobranch vertebrate lineages , and the presence of a substantial number of expressed sequence tags for which no homologues in other organisms could be identified . These apparent distinctions hint that other genomic novelties are possible in this lineage and await discovery.
The white shark, Carcharodon carcharias (Lamnidae), a large apex predator, is one of the highest-profile marine species, capturing extraordinary attention from the public and media. Although it demonstrates a cosmopolitan distribution, the species is believed to have a low abundance throughout its range, leading to international concerns about its conservation (IUCN Red List Category: Vulnerable A2cd+3cd) in the face of known market utilization for its body parts and widespread shark overfishing practices [9–11]. Arguably, the white shark may be a “poster child” for marine, large animal conservation attention. The white shark also possesses some notable physical and physiological characteristics that make it an interesting biological study, including an estimated genome size (C-value = 6.45 pg) nearly twice that of humans, large adult sizes reaching up to ~6 m in length, a thermal regulatory capability uncommon in fishes, a slow reproductive cycle with oophagous embryos, extensive migratory capabilities, and an ability to utilize a wide thermal niche including diving to near 1000 m depths [12–14].
Despite the high public profile of white sharks, their serious conservation needs, and their noteworthy evolutionary and life-history characteristics, this species is still largely uncharacterized at the molecular level, and no genomics resources for it exist. Given the white shark’s rather large genome size, a transcriptome characterization using next-generation sequencing technology provides a tractable entry into providing the first genomic view and genome resource for this remarkable species. However, obtaining white shark tissue is extremely difficult (see Methods), and as a consequence our study was restricted to one tissue type (heart) from one individual. This precluded examination of expression differences among tissue types, and we acknowledge the obvious limitation of a single transcriptome that may not be typical of the species.
Typically, de-novo transcriptomes for non-model organisms where no reference genome exists have been obtained using Roche 454 pyrosequencing technology because of the generation of longer sequencing reads e.g. [15–22]. However, recent advances in de-novo assembly for shorter Illumina reads are now making this approach a more viable alternative . In addition, some workers have combined both approaches e.g. [15, 24], and here we adopt this latter approach for deriving the first transcriptome dataset for the white shark. Specifically, Illumina reads were aligned to 454 contigs to produce a 454/Illumina consensus sequence. By utilizing the strengths of both sequencing technologies, this approach yielded a considerable increase (~20%) in transcriptome annotation when compared to 454 alone. We utilize this sequence dataset to provide a general characterization of the heart transcriptome with regards to gene discovery and annotation, identification and characterization of multiple microsatellite markers, and detection of genes under positive selection.
Roche 454 sequencing of the white shark heart cDNA produced 665,399 reads ranging in size from 100-931 bp (median = 387 bp) for a total of 240,894,914 bp. The de-novo assembly produced 141,626 contigs (unigenes) ranging in size from 101–12,997 bp, with a mean of 503 bp. The distribution of the number of reads per contig was as follows: 87,500 contigs (62%) = 1 read (singletons), 37,915 contigs (27%) = 2–5 reads, 6,595 contigs (5%) = 6–10 reads, and 9,616 contigs (7%) >10 reads (max = 568). The Illumina HiSeq run produced 78,566,588 100 bp reads. Aligning these data to the 454 contigs produced 105,014 454/Illumina consensus sequences (36,612 454 contigs lacked a consensus sequence). A total of 86,785 (82.6%) of the consensus sequences contained an ORF of 20 amino acids or longer. Of the 454 contigs lacking a 454/Illumina consensus sequence, 3,432 (9.4%) were non-singletons and 2,750 contained an ORF of 20 amino acids or longer. These ORFs were combined with the 86,785 ORFs obtained from the consensus sequences resulting in a total of 89,535 ORFs that were subsequently annotated. For purposes of quantitative evaluation of our combined 454/Illumina approach (e.g. number and length of contigs and number of annotated ORFs), we also processed the 454 data exclusively. Non-singleton 454 contigs (54,126) contained 52,841 ORFs of 20 amino acids or longer (97.6%). The 454 and Illumina derived short read files were deposited in the Sequence Read Archive at NCBI under the study accession number SRP016555. The 454 contigs, 454/Illumina consensus sequences, and 454/Illumina consensus ORFs (89,535) are included as Additional files 1, 2, and 3 respectively.
For a 454 contig, if there were nucleotide sites lacking consensus with Illumina data (possibly due to lack of coverage), the consensus sequence would contain Ns at the relevant positions. This in turn would lead to Xs (unspecified or unknown amino acids) in the subsequent translated ORF. The 86,785 ORFs generated from the 105,014 consensus sequences contained 7,674,130 amino acids (AA) including 783,158 Xs (10.2%). To place this apparent loss of AA data in perspective, the 52,841 ORFs generated from the 454 data alone, contained 5,579,487AA. Therefore, despite the ~10% loss of AA data in the consensus approach, we were still able to generate 1,311,485 more AA data, an increase of approximately one third, using the combined platform approach.
Annotation and comparative gene ontology
The ORF annotation was performed by searching the Swiss-Prot database using BLAST2GO , and a total of 21,019 consensus ORFs (23.5%) had blast hits with the database, with 19,277 (21.5%) of these receiving annotation with Gene Ontology (GO) terms (see Additional file 4). In comparison, 16,996 454-derived ORFs (32.2%) had blast hits with the Swiss-Prot database, with 15,597 (29.5%) annotated with GO terms (see Additional file 5). Consequently, although the mean ORF length for the consensus data was lower, there was a considerable increase in the number of annotated ORFs obtained (approximately one fifth), highlighting the improvement gained when 454 and Illumina data are combined.
For the Molecular Function domain (Figure 3), comparison to the zebrafish showed 20 (32%) GO terms to be significantly enriched (i.e. had a significantly higher proportion of ORFs assigned) in the white shark, whereas comparison to human showed 18 (29%) terms to be enriched. There were 11 terms enriched in the zebrafish comparison that were not enriched in the human comparison. In general, these enriched terms described ion/nucleic acid/RNA binding, and enzyme/peptidase/nuclease/hydrolase/electron carrier activity. In turn, there were nine terms enriched in the human comparison that were not enriched in the comparison to zebrafish. In general, these terms described pyrophosphatase/phosphotransferase/hydrolase/nucleoside/transferase/kinase activity. While many of these enzymatic terms are likely involved in metabolic processes, two terms for the zebrafish comparison are perhaps particularly noteworthy: electron carrier and peptidase activity. Enrichment in these may again reflect the endothermic nature of the white shark. For example, electron carrier term enrichment suggests elevated oxidative metabolism, which is consistent with the increased energetic needs of an endothermic physiology, a continuous swimming lifestyle (required to obtain sufficient ventilation and hydrostatic lift) and the very long distance migratory capability of white sharks [29, 30]. Enrichment for the peptidase activity term suggests increased digestive rates in white sharks, consistent with previous hypotheses of this capability based on the elevated temperatures observed in the stomach and other viscera of white sharks .
The Cellular Component domain (Figure 4) describes where a gene product is active; and a notably large number of GO terms in this domain were enriched for the white shark: 56% were enriched compared to human and 77% were enriched compared to zebrafish. This unexpectedly large difference in GO term enrichment in the white shark - zebrafish comparison compared to the white shark - human comparison also hints at the possibility that a component of the white shark transcriptome may be more similar to human than zebrafish. Similarities between another chondrichthyian and humans were also apparent in the genome sequence comparisons of Venkatesh et al. [4, 31], in which the elephant shark (a chimaera; subclass Euchondrocephali) surprisingly shared a higher degree of gene synteny and more conserved non-coding elements (CNEs) with humans than with either the zebrafish or puffer fish (Fugu rubripes).
Expansion of trinucleotide repeats within ORFs has been implicated in human neurodegenerative disorders and some cancers [43–47]. Notably, elasmobranchs allegedly have the lowest incidence of malignant neoplasia (tumors) of any vertebrate group , although this claim remains controversial due to a lack of sufficient study . If further studies demonstrate that elasmobranchs do indeed have a lower susceptibility to cancer, the relatively lower proportion of trinucleotide microsatellite repeats within ORFs, as seen here for the white shark, may provide a genetic mechanism hypothesis for further exploration.
There were 1,600 ORFs that contained one or more microsatellite (1,888 microsatellites in total). Of these, 1,331 ORFs contained one or more dinucleotide repeat, 255 ORFs contained one or more trinucleotide repeat, and 14 ORFs contained one or more tetranucleotide repeat. A total of 413 (~ 26%) of these microsatellite-containing ORFs were annotated (motif distribution: di = 284, tri = 122, and tetra = 7). For these ORFs, we investigated whether any of the GO terms assigned to them, appeared in significantly higher proportions (i.e. were relatively enriched) compared to the remainder of the transcriptome’s non-microsatellite containing ORFs. For ORFs containing dinucleotide or tetranucleotide repeats, a Fisher test showed that no GO term was significantly enriched (FDR = 0.05). For ORFs containing trinucleotide repeats, however, terms within the Molecular Function domain (nucleic acid/DNA binding and transcription factor/regulator activity) and Cellular Component domain (nucleoplasm) were significantly enriched.
The Molecular Function domain enriched terms described gene products that (i) interacted selectively and non-covalently with nucleic acids, and (ii) interacted selectively and non-covalently with specific DNA sequences in order to modulate transcription. These results suggest that white shark ORFs containing trinucleotide repeats may have regulatory roles involved in the control of transcription (see Additional file 7 for a list of these ORFs).
Finally, the large pool of microsatellites discovered here provides the potential to greatly expand the limited microsatellite marker resources available for this vulnerable species. To this end and as part of a separate study, we are developing microsatellite PCR primers on a global set of white shark fin tissue samples. To date, we have tested 35 loci (mostly dinucleotide and trinucleotide repeats). Of these, 14 are scorable (an individual can be genotyped), suggesting good prospects for the development of additional loci (A. Bernard, VPR, MJS, MSS; data not shown).
We searched the white shark transcriptome for genes showing signs of positive selection by comparing it to embryo transcriptomes of two additional elasmobranch species: Scyliorhinus canicular (cat shark) and Leucoraja erinacea (little skate). For each of the three species, we tested each species’ lineage for positive selection using the branch-site test as implemented in PAML . Before correction for multiple testing, there were ten, three, and five genes on the white shark, cat shark, and skate lineages respectively that had significant results for positive selection (Additional file 8 shows results for white shark). After correction (FDR = 0.05), four white shark genes remained significant: UN031816 (TIP41-like protein), UN034361 (mediator of RNA polymerase II transcription subunit 20), UN050025 (protein MIS12 homolog), and UN034642 (uncharacterized protein C12orf12 homolog). None of the cat shark or skate genes remained significant after correction for multiple testing.
In yeast, TIP41 indirectly regulates cell growth by regulating SIT4 (serine/threonine-protein phosphatase 2A) activity . More specifically, when nutrients such as nitrogen or carbon are abundant, the rapamycin-sensitive TOR signaling pathway promotes binding of the inhibitor protein TAP42 to SIT4 thereby inhibiting its activity. However, when nutrients are low, TOR does not promote binding of TAP42 and the inhibitor disassociates from SIT4 and TIP41 binds to the inhibitor, which in turn permits SIT4 activity. The regulatory role of protein phosphatases is debated in the literature, and they may function to both up and down regulate cell growth . Furthermore, given the important role of these enzymes in cell growth, they have been actively studied by cancer researchers, with some studies suggesting that they might possess tumor suppressive capabilities . However, other studies have emphasized their requirement for active cell growth and survival . Nevertheless, the finding here of positive selection for a white shark gene involved in their regulation warrants further investigation given (i) an apparent low incidence of malignant tumors reported for elasmobranchs , (ii) the high levels of nitrogen (urea) in shark tissue, and (iii) the unique ability of elasmobranchs when compared to higher vertebrates to regenerate kidney tissue [57, 58].
Mediator of RNA polymerase II transcription subunit 20 is a component of the Mediator complex. This large multi protein complex, which is conserved among eukaryotes, binds RNA polymerase II and regulates transcription of class II genes [59–61]. In addition to controlling cell growth, the TOR signaling pathway has also been implicated in the regulation of transcription. For example, in yeast, TOR limits transcription when nitrogen levels are low . Perhaps the elevated level of urea in shark tissue is a factor contributing to positive selection for the Mediator subunit 20.
Comparison of GO terms for genes under positive selection for the white shark, two teleosts, and six mammals
(shared between teleosts and mammals)
Response to stimulus
Regulation of transcription from RNA polymerase II promoter
Response to stress* or wounding**
Immune system process, Immune response
Regulation of transcription DNA-dependent
(shared between teleosts and mammals)
DNA binding, or mismatch DNA binding
RNA polymerase II transcription mediator activity
Protein binding**, chemokine receptor binding**, interleukin binding**, interleukin-1 receptor binding*
DNA-directed RNA polymerase activity
Metal ion binding
Utilizing an approach that combined Roche 454 and Illumina sequencing technologies, we assembled and characterized the first white shark transcriptome. This combined approach yielded a considerable improvement over Roche 454 technology alone, generating 21,019 annotated transcripts. The white shark transcriptome is a valuable resource that adds to the currently nascent field of cartilaginous fish genomics and provides a reference for characterization of genomic datasets from other elasmobranchs, which we anticipate will emerge with increasing frequency. This resource also provides the first large-scale view of the gene content of a major marine apex predator that displays a collection of remarkable physical, physiological, and behavioral properties. Of particular interest is the observation that the proportion of annotated transcripts involved in metabolic processes was more similar between the white shark and humans than between the white shark and a teleost, a finding consistent with those of Venkatesh et al. [4, 31] who found genomic non-coding elements and the relative position of genes to be more similar between another cartilaginous fish (the elephant shark) and humans than between the elephant shark and a teleost. We also compared the white shark transcriptome to other available elasmobranch sequences, for signatures of positive selection and identified several genes of putative adaptive significance on the white shark lineage. The transcriptome resource also provides a large set of new microsatellites that will be immediately useful as markers in studies of population structure, dispersal dynamics, genetic diversity, and mating system biology to further the conservation and management of this vulnerable species.
The white shark is protected by many countries, including the US, and is also a CITES Appendix II listed species . Consequently, obtaining white shark tissue is extremely difficult. However, we were able to obtain tissue from a juvenile white shark illegally landed by an independent fisher off the Delaware, USA coast in 2007. The shark was confiscated from the fisher by the US National Oceanic and Atmospheric Administration Office for Law Enforcement. The heart was collected during a subsequent necropsy of the shark conducted by the National Oceanic and Atmospheric Administration for scientific data collection, and provided to us by this agency for further analysis. The heart was kept frozen at -80°C until sub-sampled for RNA isolation.
cDNA library construction and Roche 454/Illumina sequencing
Total RNA was isolated by homogenization of heart tissue in TRIzol (Invitogen, Carlsbad, USA) followed by phenol chloroform extraction. Full-length cDNA was synthesized using two sets of oligo dT primers in a two step procedure and single-stranded cDNA was used for hybridization instead of double-stranded . After hybridization, reassociated ds-cDNA was separated from ss-cDNA (normalized cDNA) by hydroxyapatite chromatography. Normalized cDNA was re-amplified using an oligo dT specific primer (L4N). cDNA was sequenced using a single run on the Roche 454 GS FLX platform and a single lane of Illumina HiSeq 2000 (100 bp reads, single end).
Sequence assembly and annotation
Roche 454 adaptor sequences were removed using LUCY  and the script SeqClean (http://compbio.dfci.harvard.edu/tgi/software). SeqClean was also used to remove reads containing low complexity sequence, reads shorter than 100bp, and to clip low quality read ends (ends rich in undetermined bases). 454 reads were assembled into contigs de-novo using iAssembler v1.3 . Contigs were searched for di, tri, and tetra microsatellites of five repeats or more using Phobos v3.3.12 as implemented in Geneious v5.5.3 . Illumina HiSeq reads were aligned to 454 contigs using the program Burrows-Wheeler Aligner (BWA)  and consensus sequences built using the pileup format as implemented in SAMtools .
Roche 454/Illumina consensus sequences were searched for open reading frames (ORFs) of 20 amino acids or longer (including the start codon for methionine, but omitting unspecified or unknown amino acids [coded as X]) using the script longorf.pl (available at http://search.cpan.org/~cjfields/BioPerl-1.6.901/examples/longorf.pl). Non-singleton 454 contigs lacking Illumina read coverage were also searched for ORFs using the longorf.pl procedure. Annotation for ORFs was obtained using Blast2GO v.2.5.0 . Amino acids were searched against the UniProtKB/Swiss-Prot database using an E value cut-off = 1e-6 (retaining best 20 hits), with a minimum amino acid alignment length cut-off (high-scoring segment pair length) of 33. Blast2GO was also used to assign GO terms.
For purposes of quantitative evaluation of our combined 454/Illumina sequencing platform approach, we also processed the 454 data without combining it with Illumina data. After singletons were removed, the contigs were searched for ORFs and annotated using the same procedure as for the combined data.
Blast2GO was also used to annotate ORFs and assign GO terms for the zebrafish and human transcriptomes (same procedure as above) to provide comparison to the white shark. Transcriptomes for zebrafish and human were obtained from Ensembl (Danio_rerio.Zv9.66.cdna.all.fa, Homo_sapiens.GRCh37.67.cdna.all.fa). Note: the Ensemble cdna.all files contain “the super set of all transcripts resulting from Ensemble known, novel and pseudo gene predictions” (see the associated readme file for a complete description). Relative enrichment of GO terms for white shark when compared to zebrafish and human (separate comparisons) was assessed using a Fisher exact test. The test was performed using the Gossip statistical package  implemented within Blast2GO. The false discovery rate (FDR) procedure of Benjamini and Hochberg  was used to correct for multiple hypothesis testing (FDR = 0.05). We did not test for underrepresentation (lower proportion of terms) as the white shark transcriptome was obtained from a single tissue type and may therefore not represent complete genomic expression.
Branch-site test of positive selection
In order to detect genes under positive selection using the branch-site test , we obtained embryo transcriptome data for two additional elasmobranch species: Scyliorhinus canicula (smallspotted cat shark) and Leucoraja erinacea (little skate). The data were downloaded from the Gene Expression Omnibus database at NCBI (accession number GSE26235). Transcripts were searched for open reading frames of 20 amino acids or longer using the same procedure as for the white shark. For each of the three elasmobranch species (C. carcharias, S. canicular and L. erinacea), and their putative homologous loci (procedure described below), we tested genes in each species’ lineage for positive selection using the branch-site test as implemented in codeml in PAML (Phylogenetic Analysis by Maximum Likelihood) version 4.4 . The test was performed on homologous core genes (those genes shared among all three species). Homologous genes were delineated using the MCL algorithm  as implemented in the MCLBLASTLINE pipeline (available at http://micans.org/mcl). The pipeline uses Markov clustering (MCL) to assign genes to homologous clusters based on a BLASTp search between all species pairs of protein sequences using an E value cut-off of 1e-5. The MCL algorithm was implemented using an inflation parameter of 1.2. Only single copy core genes were used (i.e. clusters containing paralogs were excluded). The nucleotide sequences corresponding to each set of homologous core genes were aligned using Probalign . Alignment columns with a posterior probability <0.6 were removed, and alignments with >50% of the sites removed were discarded from the analysis. Using each of the alignments and the three elasmobranch species tree topology, positive selection was assessed for each lineage by performing likelihood ratio tests. We compared two branch-site models: (i) a null model that does not allow positive selection (model M1a) and (ii) an alternative model that allows positive selection (model A). P values were calculated under the assumption that the likelihood ratio follows a chi-square distribution with one degree of freedom . Multiple testing adjustment was performed using a false discovery rate approach  (significance level = 0.05).
Availability of supporting data
The 454 and Illumina derived short read files are available at the NCBI Sequence Read Archive (SRA) under the study accession number SRP016555.
We are very grateful to Dr. L. Natanson (NOAA NEFSC) for providing the white shark heart and the NOAA Office for Law Enforcement for making the confiscated shark carcass available for scientific study. We thank the Cornell Life Sciences Core Laboratory for performing the 454 and Illumina sequencing, and we are grateful to Qi Sun and Jaroslaw Pillardy for their assistance with the bioinformatics analyses. This study was supported by grants from the Save Our Seas Foundation, Guy Harvey Ocean Foundation, and Hai Stiftung/Shark Foundation to MSS.
- Grogan ED, Lund R: The origin and relationships of Early Chondrichthyes. Biology of sharks and their relatives [CRC Marine Biology Series]. Edited by: Carrier JC, Musick JA, Heithaus MR, Boca R. 2004, London etc: CRC Press, 3-31.Google Scholar
- Schneider I, Aneas I, Gehrke AR, Dahn RD, Nobrega MA, Shubin NH: Appendage expression driven by the Hoxd Global Control Region is an ancient gnathostome feature. Proc Natl Acad Sci U S A. 2011, 108 (31): 12782-12786. 10.1073/pnas.1109993108.PubMed CentralView ArticlePubMedGoogle Scholar
- Tan YY, Kodzius R, Tay BH, Tay A, Brenner S, Venkatesh B: Sequencing and analysis of full-length cDNAs, 5′-ESTs and 3′-ESTs from a cartilaginous fish, the elephant shark (Callorhinchus milii). PLoS One. 2012, 7 (10): e47174-10.1371/journal.pone.0047174.PubMed CentralView ArticlePubMedGoogle Scholar
- Venkatesh B, Kirkness EF, Loh Y-H, Halpern AL, Lee AP, Johnson J, Dandona N, Viswanathan LD, Tay A, Venter JC: Survey sequencing and comparative analysis of the elephant shark (Callorhinchus milii) genome. PLOS Biology. 2007, 5 (4): 0932-0944.View ArticleGoogle Scholar
- Wang Q, Arighi CN, King BL, Polson SW, Vincent J, Chen C, Huang H, Kingham BF, Page ST, Rendino MF: Community annotation and bioinformatics workforce development in concert--Little Skate Genome Annotation Workshops and Jamborees. Database (Oxford). 2012, 2012: bar064-10.1093/database/bar064.View ArticleGoogle Scholar
- Clarke SC, McAllister MK, Milner-Gulland EJ, Kirkwood GP, Michielsens CGJ, Agnew DJ, Pikitch EK, Nakano H, Shivji MS: Global estimates of shark catches using trade records from commercial markets. Ecol Lett. 2006, 9 (10): 1115-1126. 10.1111/j.1461-0248.2006.00968.x.View ArticlePubMedGoogle Scholar
- King BL, Gillis JA, Carlisle HR, Dahn RD: A natural deletion of the HoxC Cluster in Elasmobranch fishes. Science (Washington D C). 2011, 334 (6062): 1517-10.1126/science.1210912.View ArticleGoogle Scholar
- Parton A, Bayne CJ, Barnes DW: Analysis and functional annotation of expressed sequence tags from in vitro cell lines of elasmobranchs: Spiny dogfish shark (Squalus acanthias) and little skate (Leucoraja erinacea). Comp Biochem Physiol Genom Proteonomics. 2010, 5 (3): 199-206. 10.1016/j.cbd.2010.04.004.View ArticleGoogle Scholar
- Chapple TK, Jorgensen SJ, Anderson SD, Kanive PE, Klimley AP, Botsford LW, Block BA: A first estimate of white shark, Carcharodon carcharias, abundance off Central California. Biol Lett. 2011, 7 (4): 581-583. 10.1098/rsbl.2011.0124.PubMed CentralView ArticlePubMedGoogle Scholar
- Fergusson I, Compagno L, Marks M: IUCN 2012. IUCN Red List of Threatened Species. Version 2012.2. 2009, http://www.iucnredlist.org ,Google Scholar
- Shivji MS, Chapman DD, Pikitch EK, Raymond PW: Genetic profiling reveals illegal international trade in fins of the great white shark, Carcharodon carcharias. Conserv Genet. 2005, 6 (6): 1035-1039.View ArticleGoogle Scholar
- Castro JI: The sharks of North America. 2011, Oxford, New York etc: Oxford University PressGoogle Scholar
- Gregory TR: Animal Genome Size Database. 2005, http://www.genomesize.com ,Google Scholar
- Nasby-Lucas N, Dewar H, Lam CH, Goldman KJ, Domeier ML: White shark offshore habitat: a behavioral and environmental characterization of the eastern Pacific offshore foraging area. PLoS One. 2009, 4 (12): e8163-10.1371/journal.pone.0008163. 8161–8114,PubMed CentralView ArticlePubMedGoogle Scholar
- Cahais V, Gayral P, Tsagkogeorga G, Melo-Ferreira J, Ballenghien M, Weinert L, Chiari Y, Belkhir K, Ranwez V, Galtier N: Reference-free transcriptome assembly in non-model animals from next-generation sequencing data. Mol Ecol Resour. 2012, 12: 834-845. 10.1111/j.1755-0998.2012.03148.x.View ArticlePubMedGoogle Scholar
- Miller HC, Biggs PJ, Voelckel C, Nelson NJ: De novo sequence assembly and characterisation of a partial transcriptome for an evolutionarily distinct reptile, the tuatara (Sphenodon punctatus). BMC Genomics. 2012, 13: 439-10.1186/1471-2164-13-439.PubMed CentralView ArticlePubMedGoogle Scholar
- Fraser BA, Weadick CJ, Janowitz I, Rodd FH, Hughes KA: Sequencing and characterization of the guppy (Poecilia reticulata) transcriptome. BMC Genomics. 2011, 12: 202-10.1186/1471-2164-12-202.PubMed CentralView ArticlePubMedGoogle Scholar
- Parchman TL, Geist KS, Grahnen JA, Benkman CW, Buerkle CA: Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery. BMC Genomics. 2010, 11: 180-10.1186/1471-2164-11-180.PubMed CentralView ArticlePubMedGoogle Scholar
- Meyer E, Aglyamova GV, Wang S, Buchanan-Carter J, Abrego D, Colbourne JK, Willis BL, Matz MV: Sequencing and de novo analysis of a coral larval transcriptome using 454 GSFlx. BMC Genomics. 2009, 10: 219-10.1186/1471-2164-10-219.PubMed CentralView ArticlePubMedGoogle Scholar
- Vera JC, Wheat CW, Fescemyer HW, Frilander MJ, Crawford DL, Hanski I, Marden JH: Rapid transcriptome characterization for a nonmodel organism using 454 pyrosequencing. Mol Ecol. 2008, 17 (7): 1636-1647. 10.1111/j.1365-294X.2008.03666.x.View ArticlePubMedGoogle Scholar
- Der JP, Barker MS, Wickett NJ, de Pamphilis CW, Wolf PG: De novo characterization of the gametophyte transcriptome in bracken fern, Pteridium aquilinum. BMC Genomics. 2011, 12: 99-10.1186/1471-2164-12-99.PubMed CentralView ArticlePubMedGoogle Scholar
- Reading BJ, Chapman RW, Schaff JE, Scholl EH, Opperman CH, Sullivan CV: An ovary transcriptome for all maturational stages of the striped bass (Morone saxatilis), a highly advanced perciform fish. BMC Res Notes. 2012, 5: 111-10.1186/1756-0500-5-111.PubMed CentralView ArticlePubMedGoogle Scholar
- Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q: Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011, 29 (7): 644-652. 10.1038/nbt.1883.PubMed CentralView ArticlePubMedGoogle Scholar
- Angeloni F, Wagemaker CA, Jetten MS, Op den Camp HJ, Janssen-Megens EM, Francoijs KJ, Stunnenberg HG, Ouborg NJ: De novo transcriptome characterization and development of genomic tools for Scabiosa columbaria L. using next-generation sequencing techniques. Mol Ecol Resour. 2011, 11 (4): 662-10.1111/j.1755-0998.2011.02990.x.View ArticlePubMedGoogle Scholar
- Gotz S, Garcia-Gomez JM, Terol J, Williams TD, Nagaraj SH, Nueda MJ, Robles M, Talon M, Dopazo J, Conesa A: High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic Acids Res. 2008, 36 (10): 3420-3435. 10.1093/nar/gkn176.PubMed CentralView ArticlePubMedGoogle Scholar
- Katz SL: Design of heterothermic muscle in fish. J Exp Biol. 2002, 205 (Pt 15): 2251-2266.PubMedGoogle Scholar
- Goldman KJ: Regulation of body temperature in the white shark, Carcharodon carcharias. J Comp Physiol B Biochem Syst Environ Physiol. 1997, 167 (6): 423-429. 10.1007/s003600050092.View ArticleGoogle Scholar
- Dickson KA, Graham JB: Evolution and consequences of endothermy in fishes. Physiol Biochem Zool. 2004, 77 (6): 998-1018. 10.1086/423743.View ArticlePubMedGoogle Scholar
- Bonfil R, Meyer M, Scholl MC, Johnson R, O’Brien S, Oosthuizen H, Swanson S, Kotze D, Paterson M: Transoceanic migration, spatial dynamics, and population linkages of white sharks. Science (Washington D C). 2005, 310 (5745): 100-103. 10.1126/science.1114898.View ArticleGoogle Scholar
- Carlson JK, Goldman KJ, Lowe CG: Metabolism, energetic demand, and endothermy. Biology of sharks and their relatives [CRC Marine Biology Series]. Edited by: Carrier JC, Musick JA, Heithaus MR. 2004, Boca Raton, London etc: CRC Press, 203-224.View ArticleGoogle Scholar
- Venkatesh B, Kirkness EF, Loh YH, Halpern AL, Lee AP, Johnson J, Dandona N, Viswanathan LD, Tay A, Venter JC: Ancient noncoding elements conserved in the human genome. Science. 2006, 314 (5807): 1892-10.1126/science.1130708.View ArticlePubMedGoogle Scholar
- Hoffman JI, Nichols HJ: A novel approach for mining polymorphic microsatellite markers in silico. PLoS One. 2011, 6 (8): e23283-10.1371/journal.pone.0023283.PubMed CentralView ArticlePubMedGoogle Scholar
- Babik W, Stuglik M, Qi W, Kuenzli M, Kuduk K, Koteja P, Radwan J: Heart transcriptome of the bank vole (Myodes glareolus): towards understanding the evolutionary variation in metabolic rate. BMC Genomics. 2010, 11: 390-10.1186/1471-2164-11-390.PubMed CentralView ArticlePubMedGoogle Scholar
- Mikheyev AS, Vo T, Wee B, Singer MC, Parmesan C: Rapid microsatellite isolation from a butterfly by de novo transcriptome sequencing: performance and a comparison with AFLP-derived distances. PLoS One. 2010, 5 (6): e11212-10.1371/journal.pone.0011212.PubMed CentralView ArticlePubMedGoogle Scholar
- Teacher AGF, Kahkonen K, Merila J: Development of 61 new transcriptome-derived microsatellites for the Atlantic herring (Clupea harengus). Conserv Genet Resour. 2012, 4 (1): 71-74. 10.1007/s12686-011-9477-5.View ArticleGoogle Scholar
- Saarinen EV, Austin JD: When technology meets conservation: increased microsatellite marker production using 454 genome sequencing on the endangered Okaloosa Darter (Etheostoma okaloosae). J Hered. 2010, 101 (6): 784-788. 10.1093/jhered/esq080.View ArticlePubMedGoogle Scholar
- Boomer JJ, Stow AJ: Rapid isolation of the first set of polymorphic microsatellite loci from the Australian gummy shark, Mustelus antarcticus and their utility across divergent shark taxa. Conserv Genet Resour. 2010, 2: 393-395. 10.1007/s12686-010-9274-6.View ArticleGoogle Scholar
- Chabot CL: Characterization of 11 microsatellite loci for the brown smooth-hound shark, Mustelus henlei (Triakidae), discovered with next-generation sequencing. Conserv Genet Resour. 2012, 4 (1): 23-25. 10.1007/s12686-011-9464-x.View ArticleGoogle Scholar
- Chabot CL, Nigenda S: Characterization of 13 microsatellite loci for the tope shark, Galeorhinus galeus, discovered with next-generation sequencing and their utility for eastern Pacific smooth-hound sharks (Mustelus). Conserv Genet Resour. 2011, 3 (3): 553-555. 10.1007/s12686-011-9402-y.View ArticleGoogle Scholar
- Pardini AT, Jones CS, Scholl MC, Noble LR: Isolation and characterization of dinucleotide microsatellite loci in the great white shark. Carcharodon carcharias. Mol Ecol. 2000, 9 (8): 1176-1178. 10.1046/j.1365-294x.2000.00954-4.x.View ArticlePubMedGoogle Scholar
- Toth G, Gaspari Z, Jurka J: Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res. 2000, 10 (7): 967-981. 10.1101/gr.10.7.967.PubMed CentralView ArticlePubMedGoogle Scholar
- Edwards YJ, Elgar G, Clark MS, Bishop MJ: The identification and characterization of microsatellites in the compact genome of the Japanese pufferfish, Fugu rubripes: perspectives in functional and comparative genomic analyses. J Mol Biol. 1998, 278 (4): 843-854. 10.1006/jmbi.1998.1752.View ArticlePubMedGoogle Scholar
- Arzimanoglou II, Gilbert F, Barber HR: Microsatellite instability in human solid tumors. Cancer. 1998, 82 (10): 1808-1820. 10.1002/(SICI)1097-0142(19980515)82:10<1808::AID-CNCR2>3.0.CO;2-J.View ArticlePubMedGoogle Scholar
- Bates G, Lehrach H: Trinucleotide repeat expansions and human genetic disease. Bioessays. 1994, 16 (4): 277-284. 10.1002/bies.950160411.View ArticlePubMedGoogle Scholar
- Reddy PS, Housman DE: The complex pathology of trinucleotide repeats. Curr Opin Cell Biol. 1997, 9 (3): 364-372. 10.1016/S0955-0674(97)80009-9.View ArticlePubMedGoogle Scholar
- Warren ST, Nelson DL: Trinucleotide repeat expansions in neurological disease. Curr Opin Neurobiol. 1993, 3 (5): 752-759. 10.1016/0959-4388(93)90149-S.View ArticlePubMedGoogle Scholar
- Wooster R, Cleton-Jansen AM, Collins N, Mangion J, Cornelis RS, Cooper CS, Gusterson BA, Ponder BA, von Deimling A, Wiestler OD: Instability of short tandem repeats (microsatellites) in human cancers. Nat Genet. 1994, 6 (2): 152-156. 10.1038/ng0294-152.View ArticlePubMedGoogle Scholar
- Ballantyne JS: Jaws: the inside story. The metabolism of elasmobranch fishes. Comp Biochem Physiol B Biochem Mol Biol. 1997, 118B (4): 703-742.View ArticleGoogle Scholar
- Ostrander GK, Cheng KC, Wolf JC, Wolfe MJ: Shark cartilage, cancer and the growing threat of pseudoscience. Cancer Res. 2004, 64 (23): 8485-8491. 10.1158/0008-5472.CAN-04-2260.View ArticlePubMedGoogle Scholar
- Pearson CE, Nichol Edamura K, Cleary JD: Repeat instability: mechanisms of dynamic mutations. Nat Rev Genet. 2005, 6 (10): 729-742. 10.1038/nrg1689.View ArticlePubMedGoogle Scholar
- Berger M, Vogt Sionov R, Levine AJ, Haupt Y: A role for the polyproline domain of p53 in its regulation by Mdm2. J Biol Chem. 2001, 276 (6): 3785-3790. 10.1074/jbc.M008879200.View ArticlePubMedGoogle Scholar
- Gerber HP, Seipel K, Georgiev O, Hofferer M, Hug M, Rusconi S, Schaffner W: Transcriptional activation modulated by homopolymeric glutamine and proline stretches. Science. 1994, 263 (5148): 808-811. 10.1126/science.8303297.View ArticlePubMedGoogle Scholar
- Perutz MF, Johnson T, Suzuki M, Finch JT: Glutamine repeats as polar zippers: their possible role in inherited neurodegenerative diseases. Proc Natl Acad Sci U S A. 1994, 91 (12): 5355-5358. 10.1073/pnas.91.12.5355.PubMed CentralView ArticlePubMedGoogle Scholar
- Yang Z: PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007, 24 (8): 1586-1591. 10.1093/molbev/msm088.View ArticlePubMedGoogle Scholar
- Jacinto E, Guo B, Arndt KT, Schmelzle T, Hall MN: TIP41 interacts with TAP42 and negatively regulates the TOR signaling pathway. Mol Cell. 2001, 8 (5): 1017-1026. 10.1016/S1097-2765(01)00386-0.View ArticlePubMedGoogle Scholar
- Schonthal AH: Role of serine/threonine protein phosphatase 2A in cancer. Cancer Lett. 2001, 170 (1): 1-13. 10.1016/S0304-3835(01)00561-4.View ArticlePubMedGoogle Scholar
- Little MH: Regrow or repair: potential regenerative therapies for the kidney. J Am Soc Nephrol. 2006, 17 (9): 2390-2401. 10.1681/ASN.2006030218.View ArticlePubMedGoogle Scholar
- Elger M, Hentschel H, Litteral J, Wellner M, Kirsch T, Luft FC, Haller H: Nephrogenesis is induced by partial nephrectomy in the elasmobranch Leucoraja erinacea. J Am Soc Nephrol. 2003, 14 (6): 1506-1518. 10.1097/01.ASN.0000067645.49562.09.View ArticlePubMedGoogle Scholar
- Blazek E, Mittler G, Meisterernst M: The mediator of RNA polymerase II. Chromosoma. 2005, 113 (8): 399-408. 10.1007/s00412-005-0329-5.View ArticlePubMedGoogle Scholar
- Malik S, Roeder RG: Dynamic regulation of pol II transcription by the mammalian Mediator complex. Trends Biochem Sci. 2005, 30 (5): 256-263. 10.1016/j.tibs.2005.03.009.View ArticlePubMedGoogle Scholar
- Asturias FJ, Jiang YW, Myers LC, Gustafsson CM, Kornberg RD: Conserved structures of mediator and RNA polymerase II holoenzyme. Science. 1999, 283 (5404): 985-987. 10.1126/science.283.5404.985.View ArticlePubMedGoogle Scholar
- Beck T, Hall MN: The TOR signalling pathway controls nuclear localization of nutrient-regulated transcription factors. Nature. 1999, 402 (6762): 689-692. 10.1038/45287.View ArticlePubMedGoogle Scholar
- Montoya-Burgos JI: Patterns of positive selection and neutral evolution in the protein-coding genes of Tetraodon and Takifugu. PLoS One. 2011, 6 (9): e24800-10.1371/journal.pone.0024800.PubMed CentralView ArticlePubMedGoogle Scholar
- Kosiol C, Vinar T, da Fonseca RR, Hubisz MJ, Bustamante CD, Nielsen R, Siepel A: Patterns of positive selection in six Mammalian genomes. PLoS Genet. 2008, 4 (8): e1000144-10.1371/journal.pgen.1000144.PubMed CentralView ArticlePubMedGoogle Scholar
- Cohen S: Strong positive selection and habitat-specific amino acid substitution patterns in MHC from an estuarine fish under intense pollution stress. Mol Biol Evol. 2002, 19 (11): 1870-1880. 10.1093/oxfordjournals.molbev.a004011.View ArticlePubMedGoogle Scholar
- Xu T, Sun Y, Shi G, Wang R: Miiuy croaker hepcidin gene and comparative analyses reveal evidence for positive selection. PLoS One. 2012, 7 (4): e35449-10.1371/journal.pone.0035449.PubMed CentralView ArticlePubMedGoogle Scholar
- Sundaram AY, Consuegra S, Kiron V, Fernandes JM: Positive selection pressure within teleost toll-like receptors tlr21 and tlr22 subfamilies and their response to temperature stress and microbial components in zebrafish. Mol Biol Rep. 2012, 39 (9): 8965-8975. 10.1007/s11033-012-1765-y.View ArticlePubMedGoogle Scholar
- van der Aa LM, Levraud JP, Yahmi M, Lauret E, Briolat V, Herbomel P, Benmansour A, Boudinot P: A large new subset of TRIM genes highly diversified by duplication and positive selection in teleost fish. BMC Biol. 2009, 7: 7-10.1186/1741-7007-7-7.PubMed CentralView ArticlePubMedGoogle Scholar
- Hinds KR, Litman GW: Major reorganization of immunoglobulin VH segmental elements during vertebrate evolution. Nature. 1986, 320 (6062): 546-549. 10.1038/320546a0.View ArticlePubMedGoogle Scholar
- Zhu C, Feng W, Weedon J, Hua P, Stefanov D, Ohta Y, Flajnik MF, Hsu E: The multiple shark Ig H chain genes rearrange and hypermutate autonomously. J Immunol. 2011, 187 (5): 2492-2501. 10.4049/jimmunol.1101671.PubMed CentralView ArticlePubMedGoogle Scholar
- Litman GW, Stolen JS, Sarvas HO, Makela O: The range and fine specificity of the anti-hapten immune response: phylogenetic studies. J Immunogenet. 1982, 9 (6): 465-474. 10.1111/j.1744-313X.1982.tb01008.x.View ArticlePubMedGoogle Scholar
- Makela O, Litman GW: Lack of heterogeneity in antihapten antibodies of a phylogenetically primitive shark. Nature. 1980, 287 (5783): 639-640. 10.1038/287639a0.View ArticlePubMedGoogle Scholar
- IUCN: Review of Migratory Chondrichthyan Fishes. Secretariat of the Convention on the Conservation of Migratory Species of Wild Animals (CMS). CMS Technical Report Series. 2007, 68-Google Scholar
- Patanjali SR, Parimoo S, Weissman SM: Construction of a uniform-abundance (normalized) cDNA library. Proc Natl Acad Sci USA. 1991, 88 (5): 1943-1947. 10.1073/pnas.88.5.1943.PubMed CentralView ArticlePubMedGoogle Scholar
- Chou HH, Holmes MH: DNA sequence quality trimming and vector removal. Bioinformatics. 2001, 17 (12): 1093-1104. 10.1093/bioinformatics/17.12.1093.View ArticlePubMedGoogle Scholar
- Zheng Y, Zhao L, Gao J, Fei Z: iAssembler: a package for de novo assembly of Roche-454/Sanger transcriptome sequences. BMC Bioinforma. 2011, 12: 453-10.1186/1471-2105-12-453.View ArticleGoogle Scholar
- Drummond A, Ashton B, Buxton S, Cheung M, Cooper A, Heled J, Kearse M, Moir R, Stones-Havas S: Geneious v5.1. 2010, Available from http://www.geneious.comGoogle Scholar
- Li H, Durbin R: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009, 25 (14): 1754-1760. 10.1093/bioinformatics/btp324.PubMed CentralView ArticlePubMedGoogle Scholar
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R: The sequence alignment/map format and SAMtools. Bioinformatics. 2009, 25 (16): 2078-2079. 10.1093/bioinformatics/btp352.PubMed CentralView ArticlePubMedGoogle Scholar
- Blüthgen N, Brand K, Cajavec B, Swat M, Herzel H, Beule D: Biological profiling of gene groups utilizing gene ontology. Genome Inform. 2005, 16 (1): 106-115.PubMedGoogle Scholar
- Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B. 1995, 57: 289-300.Google Scholar
- Zhang J, Nielsen R, Yang Z: Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol Biol Evol. 2005, 22 (12): 2472-2479. 10.1093/molbev/msi237.View ArticlePubMedGoogle Scholar
- van Dongen S: Graph clustering by flow simulation PhD thesis. 2000, University of UtrechtGoogle Scholar
- Roshan U, Livesay DR: Probalign: multiple sequence alignment using partition function posterior probabilities. Bioinformatics. 2006, 22 (22): 2715-2721. 10.1093/bioinformatics/btl472.View ArticlePubMedGoogle Scholar
- Benjamini Y, Yekutieli D: The control of the false discovery rate in multiple testing under dependency. Ann Statist. 2001, 29 (4): 1165-1188. 10.1214/aos/1013699998.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.