- Research article
- Open Access
The venom-gland transcriptome of the eastern coral snake (Micrurus fulvius) reveals high venom complexity in the intragenomic evolution of venoms
BMC Genomicsvolume 14, Article number: 531 (2013)
Snake venom is shaped by the ecology and evolution of venomous species, and signals of positive selection in toxins have been consistently documented, reflecting the role of venoms as an ecologically critical phenotype. New World coral snakes (Elapidae) are represented by three genera and over 120 species and subspecies that are capable of causing significant human morbidity and mortality, yet coral-snake venom composition is poorly understood in comparison to that of Old World elapids. High-throughput sequencing is capable of identifying thousands of loci, while providing characterizations of expression patterns and the molecular evolutionary forces acting within the venom gland.
We describe the de novo assembly and analysis of the venom-gland transcriptome of the eastern coral snake (Micrurus fulvius). We identified 1,950 nontoxin transcripts and 116 toxin transcripts. These transcripts accounted for 57.1% of the total reads, with toxins accounting for 45.8% of the total reads. Phospholipases A2 and three-finger toxins dominated expression, accounting for 86.0% of the toxin reads. A total of 15 toxin families were identified, revealing venom complexity previously unknown from New World coral snakes. Toxins exhibited high levels of heterozygosity relative to nontoxins, and overdominance may favor gene duplication leading to the fixation of advantageous alleles. Phospholipase A2 expression was uniformly distributed throughout the class while three-finger toxin expression was dominated by a handful of transcripts, and phylogenetic analyses indicate that toxin divergence may have occurred following speciation. Positive selection was detected in three of the four most diverse toxin classes, suggesting that venom diversification is driven by recurrent directional selection.
We describe the most complete characterization of an elapid venom gland to date. Toxin gene duplication may be driven by heterozygote advantage, as the frequency of polymorphic toxin loci was significantly higher than that of nontoxins. Diversification among toxins appeared to follow speciation reflecting species-specific adaptation, and this divergence may be directly related to dietary shifts and is suggestive of a coevolutionary arms race.
Snake venom is a cocktail of biologically active proteins with multifarious pharmacological effects representing the inverse of the physiological processes that maintain prey homeostasis . Venom defines the ecology, life history, and evolution of venomous species due to its involvement in prey capture, digestion, and defense . Positive selection has been repeatedly detected in toxin genes and reflects the significant contribution of venoms to fitness [2–4]. These molecular signals of adaptive evolution coupled with compositional variation suggest that toxin diversification is an adaptation to diet and may reflect a predator-prey arms race . Snake venom components exert selective pressures on both prey [6, 7] and predators , and potentially offer a unique, dual perspective on predator-prey coevolution.
New World coral snakes (Elapidae) consist of three genera (Leptomicrurus, Micrurus, and Micruroides) and over 120 species and subspecies, inhabiting a diverse array of habitats from the southern United States to central Argentina . The eastern coral snake (Micrurus fulvius) is native to the forests of the southeastern United States and is mainly ophiophagous, preying upon smaller snakes and other squamates . Bites from Micrurus species can be lethal  due to the pre- and postsynaptic effects of the neurotoxic venom components that dominate coral snake venoms (LD50 = 7–76 μ g venom/18–22 g mouse ), and polyvalent antivenom has been shown to be ineffective at treating bites from all Micrurus species [11, 12] and is currently unavailable in the United States . A full characterization of all venom components may allow the design of more effective polyvalent antivenom [13, 14], but coral-snake venom composition is poorly understood in comparison to that of Old World elapids , mainly due to the difficulty of procuring sufficient venom quantities during milking for standard proteomic techniques . High-throughput sequencing approaches are capable of identifying thousands of loci, enabling a detailed examination of the evolutionary forces shaping venom composition at the molecular level. We describe the first high-throughput transcriptomic characterization of an elapid venom gland to date. We sequenced the venom-gland transcriptome of M. fulvius with Illumina technology using the paired-end approach of Rokyta et al. , and used the generated sequence data to examine the relationship between toxin heterozygosity and gene duplication events and uncover distinct expression patterns in highly expressed and extremely diverse toxin gene families.
Results and discussion
High venom complexity revealed by means of sequencing
Our high-throughput transcriptomic analysis revealed high venom complexity in M. fulvius, comparable to the diversity of toxin components recently identified in the venom-gland transcriptome of the eastern diamondback rattlesnake (Crotalus adamanteus: Viperidae) . We generated 79,573,048 pairs of 100 nucleotide raw reads and merged 61,609,456 pairs on the basis of overlap at their 3’ ends, following the approach of Rokyta et al.  and Rodrigue et al. . These merged reads had an average length of 134 nucleotides with average phred scores of 46. The iterative assembly process described by Rokyta et al.  coupled with a reference-based assembly using nontoxin transcripts previously described in the venom-gland transcriptome of C. adamanteus resulted in the identification of 1,950 unique, full-length nontoxin coding sequences and 116 unique, full-length toxin coding transcripts (Figure 1A).
Transcript abundances were estimated by mapping 10 million reads to unique, full-length sequences with a minimum match percentage of 95% as described in Rokyta et al.  for C. adamanteus, against whose results we will be making comparisons. Toxin transcripts were grouped into clusters based on <1% nucleotide divergence for abundance estimates. This enabled us to account for allelic variation, recent duplication events, and possible sequencing errors while also allowing reads to be mapped to a unique sequence rather than to multiple sequences, resulting in a more accurate estimation of abundance levels. Our estimates of abundance reflect mRNA levels and not necessarily protein abundances; the relationship between mRNA and protein levels is not always straightforward , and the construction of a complete genotype-phenotype map requires proteomic analyses (e.g., mass spectrometry).
The 1,950 nontoxin transcripts accounted for 11.3% of the reads. The 116 full-length toxin transcripts were grouped into 75 clusters (Figure 1B, Table 1) and accounted for 45.8% of the reads. In total, we accounted for 57.1% of the reads (Figure 2), comparable to the amount identified for C. adamanteus using a similar approach . While the overall percentage of reads mapping to identified transcripts was similar for M. fulvius and C. adamanteus, toxin expression levels were quite different (Figure 2). The toxin transcripts identified in M. fulvius accounted for nearly half of the total sequencing reads (45.8%) while the toxin transcripts in C. adamanteus accounted for approximately one-third (35.4%) of the total reads (Figure 2) . The numbers and abundances of nontoxin coding sequences were much lower in M. fulvius than in C. adamanteus, despite an increase in assembly effort (e.g., the addition of the reference-based assembly), as 1,950 nontoxin transcripts accounted for 11.3% of the total reads in M. fulvius while 2,879 nontoxin transcripts accounted for 27.5% of the total reads in C. adamanteus (Figure 2). The venom-gland transcriptome of C. adamanteus was characterized by large, hemorrhage-inducing toxins such as snake venom metalloproteinases (SVMPs), proteins that presumably require extensive downstream processing by nontoxin machinery prior to becoming mature, active toxins . The vast majority of highly expressed nontoxin sequences identified in the transcriptome of C. adamanteus were involved with proteostasis (e.g., protein folding, degradation, and transport) , and the reduction in the expression levels of nontoxin transcripts in M. fulvius could potentially reflect a difference in the maintenance, production, and folding requirements of the venom components of each species. The venom of M. fulvius was dominated by three-finger toxins (3FTxs) and phospholipases A2 (PLA2s), relatively short toxins that may not require the degree of downstream processing needed by larger toxins to become functional. This suggests that venoms dominated by smaller proteins differ in the transcriptional effort expended on toxins relative to nontoxins in comparison to venoms characterized by high-molecular weight components, with small-component venoms expressing toxins at much higher levels relative to nontoxin production. The largely proteinaceous composition of venom makes it metabolically costly to produce , and a reduction in the machinery necessary for the production of functional toxic proteins may confer an energetic benefit to species expressing smaller peptides and enzymes. Simple, smaller toxins have a reduced mutational target relative to larger proteins as a function of sequence length, potentially reducing the ability to evolve effective counterdefenses to resistance development in frequently envenomed prey [6, 7] and predators  where more complex venoms would be advantageous. However, as our hypotheses are based on a comparison between a single representative of each family, sequencing additional members of Viperidae and Elapidae are needed to test whether these putative differences in transcriptional effort are fixed or unique to M. fulvius and/or C.adamanteus.
Toxin class expression patterns
We identified 116 unique, full-length toxin transcripts representing 15 toxin classes or families which clustered into 75 groups with <1% nucleotide divergence. Clusters could include alleles, recent duplicates, or assembly/sequencing errors. These clusters accounted for 45.8% of the total reads (Figure 2), and nearly all of the high-abundance transcripts coded for toxins (Figure 1A). Of the 75 toxin clusters identified, 71 were among the top 200 most highly expressed genes (Figure 1A), while only 63 of the 78 toxin clusters identified in C. adamanteus were among the top 200 most highly expressed genes . The expression patterns of the venom gland appear to be biased towards toxin production in C. adamanteus, but reach a more extreme level of specialization in M. fulvius. Although sequencing a single specimen may not accurately reflect gene expression for all individuals of M. fulvius, our analyses provide a reference transcriptome for future work with the species. We used the number of reads mapping to a specific transcript as an estimate of its abundance as per Rokyta et al. . Toxin transcripts were named by the toxin-class abbreviation followed by the cluster number and a letter if multiple transcripts were present within the cluster.
We identified 54 unique sequences of PLA2s that grouped into 31 clusters, which accounted for 64.9% of toxin reads and 29.7% of the total reads (Figure 2). PLA2s were the most abundant and diverse toxin class in the M. fulvius venom-gland transcriptome. PLA2s are esterolytic en- zymes and are among the most toxic components of snake venoms, causing pre- and postsynaptic neurotoxicity, myotoxicity, cardiotoxicity, hemolytic activity, anticoagulant activity, and hypotensive activity among other effects . Conversely, SVMPs were the most abundant toxin gene family in C. adamanteus, with 16 clusters representing 39 sequences accounting for 24.4% of the toxin reads and 8.6% of the total reads . PLA2s accounted for more than twice the percentage of toxin reads and over three times the percentage of total reads than the most abundant class in C. adamanteus, demonstrating the extreme dominance of gene expression by the PLA2 gene family in M. fulvius.
We identified 26 full-length transcripts and 15 clusters of 3FTxs, which accounted for 21.1% of toxin reads and 9.7% of the total reads (Figure 2). 3FTxs were the second most abundant and diverse toxin class, and with PLA2s accounted for 86.0% of the total toxin reads. 3FTxs are short (60–71 amino-acid residues), nonenzymatic proteins characterized by three β-stranded loops extending from a center protein core . These proteins are common components of elapid snake venoms and exhibit postsynaptic neurotoxicity by inhibiting the binding of acetylcholine to the muscle nicotinic acetylcholine receptors, causing a blockage of neuromuscular transmission . The venom-gland transcriptome of M. altirostris was dominated by 3FTxs, as this toxin family accounted for approximately half of the identified expressed sequence tags . In M. fulvius, 3FTxs only accounted for 9.7% of the total reads, reflecting vast differences in expression patterns within the genus.
Long-chain neurotoxins (LCNs) were the third most abundant and fourth most diverse toxin class. We identified nine sequences, which grouped into seven clusters and accounted for 4.0% of toxin reads and 1.8% of the total reads (Figure 2). LCNs are similar to the short-chain 3FTxs described above but contain an additional 6–12 amino acid residues and an additional disulfide bond . While both antagonize muscle nicotinic acetylcholine receptors, differences in the functional sites of these toxins correlate to differences in the specificity of their targets . LCNs were much more similar to one another than 3FTxs, being 5.3% divergent at the nucleotide level and 12.8% divergent at the amino acid level whereas divergence among members of the 3FTx class was extremely high, with a paucity of conserved nucleotides and amino acids across cluster members.
Snake venom metalloproteinases
SVMPs were the fourth most abundant toxin class, represented by a single transcript which accounted for 2.9% of the toxin reads and 1.3% of the total reads (Figure 2). SVMPs display extensive local and systemic hemorrhagic activity associated with envenomation in viperids, but the role of these enzymes following Micrurus envenomation is uncertain. These enzymes can be subdivided into classes based on differences in domain structure, and the transcript identified in the venom-gland transcriptome of M. fulvius is a member of the class SVMP-III, characterized by disintegrin-like, cysteine-rich, and metalloproteinase domains . Larger molecular weight toxins, such as SVMPs, are often absent from elapid venoms and are more common components in the venoms of viperids [16, 23]. High metalloproteinase expression has been shown to be incompatible with high toxicity in rattlesnakes, being described by Mackessy as a tradeoff between neurotoxicity and hemorrhagic activity . High-molecular weight toxins such as SVMPs are the major causes of hemorrhage and necrosis commonly associated with viperid envenomations, and presumably aid in digestion and are most effective when localized at the site of the bite. Conversely, toxins with systemic effects (e.g., toxicity) such as 3FTxs are more effective when spread through the envenomated organism is rapid, resulting in prey immobilization. This dichotomy in pharmocological effects appears to hold in elapids as well, demonstrating the antagonistic effects of these components. Yet in comparison to other transcriptomic work with Micrurus species, SVMP expression levels are nearly five-times higher in M. fulvius than in congeneric individuals , suggesting a more active functional role for this toxin in M. fulvius. This hypothesis awaits proteomic verification and testing via comparative functional assays to quantify the activity of SVMPs in M. fulvius relative to congeners.
Kunitz-type inhibitors (KUNs) were the fifth most abundant and third most diverse toxin class present in the M. fulvius transcriptome, with nine transcripts and eight clusters accounting for 2.2% of the toxin reads and 1.0% of the total reads (Figure 2). KUNs are characterized by possessing a compact tertiary fold and three disulfide bonds, functioning as both inhibitors of serine proteases and as neurotoxins by inhibiting calcium and potassium ion channels . KUN-4 was a very divergent transcript, being 504 nucleotides longer than the next longest transcript. It shared 92.0% identity with a putative KUN known from the Australian elapid Austrelaps labialis and 85.0% identity with a KUN identified in the venom-gland transcriptome of C. adamanteus. This transcript is considered putative until a comprehensive functional characterization is completed. The removal of this extremely divergent transcript resulted in a maximum pair-wise nucleotide divergence of 31.0% within KUNs and a maximum pair-wise amino acid divergence of 68.0%, reflecting the diversity of this toxin class, especially in comparison to LCNs.
PLA2s, 3FTxs, LCNs, SVMPs, and KUNs accounted for 95.1% of the reads mapping to toxin sequences (Figure 1B), 82.7% of the toxin clusters, and 85.3% of the unique toxin transcripts. The remaining low-abundance toxins fell into ten different classes, are listed under “Other” in Figure 1B and Figure 2, and are described in Table 2.
Heterozygosity and gene duplication
Single nucleotide polymorphisms (SNPs) were identified in the coding regions of toxin clusters and nontoxin sequences using the SeqMan module from the DNAStar Lasergene software suite following a templated alignment in NGen. We analyzed 2,025 contigs and detected 98 SNPs in 78 transcripts (Table 3). Of these 78 transcripts, 69 coded for nontoxin proteins while the remaining nine SNP-containing transcripts were identified as toxins. Ten of the 69 nontoxin transcripts and two of the nine toxin sequences with variable sites contained multiple SNPs. A total of 87 SNPs were spread over 3.7% of nontoxin transcripts while 11 SNPs were found over 12.0% of toxin cluster sequences. Of the SNPs identified in nontoxin transcripts, 26.4% resulted in alterations to the amino-acid sequence, while 45.5% of the SNPs identified in toxin transcripts resulted in a nonsynonymous mutation. Overdominant selection favors polymorphism and increases genetic variability. Synonymous SNPs are often predicted to be invisible to the sieve of selection, and elevated nonsynonymous rates at polymorphic toxin loci suggest that the diversity of toxin genes can at least, in part, be explained by overdominant selection. Our SNP calling approach only identified polymorphic sites as SNPs if the nucleotide occurred with a frequency ranging from 40–60%. Therefore, sequences possessing SNPs were considered putative heterozygous loci, and we compared the frequency of heterozygous loci between classes to a χ2 distribution with one degree of freedom. Toxin and nontoxin sequences accounted for 5.6% and 94.4% of the total annotated transcripts, respectively. The frequency of heterozygous toxin loci was significantly greater than the frequency of heterozygous nontoxin sequences ( χ2 = 6.383,d f = 1,p = 0.0115). Toxin transcripts also possessed a much higher SNP density than nontoxins. SNP density was calculated as the number of SNPs per 1,000 bases (kb) in the coding regions of toxin and nontoxin transcripts containing SNPs . Nontoxin transcripts had a SNP density of 0.96 SNPs/kb while toxins had a SNP density of 1.93 SNPs/kb.
Sequences in multitranscript clusters appear to represent different allelic states of heterozygous loci, potentially reflecting how heterozygote advantage can favor gene duplication. If duplicate genes confer an advantage by providing more of a specific gene product, duplication events involving genes that are already highly expressed would be most beneficial . While duplication events of highly expressed toxin transcripts within the snake venom gland have been rampant throughout the evolutionary history of venomous species (e.g., the formation of large gene families), this does not explain the increased heterozygosity detected in our analyses. Gene duplication may also produce a selective advantage when overdominance occurs. In this scenario, carrying different alleles at a single locus is beneficial and gene duplication can ultimately result in the fixation of multiple advantageous alleles. Heterozygosity occurred with a significantly higher frequency in toxin than in nontoxin sequences with nearly half of the polymorphic sites in toxins resulting in a nonsynonymous mutation, suggesting that heterozygote advantage may play a key role in driving gene duplication and allelic variation within toxin genes  (but see ).
Expression biases and sequence divergence following speciation
The maximum-likelihood phylogeny for 15 unique M. fulvius 3FTx cluster members was estimated under the HKY+G model (Figure 3A). Expression levels were extremely biased toward a handful of sequences; four transcripts accounted for more than 67.0% of the toxin reads mapping to 3FTxs. These high-abundance transcripts (3FTx-1a, 2a, 3a, and 4a) are all members of multitranscript clusters, and duplication events and the number of alleles appear to be positively correlated with expression within the 3FTx toxin gene family. The maximum-likelihood phylogeny for 15 M. fulvius 3FTx clusters and 23 3FTx transcripts from M. altirostris and M. corallinus was estimated under the HKY+G model (Figure 3B). Two of the four most highly expressed M. fulvius 3FTx paralogs are in well-supported clades con- taining orthologs from both M. altirostris and M. corallinus. 3FTx-13, while accounting for approximately 2.0% of the 3FTx reads in M. fulvius, is in a well supported clade containing ten M. altirostris sequences indicating that orthologs to 3FTx-13 have undergone duplication events in M. altirostris. M. altirostris sequences were identified in a venom-gland transcriptome analysis where the toxin family accounted for 52.8% of the 867 expressed sequence tags generated , demonstrating that these transcripts are highly expressed. The apparent dominance of the 3FTx toxin family in M. altirostris by these transcripts, and the extremely low expression of 3FTx-13 in M. fulvius could reflect dietary differences between the species, as shifts in prey consumption have been known to be coupled with expression modifications that augment venom efficacy . Genes that are no longer effective are lost or silenced while effective genes are highly expressed and diversify, suggesting functional divergence among toxins occurs following speciation events .
A maximum-likelihood phylogeny for 31 unique M. fulvius PLA2 clusters was estimated under the SYM+G model (Figure 4A). Expression levels for PLA2 transcripts were much more evenly distributed across the entire gene family in comparison to 3FTx sequences. The most highly expressed 3FTx transcript, 3FTx-2a, accounted for 22.7% of the total 3FTx reads while four transcripts accounted for approximately 67.0% of the 3FTx reads. In the PLA2 gene family, PLA2-2a was the most abundant toxin family member, accounting for 8.3% of the total PLA2 reads and the four most highly expressed transcripts accounted for only 30.8% of the reads mapping to PLA2s. Although these two families dominated toxin transcript expression levels in the M. fulvius venom gland, PLA2s did so through uniform expression while 3FTx expression patterns were tremendously biased. PLA2 transcripts also demonstrated less sequence divergence among clusters relative to 3FTx sequences. PLA2s are esterolytic enzymes that share a conserved three-dimensional structure [4, 20], and the greater similarity between PLA2 transcripts may be a result of more stringent conformational constraints. The conservation of crucial structures in PLA2 enzymes ensures a functioning active site whereas the relatively short 3FTx peptides may be free of this limitation. Lynch  found that functionally critical sites were under strong purifying selection in PLA2s, with strong directional selection being restricted to surface residues due to their interactions with specific targets in prey, enabling prey-specific adaptation while ensuring the functionality of the enzyme. The maximum-likelihood phylogeny for 31 M. fulvius PLA2 clusters, three M. altirostris sequences, and a single M. corallinus sequence was estimated under the SYM+G model (Figure 4B). Functional divergence among PLA2s may also occur following speciation events , as all three M. altirostris sequences constitute a monophyletic clade and are sister to PLA2-21, a transcript that accounts for <1% of PLA2 reads in M. fulvius (although the pattern is not as strong as in 3FTxs as the PLA2-21/ M. altirostris clade is not well-supported).
Rampant gene duplication following the initial recruitment of toxin genes into the venom gland has led to the production of multigene toxin families , including the highly expressed 3FTx and PLA2 gene families present in the M. fulvius transcriptome. Gene duplication can be advantageous by increasing the production of a beneficial gene product or permanently fixing multiple advantageous alleles. Gene duplication can be followed by elevated rates of selection , enabling genes to acquire new functions or divide single functions among several genes . Phylogenetic analyses of the 3FTx and PLA2 toxin classes (Figures 3 and 4) demonstrate the pervasiveness of gene duplication throughout the evolutionary history of these toxin families, and suggest that the divergence of toxin genes occurs following speciation. Zhang  stated that species-specific duplication events can result in species-specific gene function and, subsequently, adaptation. This adaptation can result in divergence and the development of unique phenotypes , which may be reflected in our phylogenetic analyses of multiple Micrurus species.
Intragenomic evolution of venom genes
Positive selection in toxin genes has been repeatedly documented, reflecting the significance of venom to the fitness of venomous species [2–4]. Selection analyses have taken a somewhat unsystematic approach, analyzing all sequences available with little regard for the evolutionary histories of the taxa involved [4, 40]. Here, we examine the intragenomic evolution of toxin classes within the M. fulvius genome. The codeml results for detecting the presence of positive selection are summarized in Table 4. The results from the M7 and M8 models are nearly identical to the results from the M1 and M2 models and are therefore not shown. We examined 3FTxs, KUNs, LCNs, and PLA2s for the presence of positive selection as these represent the four most diverse and four of the five most abundant toxin classes identified. While the fourth most highly expressed toxin class was SVMPs, this class was represented by a single transcript and thus could not be included in our analyses.
The M0 model estimates a single ω value over all branches in the phylogeny and therefore is only capable of detecting evidence of selection when the majority of sites have experienced positive selection throughout their evolutionary histories . Under the M0 model, we detected strong evidence of positive selection ( 1.48≤ω≤999.00) for the 3FTx, LCN, and PLA2 toxin families. Evidence of positive selection was not detected in the KUN toxin family ( ω=0.58), and the extraordinarily high ω calculated for the LCN family ( ω=999.00) was a result of all polymorphic sites resulting in nonsynonymous substitutions. We also used site specific models (M1, M2, M7, and M8) to detect rate variation among sites [2, 4]. The 3FTx, LCN, and PLA2 toxin classes rejected the null or nearly neutral model (M1) in favor of the selection model (M2) ( 0.00≤p≤4.49×10−5), demonstrating the pervasiveness of positive selection experienced by highly expressed toxin transcripts within the M. fulvius genome. The percentage of sites experiencing selection ranged from 16–56% with 3.79≤ω≤999.00. The KUN toxin class did not reject the M1 model in favor of the selection model ( p=0.0539). The high ω value under the site specific models for the LCN class again reflects the absence of synonymous substitutions within the class. Elevated rates of evolution have previously been documented in toxin genes [2, 4, 41, 42], and this accelerated evolution is most likely due to their direct involvement in fitness [2, 4] and may be reflective of a coevolutionary arms race with specific prey taxa [6, 7].
Sequence accession numbers
The original, unmerged reads were submitted to the National Center for Biotechnology Information (NCBI) Sequence Read Archive under accession number SRA062772. The annotated sequences were submitted to the GenBank Transcriptome Shotgun Assembly database under accession numbers GAEP01000001–GAEP01001950 (nontoxins) and GAEP01001951–GAEP01002066 (toxins).
We have described the most comprehensive transcriptomic characterization of an elapid venom gland to date, revealing venom complexity previously unknown from any New World coral snake [11, 13, 15, 22, 43–45]. Transcriptional effort expended on toxins relative to nontoxins may differ between venoms dominated by high-molecular weight components and venoms dominated by smaller proteins. This reduction in the machinery required for the production of functional toxins may confer a metabolic advantage to species expressing smaller peptides and enzymes, but may also reduce the capacity of species to evolve effective counterdefenses to resistant prey or predators. Toxin expression was dominated by PLA2s and 3FTxs, yet these two gene families greatly differed in expression patterns. Expression within the 3FTx family was extremely biased, being dominated by a handful of transcripts while PLA2 expression was much more uniform. SNP analysis revealed the frequency of heterozygous loci was significantly higher in toxins than in nontoxins with nearly half of the polymorphic sites in toxins resulting in a nonsynonymous substitution, suggesting overdominance may ultimately favor gene duplication and permanent fixation of advantageous alleles within the venom gland. We detected evidence of positive selection in three of the four most diverse and highly expressed toxin classes; sequence evolution or modifications of toxin expression patterns could increase the specificity of venoms for frequently envenomed prey items . Diet has been proposed to be an important selective regime in determining venom composition within Micrurus species , and elevated rates of selection suggest a coevolutionary arms race .
Venom-gland transcriptome sequencing
We followed the approach described in Rokyta et al.  for the preparation and sequencing of the venom gland. We sequenced the venom-gland transcriptome of an adult female M. fulvius from Wakulla County, FL, with a snout-to-vent length of 620 mm and a total length of 685 mm. The snake was anesthetized by propofol injection (10 mg/kg), and venom was extracted by electrostimulation . The animal was allowed to recover for four days for transcription to be maximized , at which point the snake was euthanized by sodium pentobarbitol injection and its venom glands removed . This procedure was approved by the Florida State University Institutional Animal Care and Use Committee under protocol #0924. Sequencing and nonnormalized cDNA library preparation was performed by the HudsonAlpha Institute for Biotechnology Genomic Services Laboratory (http://www.hudsonalpha.org/gsl/). Total RNA was reduced to poly-A + RNA with oligo-dT beads. Two rounds of poly-A + selection were performed. The mRNA then underwent a mild heat fragmentation followed by random priming for first-strand synthesis. Standard second-strand synthesis was followed by library preparation with the double-stranded cDNA as input material. Sequencing was performed in a single lane on the Illumina HiSeq 2000 with 100-base-pair, paired-end reads.
Transcriptome assembly and analysis
We followed the general approach described in Rokyta et al.  for the de novo assembly of the M. fulvius venom-gland transcriptome. Most of our read pairs overlapped at the 3’ end and these reads were subsequently merged to produce longer, composite reads [16, 17]. Quality scores were updated accordingly and only these merged reads were used in the assembly process. The Extender program  was used as a de novo assembler of 1,000 random reads to eliminate extremely high-abundance transcripts. Full-length transcripts were identified with blastx searches and annotated. These sequences were then used as a template to filter a set of the unassembled, original merged reads using NGen 3.1 and a minimum match percentage of 98%. This de novo approach using the Extender program was repeated a second time as described above, using 1,000 of the filtered reads as the program continued to be productive at assembling full-length, unique transcripts. We next identified nontoxin transcripts by aligning 10 million of the unassembled reads to nontoxin transcripts previously identified in the venom-gland transcriptome of C. adamanteus using the NGen assembler. We performed an initial alignment with a minimum match percentage of 93%. We then performed a second alignment with a reduced minimum match percentage of 90% to assemble more divergent transcripts. All other parameter values were consistent with our de novo assemblies in NGen. Coding regions with minimum ten-fold coverage were annotated, and all regions below this threshold were removed. These transcripts were combined with the Extender results and this unique set of sequences was then used as a template to filter a set of the merged reads using NGen 3.1 and a minimum match percentage of 98%. Next, 10 million filtered reads were used in a de novo assembly with NGen 3.1 with a minimum match percentage of 93%. Contigs with a minimum of 200 reads were identified with blastx searches, annotated, and duplicates removed, producing a unique set of identified transcripts. This process of filtering, NGen assembly, and annotating was performed an additional two times. Abundances were estimated by mapping 10 million merged reads to full-length sequences with a minimum match percentage of 95% using a reference-based assembly in NGen . The percentage of reads mapping to an individual transcript was used to estimate abundance. Toxin transcripts were clustered based on <1% nucleotide divergence for abundance estimates.
Ten million merged reads were aligned to the 2,025 annotated transcripts in a reference-based assembly in NGen with a minimum match percentage of 95%. SNPs were identified by using the SeqMan module of the DNAStar Lasergene software suite. Toxin transcripts were clustered based on <1% nucleotide divergence as SNP detection provided an approach to identify allelic variation within toxin clusters. SNPs were only considered if they occurred in the coding sequence of full-length, annotated transcripts, had a SNP% ranging from 40–60%, and at least 20-fold coverage with maximum coverage bounded at 20,000-fold. These parameters are more stringent than previous SNP identification approaches  and provide a conservative estimation of variable sites in the coding regions of annotated transcripts. The frequency of toxin versus nontoxin heterozygous loci, relative to the number of transcripts belonging to each class identified in the transcriptome, was compared to a χ2 distribution with one degree of freedom.
Sequences of 3FTxs and PLA2s were independently aligned on the basis of the amino-acid sequence in the MegAlign module of the DNAStar Lasergene software suite with ClustalW . Model selection was performed using the Akaike Information Criterion values with MrModelTest2.3 , and maximum-likelihood phylogenies were estimated using PAUP* version 4.0a126  and the iterative search strategy described by Rokyta et al. . Nodal support was determined in MrBayes v3.1.2 by the estimation of Bayesian posterior probabilities [53, 54]. Markov Chain Monte Carlo searches were run for 10 million generations with four chains and a temperature parameter of 0.20. Samples were taken every 1,000 generations and the first million generations were discarded as burn-in. A related transcript from the mangrove snake (Boiga dendrophila: Colubridae) served as the outgroup to root each 3FTx phylogeny while a transcript from the eastern diamondback rattlesnake (C. adamanteus) served as the outgroup to root each PLA2 phylogeny. Orthologous transcripts of both toxin families from congeners were downloaded from the NCBI database and included in a second analysis following the method described above.
Transcripts from the four major toxin families (3FTx, KUN, LCN, and PLA2) were used in selection analyses. Sequences were independently aligned according to class on the basis of the amino-acid sequence in the MegAlign module of the DNAStar Lasergene software suite with ClustalW . Gaps, stop codons, and signal peptides were excluded from all analyses, and only transcripts possessing signal peptides were included. Signal peptides mediate the targeting and transporting of the pre-protein and are cleaved prior to expression . Their exclusion from our analyses ensured that only the mature amino-acid sequences of expressed toxins that are targets of selection were examined. Model selection was performed using the Akaike Information Criterion values with MrModelTest2.3 . A maximum likelihood phylogeny was constructed using PAUP* version 4.0a126  and the iterative search strategy previously described by Rokyta et al. .
A likelihood-ratio test for positive selection was conducted with codeml from the PAML version 4.4 package [56, 57] with the maximum-likelihood phylogeny estimated as described above . The null model, or the nearly neutral model (M1), allows for a class of sites to be evolving under neutral selection (dN/dS=1) while constraining the dN/dS for a second class to be <1. The alternative model, or the positive selection model (M2), incorporates an additional class that allows for a proportion of codon sites to be experiencing positive selection (dN/dS >1). To test for the presence of positive selection, negative twice the difference in log likelihoods were compared between models to a χ2 distribution with two degrees of freedom. A similar test was also performed to verify the results of the initial analysis, comparing models M7 (Beta) and M8 (Beta with positive selection) to a χ2 distribution with two degrees of freedom . To estimate an overall dN/dS, the M0 model was used. This model averages the dN/dS across the entire phylogeny, producing a single ratio for all sites. While this model has been shown to have limited power at detecting positive selection , it provides a broader perspective and a more conservative estimate of selection within the M. fulvius genome than the site-specific models described above.
Cysteine-rich with EGF-like domain
Kunitz-type protease inhibitor
L amino-acid oxidase
Nerve growth factor
Reverse-phase high-performance liquid chromatography
Single nucleotide polymorphism
Snake venom metalloproteinase.
Jansa SA, Voss RS: Adaptive evolution of the venom-targeted vWF protein in opossums that eat pitvipers. PLoS ONE. 2011, 6 (6): e20997-10.1371/journal.pone.0020997.
Rokyta DR, Wray KP, Lemmon AR, Lemmon EM, Caudle SB: A high-throughput venom-gland transcriptome for the eastern diamondback rattlesnake (Crotalus adamanteus) and evidence for pervasive positive selection across toxin classes. Toxicon. 2011, 57: 657-671. 10.1016/j.toxicon.2011.01.008.
Pahari S, Mackessy SP, Kini RM: The venom gland transcriptome of the desert massasauga rattlesnake (Sistrurus catenatus edwardsii): towards an understanding of venom composition among advanced snake (superfamily Colubroidea). BMC Mol Biol. 2007, 8: 115-10.1186/1471-2199-8-115.
Lynch VJ: Inventing an arsenal: adaptive evolution and neofunctionalization of snake venom phospholipase A2 genes. BMC Evol Biol. 2007, 7: 2-10.1186/1471-2148-7-2.
Casewell NR, Harrison RA, Wüster W, Wagstaff SC: Comparative venom gland transcriptome surveys of the saw-scaled vipers (Viperidae: Echis) reveal substantial intra-family gene diversity and novel venom transcripts. BMC Genomics. 2009, 10: 564-10.1186/1471-2164-10-564.
Biardi JE, Coss RG, Smith DG: California ground squirrel (Spermophilus beecheyi) blood sera inhibits crotalid venom proteolytic activity. Toxicon. 2000, 38: 713-721. 10.1016/S0041-0101(99)00179-8.
Biardi JE, Coss RG: Rock squirrel (Spermophilus variegatus) blood sera affects proteolytic and hemolytic activities of rattlesnake venoms. Toxicon. 2011, 57: 323-331. 10.1016/j.toxicon.2010.12.011.
Renjifo C, Smith EN, Hodgson WC, Renjifo JM, Sanchez A, Acosta R, Maldonado JH, Riveros A: Neuromuscular activity of the venoms of the Colombian coral snakes Micrurus dissoleucus and Micrurus mipartitus: an evolutionary perspective. Toxicon. 2012, 59: 132-142. 10.1016/j.toxicon.2011.10.017.
Tennant A, Bartlett RD: Snakes of North America : Eastern and Central Regions. 2000, Houston: Gulf Publishing Company
Norris RL, Pfalzgraf RR, Laing G: Death following coral snake bite in the United States - First documented case (with ELISA confirmation of envenomation) in over 40 years. Toxicon. 2009, 53 (6): 693-697. 10.1016/j.toxicon.2009.01.032.
Tanaka GD, Furtado MdFD, Portaro FCV, Sant’Anna OA, Tambourgi DV: Diversity of Micrurus snake species related to their venom toxic effects and the prospective of antivenin neutralization. PLoS Negl Trop Dis. 2010, 4 (3): e622-10.1371/journal.pntd.0000622.
Bohlen CJ, Chesler AT, Sharif-Naeini R, Medzihradszky KF, Zhou S, King D, Sãnchez EE, Burlingame AL, Basbaum AI, Julius D: A heteromeric Texas coral snake toxin targets acid-sensing ion channels to produce pain. Nature. 2011, 479 (7373): 410-414. 10.1038/nature10607.
Ciscotto PHC, Rates B, Silva DAF, Richardson M, Silva LP, Andrade H, Donato MF, Cotta GA, Maria WS, Rodrigues RJ, Sanchez E, De Lima ME, Pimenta AMC: Venomic analysis and evaluation of antivenom cross-reactivity of South American Micrurus species. J Proteomics. 2011, 74: 1810-1825. 10.1016/j.jprot.2011.07.011.
Dokmetjian JC, del Canto S, Vinzõn S, Biscoglio de Jime~nez Bonino MJ: Biochemical characterization of the Micrurus pyrrhocryptus venom. Toxicon. 2009, 53: 375-382. 10.1016/j.toxicon.2008.12.015.
Leão LI, Ho PL, Junqueira-de-Azevedo ILM: Transcriptomic basis for an antiserum against Micrurus corallinus (coral snake) venom. BMC Genomics. 2009, 10: 112-10.1186/1471-2164-10-112.
Rokyta DR, Lemmon AR, Margres MJ, Aronow K: The venom-gland transcriptome of the eastern diamondback rattlesnake (Crotalus adamanteus). BMC Genomics. 2012, 13: 312-10.1186/1471-2164-13-312.
Rodrigue S, Materna A, Timberlake S, Blackburn M, Malmstrom R, Alm E, Chisholm S: Unlocking short read sequencing for metagenomics. PLoS One. 2010, 5: e11840-10.1371/journal.pone.0011840.
Alberch P: From genes to phenotype: dynamical systems and evolvability. Genetics. 1991, 84: 5-11.
Barlow A, Pook CE, Harrison RA, Wüster W: Coevolution of diet and prey-specific venom activity supports the role of selection in snake venom evolution. Proc R Soc B. 2009, 276: 2443-2449. 10.1098/rspb.2009.0048.
Doley R, Zhou X, Kini RM: Snake venom phospholipase A2 enzymes. Handbook of Venoms and Toxins of Reptiles. Edited by: Mackessy SP. 2010, Boca Raton: CRC Press, 173-205.
Hegde RP, Rajagopalan N, Doley R, Kini RM: Snake venom three-finger toxins. Handbook of Venoms and Toxins of Reptiles. Edited by: Mackessy S P., Mackessy S P. 2010, Boca Raton: CRC Press, 287-301.
Corrêa-Netto C, Junqueira-de Azevedo IdL, Silva D, Ho P, Leitão-de Araújo M, Alves M, Sanz L, Foguel D, Zingali R, Calvete J: Snake venomics and venom gland transcriptomic analysis of Brazilian coral snakes, Micrurus altirostris and M. corallinus. J Proteomics. 2011, 74: 1795-1809. 10.1016/j.jprot.2011.04.003.
Fox JW, Serrano SMT: Snake venom metalloproteinases. Handbook of Venoms and Toxins of Reptiles. Edited by: Mackessy SP. 2010, Boca Raton: CRC Press, 95-113.
Mackessy SP: Evolutionary trends in venom composition in the western rattlesnakes (Crotalus viridis sensu lato): toxicity vs. tenderizers. Toxicon. 2010, 55: 1463-1474. 10.1016/j.toxicon.2010.02.028.
St Pierre L, Earl ST, Filippovich I, Sorokina N, Masci PP, de Jersey J, Lavin MF: Common evolution of waprin and kunitz-like toxin families in Australian venomous snakes. Cell Mol Life Sci. 2008, 65: 4039-4054. 10.1007/s00018-008-8573-5.
Doley R, Tram NN, Reza MA, Kini RM: Unusual accelerated rate of deletions and insertions in toxin genes in the venom glands of the pygmy copperhead (Austrelaps labialis) from Kangaroo island. BMC Evol Biol. 2008, 8: 70-10.1186/1471-2148-8-70.
Du XY, Clemetson KJ: Reptile C-type lectins. Handbook of Venoms and Toxins of Reptiles. Edited by: Mackessy SP. 2010, Boca Raton: CRC Press, 359-375.
Smith KM, Gaultier A, Cousin H, Alfandari D, White JM, DeSimone DW: The cysteine-rich domain regulates ADAM protease function ‘in vivo’. J Cell Biol. 2002, 159: 893-902. 10.1083/jcb.200206023.
Kemparaju K, Girish KS, Nagaraju S: Hyaluronidases, a neglected class of glycosidases from snake venom, beyond a spreading factor. Handbook of Venoms and Toxins of Reptiles. Edited by: Mackessy SP. 2010, Boca Raton: CRC Press, 237-258.
Tan NH, Fung SY: Snake venom L-amino acid oxidases. Handbook of Venoms and Toxins of Reptiles. Edited by: Mackessy SP. 2010, Boca Raton: CRC Press, 221-235.
Lavin MF, Earl S, Birrel G, St Pierre L, Guddat L, de Jersey J, Masci P: Snake venom nerve growth factors. Handbook of Venoms and Toxins of Reptiles. Edited by: Mackessy SP. 2010, Boca Raton: CRC Press, 377-391.
Dhananjaya BL, Vishwanath BS, D’Souza CJM: Snake venom nucleases, nucleotidases, and phosphomonoesterases. Handbook of Venoms and Toxins of Reptiles. Edited by: Mackessy SP. 2010, Boca Raton: CRC Press, 155-171.
Bernheimer AW, Linder R, Weinstein SA, Kim KS: Isolation and characterization of a phospholipase B from venom of collett’s snake, Pseudechis colletti. Toxicon. 1987, 25 (5): 547-554. 10.1016/0041-0101(87)90290-X.
Yamazaki Y, Matsunaga Y, Tokunaga Y, Obayashi S, Saito M, Morita T: Snake venom vascular endothelial growth factors (VEGF-Fs) exclusively vary their structures and functions among species. J BiolChem. 2009, 284 (15): 9885-9891.
Chu JH, Lin RC, Yeh CF, Hsu YC, Li SH: Characterization of the transcriptome of an ecologically important avian species, the vinous-throated parrotbill Paradoxornis webbianus bulomachus (Paradoxornithidae; Aves). BMC Genomics. 2012, 13: 149-10.1186/1471-2164-13-149.
Zhang J: Evolution by gene duplication: an update. Trends Ecol Evol. 2003, 18 (6): 292-298. 10.1016/S0169-5347(03)00033-8.
Otto SP, Yong P: The evolution of gene duplicates. Adv Genet. 2002, 46: 451-483.
Qian W, Zhang J: Gene dosage and gene duplicability. Genetics. 2008, 179: 2319-2324. 10.1534/genetics.108.090936.
Bielawski JP, Yang Z: Maximum likelihood methods for detecting adaptive evolution after gene duplication. J Struct Funct Genomics. 2003, 3 (1–4): 201-212.
Fry BG, Wüster W, Kini RM, Brusic V, Khan A, Venkataraman D, Rooney AP: Molecular evolution and phylogeny of elapid snake venom three-finger toxins. J Mol Evol. 2003, 57: 110-129. 10.1007/s00239-003-2461-2.
Gibbs HL, Rossiter W: Rapid evolution by positive selection and gene gain and loss: PLA2 venom genes in closely related Sistrurus rattlesnakes with divergent diets. J Mol Evol. 2008, 66: 151-166. 10.1007/s00239-008-9067-7.
Kordiš D, Gubenšek F: Adaptive evolution of animal toxin multigene families. Gene. 2000, 261: 43-52. 10.1016/S0378-1119(00)00490-X.
Cecchini AL, Marcussi S, Silveira LB, Borja-Oliveira CR, Rodrigues-Simioni L, Amara S, Stábeli RG, Giglio JR, Arantes EC, Soares AM: Biological and enzymatic activities of Micrurus sp. (coral) snake venoms. Comp Biochem Physiol. 2005, Part A (140): 125-134.
Fernandez J, Alape-Giron A, Angulo Y, Sanz L, Gutierrez J, Calvete J, Lomonte B: Venomic and antivenomic analyses of the Central American Coral Snake, Micrurus nigrocinctus (Elapidae). J Proteome Res. 2011, 10: 1816-1827. 10.1021/pr101091a.
Rey-Suárez P, Núñez V, Gutiérrez J, Lomonte B: Proteomic and biological characterization of the venom of the redtail coral snake, Micrurus mipartitus (Elapidae), from Colombia and Costa Rica. J Proteomics. 2011, 75 (2): 655-667. 10.1016/j.jprot.2011.09.003.
da Silva Jr NJ, Aird SD: Prey specificity, comparative lethality and compositional differences in coral snake venoms. Comp Biochem Physiol. 2001, 128: 425-456. 10.1016/S1096-4959(00)00347-X.
McCleary RJR, Heard DJ: Venom extraction from anesthetized Florida cottonmouths, Agkistrodon piscivorus conanti, using a portable nerve stimulator. Toxicon. 2010, 55: 250-255. 10.1016/j.toxicon.2009.07.030.
Rotenberg D, Bamberger ES, Kochva E: Studies on ribonucleic acid synthesis in the venom glands of Vipera palaestinae (Ophidia, Reptilia). Biochem J. 1971, 121: 609-612.
Thompson J, Higgins D, Gibson T, CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 1994, 22: 4673-4680. 10.1093/nar/22.22.4673.
Nylander JAA: MrModeltest v2. Tech. rep., Evolutionary Biology Centre, Uppsala University. 2004
Swofford DL: Phylogenetic Analysis Using Parsimony* (PAUP*), Volume Version 4.0. Sunderland: Sinauer Associates
Rokyta D, Burch C, Caudle S, Wichman H: Horizontal gene transfer and the evolution of microvirid coliphage genomes. J Bacteriol. 2006, 188: 1134-1142. 10.1128/JB.188.3.1134-1142.2006.
Huelsenbeck JP, Ronquist F: MRBAYES: Bayesian inference of phylogeny. Bioinformatics. 2001, 17: 754-755. 10.1093/bioinformatics/17.8.754.
Ronquist F, Huelsenbeck JP: MRBAYES 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. 2003, 19: 1572-1574. 10.1093/bioinformatics/btg180.
Choo KH, Ranganathan S, Tan TW: A comprehensive assessment of N-terminal signal peptides prediction methods. Nucleic Acids Res., Volume 10, Asia Pacific Bioinformatics Network (APBioNet) Eighth International Conference on Bioinformatics (InCoB2009). 2009, Singapore BMC Bioinformatics
Yang Z: PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci. 1997, 13: 555-556.
Yang Z, Nielsen R, Goldman N, Pedersen AMK: Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics. 2000, 155: 431-449.
Yang Z, Swandon WJ: Codon-substitution models to detect adaptive evolution that acount for heterogeneous selective pressures among site classes. Mol Biol Evol. 2002, 19: 49-57. 10.1093/oxfordjournals.molbev.a003981.
Crandall KA, Kelsey CR, Imamichi H, Lane HC, Salzman NP: Parallel evolution of drug resistance in HIV: failure of non-synonymous/synonymous substitution rate ratio to detect selection. Mol Biol Evol. 1999, 16: 372-382. 10.1093/oxfordjournals.molbev.a026118.
The authors thank Kenneth P. Wray and S. Brian Caudle for assistance in the laboratory. Computational resources were provided by the Florida State University High-Performance Computing cluster and funding for this work was provided to DRR by the National Science Foundation (DEB 1145987).
The authors declare that they have no competing interests.
The project was conceived and planned by DRR. MJM, KA, JL, and DRR collected and analyzed the data. MJM wrote the manuscript. All authors read and approved the final manuscript.