Single nucleotide polymorphism discovery in cutthroat trout subspecies using genome reduction, barcoding, and 454 pyro-sequencing

Background Salmonids are popular sport fishes, and as such have been subjected to widespread stocking throughout western North America. Historically, stocking was done with little regard for genetic variation among populations and has resulted in genetic mixing among species and subspecies in many areas, thus putting the genetic integrity of native salmonid populations at risk and creating a need to assess the genetic constitution of native salmonid populations. Cutthroat trout is a salmonid species with pronounced geographic structure (there are 10 extant subspecies) and a recent history of hybridization with introduced rainbow trout in many populations. Genetic admixture has also occurred among cutthroat trout subspecies in areas where introductions have brought two or more subspecies into contact. Consequently, management agencies have increased their efforts to evaluate the genetic composition of cutthroat trout populations to identify populations that remain uncompromised and manage them accordingly, but additional genetic markers are needed to do so effectively. Here we used genome reduction, MID-barcoding, and 454-pyrosequencing to discover single nucleotide polymorphisms that differentiate cutthroat trout subspecies and can be used as a rapid, cost-effective method to characterize the genetic composition of cutthroat trout populations. Results Thirty cutthroat and six rainbow trout individuals were subjected to genome reduction and next-generation sequencing. A total of 1,499,670 reads averaging 379 base pairs in length were generated by 454-pyrosequencing, resulting in 569,060,077 total base pairs sequenced. A total of 43,558 putative SNPs were identified, and of those, 125 SNP primers were developed that successfully amplified 96 cutthroat trout and rainbow trout individuals. These SNP loci were able to differentiate most cutthroat trout subspecies using distance methods and Structure analyses. Conclusions Genomic and bioinformatic protocols were successfully implemented to identify 125 nuclear SNPs that are capable of differentiating most subspecies of cutthroat trout from one another. The ability to use this suite of SNPs to identify individuals of unknown genetic background to subspecies can be a valuable tool for management agencies in their efforts to evaluate the genetic structure of cutthroat trout populations prior to constructing and implementing conservation plans.


Background
Single nucleotide polymorphisms (SNPs) are powerful genetic markers that are increasingly being used in phylogenetic and population genetic studies [1,2]. Advances in high-throughput sequencing technologies have made SNP identification faster and cheaper than traditional methods that utilize Sanger sequencing (see Metzker [3] for a review of next-generation techniques). The relative ease by which SNPs can now be identified makes SNP discovery more attainable [2,4]. Indeed, many researchers have recently used next generation sequencing to detect SNPs in a variety of organisms [5][6][7][8][9]. SNP discovery in salmonid fishes has garnered much attention in recent years [10][11][12][13][14][15][16][17][18][19], but there is a growing need for additional SNP discovery in certain groups of salmonids.
Cutthroat trout, Oncorhynchus clarkii, a native western North American salmonid, has ten extant subspecies (and two extinct subspecies). The species appears to be monophyletic [20], and diversification among subspecies is postulated to have begun approximately two million years ago [20,21]. The extant subspecies are: Bonneville cutthroat trout (O. c. utah) in the Bonneville Basin; Coastal cutthroat trout (O. c. clarki) in coastal drainages from Alaska to northern California; Colorado River cutthroat trout (O. c. pleuriticus) in the upper Colorado River basin; Greenback cutthroat trout (O. c. stomias) in the Arkansas and the South Platte river basins in eastern Colorado (federally listed as threatened); Lahontan cutthroat trout (O. c. henshawi) in the western Lahontan Basin of Nevada and several closed basins in Oregon (federally listed as threatened); Humboldt cutthroat trout (O. c. humboldtensis) in the Humboldt River in the eastern portion of the Lahontan Basin recently designated as a separate sub-species from the Lahontan cutthroat trout [22]; Paiute cutthroat trout (O. c. seleniris) in Silver Creek on the eastern slope of the Sierra Nevada (federally listed as threatened); Rio Grande cutthroat trout (O. c. virginalis) in tributaries to the Rio Grande in southern Colorado and New Mexico; Westslope cutthroat trout (O. c. lewisi) in drainages of the Rocky Mountains in Alberta, British Columbia, northern Idaho, and Montana, with disjunct populations in Oregon and Washington; Yellowstone cutthroat trout (O. c. bouvieri) in the Yellowstone River, Yellowstone Lake, and the upper Snake River drainages of Wyoming, Idaho, and Montana.
The Snake River fine spotted cutthroat trout (in the upper Snake River) was designated as a separate subspecies, O. c. behnkei [23]. The only documented differences between it and sympatric Yellowstone cutthroat trout are the spotting pattern and behavior, whereas other morphological and meristic characters are the same [23,24]. Genetic analyses have not revealed differences between Snake River fine spotted and Yellowstone cutthroat trout [21,25,26]. Unfortunately, Montgomery [23] did not designate a type specimen when he named O. c. behnkei, thus technically ren-dering the subspecies designation invalid. Therefore, we treat the Yellowstone cutthroat trout and the Snake River fine spotted cutthroat trout as O. c. bouvieri. Similarly, while Paiute cutthroat trout have fewer spots than Lahontan cutthroat trout (many have no spots), they do not differ from Lahontan cutthroat trout with any other morphological or meristic characters that have been examined [24], nor do the two subspecies appear to be genetically distinct based on electrophoretic data [27]. Recently, Finger et al. [28] characterized six SNPs that differentiate rainbow trout from Lahontan and Paiute cutthroat trout, but the two cutthroat trout subspecies were identical at all six SNP loci. Additionally, Paiute cutthroat trout and Lahontan cutthroat trout carried identical haplotypes of the second subunit of the NADH dehydrogenaseubiquinone oxidoreductase enzyme complex I (ND2) of the mitochondrial genome [21]. Hence, Paiute cutthroat trout and Lahontan cutthroat trout are likely to be very similar genetically.
The Bonneville cutthroat trout includes a morphologically and ecologically unique lineage of cutthroat trout in the Bear River drainage [24,29]. Based on allozyme [30,31] and mtDNA [21,26,32] data, the Bear River strain of Bonneville cutthroat trout (hereafter referred to as Bear River cutthroat trout for simplicity) are genetically more closely related to Yellowstone cutthroat trout than to the Bonneville cutthroat trout in the main Bonneville Basin [26,[30][31][32], although an associated taxonomic revision has not yet been made. The sister relationship between Yellowstone and Bear River cutthroat trout makes biogeographic sense because the Bear River was part of the upper Snake River drainage until the late Pleistocene, at which time the Bear River was redirected into the Bonneville Basin (~35 Ka) [33][34][35]. Contemporary gene flow between the Bear River and other drainages within the Bonneville Basin is prevented by the Great Salt Lake. While the cutthroat trout in the Bear River system are currently classified as Bonneville cutthroat trout, herein we treat them as an additional lineage within the species O. clarkii.
Despite these unique lineages, cutthroat trout were stocked within and among major drainages with little concern for genetic variability among subspecies [24,36]. Introgression among cutthroat trout subspecies has resulted from these stocking practices. Additionally, rainbow trout (Oncorhynchus mykiss) have been stocked extensively throughout western North America. Rainbow trout readily hybridize with cutthroat trout in areas where the two species did not formerly co-occur, posing a serious threat to the genetic integrity of the native cutthroat trout populations [25,[37][38][39]. Rainbow trout x cutthroat trout hybrids can be identified reasonably well using morphological and meristic characters, however in populations with extensive introgressive hybridization this is not always the case, nor are hybrids between cutthroat trout subspecies easily recognized. Because of widespread intra-and interspecific hybridization, management agencies have increased efforts to assess the genetic composition of native cutthroat trout populations. A range of genetic markers have been used to assess introgressive hybridization in cutthroat trout populations, and recently SNPs useful for species identification and detecting introgression between rainbow trout and cutthroat trout have been developed [28,40,41]. Some SNPs differentiate certain subspecies of cutthroat trout from rainbow trout (e.g., rainbow trout vs. westslope cutthroat trout [39,42]; rainbow trout vs. westslope and Yellowstone cutthroat trout [43]; rainbow trout vs. westslope, Yellowstone, coastal, and Lahontan cutthroat trout [44]; rainbow trout vs. westslope, Bonneville, Yellowstone, coastal and Lahontan cutthroat trout [45]). However, it remains unclear whether these SNPs are unique to those cutthroat trout subspecies, or if they are shared with other subspecies not included in those studies. Here we use next-generation sequencing technologies to identify additional nuclear SNPs that collectively characterize most of the cutthroat trout subspecies. Specifically, we used genome reduction, MID-barcoding, and 454-pyrosequencing in an attempt to discover nuclear SNPs that differentiate nine lineages of cutthroat trout (i.e., Bear River cutthroat, Bonneville cutthroat, coastal cutthroat, Colorado River cutthroat, greenback cutthroat, Lahontan cutthroat, Rio Grande cutthroat, westslope cutthroat, and Yellowstone cutthroat) and rainbow trout (including Columbia redband trout and steelhead). These SNPs were used to develop a SNP assay that can be easily used to evaluate the genetic integrity of cutthroat trout populations across the entire range of the species. The SNP assays are based on KASPar ™ genotyping chemistry and were detected using the Fluidigm dynamic array platform.

Methods
A brief overview of our methods is as follows: DNA was extracted from tissue samples, followed by genome reduction and 454 pyro-sequencing. The sequences produced were assembled and scanned for SNPs bioinformatically. SNPs were then genotyped and SNP diversity analyses were performed. Individuals that yielded unexpected results according to a priori subspecies designations were re-sequenced for the ND2 mitochondrial DNA gene using Sanger sequencing for further evaluation.

DNA extraction
Fin clips or muscle tissues were obtained from thirty-six individuals representing nine sub-species of cutthroat trout, plus the Bear River cutthroat trout and rainbow trout (including Columbia redband trout and steelhead, which are unique forms of O. mykiss) from the Monte L. Bean Life Science Museum ichthyological collection ( Table 1). The samples were collected by field biologists familiar with cutthroat trout subspecies identifications based on phenotypic characters. While subspecies identifications are typically accurate when made by experts, especially when accounting for geographic distribution, the presence of cryptic hybrids among these samples is possible. We were unable to obtain tissues for Paiute cutthroat trout, but given the lack of distinguishing genetic characteristics between them and Lahontan cutthroat trout, they are likely very similar to Lahontan cutthroat trout. Whole genomic DNA was extracted using Qiagen DNeasy Tissue kits following the manufacturer's recommended protocol. All extracted DNA was quantified using a NanoDrop 1000 Spectrophotometer (NanoDrop Technologies Inc., Montchanin, DE), and each sample was diluted to a concentration of 150 ng/μl using nuclease free water.
Some of the DNA samples that were extracted were not high enough quality to be included in the genomic reduction and 454-pyrosequencing steps (outlined below). In most cases this resulted in simply using one less individual from a subspecies, but was more problematic for Lahontan and Humboldt cutthroat trout where only one individual of each subspecies had DNA of a high enough quality to include. To ensure getting enough reads, those two individuals were pooled, resulting in a Lahontan Basin complex where Lahontan and Humboldt cutthroat were treated as a single unit. Hereafter we refer to both groups as Lahontan Basin cutthroat, but this is merely for convenience, and is not meant to be a statement regarding a taxonomic update.
To verify that the discovered SNPs were able to differentiate subspecies of cutthroat trout from each other and from rainbow trout, DNA was extracted from an additional sixty cutthroat trout individuals representing ten trout lineages. These sixty samples, along with the thirty-six samples that were extracted for 454-pyrosequencing, were used to verify SNPs. Hence, each cutthroat trout lineage was represented by ten individuals, and rainbow trout was represented by six individuals for SNP verification. In an attempt to account for geographic variation within cutthroat trout subspecies, samples were included from populations different from those included on the initial SNP discovery panel when possible (Table 1). Samples were chosen from populations that are believed to be nonadmixed, but this was not verifiable in all cases.

Genomic reduction and 454-pyrosequencing
We followed the genomic reduction methodology described in detail by Maughan et al. [6]. In brief, genomic reduction was carried out using restriction enzymes EcoRI and BfaI to double digest genomic DNA at restriction sites that were conserved across all sub-species, and then attaching EcoRI and BfaI adapter sequences to the sticky ends using T4 DNA ligase. Approximately 90% of the genome was then discarded through size exclusion via spin chromatography and biotin-streptavidin paramagnetic  according to the MID-barcodes that were attached to DNA fragments from each of the individual samples. All reads from all subspecies were pooled and a de novo assembly was created using Newbler v.2.6 (454 Life Sciences 2006-2011), after which all reads from all subspecies were mapped onto this assembly using the reference mapping function in CLCBio Workbench.
Putative autapomorphic SNPs were identified by comparing sequences derived from a single cutthroat trout subspecies to the sequences of the other subspecies combined. Comparisons were completed using SNP_Finder_Plus, a custom perl script described by Maughan et al. [6]. A similar comparison between rainbow trout and cutthroat trout (all subspecies sequences pooled) was used to identify species specific SNPs. A subset of SNPs was selected for validation from the pool of all putative SNPs using the following criteria: 1) Polymorphisms were unique to a single subspecies of cutthroat or rainbow trout so that only potentially diagnostic SNPs were selected. 2) All reads at the SNP locus had a minimum read coverage depth ≥ 8 to exclude putative SNPs that were based on too few reads. 3) A minimum of 50 bp existed on either side of the putative SNP for primer binding sites, with no indels or ambiguities within 20 bp of the SNP. 4) The minor allele of the SNP had to consist of a minimum of 3 reads comprising at least 4% of the total alleles observed at that position or it was not considered as a SNP. This was an arbitrary cutoff designed as an extra filter to ensure that the minor allele is not erroneous (particularly for contigs with high coverage depth). 5) The SNP had at least 95% identity within the subspecies so that only alleles that were fixed or nearly fixed for a given subspecies were considered, and alleles that were unique to a small proportion of individuals within a subspecies were not. This criterion also guards against calling SNPs in misaligned sites. 6) The SNP did not appear in known repeat regions or reside within the mtDNA genome, which was determined using RepeatMasker v.3.3.0 [46] against the Danio reference database and by performing BLAST searches on all contigs with a rainbow trout mtDNA reference genome [GenBank: NC_001717]. We acknowledge that mtDNA SNPs are useful in many cases, but the purpose of this study was to discover unlinked nuclear markers for additional statistical power in future population genetic studies. 7) SNPs were limited to one per contig in an attempt to minimize the number of linked alleles included. Limiting the SNPs to one per contig also served as a screen to eliminate paralogous sequence variants (PSVs). A position weighted window filter was also used to determine if SNPs were in a poor alignment region. Every polymorphism that occurred within the window was assigned a score based on its relative position to the SNP, the total score was added up and if a threshold score ≥6 was reached the SNP was considered erroneous and skipped. This window filter was designed to avoid calling SNPs in misaligned regions, as well as to avoid areas that aligned paralogous loci, thus serving as an additional screen for PSVs. Primers for 288 SNP loci that met the above criteria (enough to fill three Fluidigm chips) were designed using the default settings in the PrimerPicker software program [47]. Primer sequences are listed in Additional file 1.

SNP genotyping
Prior to genotyping the SNPs, specific target amplification (STA) was used to pre-amplify each SNP locus. STA primers are non-allele specific and do not carry a polymorphic base on their 3' end (Additional file 1). STA reactions consisted of 2.5 μl of Qiagen 2X Multiplex PCR Master Mix, 0.5 μl of the 10X STA primer assays (which consisted of 192 μl STA primer, 192 μl constant primer [2 μl each primer per reaction × 96 reactions], and 16 μl TE buffer [10 mM Tris, 1 mM EDTA, pH 7.5]), and 0.75 μl DNase free water per reaction. A total of 3.75 μl of the STA premix was combined with 1.25 μl of genomic DNA and amplified with the following thermal profile: 95°C for 15 minutes followed by 14 cycles of 95°C for 15 seconds and 60°C for 4 minutes. Following PCR, we diluted the STA products (1:100) with nuclease free water.

SNP diversity data analysis
To visualize the ability of the SNPs to discern among cutthroat trout lineages, three analyses were performed. First, a NeighborNet phylogenetic network was created using the software program SplitsTree v.4.12.6 [51]. Second, principal coordinates analysis (PCoA) was performed using the Ecodist package v. 1.2.7 [52] available in the statistical software program R v. 2.14.1 [53]. Third, the population genetic software program Structure v.2.2.3 [54] and Structure Harvester [55] were used to infer the number of distinct populations within our panel. Structure analysis was evaluated 30 times for each K (with K ranging from 2 to 18), with 1,000,000 repetitions per run after discarding an initial 100,000 repetitions as burn-in. In three instances, Structure was unable to differentiate among some subspecies (see Results). Each of those three groups of subspecies were extracted from the original input file and re-evaluated 20 times for each K (with K ranging from 1 to 6) following the same procedure as outlined above.

MtDNA Re-sequencing
The mitochondrial ND2 gene was re-sequenced for some individual samples that did not group with their predefined subspecies as expected (see Results). Amplification of ND2 was achieved using PCR primers Gln56F (5'-ACT ACA CCA CTT TCT AGT AAG GTC AGC-3') and Ala13R (5'-GCA TTC AGA AGA TGT GGG ATA AAG TC-3'). Reaction cocktails were 12.5 μl in volume, and contained~100 ng genomic DNA, 2.25 μl nuclease free water, 0.5 μl each primer, and 6.25 μl Promega GoTaq W Hot Start Green Master Mix. The thermal profile contained an initial denature of 95°C for two minutes to activate the enzyme, 35 cycles of 95°C for 30 seconds, 48°C for 30 seconds, and 72°C for 90 seconds, followed by a rapid cool down to 12°C. The light and heavy strands were each sequenced in 10.5 μl reactions using the same primers that amplified the ND2 gene and Big Dye chemistry. Excess dye terminator was removed using Sephadex columns. Sequencing was performed on an ABI 3730xl automated sequencer located in the DNA Sequencing Center at Brigham Young University.

454-Pyrosequencing, genome assembly and SNP discovery
454-pyrosequencing produced 1,499,670 reads, with an average read length of 379 bp, for a total of 569,060,777 bp from a single run. The reads have been made publicly available via the NCBI Sequence Read Archive (Study #SRA062178). Reads were not equal for each subspecies. This inequality was likely influenced by the differences in sample size for each subspecies since N ranged from 2 to 4 for each subspecies of cutthroat that was included on the 454 plate (Table 1). Additionally, some MID-barcodes produced more reads than others ( Figure 1) even though an attempt was made to mix the samples in equimolar amounts before sequencing. The discrepancy is likely due to difficulties associated with fluorometric quantification of the PCR samples before pooling and/or inaccurate pipetting during the pooling process. Of the cutthroat trout lineages, Bear River cutthroat trout had the most reads (187,976) and coastal cutthroat trout had the fewest (14,607) (see Figure 2).
We discovered 43,558 putative SNPs, of which 6383 (15%) had BLAST hits to the GenBank refseq_protein database, and 37,175 (85%) did not. Our selection process revealed that 28,887 of those putative SNPs met the first five criteria outlined above for designing SNP primers (unique to a single subspecies, ≥8X coverage at the SNP, 50 bp flanking the SNPs for primer binding sites, minor allele frequency of at least 3 reads, and 95% identical for a subspecies). Pairwise comparisons showing the number of putative SNPs between each subspecies are shown in Table 2. A total of 8,627 of these putative SNPs that met the first five criteria were excluded from primer picking because they were determined to occur in repeat regions or in the mitochondrial genome. Of the remaining 20,260 putative SNPs, we developed primers for 288 SNP loci, taking care not to choose primers from the same contig when possible in an attempt to avoid picking linked loci.

SNP genotyping and diversity analysis
A total of 125 of the subset of 288 SNP loci for which we developed primers (43%) produced clean amplification signal, yielding genotypic clusters that were visibly separated from each other and could be scored following PCR amplification and SNP genotyping. Sequence information for all 125 validated SNPs has been deposited in the GenBank database [GenBank accession #s are listed in Additional file 1]. All 125 validated SNPs are listed with the minor allele frequencies and the subspecies that carried the minor allele in Additional file 2. Westslope cutthroat trout had the lowest number of polymorphic SNPs with 20, whereas greenback cutthroat trout had the highest number of polymorphic SNPs with 75 (Table 3). Westslope cutthroat trout also had the lowest number of highly polymorphic SNPs with 7, and Bear River cutthroat had the highest number of highly polymorphic SNPs with 22 (Table 3).
The diversity panel was comprised of 95 individuals representing nine lineages of cutthroat trout (n=10 per lineage), rainbow trout (n=5), and a negative control. The total number of polymorphic SNPs ranged from 19 to 75 for these cutthroat trout subspecies and rainbow trout ( Table 3). Minor-allele frequencies (MAF) ranged from 0.05 to 0.50 for each subspecies of cutthroat trout, and from 0.10 to 0.50 for rainbow trout ( Table 3). The mean MAF value for all subspecies of cutthroat trout was 0.198 per SNP locus. Because SNPs are biallelic markers, the maximum MAF for any given SNP locus is 0.50, which occurs when both SNP alleles are present at equal frequencies in the sample population. Therefore, considering SNP loci that exhibited a MAF ≥ 0.3 to be highly polymorphic, the number of highly polymorphic SNPs in these cutthroat trout subspecies ranged from 7 to 22 (Table 3). Minor allele frequencies for all 125 SNPs reported here are listed in Additional file 2.
The SplitsTree analysis produced a NeighborNet phylogenetic network that separates the cutthroat trout subspecies (Figure 3). Coastal cutthroat trout clustered together with rainbow trout. Westslope cutthroat clustered with the coastal-rainbow group, but with a large genetic distance between them. Lahontan Basin cutthroat trout exhibited great genetic distance between them and the other subspecies, as did the coastalwestslope-rainbow group. Of the interior cutthroat trout subspecies (i.e., Bear River, Bonneville, Colorado River, Greenback, Rio Grande and Yellowstone), Bonneville and Rio Grande cutthroat separated from the others, although Rio Grande cutthroat appear in two separate parts of the network (Figure 3). Bear River and Yellowstone cutthroat cluster together, as do Colorado River and greenback cutthroat trout. Moreover, 6 of the 95 individuals did not separate as expected according to their a priori subspecies designation. One Lahontan Basin cutthroat individual (LAH95362) grouped with rainbow trout rather than with the other Lahontan Basin individuals. Three Rio Grande cutthroat individuals (RG90712, RG90714, and RG90732) clustered with the group containing Yellowstone and Bear River cutthroat rather than with the other seven Rio Grande cutthroat individuals. One Bear River cutthroat individual (BR239318) grouped with Bonneville cutthroat, and one Colorado River cutthroat individual (CR134518) connected to the  Figure 1 Individual read numbers. Number of reads for each cutthroat trout and rainbow trout individual, identified by BYU # and MID-barcode, BYU #s are prefaced with a corresponding subspecies abbreviation as follows: BR=Bear River cutthroat, BON=Bonneville cutthroat, COA=Coastal cutthroat, CR=Colorado River cutthroat, GB=Greenback cutthroat, LAH=Lahontan Basin cutthroat, RG=Rio Grande cutthroat, WS=westslope cutthroat, YS=Yellowstone cutthroat, and RBT=rainbow trout. base of a branch leading to a group containing seven Rio Grande cutthroat individuals. Re-sequencing of the mitochondrial gene ND2 showed that the "problematic" Lahontan Basin cutthroat individual carried rainbow trout mtDNA, and that the Colorado River cutthroat individual carried Greenback cutthroat trout mtDNA. The Rio Grande and Bear River cutthroat individuals carried mtDNA sequences of their respective subspecies. Principal coordinates analysis results showed that Principal Coordinate 1 explained 35.8% of the total variance, and Principal Coordinate 2 explained 15.7% (Figure 4). Thus, the first two principal coordinates combined explained 51.5% of the total variance observed in the distance matrix. Groups on the PCoA plot ( Figure 4) replicate the groups observed on the phylogenetic network ( Figure 3).

# OF READS
Structure analysis and the results of Structure Harvester using the Evanno method revealed six distinct populations of cutthroat trout and rainbow trout, not the ten that we defined a priori (see Figure 5). Structure grouped Colorado River cutthroat trout and greenback cutthroat trout together rather than as distinct subspecies, and did the same for Bear River and Yellowstone cutthroat. Similarly, coastal cutthroat, westslope cutthroat and rainbow trout were also considered to be a single population in the Structure results. Several individuals, including those that did not cluster as expected on the phylogenetic network, showed evidence of hybridization, as illustrated by mixed colors in the bars representing those individuals ( Figure 5). Further Structure analyses revealed two distinct populations within the westslope/coastal/rainbow group, with westslope cutthroat cleanly separating from coastal cutthroat and rainbow trout ( Figure 6). Reanalysis of the Bear River/ Yellowstone group resulted in two distinct populations of cutthroat trout, although the boundary between them was not as clean cut as the boundary between westslope cutthroat and the coastal cutthroat/rainbow trout group ( Figure 6). Reanalysis of the Colorado River/Greenback group resulted in three distinct populations, although many individuals showed signs of genetic admixture, and all individuals that fell into the third group showed signs of hybridization with other subspecies of cutthroat trout ( Figure 5 and Figure 6).

Discussion
We have implemented genomic and bioinformatic protocols to discover over 28,000 putative SNPs among cutthroat trout subspecies. We were able to scrutinize these data to develop a SNP assay that contains 125 nuclear SNPs and is capable of differentiating most subspecies of cutthroat trout from one another and from rainbow trout. The SNP assay is a fast and cost-effective way to identify individuals of unknown genetic background to subspecies, and can be a valuable tool for management agencies in their efforts to evaluate the genetic structure of cutthroat trout populations in western North America, especially prior to constructing and implementing conservation plans. These 125 putatively unlinked nuclear SNPs also allow for the detection of hybrid individuals, as evidenced by the results of our SNP diversity analyses. Indeed, the majority of the individuals that did not cluster as predicted in the phylogenetic network (see Results and Figure 3) carried a signature of genetic admixture between cutthroat trout subspecies in the results of Structure analysis (see Figure 5). For example, the three Rio Grande cutthroat individuals that clustered with Bear River and Yellowstone cutthroat in the phylogenetic network (RG90712, RG90714, and RG90732) carried Rio Grande cutthroat mtDNA haplotypes, but appear to carry both Rio Grande and Bear River-Yellowstone cutthroat alleles in the Structure bar graph ( Figure 5). Moreover, the Lahontan Basin cutthroat trout individual that clustered with rainbow trout on the network did, in fact, carry rainbow trout mtDNA, and has mostly rainbow trout SNP alleles ( Figure 5), thus illustrating a case where a probable misidentified specimen was successfully identified using the SNP markers (the sample was provided as a fin clip, so we were unable to go back to the voucher specimen to reassess the species identification). The Structure analysis was also useful in detecting other Figure 3 Phylogenetic network. NeighborNet phylogenetic network illustrating genetic distances among cutthroat trout species and rainbow trout, Individual names are represented by abbreviations for species/subspecies designations followed by BYU ID numbers. Subspecies abbreviations are as follows: BR=Bear River cutthroat, BON=Bonneville cutthroat, COA=Coastal cutthroat, CR=Colorado River cutthroat, GB=Greenback cutthroat, LAH=Lahontan Basin cutthroat, RG=Rio Grande cutthroat, WS=westslope cutthroat, YS=Yellowstone cutthroat, and RBT=rainbow trout. Individuals are highlighted with the same colors used to designate unique groups in the Structure results (see Figures 5 and 6). heterozygous/introgressed individuals, even if they did cluster with the pre-assigned subspecies on the network, as evidenced by a number of mixed bars representing Bear River, Bonneville, coastal, Colorado River, greenback, Lahontan Basin and Rio Grande cutthroat trout individuals ( Figure 5).
The groups on the phylogenetic network ( Figure 3) and in the PCoA results ( Figure 4) accurately reflect what is known about phylogenetic relationships among cutthroat trout subspecies. It is generally accepted that coastal cutthroat trout was the first to branch off from the other cutthroat trout after the initial divergence between cutthroat and rainbow trout [25], so the apparent close genetic distance between coastal cutthroat trout and rainbow trout is not surprising. While there are many SNPs that differentiate cutthroat trout from rainbow trout [28,40,41], the  and moving to the right: ColoradoRiver/greenback cutthroat (red), westslope/coastal/rainbow trout (green), Bear River/Yellowstone cutthroat (blue), Lahontan Basin cutthroat (yellow), Bonneville cutthroat (purple), and Rio Grande cutthroat (orange). Bars represent unique individuals, each of which is labeled using an abbreviation for the a priori subspecies designation (given in Figure 3) followed by that individual's BYU identification number. majority of the SNP primers examined herein were chosen specifically to detect differences among subspecies of cutthroat trout. Of the 288 SNP loci for which primers were developed in this study, only six were selected that should have detected differences between rainbow and cutthroat trout, and not all of those amplified reliably (Additional file 2), so the inability to clearly differentiate the two species likely results from an ascertainment bias that is a direct result of that under-sampling.
It is somewhat surprising that the initial Structure analysis was unable to separate westslope cutthroat from coastal cutthroat and rainbow trout because both the phylogenetic network ( Figure 3) and the PCoA results ( Figure 4) show what appears to be a large genetic distance between westslope cutthroat and the other lineages. The fact that westslope cutthroat did separate from coastal cutthroat and rainbow trout in the second Structure analysis suggests that this might have been caused by the signal from a small number of alleles that were unique to westslope cutthroat trout (Additional file 2) being overridden by the signal from the larger data-set. Lahontan Basin cutthroat trout also exhibit large genetic distances between them and other cutthroat trout subspecies on our network (Figure 3), and they separated from the other subspecies in PCoA ( Figure 4) and Structure analyses ( Figure 5), which is consistent with previously published genetic distances and their hypothesized positions in published phylogenies [20,21,25,32]. The relatively smaller genetic distances between the other five lineages of cutthroat trout is also consistent with previously published data, and seems to correspond well with the evolution of these lineages in separate drainage basins. However, results of our Structure analyses show that these SNPs were initially unable to differentiate Bear River cutthroat from Yellowstone cutthroat, and only did so when the Bear River/Yellowstone subset was reanalyzed. The inability of even the second Structure analyses to cleanly separate these two subspecies may be because the final separation between Bear River Figure 6 Additional Structure results. Results of secondary Structure analyses showing additional substructure in three groups that clustered as unique populations in the first set of analyses. Those groups are as follows: Westslope/Coastal/Rainbow, Bear River/Yellowstone, and Colorado River/Greenback. Secondary Structure analyses show two distinct populations in the westslope/coastal/rainbow group, two distinct populations in the Bear River/Yellowstone group, and three distinct populations in the Colorado River/greenback group. and Yellowstone cutthroat likely corresponds with late Pleistocene events that resulted in the capture of the Bear River into the Bonneville Basin [26]. Gene flow was likely possible between Bear River and Yellowstone cutthroat trout populations up until the Bear River was diverted into the Bonneville Basin in the late Pleistocene [33][34][35]. It is possible that these two lineages have not been separated long enough for mutations to become fixed in each subspecies. It is also possible that our results are confounded by widespread stocking of Yellowstone cutthroat trout by management agencies prior to the recognition of unique lineages. Unfortunately, we are unable to distinguish between these two scenarios. Similarly, we were not able to cleanly differentiate Colorado River and greenback cutthroat trout from each other using this suite of SNPs, even after reanalyzing the reduced datasets using Structure. It is unclear whether the inability of these SNPs to differentiate between Colorado River and greenback cutthroat is because these subspecies have diverged too recently so alleles have not had time to reach fixation, or if there have been introductions resulting in introgressive hybridization in what we initially treated as non-admixed populations. Considering the close proximity of the drainages in which these subspecies reside and the frequency at which cutthroat trout were stocked in the past, the latter scenario is certainly plausible. A number of bars on the Structure bar graph represent Colorado River and greenback cutthroat trout individuals that appear to be hybrids ( Figure 5), which lends support to the latter scenario. Clearly additional research is needed to resolve this issue.
Additional studies focused on SNP development in the Interior group of cutthroat trout (i.e., Bear River, Bonneville, Colorado River, Greenback, Rio Grande, and Yellowstone) are warranted, particularly when it comes to searching for fixed alleles between Bear River and Yellowstone cutthroat, and between Colorado River and greenback cutthroat trout, if they exist.

Conclusions
The SNP markers reported here have added to a rapidly growing body of markers that can be used in cutthroat trout population genetic studies, and should be a valuable resource in future attempts to evaluate the genetic composition of cutthroat trout populations in western North America, including the detection of hybrids. These results reiterate that cutthroat trout subspecies are geographically and evolutionarily distinct, and ought to continue to be managed as such by state and federal agencies.
The method used to discover these SNP loci in cutthroat trout was developed for SNP discovery in the Eudicot genus Amaranthus [6]. Because the method was also successful for SNP discovery in something as evolutionary distant as cutthroat trout, it should be applicable to SNP discovery in many different kinds of non-model organisms.

Additional files
Additional file 1: SNP primer table, SNP marker names, GenBank accession numbers, the type of polymorphism for each SNP, allele specific primers, common reverse primers and specific target amplification primers are listed herein.
Additional file 2: Characterization of each SNP locus, Major and minor alleles for all 125 SNP loci, along with the proportions of individuals within each a priori designated subspecies that carry the minor allele, as well as minor allele frequencies for each SNP locus are listed herein.