More comprehensive forensic genetic marker analyses for accurate human remains identification using massively parallel DNA sequencing

Background Although the primary objective of forensic DNA analyses of unidentified human remains is positive identification, cases involving historical or archaeological skeletal remains often lack reference samples for comparison. Massively parallel sequencing (MPS) offers an opportunity to provide biometric data in such cases, and these cases provide valuable data on the feasibility of applying MPS for characterization of modern forensic casework samples. In this study, MPS was used to characterize 140-year-old human skeletal remains discovered at a historical site in Deadwood, South Dakota, United States. The remains were in an unmarked grave and there were no records or other metadata available regarding the identity of the individual. Due to the high throughput of MPS, a variety of biometric markers could be typed using a single sample. Results Using MPS and suitable forensic genetic markers, more relevant information could be obtained from a limited quantity and quality sample. Results were obtained for 25/26 Y-STRs, 34/34 Y SNPs, 166/166 ancestry-informative SNPs, 24/24 phenotype-informative SNPs, 102/102 human identity SNPs, 27/29 autosomal STRs (plus amelogenin), and 4/8 X-STRs (as well as ten regions of mtDNA). The Y-chromosome (Y-STR, Y-SNP) and mtDNA profiles of the unidentified skeletal remains are consistent with the R1b and H1 haplogroups, respectively. Both of these haplogroups are the most common haplogroups in Western Europe. Ancestry-informative SNP analysis also supported European ancestry. The genetic results are consistent with anthropological findings that the remains belong to a male of European ancestry (Caucasian). Phenotype-informative SNP data provided strong support that the individual had light red hair and brown eyes. Conclusions This study is among the first to genetically characterize historical human remains with forensic genetic marker kits specifically designed for MPS. The outcome demonstrates that substantially more genetic information can be obtained from the same initial quantities of DNA as that of current CE-based analyses. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3087-2) contains supplementary material, which is available to authorized users.


Background
The paramount goal of forensic DNA testing of human skeletal remains is identification of the unknown individual. A variety of genetic markers can be used to achieve identification, including highly polymorphic autosomal short tandem repeat (STR) loci and lineage markers [Y-STRs, Y chromosome single nucleotide polymorphisms (Y-SNPs), mitochondrial DNA (mtDNA)]. However, reference samples must be available for comparison for these markers to be informative. In mass disasters, missing persons cases, or cases involving historical/archaeological remains, sometimes there are no clues as to the person's potential identity and/or there are no associations made with a reference sample or reference pedigree via a database search [1]. In such scenarios, identification can be difficult or impossible using solely autosomal STRs and lineage markers. However, there are other genetic markers that can extend human identification capabilities, such as analysis of ancestryinformative markers [2][3][4][5][6][7] and phenotype-informative SNPs [7][8][9][10][11].
Massively parallel sequencing (MPS) of ancestry-and phenotype-informative SNPs, with its expanded capacity for marker typing, offers the ability to develop investigative leads in such cases [12][13][14][15][16]. Thus, more genetic information can be gleaned from a sample without further consumption of often very limited quantity and quality samples. In this study, MPS was used in an effort to help characterize 140-year-old human skeletal remains that were buried in an unmarked grave in Deadwood, South Dakota USA, a famous town of the American Old West.
In 1874, the discovery of gold in the Black Hills of South Dakota set off one of the last great gold rushes in America. In 1876, miners moved to the area and formally established the city of Deadwood, now a U.S. historical landmark. Deadwood's original cemetery, Ingleside, was located near the town's core business district and contained approximately 100 burials (although cemetery records are incomplete and some were buried in unmarked graves). In 1878, the individuals interred there were relocated to the hills above Deadwood, and Mount Moriah Cemetery was established.
In 2012, a set of unidentified human skeletal remains were unearthed by a construction crew in Deadwood's Presidential District, the original site of Ingleside Cemetery [17][18][19]. South Dakota State archaeologists and historic preservation officials for the City of Deadwood recovered the skeleton from the site (with the exception of one tooth and a few bones from the hands and feet). Anthropological analyses indicated that the remains are consistent with a male of European ancestry (Caucasian) who was 18-24 years of age at the time of death and 65.7 − 70.7 inches tall. No indications of the cause of death were evident in the skeletal samples [19][20][21].
Forensic odontological analyses determined that this unknown individual was a habitual tobacco user and had nine dental fillings (3 gold, 6 tin/amalgam). The latter observation is indicative of some level of affluence/wealth, as most individuals in the late 19 th century would simply have had unhealthy teeth extracted [20,21].
In June 2014, the City of Deadwood and the Deadwood Historic Preservation Commission requested that the Institute of Applied Genetics (IAG) conduct DNA testing on the remains to provide some level of identification [18][19][20][21]. Given that the remains were in an unmarked grave and no investigative leads existed regarding his identity, Deadwood city officials were interested in the analysis of DNA markers that could help predict the individual's ancestry and external physical traits. Markers chosen for analysis included Y-STRs, Y-SNPs, ancestry-informative SNPs, phenotype-informative SNPs, and mitochondrial DNA (mtDNA). To the best of our knowledge, this study is among the first to genetically characterize historical human remains with forensic genetic marker kits specifically designed for MPS.

Methods
The practices for minimizing contamination during the analysis of the Deadwood remains were the same contamination controls recommended for archaeological and ancient DNA specimens, including: (a) use of protective suits, gloves, and masks; (b) bleach decontamination and UV-irradiation of work benches and associated equipment; (c) physical removal and/or chemical destruction of contaminant/exogenous DNA on external bone surfaces; (d) extraction of bone samples in a designated low-copy area; (e) PCR amplification in a location that is physically separated from the extraction area; (f) use of appropriate negative controls, reagent blanks, and positive controls; and (g) replicate testing [22][23][24][25][26][27][28].

Bone processing and DNA extraction
The right femur was provided to the IAG for DNA testing (Loan Accession No. 12-0051, South Dakota Archaeological Research Center) (Fig. 1). A portion of the . DNA extractions were performed on six of the eight bone sections in a designated low-copy number (LCN) area of the laboratory, as described in Ambers et al. [29].

DNA quantification
The quantity of DNA from seven bone powder fractions was determined using the Quantifiler® Human DNA Quantification Kit and an ABI 7500 Real-Time PCR System (Thermo Fisher Scientific, Waltham, Massachusetts USA), according to manufacturers' recommendations [30].

Traditional Y-STR typing via capillary electrophoresis
Human genomic DNA was amplified with reagents from the AmpFlSTR® Yfiler™ PCR Amplification Kit and a GeneAmp® PCR System 9700 (Thermo Fisher Scientific), according to manufacturer's recommendations [31]. Negative controls consisted of 10 μl low-TE buffer and 10 μl 9947A female DNA (0.1 ng/μl); 10 μl 007 Male Control DNA (0.1 ng/μl) served as the positive control. PCR products were separated via capillary electrophoresis (CE) on a 3500xl Genetic Analyzer, and analyzed using GeneMapper® ID-X software (Thermo Fisher Scientific). DNA (elution #1 and elution #2) from seven bone powder fractions was typed.

Final data analysis
30X and 10X coverage were set as minimum detection thresholds for the autosomal markers and mtDNA typed by MPS in this study, respectively. The Y  haplogroup was determined using the ancestry feature and metapopulation tool of the Y-STR haplotype reference database YHRD (www.yhrd.org). A PCA plot of ancestry-informative SNP data was generated with the Illumina® ForenSeq™ Universal Analysis Software. Mitochondrial DNA sequence alignment was performed with the mitoSAVE workbook [38], and haplogroup determination was made using HaploGrep software (http://haplogrep.uibk.ac.at/) [39]. Phenotypic SNP data were analyzed with the Illumina® Fore-nSeq™ Universal Analysis Software as well as with the HIrisplex hair/eye color prediction tool (http://hirisplex.erasmusmc.nl) [9,10].

Results and discussion
DNA concentrations recovered from the right femur powder fractions ranged from 0.0147-0.3350 ng/μl for elution #1 and 0-0.0579 ng/μl for elution #2, respectively. The elution volume for each DNA extract was 30 μl, and the total DNA recovered per elution is reported in Table 1.
A variety of STR and SNP markers were analyzed via CE and MPS. No DNA was detected in any of the negative controls and reagent blanks, and positive controls yielded the correct type for all analyses.  Table 2. Fifteen of the twenty-six Y-STR markers analyzed with the Illumina® ForenSeq™ DNA Signature Prep Kit overlap with the AmpFlSTR® Yfiler™ PCR Amplification Kit. The Y-STR alleles recovered from all bone samples among the common markers between MPS and CE were concordant. Y-STR typing results were obtained for 17 of the 26 markers assayed with MPS (Table 3); coverage ranged from 31x to 620x [148 ± 137 (Avg ± SD)]. The total number of Y-STR loci that yielded results for both methods was 25.
The composite 17-locus Y-STR profile generated with AmpFlSTR® Yfiler™ and the additional Y STR loci from the Illumina® ForenSeq™ DNA Signature Prep Kit is consistent with the R1b haplogroup. R1b is the most common Y haplogroup in Western Europe, spanning 80 % of the population in Ireland, western Wales, the Scottish Highlands, the Atlantic fringe of France, Catalonia, and the Basque country. It also is common around the Caucasus and in Anatolia, in parts of Russia, and in Central and South Asia [40][41][42][43][44][45].
In addition to Y-STR data, a consensus Y-SNP profile was compiled using data from three different bone powder fractions from the Deadwood unidentified skeletal remains. All 34 upper clade Y-SNPs in the HID-Ion AmpliSeq™ Identity Panel provided typing results [Additional file 1], and these haplogroupinformative Y-SNP results also supported an R1b haplogroup assignment.

Ancestry informative SNPs
Ancestry-informative SNP results were obtained for 51 of the 54 SNP markers amplified via the Illumina® Fore-nSeq™ DNA Signature Prep Kit, and for all 165 markers tested using the HID-Ion AmpliSeq™ Ancestry Panel. Depth of coverage ranged from 31x to 2240x (170 ± 107) and 53x to 1190x (379 ± 243), respectively. Fiftythree of the ancestry-informative SNPs in the Illumina® ForenSeq™ kit are included in the HID-Ion AmpliSeq™ Ancestry Panel, and 51 of these SNPs yielded results with both panels. The results were concordant, and a composite profile was generated [Additional file 2]. Using the ancestry-informative SNP data, the major population bio-ancestry was determined to be European (Fig. 2).

Mitochondrial DNA (mtDNA) Analysis
An in-house PCR multiplex assay comprised of short amplicons (~200 bp in length) at targeted sites on the coding and non-coding regions (HVI and HVII) of the mitochondrial DNA (mtDNA) genome was used to characterize the maternal lineage of the Deadwood  [39]. Mitochondrial haplogroup H1 is the most common in Western Europe and is found throughout Europe, northern Africa, the Levant, the Caucasus, Anatolia, and as far as Central Asia and Siberia [46][47][48][49][50][51][52]. Hence, the biogeographic ancestry determined by the Y-STR, Y-SNP, ancestryinformative SNP, and mtDNA data are all consistent with that obtained by anthropological analyses of a European ancestry.

Forensic DNA phenotyping
Twenty-four phenotype-informative SNPs were assayed using the Ion Torrent PGM® and HID-Ion AmpliSeq™ Externally Visible Characteristics (EVC) Prototype Panel. Results were obtained for 23 of the 24 phenotype-  informative SNPs assayed, with a depth of coverage of 33x to 1419x (282 ± 205) ( Table 4). Additional testing was performed on the skeletal samples using the Illumina® ForenSeq™ DNA Signature Prep Kit and MiSeq® platform. Results were obtained for all 24 phenotype-informative SNP markers assayed, with a depth of coverage of 32x to 1187x (288 ± 407). Typing results were concordant between assays and between the two MPS platforms. A composite phenotypeinformative SNP profile was generated and is shown in Additional file 4. Phenotypic SNP analysis was performed using the HIrisPlex hair/eye color prediction tool (http://hirisplex.erasmusmc.nl), which generates individual probabilities for four hair color categories (red, blonde, brown, black), two hair color shades (light, dark), and three eye color categories (blue, intermediate, brown) [9,10]. The 24 predictive DNA variants (23 SNPs and 1 INDEL) of the HIrisPlex assay are included in the Illumina® ForenSeq™and HID-Ion AmpliSeq™ Externally Visible Characteristics (EVC) Prototype Panel, and the system was designed to cope with low template and degraded DNA. All 24 DNA variants have small amplicon sizes (< 160 bp). In terms of specificity, HIrisPlex variants provide blue and brown human eye color predictions with over 90 % precision [9] and average hair color prediction accuracies of 0.70, 0.79, 0.80, and 0.88 for red, blonde, brown, and black hair, respectively [10]. Analysis of the Deadwood skeletal remains indicated that this individual likely had light red hair and light brown eyes. Probabilities for hair color, hair color shade, and eye color were 0.69, 0.71, and 0.72, respectively (Table 5).

Other markers assayed with MPS panels
The Illumina® ForenSeq™ DNA Signature Prep Kit and the HID-Ion AmpliSeq™ Identity Panel also contain markers that do not contribute to the characterization of bioancestry or phenotype, but nonetheless were able to be typed. With the Illumina® ForenSeq™ DNA Signature Prep Kit, results were obtained for 88/95 human identity SNPs, 27/29 autosomal STRs (plus amelogenin), and 4/8 X-STRs. Range in coverage for the human identity SNPs, autosomal STRs, and X-STRs were 32x-1085x (217 ± 213), 31x-2838x (297 ± 485), and 31x-361x (170 ± 107), respectively. With the HID-Ion AmpliSeq™ Identity Panel, results were obtained for 90/ 90 human identity SNPs [Additional files 5, 6 and 7], with a depth of coverage of 33x-1419x (282 ± 205). There are 80 human identity SNPs in common between the two kits, and 75 of these common markers yielded results with both panels. Results were concordant between the two identity SNP panels. These results further support the potential of MPS to enable typing of a much larger number of genetic markers from the same amount of DNA than would have been possible with current CE-based systems.

Conclusion
In an effort to learn more about the late-19 th -century human skeletal remains discovered at the site of Deadwood's first cemetery, historic preservation officials enlisted a number of forensic specialists to conduct analyses on the remains that could assist in his identification [17][18][19][20][21]53]. Since the individual was buried in an unmarked grave and no investigative leads existed regarding his identity, lineage testing and forensic DNA phenotyping were performed to predict ancestry and external physical traits.
The Y-chromosome (Y-STR, Y-SNP) and mitochondrial DNA (mtDNA) profiles of the unidentified skeletal remains are consistent with the R1b and H1 haplogroups, respectively. Both of these haplogroups are the most common ones in Western Europe. The ancestry-informative SNPs also indicated a European background. These genetic results are consistent with the findings of a previous anthropological report which determined that the Deadwood unidentified skeletal remains belong to a male of European ancestry (Caucasian). The phenotype-informative SNPs provided strong support that the individual had light red hair and brown eyes. This study is among the first known historical remains case that has been characterized with genetic panels designed specifically An important point about DNA testing of historical or archaeological skeletal remains should be emphasized. Six bone sections/cuttings were taken, and bone powder fractions from each were analyzed. Adjacent bone sections yielded vastly different results in terms of DNA quantity and number of allele calls; some regions of bone did not yield any DNA, while other areas yielded complete profiles. These findings are consistent with a previous study performed on the 120-year-old skeletal remains of an American Civil War soldier [29], which required testing of multiple bone sections and a consensus testing approach to obtain a complete Y-STR haplotype.
With its capacity for simultaneous analysis of a multitude of different types of DNA markers, MPS technology holds promise for use in the characterization of historical and archaeological remains, and in missing persons cases. In addition, in mass disasters or other types of cases where reference samples are not available/known, genetic markers such as ancestry-informative and phenotype-informative SNPs can provide data for craniofacial reconstructions that could be useful for positive identification.   A/A C/A G/G C/C G/G T/T G/G C/C T/T G/G C/C G/G Femur 008.002 E1 A/A C/A G/G C/C G/G T/T G/G C/C T/T G/G C/C G/G Femur 008.002 E2 A/A C/A G/G C/C G/G T/T G/G C/C T/T C/C G/G