Sequencing and analysis of a South Asian-Indian personal genome
© Gupta et al.; licensee BioMed Central Ltd. 2012
Received: 21 March 2012
Accepted: 18 August 2012
Published: 31 August 2012
With over 1.3 billion people, India is estimated to contain three times more genetic diversity than does Europe. Next-generation sequencing technologies have facilitated the understanding of diversity by enabling whole genome sequencing at greater speed and lower cost. While genomes from people of European and Asian descent have been sequenced, only recently has a single male genome from the Indian subcontinent been published at sufficient depth and coverage. In this study we have sequenced and analyzed the genome of a South Asian Indian female (SAIF) from the Indian state of Kerala.
We identified over 3.4 million SNPs in this genome including over 89,873 private variations. Comparison of the SAIF genome with several published personal genomes revealed that this individual shared ~50% of the SNPs with each of these genomes. Analysis of the SAIF mitochondrial genome showed that it was closely related to the U1 haplogroup which has been previously observed in Kerala. We assessed the SAIF genome for SNPs with health and disease consequences and found that the individual was at a higher risk for multiple sclerosis and a few other diseases. In analyzing SNPs that modulate drug response, we found a variation that predicts a favorable response to metformin, a drug used to treat diabetes. SNPs predictive of adverse reaction to warfarin indicated that the SAIF individual is not at risk for bleeding if treated with typical doses of warfarin. In addition, we report the presence of several additional SNPs of medical relevance.
This is the first study to report the complete whole genome sequence of a female from the state of Kerala in India. The availability of this complete genome and variants will further aid studies aimed at understanding genetic diversity, identifying clinically relevant changes and assessing disease burden in the Indian population.
KeywordsIndian genome Personal genomics Whole genome sequencing
Since the publication of the first human reference genome in 2001, sequencing technologies have rapidly evolved, leading to increased throughput and reduced cost. Currently, one can obtain a complete human genome in less than two weeks at a cost of USD ~5000 or less, whereas the human genome project took over a decade and USD ~3 billion to complete. This advance has paved the way for obtaining personal human genomes quickly and inexpensively. Comparison of personal genomes and select regions of the genomes against the reference genome has provided a comprehensive view of human genetic diversity. Rapid advances in sequencing technologies have enabled the identification of rare disease risk alleles and facilitated the practice of personalized medicine when making treatment decisions, though such applications are at their infancy[2–8].
Currently, published personal genomes predominantly represent individuals of European ancestry[9–13]. Genomes of individuals representing the Yoruba West-African, Han Chinese, South Korean, Khoisan and Bantu of Africa, Japanese, and Australian aborigines have also been published[14–19]. Recently, an Indian male genome was also published. While a few studies have been conducted to understand the genetic diversity across populations in India, none have catalogued genetic variation at the whole genome level of a female individual from the subcontinent[20–22]. Understanding the extent of variations in the Indian population will be important for identifying clinically relevant changes in the Asian Indian subcontinent context.
Using a massively parallel sequencing approach, we have obtained the complete sequence of a South Asian Indian female (SAIF) genome. We identified over 3.4 million SNPs from this genome of which over 89,000 were found to be private SNPs. In performing an analysis of clinically relevant variants we have identified SNPs that indicate susceptibility to multiple sclerosis.
Genome sequencing and alignment to the human reference
Sequencing and analysis statistics
Total paired-end raw reads (each of 100 bases) in million
Total raw bases (Gb)
Total mapped bases (Gb)
Mean mapped depth (x)
Bases accessed (% of genome)
We performed a de novo assembly of reads that did not align to the chromosomes in GRCh37, using SOAPdenovo. This generated 57,426 contigs comprising 23,683,357 bases with an average contig length of 412 bp. Of these, 42.69% sequences aligned to the unanchored contigs and chromosomes in GRCh37 and another 9.25% of the sequences aligned to the alternative human assemblies. About 33.05% of the assembled sequences aligned to other human sequences in the NT database, while another 3.64% of the sequences aligned to non-human primates with an E < 10-5.
SNPs and indels
In addition to SNPs, insertions and deletions (indels) are a class of variations that shape evolution of genomes[29, 30]. In the SAIF genome, of the total 384,926 indels identified, 190,533 (49.5%) were found in gene coding regions. As observed with SNPs, only 7,871 (4.1%) of indels (1,591 in coding exons, 1,769 in non-coding exons, 620 in 5’UTR and 3,891 in 3’UTR) occurred within exons. Of the total indels, 248,309 (64.5%) were found in repetitive regions, proportionally higher than SNPs that occurred in this region. This very likely reflects the slippage that occurs during replication leading to increased occurrence of indels in repeat regions. Further, it is interesting to note that while indels were predominant (34%; 85,193) in simple repeats (Figure 2, Additional file1: Table S1B), only 2% of the SNPs were found in the simple repeat regions.
Coding SNPs are predominant in olfactory genes
Variants with Gene coding regions
Unlike SNPs, indels in coding regions, in addition to non-sense mutations, can lead to frame shift changes. Of the 372 coding region indels, 172 are in-frame and 200 lead to frame-shift change (Additional file1: Table S4A, Additional file1: Table S4B). Genes where the indel leads to a frame-shift includes HIF3A, hypoxia inducible factor 3 alpha subunit, thought to be a negative regulator of hypoxia-inducible gene expression; MMP28, a matrix metallopeptidase involved in the breakdown of extracellular matrix for both normal physiological and disease processes; and HNF1A, a transcription factor required for the expression of several liver-specific genes. The frame-shift at position 147 in MMP28 protein introduces a premature stop codon at 179. This results in loss of zinc-dependent metalloprotease and hemopexin-like repeat domain, leading to a truncated MMP28 protein that lacks a catalytic domain (Additional file2: Figure S2). SIFT analysis of the indels indicated 126 indels to be deleterious (Additional file1: Table S5).
Comparison and novel variants
We compared SAIF SNPs against those from other published personal genomes, the variations from the 1000 Genomes Project and dbSNP database (dbSNP132). The personal genomes used to perform the comparison had a sequencing coverage of at least 10X. Shared SNP sites, where both the SAIF genome and the genome it is compared to carry a SNP, provide a measure of the degree of similarity between the genomes. We also compared the indels found in the SAIF genome with those reported by the 1000 Genomes Project.
SNPs level comparison of the SAIF genome found that this individual shared 48.77% of the SNP sites with NA12891 (Caucasian) genome, 48.82% with the NA12892 (Caucasian) genome, 52.5% with the Venter (Caucasian) genome, 50.68% with the NA18507 (YRI) genome, 44.29% with the NA19238 (YRI) genome, 44.33% with NA19239 (YRI), 53.75% with YH (Han Chinese) genome, 59.24% with SJK (Korean) genome, 46.5% with ABT (South Africa) genome, 51.1% with Irish (Caucasian) genome, 49.86% with KB1 (Southern Kalahari, Africa), 59.41% with the recently published Indian male genome, 95.18% with dbSNP 132, and 92.44% with 1000 Genomes Project variation collection.
SNPs with health and medical relevance
We assessed cSNPs identified in the SAIF genome using annotations in SNPedia and OMIM for their health and disease relevance. This analysis identified 59 and 63 cSNPs with implications in health and disease from SNPedia and OMIM databases[45, 46], respectively (Additional file1: Table S8 and Additional file1: Table S9). Interestingly, this analysis revealed several SNPs with implications for susceptibility to cancer and cardiovascular diseases. The cancer susceptibility SNPs included the variation in SDHB gene (S163P, OMIM_ID #185470.0015), responsible for Cowden-like syndrome, resulting in enrichment of carcinomas of human breast due to downstream inactivation of PTEN. We also found an exon 10 BRCA2 variant (N372H; OMIM_ID # 600185.0133), and an EPCAM variant identified in Chinese population (M143T; rs1126497;) that are associated with increased risk for breast cancer. Further, a SNP in CENPF gene (R2943G; rs438034) that occurs in the SAIF genome is associated with a poor breast cancer survival. Other SNPs with increased cancer susceptibility include FCGR2A H166R (rs1801274) associated with increased risk for non-Hodgkin’s lymphoma, ANKK1 E713K (rs1800497;) involved in advanced adenoma recurrence, HNF1A S487N (rs2464196;), MMP9 Q166R (rs17576-rs2250889;), and XPC Q939K (rs2228001;) variants associated with lung cancer, ATG16L1 T137A (rs2241880;[56, 57]) with Crohn’s disease, and OGG1 P332A (rs1052133;[58–60]) associated with bladder and gall-bladder cancer in Japanese, Chinese and Indian populations. An ATR (M211T; rs2227928) variant found in the genome has been associated with a poorer response to gemcitabine and radiation therapy in pancreatic cancer. We also found a protective SNP that occurs in the PON1 gene (Q192R; rs662) that is known to lower (0.65x) risk for ovarian cancer. Two common missense variations in ELAC2 gene (A541T; OMIM_ID # 605367.0002 and S217L; OMIM_ID # 605367.0001) implicated in genetic susceptibility to heredity prostate cancer were found in the SAIF genome. This while not of direct significance to SAIF individual, could be of relevance to the male children, if any[63–65].
The cardiovascular disease associated SNPs found in this individual include those in LRP8 (R952Q; rs5174/OMIM_ID # 602600.0001;) and MMP9 (Q166R; rs17576;) both of which increase risk for myocardial infarction, ROS1 (S2229C; rs619203;) variation associated with increased coronary heart disease, AKAP10 SNP (I646V; OMIM_ID # 604694.0001;) associated with cardiac conductivity defect susceptibility and ADRB3 variant (W64R; rs4994;) implicated in higher risk of cardiac events. Also, two SNPs in the PON1 (Q192R; rs662 and L55M; rs854560) show a high risk of cardiovascular disease and a higher risk of coronary artery disease[72, 73]. A SNP in SNX19 (L878R; rs2298566) is linked to elevated risk of coronary heart disease but has also been shown to be associated with better response to statins and may be of clinical significance. Other SNPs affecting cholesterol levels (EDN1 K198N; OMIM_ID # 131240), familial obesity (FAM71F1 E143K; rs6971091) and hypertension susceptibility (PPARFC1A, G482S; rs8192678 and CYP4A11, F434S; rs1126742) were also found in the genome.
In addition to this, several other SNPs associated with Alzheimer’s disease, diabetes, tuberculosis susceptibility and macular degeneration were also detected. A SNP in ICAM1 (K469E; rs5498), associated with increased resistance to malarial infection, originally identified in a study of over 552 Indian individuals, was also observed in the SAIF genome. It must be noted that a majority of the SNPs of health relevance used to annotate the coding SNPs were derived from studies involving western populations. Hence, validating the relevance of these in the context of Asian Indian population will require controlled studies in a cohort representative of the Indian subcontinent.
We further assessed the relative genetic risk of SAIF against Gujarati Indians in Houston (GIH) population represented in HapMap III. We used the set of disease SNPs measured in both SAIF and GIH, and recalculated the likelihood ratio (LR) for SAIF and each of 101 GIH individuals. We found that the SAIF individual had a higher genetic risk than 80% of GIH for eight diseases (Additional file2: Figure S5). Intersecting both results, we found that SAIF had a high genetic risk for four diseases, including multiple sclerosis (post-test probability = 5%, relative risk > 100% GIH), uterine leiomyoma (post-test probability = 47%, relative risk > 97% GIH), asthma (post-test probability = 17%, relative risk > 90% GIH), and obesity (post-test probability = 34%, relative risk > 82% GIH).
In addition to multiple sclerosis, SAIF had a high genetic risk of uterine leiomyoma, driven by a rare heterozygous CT variant at rs7913069 (Additional file2: Figure S6). The T allele had been validated to increase the risk of uterine leiomyoma with an odds ratio of 1.47 and p-value = 8.65 × 10-14 in Japanese women. A high genetic risk for asthma and obesity were also identified in the SAIF individual[89, 90] (Figure 8, Additional file2: Figures S7 and Additional file2: Figures S8).
SNPs of pharmacogenomic relevance
Drug response SNPs
Drug related outcome
Interferon beta therapy for multiple sclerosis (MS)
will likely not show increase response to interferon beta therapy in case of relapsed MS
Lumiracoxib-related liver toxicity
increase in liver toxicity risk in response to lumiracoxib used to treat acute pain and osteoarthritic symptoms
will respond better to Metformin
greatly decreased odds of developing anemia when taking PEG-IFN/RBV
Statin induced myopathy
typical dose of Simvastin will not increase myopathy risk
Floxacillin and liver toxicity
at typical dose liver toxicity is not expected in response to floxacillin
Beta-Blocker - heart failure risk
Bucindolol is unlikely to reduce mortality odds in case of heart failure
Response to amitriptyline
typical response to depression when treated with Elavil, Paxil, Effexor, or Celexa
typical dose of warfarin does not increase risk of bleeding
No copies of the DPYD*2A mutation. May still be at risk for 5-FU toxicity due to other genetic or non-genetic factors
We have sequenced the genome of a female from Kerala in southern India and identified 3,459,784 SNPs and 384,926 short indels. Comparison with published personal genomes revealed that SAIF shared ~50% of the SNPs with each of the personal genomes published so far and had 89,873 private SNPs. Of the total SNPs detected, we identified 11,107 missense substitutions and 109 non-sense mutations. We found olfactory genes to be enriched for non-synonymous SNPs suggesting that this family of genes may be under reduced evolutionary constraint in humans. Besides the nuclear genome, analysis of the mitochondrial genome showed that SAIF mitochondria belonged to the U1 haplogroup which is known to occur in the southern Indian state of Kerala.
SNPs in personal genomes can be used to assess disease risk, carrier status and drug response/interaction. We have assessed the SAIF genome using OMIM, SNPedia and Varimed databases for SNPs with health and disease consequences. We identified higher risk for multiple sclerosis, among other diseases. Drug response related SNP assessment revealed that the SAIF genome carried a SNP in the ATM gene that predicts a favorable response to metformin used in treating diabetes. These and the other annotations made using experimentally verified variants will very likely be used by physicians for counseling and making treatment decisions.
A recent study on variations in India using SNP array suggest that genetic diversity within India is at least three times that observed within Europe. In India, burden of recessive genetic disorders is predicted to be high and likely to be unique within each population group. Additional personal genomes from Indian subcontinent that represent population groups within India will be critical to assessing the variation and disease burden.
In this study we report the first complete sequence of a south Asian Indian female from the state of Kerala in India. The availability of this genome and the variants identified is a first step in understanding the genetic diversity in the Indian subcontinent. In addition, the clinically relevant changes identified in this personal genome, along with further studies on additional genomes from this region, should provide a comprehensive assessment of the disease burden in the Indian population.
Sample collection, library construction and sequencing
Informed consent was obtained from the individual prior to initiation of this study. The donor is a healthy 48 year old female from Kerala in the southern part of India. Blood sample (8.5 ml) was collected in a PAXgene Blood DNA Tube (Qiagen, CA) and processed as per manufacturer’s instructions. High molecular weight genomic DNA obtained was sheared and used in the preparation of the whole genome shotgun libraries as per Illumina’s library preparation protocols (Illumina, CA). The libraries were then sequenced on a HiSeq 2000 sequencing machine (Illumina, CA) to obtain the sequence data.
Alignment to the reference
We used BWA (version 0.5.9) to align the reads to the human reference sequence (GRCh37/hg19). All default parameters were used, with the exception of “-q 15” which allows read trimming at the 3’ ends, down to 35 bp, prior to alignment. Potential PCR duplicates, which can adversely affect the variant calls, were removed using the MarkDuplicates tool from Picard version 1.4.0 (http://picard.sourceforge.net). The resulting BAM file was used for all subsequent analysis.
De novo assembly of unaligned reads
We used SOAPdenovo with a K-mer size of 39 and with the “-R” option to use reads to solve tiny repeats. The resulting contigs were first aligned to unanchored contigs in hg19 using LASTZ requiring an identity > 95% and requiring more than 80% of the assembled contig sequence in the alignment. The reads that did not align to hg19 were compared using BLAST against all existing human assemblies using BLASTN requiring an E < 10-5. The remaining reads were then analyzed using BLAST against the NT database.
SNP and Indel identification
We used SAMtools (version 0.1.12a) to call variants (substitutions and small indels) from the alignments generated above. All default parameters were used in conjunction with “-C 50” to reduce the effects of the sequences with excessive mismatches. The variants were filtered to keep the ones where the depth of coverage was > = 5 and < = 60 for all chromosomes except the mitochondria. A total of 3,620,895 single nucleotide substitutions and 509,994 indels were identified in this sample, and we further filtered the variants to only keep the ones with a SNP quality score > = 30. Also, heterozygous variants that did not share any alleles with the reference sequence were excluded. The SNP calls made using the whole genome sequencing data were further validated using SNP calls for this individual derived using Illumina HumanOmni1-Quad Beadchip SNP array. We found that the calls between sequencing data and the SNP to be concordant at 989,747 of 1,003,031 SNP array positions (98.7% concordance).
SNP and Indel annotation
We designed a pipeline to annotate SNPs and indels. The human gene annotation release 62 provided by Ensembl database (http://www.ensembl.org/info/data/ftp/index.html) was used for annotating variants with gene, exon and UTRs. The repeat definition, conserved TFBS and enhancer information was obtained from UCSC genome browser database (http://genome.ucsc.edu). SIFT annotation was performed using the online version available at (http://sift.bii.a-star.edu.sg/). The pathway analysis was performed using DAVID program and an FDR of < = 0.05 was used to identify significant pathways.
Comparison and novel variants
The personal genome information was obtained from Ensembl, UCSC, Galaxy and published articles. The variant annotation for 1000 Genomes Project was obtained from (http://www.1000genomes.org/). The common SNP database (dbSNP132) was downloaded from Ensembl and UCSC. Liftover program (http://genome.ucsc.edu/cgi-bin/hgLiftOver) was used to convert the coordinate from hg18 to hg19 version of the genome.
From the comparison of the SAIF mt genome and the reference sequence (rCRS, NC_012920), 35 single nucleotide variants were found. Those variants were used to identify the haplotype of SAIF, using Haplogrep. To examine phylogenetic relationships of the haplotype of SAIF with closely related haplotypes, the Neighbor-Joining tree was constructed by MEGA5 for 210 complete mitochondrial genomes belonging to U and K haplogroups. The genome sequences were retrieved from GenBank. The coalescence time for the U haplogroups was estimated using BEAST. For the BEAST analysis, 313 mt genome sequences evenly distributed throughout all lineages obtained from GenBank were used. The following parameters were used for the BEAST analysis: strict clock molecular clock model, exponential growth tree prior, Markov chain Monte Carlo (MCMC) chain length 2 M, and 10% burn-in.
OMIM, SNPedia and varimed annotation
We compared the SNPs predicted from the SAIF genome against disease associated OMIM variants. We also annotated the SAIF genome against SNPedia to understand the effect of the variants. Annotation using Varimed database was performed as described recently. Briefly, we first retrieved the SAIF’s genotypes, including variants and ref-ref calls for all the significant SNPs represented in the Varimed database known to be associated with disease based on genome-wide association studies. For multiple SNPs in the same linkage disequilibrium with R2 > 0.3, we only kept the one with the strongest evidence. Finally, we multiplied the likelihood ratio (LR) from independent SNPs, incorporated it with the pre-test probability to estimate the post-test probability of the disease.
This project is funded, in part, under a grant by the Pennsylvania Department of Health using Tobacco CURE Funds to AR. The Pennsylvania Department of Health specifically disclaims responsibility for any analyses, interpretations or conclusions. We thank Devi Santhosh and Sneha Somasekar for helping edit the manuscript.
- Venter JC: Multiple personal genomes await. Nature. 2010, 464 (7289): 676-677. 10.1038/464676a.View ArticlePubMed
- Meyer UA: Personalized medicine: a personal view. Clin Pharmacol Ther. 2012, 91 (3): 373-375. 10.1038/clpt.2011.238.View ArticlePubMed
- Ginsburg GS, Willard HF: Genomic and personalized medicine: foundations and applications. Transl Res: The J of lab and Clin Med. 2009, 154 (6): 277-287.View Article
- Chan IS, Ginsburg GS: Personalized medicine: progress and promise. Ann Rev of genomics and Hum Genet. 2011, 12: 217-244. 10.1146/annurev-genom-082410-101446.View Article
- Hong KW, Oh B: Overview of personalized medicine in the disease genomic era. BMB reports. 2010, 43 (10): 643-648. 10.5483/BMBRep.2010.43.10.643.View ArticlePubMed
- Pasche B, Absher D: Whole-genome sequencing: a step closer to personalized medicine. JAMA: The J of the Am Med Assoc. 2011, 305 (15): 1596-1597. 10.1001/jama.2011.484.View Article
- Welch JS, Westervelt P, Ding L, Larson DE, Klco JM, Kulkarni S, Wallis J, Chen K, Payton JE, Fulton RS, et al: Use of whole-genome sequencing to diagnose a cryptic fusion oncogene. JAMA: The J of the Am Med Assoc. 2011, 305 (15): 1577-1584. 10.1001/jama.2011.497.View Article
- Link DC, Schuettpelz LG, Shen D, Wang J, Walter MJ, Kulkarni S, Payton JE, Ivanovich J, Goodfellow PJ, Le Beau M, et al: Identification of a novel TP53 cancer susceptibility mutation through whole-genome sequencing of a patient with therapy-related AML. JAMA: The J of the Am Med Assoc. 2011, 305 (15): 1568-1576. 10.1001/jama.2011.473.View Article
- Levy S, Sutton G, Ng PC, Feuk L, Halpern AL, Walenz BP, Axelrod N, Huang J, Kirkness EF, Denisov G, et al: The diploid genome sequence of an individual human. PLoS biology. 2007, 5 (10): e254-10.1371/journal.pbio.0050254.PubMed CentralView ArticlePubMed
- Wheeler DA, Srinivasan M, Egholm M, Shen Y, Chen L, McGuire A, He W, Chen YJ, Makhijani V, Roth GT, et al: The complete genome of an individual by massively parallel DNA sequencing. Nature. 2008, 452 (7189): 872-876. 10.1038/nature06884.View ArticlePubMed
- Rothberg JM, Hinz W, Rearick TM, Schultz J, Mileski W, Davey M, Leamon JH, Johnson K, Milgrew MJ, Edwards M, et al: An integrated semiconductor device enabling non-optical genome sequencing. Nature. 2011, 475 (7356): 348-352. 10.1038/nature10242.View ArticlePubMed
- Ashley EA, Butte AJ, Wheeler MT, Chen R, Klein TE, Dewey FE, Dudley JT, Ormond KE, Pavlovic A, Morgan AA, et al: Clinical assessment incorporating a personal genome. Lancet. 2010, 375 (9725): 1525-1535. 10.1016/S0140-6736(10)60452-7.PubMed CentralView ArticlePubMed
- Sirota M, Schaub MA, Batzoglou S, Robinson WH, Butte AJ: Autoimmune disease classification by inverse association with SNP alleles. PLoS genetics. 2009, 5 (12): e1000792-10.1371/journal.pgen.1000792.PubMed CentralView ArticlePubMed
- Schuster SC, Miller W, Ratan A, Tomsho LP, Giardine B, Kasson LR, Harris RS, Petersen DC, Zhao F, Qi J, et al: Complete Khoisan and Bantu genomes from southern Africa. Nature. 2010, 463 (7283): 943-947. 10.1038/nature08795.PubMed CentralView ArticlePubMed
- Wang J, Wang W, Li R, Li Y, Tian G, Goodman L, Fan W, Zhang J, Li J, Zhang J, et al: The diploid genome sequence of an Asian individual. Nature. 2008, 456 (7218): 60-65. 10.1038/nature07484.PubMed CentralView ArticlePubMed
- Ahn SM, Kim TH, Lee S, Kim D, Ghang H, Kim DS, Kim BC, Kim SY, Kim WY, Kim C, et al: The first Korean genome sequence and analysis: full genome sequencing for a socio-ethnic group. Genome Res. 2009, 19 (9): 1622-1629. 10.1101/gr.092197.109.PubMed CentralView ArticlePubMed
- Kim JI, Ju YS, Park H, Kim S, Lee S, Yi JH, Mudge J, Miller NA, Hong D, Bell CJ, et al: A highly annotated whole-genome sequence of a Korean individual. Nature. 2009, 460 (7258): 1011-1015.PubMed CentralPubMed
- Rasmussen M, Guo X, Wang Y, Lohmueller KE, Rasmussen S, Albrechtsen A, Skotte L, Lindgreen S, Metspalu M, Jombart T, et al: An Aboriginal Australian genome reveals separate human dispersals into Asia. Science. 2011, 334 (6052): 94-98. 10.1126/science.1211177.PubMed CentralView ArticlePubMed
- Fujimoto A, Nakagawa H, Hosono N, Nakano K, Abe T, Boroevich KA, Nagasaki M, Yamaguchi R, Shibuya T, Kubo M, et al: Whole-genome sequencing and comprehensive variant analysis of a Japanese individual using massively parallel sequencing. Nat Genet. 2010, 42 (11): 931-936. 10.1038/ng.691.View ArticlePubMed
- Patowary A, Purkanti R, Singh M, Chauhan RK, Bhartiya D, Dwivedi OP, Chauhan G, Bharadwaj D, Sivasubbu S, Scaria V: Systematic analysis and functional annotation of variations in the genome of an Indian individual. Hum Mutat. 2012, 33 (7): 1133-1140. 10.1002/humu.22091.View ArticlePubMed
- Brahmachari SK MP, Mukerji M, Habib S, Dash D, Ray K, Bahl S, Singh L, Sharma A, Roychoudhury S, Chandak GR, Thangaraj K, Parmar D, Sengupta S, Bharadwaj D, Rath SK, Singh J, Jha GN, Virdi K, Rao VR, Sinha S, Singh A, Mitra AK, Mishra SK, Pasha Q, Sivasubbu S, Pandey R, Baral A, Singh PK, Sharma A, Kumar J, et al: Genetic landscape of the people of India: a canvas for disease gene exploration. J Genet. 2008, 87 (1): 3-20.View Article
- Reich D, Thangaraj K, Patterson N, Price AL, Singh L: Reconstructing Indian population history. Nature. 2009, 461 (7263): 489-494. 10.1038/nature08365.PubMed CentralView ArticlePubMed
- Li H, Durbin R: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009, 25 (14): 1754-1760. 10.1093/bioinformatics/btp324.PubMed CentralView ArticlePubMed
- Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z, Li Y, Li S, Shan G, Kristiansen K, et al: De novo assembly of human genomes with massively parallel short read sequencing. Genome Res. 2010, 20 (2): 265-272. 10.1101/gr.097261.109.PubMed CentralView ArticlePubMed
- Hodgkinson A, Eyre-Walker A: Variation in the mutation rate across mammalian genomes. Nat Rev Genet. 2011, 12 (11): 756-766. 10.1038/nrg3098.View ArticlePubMed
- Kryazhimskiy S, Plotkin JB: The population genetics of dN/dS. PLoS genetics. 2008, 4 (12): e1000304-10.1371/journal.pgen.1000304.PubMed CentralView ArticlePubMed
- Madsen BE, Villesen P, Wiuf C: Short tandem repeats and genetic variation. Methods Mol Biol. 2010, 628: 297-306. 10.1007/978-1-60327-367-1_16.View ArticlePubMed
- Hannan AJ: TRPing up the genome: Tandem repeat polymorphisms as dynamic sources of genetic variability in health and disease. Discovery medicine. 2010, 10 (53): 314-321.PubMed
- Wetterbom A, Sevov M, Cavelier L, Bergstrom TF: Comparative genomic analysis of human and chimpanzee indicates a key role for indels in primate evolution. J Mol Evol. 2006, 63 (5): 682-690. 10.1007/s00239-006-0045-7.View ArticlePubMed
- Mills RE, Pittard WS, Mullaney JM, Farooq U, Creasy TH, Mahurkar AA, Kemeza DM, Strassler DS, Ponting CP, Webber C, et al: Natural genetic variation caused by small insertions and deletions in the human genome. Genome Res. 2011, 21 (6): 830-839. 10.1101/gr.115907.110.PubMed CentralView ArticlePubMed
- Chen JQ, Wu Y, Yang H, Bergelson J, Kreitman M, Tian D: Variation in the ratio of nucleotide substitution and indel rates across genomes in mammals and bacteria. Mol Biol Evol. 2009, 26 (7): 1523-1531. 10.1093/molbev/msp063.View ArticlePubMed
- Kasowski M, Grubert F, Heffelfinger C, Hariharan M, Asabere A, Waszak SM, Habegger L, Rozowsky J, Shi M, Urban AE, et al: Variation in transcription factor binding among humans. Science. 2010, 328 (5975): 232-235. 10.1126/science.1183621.PubMed CentralView ArticlePubMed
- Karolchik D, Hinrichs AS, Kent WJ: The UCSC Genome Browser. Current protocols in human genetics / editorial board, Jonathan L Haines [et al.]. 2011, Chapter 18:Unit18 16
- Visel A, Minovitsky S, Dubchak I, Pennacchio LA: VISTA Enhancer Browser--a database of tissue-specific human enhancers. Nucleic Acids Res. 2007, 35 (Database issue): D88-92.PubMed CentralView ArticlePubMed
- da Huang W, Sherman BT, Lempicki RA: Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc. 2009, 4 (1): 44-57.View ArticlePubMed
- Hasin-Brumshtein Y, Lancet D, Olender T: Human olfaction: from genomic variation to phenotypic diversity. Trends in genetics: TIG. 2009, 25 (4): 178-184. 10.1016/j.tig.2009.02.002.View ArticlePubMed
- Kachapati K, O'Brien TR, Bergeron J, Zhang M, Dean M: Population distribution of the functional caspase-12 allele. Hum Mutat. 2006, 27 (9): 975-View ArticlePubMed
- Xue Y, Daly A, Yngvadottir B, Liu M, Coop G, Kim Y, Sabeti P, Chen Y, Stalker J, Huckle E, et al: Spread of an inactive form of caspase-12 in humans is due to recent positive selection. Am J Hum Genet. 2006, 78 (4): 659-670. 10.1086/503116.PubMed CentralView ArticlePubMed
- Yngvadottir B, Xue Y, Searle S, Hunt S, Delgado M, Morrison J, Whittaker P, Deloukas P, Tyler-Smith C: A genome-wide survey of the prevalence and evolutionary forces acting on human nonsense SNPs. Am J Hum Genet. 2009, 84 (2): 224-234. 10.1016/j.ajhg.2009.01.008.PubMed CentralView ArticlePubMed
- Koyano S, Emi M, Saito T, Makino N, Toriyama S, Ishii M, Kubota I, Kato T, Kawata S: Common null variant, Arg192Stop, in a G-protein coupled receptor, olfactory receptor 1B1, associated with decreased serum cholinesterase activity. Hepatol Res: The Off J of the Japan Soc of Hepatol. 2008, 38 (7): 696-703.View Article
- Ng PC, Henikoff S: SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res. 2003, 31 (13): 3812-3814. 10.1093/nar/gkg509.PubMed CentralView ArticlePubMed
- Cann RL, Stoneking M, Wilson AC: Mitochondrial DNA and human evolution. Nature. 1987, 325 (6099): 31-36. 10.1038/325031a0.View ArticlePubMed
- Palanichamy MG, Sun C, Agrawal S, Bandelt HJ, Kong QP, Khan F, Wang CY, Chaudhuri TK, Palla V, Zhang YP: Phylogeny of mitochondrial DNA macrohaplogroup N in India, based on complete sequencing: implications for the peopling of South Asia. Am J Hum Genet. 2004, 75 (6): 966-978. 10.1086/425871.PubMed CentralView ArticlePubMed
- Forster L, Forster P, Lutz-Bonengel S, Willkomm H, Brinkmann B: Natural radioactivity and human mitochondrial DNA mutations. Proc Natl Acad Sci U S A. 2002, 99 (21): 13950-13954. 10.1073/pnas.202400499.PubMed CentralView ArticlePubMed
- Hamosh A, Scott AF, Amberger JS, Bocchini CA, McKusick VA: Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 2005, 33 (Database issue): D514-517.PubMed CentralView ArticlePubMed
- Cariaso M, Lennon G: SNPedia: a wiki supporting personal genome annotation, interpretation and analysis. Nucleic Acids Res. 2012, 40 (Database issue): D1308-1312.PubMed CentralView ArticlePubMed
- Ni Y, Zbuk KM, Sadler T, Patocs A, Lobo G, Edelman E, Platzer P, Orloff MS, Waite KA, Eng C: Germline mutations and variants in the succinate dehydrogenase genes in Cowden and Cowden-like syndromes. Am J Hum Genet. 2008, 83 (2): 261-268. 10.1016/j.ajhg.2008.07.011.PubMed CentralView ArticlePubMed
- Healey CS, Dunning AM, Teare MD, Chase D, Parker L, Burn J, Chang-Claude J, Mannermaa A, Kataja V, Huntsman DG, et al: A common variant in BRCA2 is associated with both breast cancer risk and prenatal viability. Nat Genet. 2000, 26 (3): 362-364. 10.1038/81691.View ArticlePubMed
- Jiang L, Zhang C, Li Y, Yu X, Zheng J, Zou P, Bin X, Lu J, Zhou Y: A non-synonymous polymorphism Thr115Met in the EpCAM gene is associated with an increased risk of breast cancer in Chinese population. Breast Cancer Res Treat. 2011, 126 (2): 487-495. 10.1007/s10549-010-1094-6.View ArticlePubMed
- Brendle A, Brandt A, Johansson R, Enquist K, Hallmans G, Hemminki K, Lenner P, Forsti A: Single nucleotide polymorphisms in chromosomal instability genes and risk and clinical outcome of breast cancer: a Swedish prospective case–control study. Eur J Cancer. 2009, 45 (3): 435-442. 10.1016/j.ejca.2008.10.001.View ArticlePubMed
- Wang SS, Cerhan JR, Hartge P, Davis S, Cozen W, Severson RK, Chatterjee N, Yeager M, Chanock SJ, Rothman N: Common genetic variants in proinflammatory and other immunoregulatory genes and risk for non-Hodgkin lymphoma. Cancer Res. 2006, 66 (19): 9771-9780. 10.1158/0008-5472.CAN-06-0324.View ArticlePubMed
- Murphy G, Cross AJ, Sansbury LS, Bergen A, Laiyemo AO, Albert PS, Wang Z, Yu B, Lehman T, Kalidindi A, et al: Dopamine D2 receptor polymorphisms and adenoma recurrence in the Polyp Prevention Trial. Int J Cancer. 2009, 124 (9): 2148-2151. 10.1002/ijc.24079.PubMed CentralView ArticlePubMed
- Heikkila K, Silander K, Salomaa V, Jousilahti P, Koskinen S, Pukkala E, Perola M: C-reactive protein-associated genetic variants and cancer risk: findings from FINRISK 1992, FINRISK 1997 and Health 2000 studies. Eur J Cancer. 2011, 47 (3): 404-412. 10.1016/j.ejca.2010.07.032.View ArticlePubMed
- Hu Z, Huo X, Lu D, Qian J, Zhou J, Chen Y, Xu L, Ma H, Zhu J, Wei Q, et al: Functional polymorphisms of matrix metalloproteinase-9 are associated with risk of occurrence and metastasis of lung cancer. Clin Cancer Res. 2005, 11 (15): 5433-5439. 10.1158/1078-0432.CCR-05-0311.View ArticlePubMed
- Qiu L, Wang Z, Shi X: Associations between XPC polymorphisms and risk of cancers: A meta-analysis. Eur J Cancer. 2008, 44 (15): 2241-2253. 10.1016/j.ejca.2008.06.024.View ArticlePubMed
- Hampe J, Franke A, Rosenstiel P, Till A, Teuber M, Huse K, Albrecht M, Mayr G, De La Vega FM, Briggs J, et al: A genome-wide association scan of nonsynonymous SNPs identifies a susceptibility variant for Crohn disease in ATG16L1. Nat Genet. 2007, 39 (2): 207-211. 10.1038/ng1954.View ArticlePubMed
- Rioux JD, Xavier RJ, Taylor KD, Silverberg MS, Goyette P, Huett A, Green T, Kuballa P, Barmada MM, Datta LW, et al: Genome-wide association study identifies new susceptibility loci for Crohn disease and implicates autophagy in disease pathogenesis. Nat Genet. 2007, 39 (5): 596-604. 10.1038/ng2032.PubMed CentralView ArticlePubMed
- Arizono K, Osada Y, Kuroda Y: DNA repair gene hOGG1 codon 326 and XRCC1 codon 399 polymorphisms and bladder cancer risk in a Japanese population. Jpn J Clin Oncol. 2008, 38 (3): 186-191. 10.1093/jjco/hym176.View ArticlePubMed
- Jiao X, Huang J, Wu S, Lv M, Hu Y, Jianfu , Su X, Luo C, Ce B: hOGG1 Ser326Cys polymorphism and susceptibility to gallbladder cancer in a Chinese population. Int J Cancer. 2007, 121 (3): 501-505. 10.1002/ijc.22748.View ArticlePubMed
- Srivastava A, Srivastava K, Pandey SN, Choudhuri G, Mittal B: Single-nucleotide polymorphisms of DNA repair genes OGG1 and XRCC1: association with gallbladder cancer in North Indian population. Ann Surg Oncol. 2009, 16 (6): 1695-1703. 10.1245/s10434-009-0354-3.View ArticlePubMed
- Okazaki T, Jiao L, Chang P, Evans DB, Abbruzzese JL, Li D: Single-nucleotide polymorphisms of DNA damage response genes are associated with overall survival in patients with pancreatic cancer. Clin Cancer Res. 2008, 14 (7): 2042-2048. 10.1158/1078-0432.CCR-07-1520.PubMed CentralView ArticlePubMed
- Lurie G, Wilkens LR, Thompson PJ, McDuffie KE, Carney ME, Terada KY, Goodman MT: Genetic polymorphisms in the Paraoxonase 1 gene and risk of ovarian epithelial carcinoma. Cancer Epidemiol Biomarkers Prev. 2008, 17 (8): 2070-2077. 10.1158/1055-9965.EPI-08-0145.PubMed CentralView ArticlePubMed
- Tavtigian SV, Simard J, Teng DH, Abtin V, Baumgard M, Beck A, Camp NJ, Carillo AR, Chen Y, Dayananth P, et al: A candidate prostate cancer susceptibility gene at chromosome 17p. Nat Genet. 2001, 27 (2): 172-180. 10.1038/84808.View ArticlePubMed
- Rokman A, Ikonen T, Mononen N, Autio V, Matikainen MP, Koivisto PA, Tammela TL, Kallioniemi OP, Schleutker J: ELAC2/HPC2 involvement in hereditary and sporadic prostate cancer. Cancer Res. 2001, 61 (16): 6038-6041.PubMed
- Wang L, McDonnell SK, Elkins DA, Slager SL, Christensen E, Marks AF, Cunningham JM, Peterson BJ, Jacobsen SJ, Cerhan JR, et al: Role of HPC2/ELAC2 in hereditary prostate cancer. Cancer Res. 2001, 61 (17): 6494-6499.PubMed
- Shen GQ, Li L, Girelli D, Seidelmann SB, Rao S, Fan C, Park JE, Xi Q, Li J, Hu Y, et al: An LRP8 variant is associated with familial and premature coronary artery disease and myocardial infarction. Am J Hum Genet. 2007, 81 (4): 780-791. 10.1086/521581.PubMed CentralView ArticlePubMed
- Horne BD, Camp NJ, Carlquist JF, Muhlestein JB, Kolek MJ, Nicholas ZP, Anderson JL: Multiple-polymorphism associations of 7 matrix metalloproteinase and tissue inhibitor metalloproteinase genes with myocardial infarction and angiographic coronary artery disease. Am Heart J. 2007, 154 (4): 751-758. 10.1016/j.ahj.2007.06.030.PubMed CentralView ArticlePubMed
- Shiffman D, Ellis SG, Rowland CM, Malloy MJ, Luke MM, Iakoubova OA, Pullinger CR, Cassano J, Aouizerat BE, Fenwick RG, et al: Identification of four gene variants associated with myocardial infarction. Am J Hum Genet. 2005, 77 (4): 596-605. 10.1086/491674.PubMed CentralView ArticlePubMed
- Tingley WG, Pawlikowska L, Zaroff JG, Kim T, Nguyen T, Young SG, Vranizan K, Kwok PY, Whooley MA, Conklin BR: Gene-trapped mouse embryonic stem cell-derived cardiac myocytes and human genetics implicate AKAP10 in heart rhythm regulation. Proc Natl Acad Sci U S A. 2007, 104 (20): 8461-8466. 10.1073/pnas.0610393104.PubMed CentralView ArticlePubMed
- Pacanowski MA, Zineh I, Li H, Johnson BD, Cooper-DeHoff RM, Bittner V, McNamara DM, Sharaf BL, Merz CN, Pepine CJ, et al: Adrenergic gene polymorphisms and cardiovascular risk in the NHLBI-sponsored Women's Ischemia Syndrome Evaluation. J Transl Med. 2008, 6: 11-10.1186/1479-5876-6-11.PubMed CentralView ArticlePubMed
- Garin MC, James RW, Dussoix P, Blanche H, Passa P, Froguel P, Ruiz J: Paraoxonase polymorphism Met-Leu54 is associated with modified serum concentrations of the enzyme. A possible link between the paraoxonase gene and increased risk of cardiovascular disease in diabetes. J Clin Invest. 1997, 99 (1): 62-66. 10.1172/JCI119134.PubMed CentralView ArticlePubMed
- Serrato M, Marian AJ: A variant of human paraoxonase/arylesterase (HUMPONA) gene is a risk factor for coronary artery disease. J Clin Invest. 1995, 96 (6): 3005-3008. 10.1172/JCI118373.PubMed CentralView ArticlePubMed
- Odawara M, Tachi Y, Yamashita K: Paraoxonase polymorphism (Gln192-Arg) is associated with coronary heart disease in Japanese noninsulin-dependent diabetes mellitus. J Clin Endocrinol Metab. 1997, 82 (7): 2257-2260. 10.1210/jc.82.7.2257.View ArticlePubMed
- Bare LA, Morrison AC, Rowland CM, Shiffman D, Luke MM, Iakoubova OA, Kane JP, Malloy MJ, Ellis SG, Pankow JS, et al: Five common gene variants identify elevated genetic risk for coronary heart disease. Genet Med. 2007, 9 (10): 682-689. 10.1097/GIM.0b013e318156fb62.View ArticlePubMed
- Sinha S, Qidwai T, Kanchan K, Anand P, Jha GN, Pati SS, Mohanty S, Mishra SK, Tyagi PK, Sharma SK, et al: Variations in host genes encoding adhesion molecules and susceptibility to falciparum malaria in India. Malar J. 2008, 7: 250-10.1186/1475-2875-7-250.PubMed CentralView ArticlePubMed
- Chen R, Davydov EV, Sirota M, Butte AJ: Non-synonymous and synonymous coding SNPs show similar likelihood and effect size of human disease association. PLoS One. 2010, 5 (10): e13574-10.1371/journal.pone.0013574.PubMed CentralView ArticlePubMed
- Pierce BL, Ahsan H: Clinical assessment incorporating a personal genome. Lancet. 2010, 376 (9744): 869-author reply 869–870View ArticlePubMed
- De Jager PL, Jia X, Wang J, de Bakker PI, Ottoboni L, Aggarwal NT, Piccio L, Raychaudhuri S, Tran D, Aubin C, et al: Meta-analysis of genome scans and replication identify CD6, IRF8 and TNFRSF1A as new multiple sclerosis susceptibility loci. Nat Genet. 2009, 41 (7): 776-782. 10.1038/ng.401.PubMed CentralView ArticlePubMed
- Goris A, Walton A, Ban M, Dubois B, Compston A, Sawcer S: A Taqman assay for high-throughput genotyping of the multiple sclerosis-associated HLA-DRB1*1501 allele. Tissue antigens. 2008, 72 (4): 401-403. 10.1111/j.1399-0039.2008.01101.x.View ArticlePubMed
- Hafler DA, Compston A, Sawcer S, Lander ES, Daly MJ, De Jager PL, de Bakker PI, Gabriel SB, Mirel DB, Ivinson AJ, et al: Risk alleles for multiple sclerosis identified by a genomewide study. N Engl J Med. 2007, 357 (9): 851-862.View ArticlePubMed
- Hoppenbrouwers IA, Aulchenko YS, Janssens AC, Ramagopalan SV, Broer L, Kayser M, Ebers GC, Oostra BA, van Duijn CM, Hintzen RQ: Replication of CD58 and CLEC16A as genome-wide significant risk genes for multiple sclerosis. J Hum Genet. 2009, 54 (11): 676-680. 10.1038/jhg.2009.96.View ArticlePubMed
- Rioux JD, Goyette P, Vyse TJ, Hammarstrom L, Fernando MM, Green T, De Jager PL, Foisy S, Wang J, de Bakker PI, et al: Mapping of multiple susceptibility variants within the MHC region for 7 immune-mediated diseases. Proc Natl Acad Sci U S A. 2009, 106 (44): 18680-18685.PubMed CentralView ArticlePubMed
- Rubio JP, Stankovich J, Field J, Tubridy N, Marriott M, Chapman C, Bahlo M, Perera D, Johnson LJ, Tait BD, et al: Replication of KIAA0350, IL2RA, RPL5 and CD58 as multiple sclerosis susceptibility genes in Australians. Genes and immunity. 2008, 9 (7): 624-630. 10.1038/gene.2008.59.View ArticlePubMed
- Cree BA, Rioux JD, McCauley JL, Gourraud PA, Goyette P, McElroy J, De Jager P, Santaniello A, Vyse TJ, Gregersen PK, et al: A major histocompatibility Class I locus contributes to multiple sclerosis susceptibility independently from HLA-DRB1*15:01. PLoS One. 2010, 5 (6): e11296-10.1371/journal.pone.0011296.PubMed CentralView ArticlePubMed
- Zivkovic M, Stankovic A, Dincic E, Popovic M, Popovic S, Raicevic R, Alavantic D: The tag SNP for HLA-DRB1*1501, rs3135388, is significantly associated with multiple sclerosis susceptibility: cost-effective high-throughput detection by real-time PCR. Clinica Chimica Acta; Int J of Clin Chem. 2009, 406 (1–2): 27-30.View Article
- Ritchie MD, Denny JC, Crawford DC, Ramirez AH, Weiner JB, Pulley JM, Basford MA, Brown-Gentry K, Balser JR, Masys DR, et al: Robust replication of genotype-phenotype associations across multiple diseases in an electronic medical record. Am J Hum Genet. 2010, 86 (4): 560-572. 10.1016/j.ajhg.2010.03.003.PubMed CentralView ArticlePubMed
- Hoppenbrouwers IA, Aulchenko YS, Ebers GC, Ramagopalan SV, Oostra BA, van Duijn CM, Hintzen RQ: EVI5 is a risk gene for multiple sclerosis. Genes and immunity. 2008, 9 (4): 334-337. 10.1038/gene.2008.22.View ArticlePubMed
- Cha PC, Takahashi A, Hosono N, Low SK, Kamatani N, Kubo M, Nakamura Y: A genome-wide association study identifies three loci associated with susceptibility to uterine fibroids. Nat Genet. 2011, 43 (5): 447-450. 10.1038/ng.805.View ArticlePubMed
- Moffatt MF, Gut IG, Demenais F, Strachan DP, Bouzigon E, Heath S, von Mutius E, Farrall M, Lathrop M, Cookson WO: A large-scale, consortium-based genomewide association study of asthma. N Engl J Med. 2010, 363 (13): 1211-1221. 10.1056/NEJMoa0906312.PubMed CentralView ArticlePubMed
- Hirota T, Takahashi A, Kubo M, Tsunoda T, Tomita K, Doi S, Fujita K, Miyatake A, Enomoto T, Miyagawa T, et al: Genome-wide association study identifies three new susceptibility loci for adult asthma in the Japanese population. Nat Genet. 2011, 43 (9): 893-896. 10.1038/ng.887.PubMed CentralView ArticlePubMed
- Owen RP, Altman RB, Klein TE: PharmGKB and the International Warfarin Pharmacogenetics Consortium: the changing role for pharmacogenomic databases and single-drug pharmacogenetics. Hum Mutat. 2008, 29 (4): 456-460. 10.1002/humu.20731.View ArticlePubMed
- Zhou K, Bellenguez C, Spencer CC, Bennett AJ, Coleman RL, Tavendale R, Hawley SA, Donnelly LA, Schofield C, Groves CJ, et al: Common variants near ATM are associated with glycemic response to metformin in type 2 diabetes. Nat Genet. 2011, 43 (2): 117-120. 10.1038/ng.735.PubMed CentralView ArticlePubMed
- Chakravarti A: Human genetics: Tracing India's invisible threads. Nature. 2009, 461 (7263): 487-488. 10.1038/461487a.View ArticlePubMed
- Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL: BLAST+: architecture and applications. BMC Bioinforma. 2009, 10: 421-10.1186/1471-2105-10-421.View Article
- Kloss-Brandstatter A, Pacher D, Schonherr S, Weissensteiner H, Binna R, Specht G, Kronenberg F: HaploGrep: a fast and reliable algorithm for automatic classification of mitochondrial DNA haplogroups. Hum Mutat. 2011, 32 (1): 25-32. 10.1002/humu.21382.View ArticlePubMed
- Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S: MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011, 28 (10): 2731-2739. 10.1093/molbev/msr121.PubMed CentralView ArticlePubMed
- Drummond AJ, Rambaut A: BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol. 2007, 7: 214-10.1186/1471-2148-7-214.PubMed CentralView ArticlePubMed
- Green RE, Malaspinas AS, Krause J, Briggs AW, Johnson PL, Uhler C, Meyer M, Good JM, Maricic T, Stenzel U, et al: A complete Neandertal mitochondrial genome sequence determined by high-throughput sequencing. Cell. 2008, 134 (3): 416-426. 10.1016/j.cell.2008.06.021.PubMed CentralView ArticlePubMed
- Gonder MK, Mortensen HM, Reed FA, de Sousa A, Tishkoff SA: Whole-mtDNA genome sequence analysis of ancient African lineages. Mol Biol Evol. 2007, 24 (3): 757-768.View ArticlePubMed
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.