- Research article
- Open Access
Regional genetic differences among Japanese populations and performance of genotype imputation using whole-genome reference panel of the Tohoku Medical Megabank Project
- Jun Yasuda1Email authorView ORCID ID profile,
- Fumiki Katsuoka1,
- Inaho Danjoh1,
- Yosuke Kawai1, 9,
- Kaname Kojima1,
- Masao Nagasaki1,
- Sakae Saito1,
- Yumi Yamaguchi-Kabata1,
- Shu Tadaka1,
- Ikuko N. Motoike1,
- Kazuki Kumada1,
- Mika Sakurai-Yageta1,
- Osamu Tanabe1,
- Nobuo Fuse1,
- Gen Tamiya1,
- Koichiro Higasa2,
- Fumihiko Matsuda2,
- Nobufumi Yasuda3,
- Motoki Iwasaki4,
- Makoto Sasaki5, 6,
- Atsushi Shimizu6,
- Kengo Kinoshita1, 7 and
- Masayuki Yamamoto1, 8Email author
© The Author(s). 2018
- Received: 4 April 2018
- Accepted: 16 July 2018
- Published: 24 July 2018
Genotype imputation from single-nucleotide polymorphism (SNP) genotype data using a haplotype reference panel consisting of thousands of unrelated individuals from populations of interest can help to identify strongly associated variants in genome-wide association studies. The Tohoku Medical Megabank (TMM) project was established to support the development of precision medicine, together with the whole-genome sequencing of 1070 human genomes from individuals in the Miyagi region (Northeast Japan) and the construction of the 1070 Japanese genome reference panel (1KJPN). Here, we investigated the performance of 1KJPN for genotype imputation of Japanese samples not included in the TMM project and compared it with other population reference panels.
We found that the 1KJPN population was more similar to other Japanese populations, Nagahama (south-central Japan) and Aki (Shikoku Island), than to East Asian populations in the 1000 Genomes Project other than JPT, suggesting that the large-scale collection (more than 1000) of Japanese genomes from the Miyagi region covered many of the genetic variations of Japanese in mainland Japan. Moreover, 1KJPN outperformed the phase 3 reference panel of the 1000 Genomes Project (1KGPp3) for Japanese samples, and IKJPN showed similar imputation rates for the TMM and other Japanese samples for SNPs with minor allele frequencies (MAFs) higher than 1%.
1KJPN covered most of the variants found in the samples from areas of the Japanese mainland outside the Miyagi region, implying 1KJPN is representative of the Japanese population’s genomes. 1KJPN and successive reference panels are useful genome reference panels for the mainland Japanese population. Importantly, the addition of whole genome sequences not included in the 1KJPN panel improved imputation efficiencies for SNPs with MAFs under 1% for samples from most regions of the Japanese archipelago.
- Genome reference panel
- Genotype imputation
- Population genetics
Genotype imputation is an important step in current genome-wide association studies. Imputation accuracy, as well as genomic coverage of highly accurate imputed genotypes, confers elevated statistical power in association tests.  The choice of a haplotype reference panel to maximize imputation performance has often been debated. [2–4] Haplotype reference panels are used to identify haplotypes of individual genomes genotyped by single-nucleotide polymorphism (SNP) arrays, and then to estimate the genotypes missing in the SNP array data. Thus, to enable high-density genotype imputation for SNPs with minor allele frequencies (MAFs) > 1% in a population, reference panels are constructed preferably based on the whole-genome sequencing (WGS) of large samples. The influence of panel selection on imputation accuracy in terms of panel size and ancestry matching between the panel and study samples has been assessed by cross-validation.  The results showed that better imputation performances were achieved when more samples from various populations were included in the reference panel. Thus, along with improved algorithms for genotype imputation using large panels, great efforts are being made to construct large reference panels for highly accurate genotype imputation, such as that by the Haplotype Reference Consortium.  In addition, several cohort studies have been conducted using WGS to construct better, more detailed haplotype reference panels. [2, 6] These studies suggest that increasing the sample sizes of population-specific haplotype reference panels is more effective for improving genotype imputation accuracy than aggregating the haplotype collection from worldwide resources, because the focus is then on specific populations. Although recent studies in human population genetics have revealed clear regional variation in haplotype diversity, even within a single population,  the influence of such variation on imputation performance has not yet been assessed.
The Tohoku Medical Megabank (TMM) project was launched in 2011 to investigate effects in the aftermath of the Great East Japan Earthquake in the Miyagi and Iwate prefectures (Northeast Japan). The TMM project developed prospective cohorts in these two prefectures  with the aim of aiding in the establishment of precision medicine in this region. To contribute to this, the WGS of 1000 human genomes from individuals in the Miyagi region was undertaken (beginning in 2015) and a 1070 Japanese genome reference panel containing haplotype information was constructed using this data (the 1KJPN panel).  In a previous study, we reported the imputation performance using the 1KJPN panel was better than the performance using the 1000 Genomes Project phase 1 panel . We also designed a custom SNP array for a Japanese population (the Japonica array). 
The 1KJPN panel consists of haplotypes derived from a cohort of participants in the Miyagi Prefecture, which has approximately 5% of Japan’s total population. However, it is not known how much region-specific sampling for a reference panel affects the performance of genotype imputation for samples collected nationwide.
Customized reference panel construction
We performed WGS for all the available samples from Iwate, Nagahama, and Aki to construct an extended haplotype reference panel (1KJPN+ panel; Fig. 1a). After removing one Aki sample because of the cryptic relatedness of another member of the cohort, the 1KJPN+ panel consisted of 2560 haplotypes from 1280 samples (1070, 136, 39, and 35 samples from Miyagi, Iwate, Nagahama, and Aki, respectively). We also compared our panels with the phase 3 panel of the 1000 Genomes Project  (1KGPp3 panel).
Genetic diversity of Japanese and other east Asian populations
We compared the diversity of Japanese populations (namely, the Miyagi cohort used to construct 1KJPN and other Japanese populations) with the diversity of populations from elsewhere in East Asia to determine how 1KJPN might reflect these populations. Principal component analysis (PCA) plots with 35,596 SNP genotypes (not indels) on chromosome 1 (see methods) are shown in Fig. 1b. The proportion of variance explained by the first and second principal components was 15.4 and 3.53%, respectively. Japanese individuals from Aki, Iwate, and Nagahama who were newly added to the dataset were clustered with the Miyagi population (= 1KJPN) but separated from other East Asian populations analyzed in the 1000 Genomes Project. Indeed, the Miyagi samples overlapped with most of the other Japanese populations, as shown in Fig. 1c, which is a magnified view of the Japanese populations in Fig. 1b. This indicates that the 1KJPN population is sufficiently similar to populations elsewhere in Japan and that 1KJPN can be used as genomic data representative of the whole Japanese population in mainland Japan.
Genetic diversity in Miyagi and other parts of Japan
Fixation index (FST) estimation* of genetic differentiation of samples in four parts of Japan
Moreover, we chose 288 of the 1070 samples from Miyagi residents whose maternal grandmother was also born in Miyagi Prefecture to analyze haplotype sharing among the four Japanese regions using fineSTRUCTURE  (Additional file 2: Figure S1). We identified four clusters and found that the cluster positions corresponded to the geographic relationship among the regions. For example, cluster A consisted of samples from Iwate and Miyagi, which are adjacent prefectures, whereas cluster B was dominated by Miyagi samples with small numbers of Iwate and Nagahama samples. However, further investigation is needed to clarify whether the cluster separation among the regions, as shown in Additional file 2: Figure S1, comes from the simple reduction of analyzed individuals, the increase of the Miyagi-specific population, or both.
1KJPN genotype imputation efficiencies and effects of genetic differences
Genotype imputation is difficult for rare variants, so this decrease in aggregate r2 values was expected. However, SNPs with about 1% MAFs were efficiently imputed with 1KJPN for the Nagahama and Aki samples (Fig. 3; r2 values around 0.75). These results indicate that 1KJPN is adequate for use as a reference panel for populations from mainland Japan.
SNPs that differentiate between Miyagi and Nagahama or Aki
SNPS that differentiate between Miyagi and Nagahama of Aki based on MAF differences
rsid (dbSNP 138)
Numbers of SNPs found in three populations but not found in 1KJPN (Miyagi population)
Iwate (per person)
Nagahama (per person)
Aki (per person)
Total SNPS (AC > =2)*
We evaluated the influence of genetic diversity on the accuracy of genotype imputation among populations from different parts of Japan. Previous studies reported clear genetic differentiation between individuals from Okinawa (Ryukyu area) and individuals from the rest of Japan (Hondo), but genetic differentiation among local regions in the Hondo area has been reported to be very low and not to show distinct clusters. [15, 16] In this study, however, we found genetic clusters separated in accordance with geographic location (Miyagi, Iwate, Nagahama, and Aki) using a haplotype-based statistical method  (Fig. 1b). Among these areas, genetic diversity was shown to be correlated with geographic distance; for example, the Miyagi and Iwate populations were genetically closer than any other pair of areas. The differences in imputation accuracy with the 1KJPN panel among samples from these regions (Fig. 3) were also consistent with this diversity. Because the 1KJPN panel contains samples only from the Miyagi area, more haplotype segments were shared with this area than with other regions. Notably, the imputation accuracy of the Iwate samples was very close to that of the Miyagi samples, even though the Iwate samples are not included in the 1KJPN panel. This is consistent with another report showing that genetic similarities among subpopulations were correlated with geography on the Japanese archipelago.  These observations support the idea that ancestry matching between the subjects of genotype imputation and the donors of genomic data for a whole-genome reference panel is effective for improving imputation accuracy, especially for low-frequency and rare variants.
We demonstrated that haplotypes specific to samples in a local area had substantial impact on imputation performance, especially for low-frequency and rare variants. As mentioned above, genetic differentiation within the Hondo population was small in terms of SNP frequency, but apparent when haplotype sharing between samples was considered. Because imputation algorithms essentially rely on haplotype sharing between the reference panel and the study samples, the genetic differentiation between 1KJPN and other Japanese regions might have been substantial. Our results show that the imputation accuracy for common variants was only marginally affected by the combination of the panel and the area (Fig. 3), and by the addition of region-specific haplotypes to the panel (Fig. 4), suggesting that the common haplotypes contained in the 1KJPN panel cover the haplotype diversity of the Hondo area. However, the imputation accuracy of low frequency and rare variants improved when area-specific haplotypes were added to the 1KJPN panel. This means that long-persisting yet rare haplotypes may exist in each area, and that the imputation accuracy can be improved when matching the haplotype panel and samples in that area. These results provide important information for future extension of haplotype reference panels in population cohorts.
Our data suggest that 1KJPN can cover most of the variants found in the samples from other areas in the Japanese mainland outside of Miyagi and that 1KJPN can be used as a representative of the Japanese population’s genomes, making it is useful genome reference panel for other parts of the Japanese mainland. We also showed that the addition of samples not included in 1KJPN improved imputation efficiencies for SNPs with MAFs under 1% from most of the Japanese archipelago.
The haplotype reference panel (1KJPN panel) was constructed from the whole-genome sequences of 1070 participants from the prospective cohort study of the TMM project.  All samples in this panel were obtained from individuals recruited in the Miyagi Prefecture. In the present study, we added samples from individuals recruited in the Iwate Prefecture to our cohort, as well as samples from age- and sex-stratified random samples from external cohorts for comparison, namely, the Nagahama study [17, 18] and the Aki area from the JPHC-NEXT study . WGS and SNP array genotyping were conducted for 136, 39, and 36 samples from the Iwate, Nagahama, and Aki cohorts, respectively. Participants from these cohorts provided written informed consent to undergo WGS in the collaboration studies. For the WGS, 2.5 μg of DNA was dissolved in TE buffer (Tris pH 8.0 10 mM, EDTA 1 mM) or distilled water (100 ng/μL). Aliquots of 500 ng were prepared for the SNP array analyses. Detailed WGS methods followed previous studies. [9, 20] SNP array genotyping was performed using Japonica arrays (Toshiba Corporation, Tokyo, Japan).
Construction of haplotype reference panel
The haplotype reference panel was constructed from WGS data.  We constructed an extended haplotype reference panel (1KJPN+ panel) consisting of newly sequenced samples from Iwate, Nagahama, and Aki cohorts using the method described previously,  in addition to the 1070 samples originally included in the 1KJPN panel. Variant calling with filtering and haplotype phasing were according to the methods used to construct the 1KJPN panel.  Briefly, read mapping and genotype calling were performed using Bowtie2 (version 2.1.0)  and Bcftools (version 0.1.17-dev) , respectively. Sequence depth criteria for filtering unreliable genotypes were determined on an individual bases to realize genotype concordance between next-generation sequencing variant call data and SNP array call data as 99.8%. We then phased the genotypes obtained from WGS using the SHAPEIT2 program (version 2.r644).  Cryptic relatives inferred through an identity by descent estimate (PI_HAT value > 0.125) were removed from the reference panel. PI_HAT values were calculated using the PLINK (version 1.9) program. 
SNP array genotyping
Genotype calling was conducted using the apt-probeset-genotype program in the Affymetrix Power Tools suite (version 1.18.2; Thermo Fisher Scientific Inc., Waltham, MA). Quality control (QC) criteria were set in accordance with the manufacturer’s recommendations (dish QC ≥0.82; sample call rate ≥ 97%) and were met by all the samples. SNP-based QC was conducted using the Ps classification function in the SNPolisher package (version 1.5.2; Thermo Fisher Scientific Inc.). SNPs that were categorized as “recommended” by the Ps classification were retained. SNPs with call rate < 97.0%, Hardy–Weinberg equilibrium of p < 10− 6, or MAF < 0.5% were excluded from the downstream analysis.
Population structure analysis
The SNP genotype data of the TMM samples were obtained by whole-genome sequencing or using the Japonica array. We obtained the corresponding SNP genotype data from next-generation sequencing analysis for the cases that were not analyzed with the Japonica array. To analyze the population genetics structure compared with that of East Asian populations, we downloaded the SNP data of chromosome 1 of unrelated East Asian populations (Dai Chinese, CDX; Han Chinese in Beijing, CHB; Han Chinese South, CHS; Tokyo Japanese, JPT; and Kinh Vietnamese, KVH) from the 1000 Genomes Project . We selected the SNPs for which probes were included on the Japonica array  and used the VCFtools package to further filter the variants and individuals and the PLINK software package to calculate r2 scores. Indels and SNPs with maximum detection fraction > 0.1, smallest MAF 0.05, and maximum r2 0.8 were filtered out. The calculation of principal components for the SNP genotype was performed using the PLINK package.
Weir and Cockerham’s FST value estimators  were calculated between all pairs of populations using PLINK. Based on the resultant FST matrix, a network was inferred among populations using the neighbor-net method  in the SplitTree program . Sample clustering by haplotype sharing was performed with the fineSTRUCTURE program . Haplotype phasing for this analysis was carried out using the SHAPEIT2 program (version 2.r644) with the default settings .
Evaluation of imputation performance
We performed genotype imputation using the IMPUTE2 program.  Variants in the reference panel that had the same position in the Japonica array  (32,913 SNPs) were extracted for use in genotype imputation. The remaining variants in the panel (1,012,074 SNPs) were used to evaluate the accuracy of imputation of the true genotypes. Because the Miyagi samples used to test the reference panels are included in the reference panel (i.e., 1KJPN), we conducted a leave-one-out cross-validation experiment. Namely, each sample in the panel was extracted from the panel one after the other, and then genotype imputation of that sample was conducted against the entire panel without that sample. Because this procedure was repeated for all samples in the panel it required intensive computational resources, so the evaluation of imputation performance was conducted for SNPs only on chromosome 10 (1,044,987 SNPs). Through this process, we obtained imputed genotypes for every sample in the panel. Genotype imputation for the samples that were not included in the reference panel (i.e., Iwate, Nagahama, and Aki samples) was done with the the IMPUTE2 program. Imputation accuracy was measured using Pearson’s correlation coefficient (r2) between true genotypes, taking a value of 0, 1, or 2, and imputed genotype dosages with values between 0 and 2. The r2 values were estimated upon aggregating the variants in the reference panel that were stratified by non-reference allele frequency to visualize the imputation accuracies for rare SNPs. These evaluations were conducted on the SNPs that were identified in all the examined reference panels.
We thank Nozomi Hatanaka, Noriko Takahashi, Masae Kimura, Keiko Tateno, and Chizuru Abe for their technical assistance. We thank Margaret Biswas, PhD, from Edanz Group (www.edanzediting.com/ac) for editing a draft of this manuscript.
This work was supported in part by the Tohoku Medical Megabank Project from the Ministry of Education, Culture, Sports, Science and Technology (MEXT) and the Reconstruction Agency; the Ministry of Education, Culture, Sports, Science and Technology (MEXT); the Japan Agency for Medical Research and Development (AMED; Grant Numbers JP17km0105001 and JP17km0105002) for Tohoku University and (Grant Numbers 17km0105003j0006 and 17km0105004j0006) for Iwate Medical University; and the Center of Innovation Program from the Japan Science and Technology Agency (JST) for Tohoku University. All computational resources were provided by the Tohoku University Tohoku Medical Megabank Organization (ToMMo) supercomputer system (http://sc.megabank.tohoku.ac.jp/en), which is supported by Facilitation of R&D Platform for AMED Genome Medicine Support conducted by AMED (Grant Number JP17km0405001). The Japan Public Health Center-Based Prospective Study for the Next Generation (JPHC-NEXT) was supported by the National Cancer Center Research and Development Fund and the Japan Science and Technology Agency (JST). The Nagahama Prospective Genome Cohort for Comprehensive Human Bioscience (the Nagahama Study) was supported by MEXT and the Takeda Science Foundation.
Availability of data and materials
The datasets used and/or analyzed in the current study are available from the representative authors of each of the cohorts on reasonable request. 1KJPN can be obtained from MY; Nagahama data can be obtained from FM; Aki data can be obtained from MI; and Iwate data can be obtained from MS.
The genotype data of the other East Asian populations were obtained from the 1000 Genome Project website . The data are described in the 1000 Genomes Project Consortium paper: A global reference for human genetic variation. Nature 2015, 526:68–74.
JY and MY planned and organized the study. NM, FM, KH, NY, MI, and MS contributed to sample collection and DNA preparation. FK, ID, SS, and AS performed the whole-genome sequencing and other experimental procedures. JY, YK, KKo, and MN performed the bioinformatics and statistical analyses. ST confirmed sample ID concordances. YYK, INM, KKu, GT, and KKi supervised the bioinformatics and statistical analyses. MSY, OT, and NF supervised the whole genome-sequencing and other experimental procedures. JY and YK contributed to writing the manuscript. YYK, KH, FM, NY, MI, NF, and MY amended the manuscript. All authors read and approved the final manuscript.
Ethics approval and consent to participate
The ethics approvals were obtained from the ethics committees of the Tohoku University Tohoku Medical Megabank Organization (ID = 2017–4-054) for 1KJPN; the Graduate School of Medicine and Faculty of Medicine, Kyoto University for Nagahama samples (IDs = G0278–11 and G0455–8); the National Cancer Center (IDs =2011–186 and 2016–305) and Kochi University Medical School (ID = 24–94) for the Aki samples; and Iwate Medical University for the Iwate samples (ID = HGH25–19). Written informed consent was obtained from the participants in all cohorts whose samples were provided for this study.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Nelson SC, Doheny KF, Pugh EW, Romm JM, Ling H, Laurie CA, Browning SR, Weir BS, Laurie CC. Imputation-based genomic coverage assessments of current human genotyping arrays. G3 (Bethesda). 2013;3:1795–807.View ArticleGoogle Scholar
- Deelen P, Menelaou A, van Leeuwen EM, Kanterakis A, van Dijk F, Medina-Gomez C, Francioli LC, Hottenga JJ, Karssen LC, Estrada K, et al. Improved imputation quality of low-frequency and rare variants in European samples using the 'Genome of the Netherlands. Eur J Hum Genet. 2014;22:1321–6.View ArticlePubMedPubMed CentralGoogle Scholar
- Huang J, Howie B, McCarthy S, Memari Y, Walter K, Min JL, Danecek P, Malerba G, Trabetti E, Zheng H-F, et al. Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel. Nat Commun. 2015;6:8111.View ArticlePubMedPubMed CentralGoogle Scholar
- McCarthy S, Das S, Kretzschmar W, Delaneau O, Wood AR, Teumer A, Kang HM, Fuchsberger C, Danecek P, Sharp K, et al. A reference panel of 64,976 haplotypes for genotype imputation. Nat Genet. 2016;48:1279–83.View ArticlePubMedPubMed CentralGoogle Scholar
- Howie B, Marchini J, Stephens M. Genotype imputation with thousands of genomes. G3 (Bethesda). 2011;1:457–70.View ArticleGoogle Scholar
- Consortium TUK. The UK10K project identifies rare variants in health and disease. Nature. 2015;526:82–90.View ArticleGoogle Scholar
- Leslie S, Winney B, Hellenthal G, Davison D, Boumertit A, Day T, Hutnik K, Royrvik EC, Cunliffe B, et al. the fine-scale genetic structure of the British population. Nature. 2015;519:309–14.View ArticlePubMedPubMed CentralGoogle Scholar
- Kuriyama S, Yaegashi N, Nagami F, Arai T, Kawaguchi Y, Osumi N, Sakaida M, Suzuki Y, Nakayama K, Hashizume H, et al. The Tohoku medical megabank project: design and mission. J Epidemiol. 2016;26:493–511.View ArticlePubMedGoogle Scholar
- Nagasaki M, Yasuda J, Katsuoka F, Nariai N, Kojima K, Kawai Y, Yamaguchi-Kabata Y, Yokozawa J, Danjoh I, Saito S, et al. Rare variant discovery by deep whole-genome sequencing of 1,070 Japanese individuals. Nat Commun. 2015;6:8018.View ArticlePubMedPubMed CentralGoogle Scholar
- Genomes Project C, Abecasis GR, Auton A, Brooks LD, DePristo MA, Durbin RM, Handsaker RE, Kang HM, Marth GT, McVean GA. An integrated map of genetic variation from 1,092 human genomes. Nature. 2012;491(7422):56–65.View ArticleGoogle Scholar
- Kawai Y, Mimori T, Kojima K, Nariai N, Danjoh I, Saito R, Yasuda J, Yamamoto M, Nagasaki M. Japonica array: improved genotype imputation by designing a population-specific SNP array with 1070 Japanese individuals. J Hum Genet. 2015;60(10):581–7.View ArticlePubMedPubMed CentralGoogle Scholar
- Consortium TGP. A global reference for human genetic variation. Nature. 2015;526:68–74.View ArticleGoogle Scholar
- Takeuchi F, Katsuya T, Kimura R, Nabika T, Isomura M, Ohkubo T, Tabara Y, Yamamoto K, Yokota M, Liu X, et al. The fine-scale genetic structure and evolution of the Japanese population. PLoS One. 2017;12(11):e0185487.View ArticlePubMedPubMed CentralGoogle Scholar
- Lawson DJ, Hellenthal G, Myers S, Falush D. Inference of population structure using dense haplotype data. PLoS Genet. 2012;8:e1002453.View ArticlePubMedPubMed CentralGoogle Scholar
- Yamaguchi-Kabata Y, Nakazono K, Takahashi A, Saito S, Hosono N, Kubo M, Nakamura Y, Kamatani N. Japanese population structure, based on SNP genotypes from 7003 individuals compared to other ethnic groups: effects on population-based association studies. Am J Hum Genet. 2008;83:445–56.View ArticlePubMedPubMed CentralGoogle Scholar
- Jinam T, Nishida N, Hirai M, Kawamura S, Oota H, Umetsu K, Kimura R, Ohashi J, Tajima A, Yamamoto T, et al. The history of human populations in the Japanese archipelago inferred from genome-wide SNP data with a special reference to the Ainu and the Ryukyuan populations. J Hum Genet. 2012;57:787–95.View ArticlePubMedGoogle Scholar
- Higasa K, Miyake N, Yoshimura J, Okamura K, Niihori T, Saitsu H, Doi K, Shimizu M, Nakabayashi K, Aoki Y, et al. human genetic variation database, a reference database of genetic variations in the Japanese population. J Hum Genet. 2016;61(6):547–53.View ArticlePubMedPubMed CentralGoogle Scholar
- Terao C, Bayoumi N, McKenzie CA, Zelenika D, Muro S, Mishima M, Connell JM, Vickers MA, Lathrop GM, Farrall M et al: Quantitative variation in plasma angiotensin-I converting enzyme activity shows allelic heterogeneity in the ABO blood group locus. Ann Hum Genet 2013, 77(6):465–471.Google Scholar
- JPHC-NEXT [http://epi.ncc.go.jp/jphcnext/index.html].
- Motoike IN, Matsumoto M, Danjoh I, Katsuoka F, Kojima K, Nariai N, Sato Y, Yamaguchi-Kabata Y, Ito S, Kudo H, et al. Validation of multiple single nucleotide variation calls by additional exome analysis with a semiconductor sequencer to supplement data of whole-genome sequencing of a human population. BMC Genomics. 2014;15:673.View ArticlePubMedPubMed CentralGoogle Scholar
- Langmead B, Salzberg SL. Fast gapped-read alignment with bowtie 2. Nat Methods. 2012;9:357–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Li H. A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data. Bioinformatics. 2011;27:2987–93.View ArticlePubMedPubMed CentralGoogle Scholar
- Delaneau O, Zagury J-F, Marchini J. Improved whole-chromosome phasing for disease and population genetic studies. Nat Methods. 2013;10:5–6.View ArticlePubMedGoogle Scholar
- Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4:7.View ArticlePubMedPubMed CentralGoogle Scholar
- IGSR: The International Genome Sample Resource. http://www.internationalgenome.org.
- Weir BS, Cockerham CC. Estimating F-statistics for the analysis of population structure. Evolution. 1984;38:1358.PubMedGoogle Scholar
- Bryant D, Moulton V. Neighbor-net: an agglomerative method for the construction of phylogenetic networks. Mol Biol Evol. 2004;21(2):255–65.View ArticlePubMedGoogle Scholar
- Huson DH, Bryant D. Application of phylogenetic networks in evolutionary studies. Mol Biol Evol. 2006;23(2):254–67.View ArticlePubMedGoogle Scholar
- Howie BN, Donnelly P, Marchini J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studies. PLoS Genet. 2009;5(6):e1000529.View ArticlePubMedPubMed CentralGoogle Scholar