Skip to main content
  • Research article
  • Open access
  • Published:

Genome-wide association analysis and QTL mapping reveal the genetic control of cadmium accumulation in maize leaf



Accumulation of cadmium (Cd) in maize (Zea mays L.) poses a significant risk to human health as it is ingested via the food chain. A genome-wide association study (GWAS) was conducted in a population of 269 maize accessions with 43,737 single nucleotide polymorphisms (SNPs) to identify candidate genes and favorable alleles for controlling Cd accumulation in maize.


When grown in contaminated soil, accessions varied significantly in leaf Cd concentration at both the seeding and maturing stages with phenotypic variation and the coefficient of variation all above 48%. The co-localized region between SYN27837 (147,034,650 bp) and SYN36598 (168,551,327 bp) on chromosome 2 was associated with leaf Cd under three soil conditions varying in Cd content in 2015 and 2016. The significant SNP (SYN25051) at position 161,275,547 could explained 27.1% of the phenotype variation. Through QTL mapping using the IBMSyn10 double haploid (DH) population, we validated the existence of a major QTL identified by GWAS; qLCd2 could explain the 39.8% average phenotype variation across the experiments. Expression of GRMZM2G175576 encoding a cadmium/zinc-transporting ATPase underlying the QTL was significantly increased in roots, stems and leaves of B73, a low Cd accumulation line in response to Cd stress.


Our findings provide new insights into the genetic control of Cd accumulation and could aid rapid development of maize genotypes with low-Cd accumulation by manipulation of the favorable alleles.


Cadmium (Cd) is a heavy metal that is highly toxic to all organisms and is one of the major environmental pollutants. One primary concern is its transfer from crop plants to the human diet. It is concluded that crop plants contribute more than 70% of cadmium intake in humans [1]. Excessive intake may result in damage from cancers of the prostate, lungs, and testes, kidney tubule damages, emphysema, and bone fractures [2]. Maize (Zea mays L.) is not only a global crop but also a primary food source for hundreds of millions of people in developing countries. In Asia, up to 50% of the ingested Cd comes from crops and their products, including maize [3]. Because of a serious concern in widespread food safety, research into a better understanding of the mechanisms driving Cd accumulation in crop plants has become increasingly important.

The pathway of Cd transport has been elucidated in rice by genetic and genomic approaches [4]. There are three main transport processes most likely to mediate Cd accumulation in plants. First, roots take up Cd from soil through the symplastic pathway. In plants, many of the transporters for divalent transition metals have Cd uptake activity. For example, OsIRT1 controls Cd absorption and overexpression of OsIRT1 increases Cd accumulation [5]. Secondly, Cd accumulation happens through root-to-shoot translocation via xylem loading. The ability of xylem-mediated Cd translocation into shoots has shown to be a major determinant for shoot Cd accumulation in many plants [6, 7]. OsHMA3 has been identified as a regulator for xylem Cd transport in rice by mediating vacuolar sequestration of Cd in root cells [8]. Third, Cd is redirected to phloem at nodes and remobilized from leaves to grains. Remobilization of Cd from leaf blades to grains appears to be regulated by phloem transport [9]. In rice, OsLCT1in leaf blades plays a role in translocation of Cd from enlarged vascular bundles to diffuse vascular bundles, which connects to the panicle [10]. OsHMA2 is involved in the preferential distribution of Zn and Cd at the roots and nodes [11]. All research reports suggest that the various transport systems are involved in Cd accumulation and differ in their roles in controlling Cd uptake and translocation.

Cd accumulation is a complex quantitative trait, controlled by multiple genes. Although the mechanisms of Cd accumulation are widespread and comprehensive in rice and Arabidopsis thaliana, none of homologous genes that have been involved in the natural genetic variation of Cd accumulation in rice and A. thaliana were cloned in maize. Traditionally, quantitative trait loci (QTLs) have been identified using linkage mapping populations such as recombinant inbred lines (RILs) and double haploid lines (DHs). However, such mapping populations are often generated from a cross between two parental lines. As a result, only a limited amount of natural allelic diversity can be captured in the population, leading to the identified QTLs spanning relatively large genomic regions and making identification of causal genes more difficult. Genome-wide association study (GWAS) utilizes the natural populations for QTL detection though marker-trait association [12, 13]. Integration of linkage mapping and GWAS provides a more powerful tool for identifying and verifying candidate genes underlying complex traits [14]. To date, genetic control of Cd accumulation is not well understood in crop species, particularly for marker and gene identification using integrated approaches of linkage mapping and GWAS. In this study, we assessed Cd accumulation in 269 maize genotypes grown at different levels of Cd contaminated soils in a greenhouse and under field conditions and conducted GWAS for leaf Cd concentration with the Illumina Infinium maize SNP50K. In addition, a co-located region identified by GWAS was validated through linkage mapping in a ten-generation intermated B73 × Mo17 (IBMSyn10) DH population with an ultra-high density bin map. Candidate genes and favorable alleles identified in the diversity accessions would assist in further revealing genetic control of Cd accumulation in maize.


Phenotypic variation and heritability

Two hundred and sixty nine diverse accessions were evaluated for Cd accumulation of leaves under low-Cd (LSLCd) and middle-Cd (MSLCd) conditions at the seeding stage and under high-Cd condition at the maturing stage (HLCd) in 2015 and 2016, respectively. Analysis of variance (ANOVA) indicated that diverse accessions differed significantly (P < 0.01) in Cd concentration in leaves, while significant genotype by environment (Y × G) interactions were observed (P < 0.01) (Table 1). The cultivation environment had a great impact on Cd accumulation.

Table 1 Phenotypic variations for leaf Cd concentration in 269 maize accessions in experiments conducted in 2015 and 2016

The mean HLCd value for individual accessions ranged from 6.76 to 128.4 mg·kg−1 under the high-Cd field condition at the maturing stage. Higher values for HLCd were observed in 2015 with a mean of 47.3 mg·kg−1, compared with 2016 (mean = 21.9 mg·kg−1) (Table 1). At the seeding stage, the mean Cd concentration of leaves (LSLCd) at the low-Cd level ranged from 0.14 to 5.90 with a mean of 1.61 mg·kg−1, whereas the Cd concentration of leaves at the middle-Cd level (MSLCd) varied from 2.79 to 64.3 mg·kg−1, with an average of 22.3 mg·kg−1. The results showed that the Cd accumulation of leaves in the natural population had a large variation with variation coefficient (CV) over 48%. In addition, the mean value of leaf Cd concentration in the tropical group was significantly lower than that in the temperate group (P < 0.001) in different environments (Fig. 1). Furthermore, the broad-sense heritability (h2) for Cd accumulation across all measured environments ranged from 0.62 (HLCd) to 0.82 (LSLCd), indicating high repeatability over testing environments as well as roles of genetic factors in determining the trait. Overall, the maize plants exhibited significant genetic variations in Cd concentration when grown at different Cd levels.

Fig. 1
figure 1

Distribution of subgroups in maize lines under low-Cd condition (a) and middle-Cd condition (b) at seeding stage and under high-Cd field condition at maturing stage (c). Trop = Tropical group; Temp = Temperature group; Mix = other group with no clear identity

Population structure and linkage disequilibrium (LD)

The population structure was calculated using 5200 SNPs. Because the log likelihood of data (LnP(D)) from the STRUCTURE output continuously increased with K value, it was not capable of identifying groups. Accordingly, the ∆K value was calculated for each K, suggesting that the 269 genotypes could be assigned into three groups (Fig. 2a). The accessions were divided into tropical, stiff stalk (SS) and non-stiff stalk (NSS) groups. Among them, 50 individuals were assigned to the NSS group, 101 individuals to the SS group, and 118 individuals to the tropical group (Additional file 1). Among them, 92 lines had mixed assignment into these three groups. The average relative kinship between any two lines was 0.067. A total of 62.8% of kinship coefficients were 0 while 27.2% were between 0 and 0.2 (Fig. 3a), indicating a weak relative kinship in the diverse population.

Fig. 2
figure 2

Analysis of the population structure of 269 maize inbred lines estimated from 5200 SNPs. a Estimated LnP(D) and ∆k over five repeats of STRUCTURE analysis; b Population structure of the 269 lines from K = 3. SS = Stiff Stalk; NSS = Non-Stiff Stalk

Fig. 3
figure 3

The distributions of pair-wise kinship and LD decay rate per chromosome. a The distributions of pair-wise kinship between 269 inbred lines; b LD decay rate per chromosome based on mean r2 per 50 kb region

The extent of LD was estimated with all 43,737 SNPs using TASSEL 5.0. A rapid decline in LD was observed with increasing physical distance on all chromosomes (Fig. 3b), but the decay rate varied over different chromosomes. At a cutoff of r2 = 0.1, equilibrium was reached within 200–250 kb on chromosome 1, 250–300 kb on chromosome 2, and 300–600 kb on the rest of the chromosomes. The mean LD decay was 350–400 kb across all chromosomes.

Genome-wide association analysis of Cd accumulation

To determine which model was more suitable for association mapping analysis, we used and compared four models for testing each trait. QQ plots showed that the distribution of observed -log10(P) values from the simple model and Q model departed far from the expected distribution, leading to a high level of false-positive signals. The model controlling K and Q + K had similar effects on reducing the false positives, except that LSLCd16 in the Q + K model had lower power than the K model (Additional file 2). Thus, we selected the Q + K model to identify association signals for LSLCd15, MSLCd15, HLCd15, MSLCd16, HLCd16, and the Q model for LSLCd16.

Across three soil conditions, 63 SNPs located on 5 of the 10 chromosomes were significantly associated with leaf Cd concentration (P < 1.18 × 10−6) (Fig. 4, Additional file 3). The association explained approximately 15.9% of the phenotypic variations. The greatest number of significant SNPs were found with traits collected from the high-Cd field environment (HLCd16, 47 SNPs) at the maturing stage in 2016, followed by the same condition (HLCd15, 30 SNPs) in 2015. However, no significant associations were detected between SNPs and leaf Cd concentration under the low-Cd condition in 2016 (Additional file 4).

Fig. 4
figure 4

Manhattan plots of association analysis between leaf Cd concentration and single-nucleotide polymorphism (SNP) markers in maize. The horizontal dashed blue line represents the significance threshold -log10(P) = 5.94. Genome-wide association (GWA) mapping under low-Cd condition (a) and middle-Cd condition (b) at the seeding stage in 2015. GWA mapping of leaf Cd concentration under high-Cd field condition at maturing stage of maize in 2015 (c) and 2016 (d)

Forty SNPs in a single region on chromosome 2 were highly associated with leaf Cd concentration (Fig. 4, Additional file 3). Meanwhile, the co-localized peak between SYN27837 (147,034,650 bp) and SYN36598 (168,551,327 bp) explained an average of 17.5% of the phenotypic variation, based on R2 values. SYN25051 (A/G), the most highly significant SNP with -log10(P) = 13.7 at position 161,275,547 on Chr2, explained 27.1% of the phenotype variation. In addition, 5 SNPs located on chromosome 5 and 6 SNPs located on chromosome 7 were associated with HLCd16 with R2 values ranging from 12.3 to 17.3%. Nine SNPs on chromosome 1were significantly associated with HLCd16.

QTL mapping for Cd accumulation in IBM Syn10 DH population

An ultra-high density bin map was used to further confirm the QTL identified from GWAS. The IBMSyn10 DH population was created by crossing the high leaf Cd line of Mo17 with an A at Chr2:161,275,547 with a low leaf Cd line of B73 with a G at Chr2:161,275,547 (Additional file 5). The field results showed that the leaf Cd concentration in B73 (23.3 mg·kg−1) was significantly lower than that in Mo17 (64.1 mg·kg−1). Leaf Cd concentration ranged from 9.5 to 114.2 mg·kg−1 in the mapping population, but the phenotypic distribution exhibited a transgressive segregation (Additional file 6). The variation in leaf Cd concentration suggested that Cd accumulation was controlled possibly by major-effect QTLs.

Composite interval mapping was performed using the threshold of LOD ≥ 3.84. Five QTLs for leaf Cd concentration were mapped and located on chromosomes 2, 5, 7, 8, and 9 (Table 2, Fig. 5). Noteworthy, an overlapped major QTL named qLCd2 (a QTL for leaf Cd concentration on chromosome 2 from GWAS) was identified, spanned a 13.83 Mb region (153.75–167.58 Mb) and explained 41.2% and 38.4% of the phenotypic variation in 2015 and 2016, respectively. This major QTL had positive additive effects, which indicated that B73 contributed more to Mo17. Although LOD values of the other 4 QTLs were high, those QTLs explained only 3.4% to 4.4% of the phenotypic variation for leaf Cd accumulation (Table 2). No QTLs in other chromosomal regions were observed by explaining more than 5% of the variation in leaf Cd accumulation, suggesting that the qLCd2 is the major genetic locus controlling natural variation in leaf Cd accumulation when maize is grown in Cd contaminated soil.

Table 2 QTLs detected for leaf Cd accumulation in maize by composite interval mapping
Fig. 5
figure 5

Logarithm of odds (LOD) score curves for the QTLs for the leaf Cd accumulation of IBMSyn10 DH population in 2015 and 2016. Gray dotted line indicates LOD threshold (3.84)

Prediction of candidate genes

Based on the results of GWAS and QTL mapping, a major and co-located QTL was associated on chromosome 2, which encompassed a 13.83 Mb region (153.75–167.58 Mb) (Fig. 6a, b). In the comparatively narrow region, GWAS detected a highly significant cluster of 16 SNPs. To determine the trait-associated loci, all significant SNPs located in the target region were clumped at D’ > 0.80. We observed that these 12 SNPs exhibited strong LD and could form two LD blocks (Fig. 6c), which caused an overlap with QTL mapping. The first LD block spanned around 3 Mb including 6 SNPs and the second big LD block spanned around 400 kb including 6 SNPs. Candidate genes were predicted based on the 12 SNPs and their extension regions from 300 kb upstream to 300 kb downstream (LD distance of chromosome 2).

Fig. 6
figure 6

Genome-wide association analysis detected a significant signal associated with leaf Cd concentration in maize. a Manhattan plot of association analysis for HLCd15; b The QTL associated with leaf Cd accumulation in maize at chromosome 2. c Linkage disequilibrium (LD) between the SNPs in target region (Chr2: 153.75–167.58 Mb) and the magnitude of LD indexed by the D’ statistic. Red squares without numbers indicate complete LD (D’ = 1, P < 0.01). D’ values are shown in the squares for values < 1.0

Detailed descriptions of 50 candidate genes were summarized in Additional file 7. According to the gene functional annotations in the maize B73 genome (RefGen_v2), 8 causal candidate genes were predicted for 12 loci associated with Cd accumulation (Table 3). Among these genes in the first LD block, GRMZM2G455491 and GRMZM2G175576, simultaneously encoding a cadmium/zinc-transporting ATPase, were located at 64.18 kb and 42.64 kb upstream of the PZE-102118224 and SYN395 loci, respectively. On average, the individuals carrying the major frequency alleles (G/G + G/G) of two SNPs (PZE-102118224, SYN395) had 27.8 mg lower leaf Cd content than those with minor frequency alleles (A/A + A/A) (Fig. 7a). A putative gene, GRMZM2G124103, encoding a vacuolar ATPase, was found at 72.3 kb apart from the peak SNP (SYN25051). The individuals carrying the major frequency allele (G/G) had 29.3 mg Cd content higher than those with the minor frequency allele (A/A) (Fig. 7b). A ZIP transcription factor, encoded by the putative gene GRMZM2G171370, was targeted by three SNPs, SYN30994, SYN30995, SYN30993. On average, there was 26.4 mg difference in leaf Cd content between major and minor alleles in three SNPs (Fig. 7d). In the second LD block, SYN33611 and PZE-102120786 were located 85.30 kb upstream and 81.87 kb downstream of GRMZM2G085939, respectively. The leaf Cd concentration of individuals with minor frequency alleles (G/G + A/A) at these loci was 28.1 mg higher than those with major frequency alleles (A/A + C/C) (Fig. 7c).

Table 3 Significant associations and corresponding QTLs detected by genome-wide association study and QTL mapping of leaf Cd concentration at seeding and maturing stage in maize
Fig. 7
figure 7

The boxplot of phenotype analysis between the candidate genes for locus associated with leaf Cd concentration in maize and phenotypic difference between minor alleles and major alleles. The number above box represents the number of inbred lines homozygous for a determined allelic variant. Δm, the difference of mean of leaf Cd concentration between the minor alleles and major alleles at maturing stage over 2 years and 2 replications. a Two combinations of the minor and major between SYN395 and SYN30994 in GRMZM2G175576and GRMZM2G455491region; b The single SYN25051 loci in GRMZM2G124103; c Two combinations of the minor and major between SYN33611 and PZE-102120786 in GRMZM2G085939region; d Three combinations of the minor and major among SYN30994, SYN30995 and SYN30993 in GRMZM2G171370 region

Expression analysis of candidate genes

Expression profiles of eight candidate genes were examined in maize B73 under non-limiting growth conditions. Using online transcriptome data from maize B73 (MaizeGDB), an expression heatmap was constructed for these candidate genes in different tissues from 8 developmental stages (Fig. 8a). The results showed that the expression patterns of different candidate genes varied greatly in different tissues. GRMZM2G124103 and GRMZM2G047727 showed a relatively high level of expression in all developmental stages compared to the other candidate genes. GRMZM2G175576 in roots had a relatively high level of expression compared to other tissues. On the contrary, GRMZM2G455491 and GRMZM2G386138 exhibited a constitutively low level of expression in different tissues. The expression of 6 candidate genes were examined in leaves, stems and roots of Cd-stressed B73 seedlings by quantitative real-time PCR analyses. As illustrated in Fig. 8b, a dramatic upregulation of the GRMZM2G175576 gene was observed in response to Cd stress, especially in the stems, exhibiting about a 9.5-fold increase in transcript abundance compared to the control. The relative expression levels of GRMZM2G085939 also significantly increased in the stems and leaves in response to Cd stress. In contrast to the other candidate genes, expression levels of GRMZM2G124103 and GRMZM2G171370 significantly decreased in roots and stems under Cd stress. The expression of GRMZM2G047727 and GRMZM2G085153 had minor response to Cd exposure. These results suggested that 4 candidate genes possibly played diverse roles in maize development and Cd-stress response and tolerance. Furthermore, possible sub-cellular locations of the candidate genes were predicted using ProtComp 9.0. This prediction suggested that GRMZM2G175576, GRMZM2G124103 and GRMZM2G171370 might be localized in plasma membrane, and vacuolar and nucleus organelles, respectively (Additional file 8).

Fig. 8
figure 8

Expression profiles of putative genes. a A heat map illustrating levels of gene expression of the putative genes in nine different tissues from various developmental stages. b Relative levels of gene expression in maize B73 roots, stems and leaves for response to Cd stress; Cd (200 mg/L) treat for 0 h (control), 12 h, 24 h, 48 h samples


Diverse responses of maize accessions to Cd treatments provided valuable information about the range and distribution of Cd concentration in maize and offered new insights into germplasm enhancement for low Cd accumulation. The mean leaf Cd concentration in the temperate group was nearly double (P = 8 × 10−13) that in the tropical group under high-Cd condition (Fig. 1). The relative values of Cd concentration in the three subpopulations (tropic < Mix < SS < NSS) were consistent with the levels of genetic relatedness of these associations to biotic and abiotic stresses [15, 16]. The results suggested that the tropical germplasm contained alleles that would provide useful information for exploring genetic variations in reducing Cd accumulation within SS and NSS. The low-Cd level in the eight lines of 11GP66-1, T32, 98WV9, Wa138, CIMMYT-1, B047, Y8G and CML282 indicated that these lines could be used for developing new maize varieties with low-Cd accumulation.

Cd accumulation is a complex quantitative trait, controlled by multiple genes. Most previous research on the mechanisms of Cd accumulation has focused on rice and A. thaliana. In rice, a major QTL controlling the translocation of Cd from roots to shoots at the seedling stage was mapped [17]. Through GWAS, Chao et al. successfully identified a single strong peak of SNPs associated with leaf Cd accumulation in A. thaliana [18]. In this study, we found 63 loci associated with leaf Cd accumulation through GWAS. Noteworthy, we observed a single region on chromosome 2 that contained 40 SNPs highly associated with leaf Cd concentration (Fig. 2). Across all experiments in 2015 and 2016, a major locus identified for Cd accumulation through GWAS was successfully validated through QTL mapping (Fig. 5). This QTL could explained over 38% of the phenotypic variation (Table 2), implying that Cd accumulation in leaves could be controlled by a major QTL. The remaining minor novel QTLs are also of interest but further work is needed to validate these QTLs in controlling Cd accumulation.

In the comparatively narrow region, GWA detected a highly significant cluster of 16 SNPs on chromosome 2, which extended to the candidate region to a 300 kb window. According to their putative functions, the candidate genes in support region are primarily involved with DNA binding, catalytic activity, transcription regulator activity and transport. Within the QTL support intervals, the R2 of all markers were over 13.6%, indicating that all these markers had high LD with the causal gene. According to a previous study, LD has been observed for the chromosomes that could be involved in the domestication process and selection signatures [19]. More importantly, a rapid breakdown of linkage-related LD can be favorable for association testing of candidate genes that are located nearby the mapped QTL and have functional relevance to trait variation [20]. Among these loci on the target region, 12 potential causative loci exhibited the strongest associations and rapid LD (Fig. 6c). Therefore, the LD might cause overlapping effects and/or co-selection by reorganization, mutation, and selection in the evolution process. These chromosome segments can be incorporated into breeding efforts in cultivating a low-Cd accumulation germplasm.

Knowing genes within a QTL region can help understand trait architecture if their function can be related to the associated trait. Once metal ions are absorbed, translocation of Cd from the roots to the shoots requires loading of Cd into the xylem from the symplast in the stele. Xylem loading of Cd in plants requires Heavy Metal ATPases [4]. In this study, GRMZM2G175576, close to the SNP of PZE-102118224, was predicted to encode a heavy metal transporting ATPase, which was homologous to rice clone-gene OsHMA3 [21]. OsHMA3 has been identified as a firewall by sequestrating Cd into the vacuoles in the roots, keeping the Cd away from the above-ground tissues [8, 22]. OsHMA3 is localized to the vacuolar membrane, but GRMZM2G175576 is predicted to be localized to the plasma-membrane (Additional file 8). The plant plasma membrane is regarded as the first ‘living’ structure that is a target for heavy metal toxicity [23]. These plasma membrane proteins, such as OsHMA2, also mediate Cd transport in plants. Thus a high expression of such proteins might reduce Cd accumulation in the plant [24]. Exposure of plants to Cd in nutrient solution for 12 h caused significant increase in the expression levels of GRMZM2G175576 in roots, stems and leaves of B73 (Fig. 8). All these results indicate that the putative protein GRMZM2G175576, played an important role in controlling Cd accumulation of maize.

Excess toxic metal may activate defense mechanisms against these toxic metals [25]. Among all defense mechanisms, vacuolar sequestration and plant-mediated bioremediation have received attention [26]. We found that GRMZM2G124103 encoding a vacuolar-type ATPase was located 72.3 kb apart from the peak SNP (SYN25051), sharing 77% amino acid sequence identity with the AVMA10. V-ATPase, a type of H pump, could maintain or adjust the concentration balance of cations on both sides of a membrane, including Na+, Ca2+, and Cd2+ [27]. In a previous report, LbVHA-c1 may confer stress tolerance through tolerance peroxidase and superoxide dismutase activities, protecting membranes from damage and decreasing lipid peroxidation under salt stress [28]. These protein pumps control the movement of ions across the vacuolar membrane, possibly leading to increased Cd tolerance of the plant. GRMZM2G085939, located downstream of PZE-102120786, encodes a calmodulin binding heat shock protein. Heat shock proteins are expressed in organisms at temperatures above the optimum growth temperatures. HSP17 and HSP70 have been involved in root responses of peruvian tomato (Lycopersicon peruvianum) to Cu and Cd [23]. We found that the expression of GRMZM2G085939 significantly increased in roots, stems, and leaves in response to Cd stress, suggesting that GRMZM2G085939 might be involved in protection and repair of the cellular proteins.


In briefly, temperate varieties of maize had twice the Cd concentration as tropical varieties. One region identified by GWAS was co-localized with quantitative trait loci (QTL) and linkage mapping. Three candidate genes, GRMZM2G175576, GRMZM2G124103, and GRMZM2G085939,underlying the major QTL were proposed, including a gene (GRMZM2G175576) homologous to rice genes (OsHMA3, OsHMA2) that functioned similarly in phenotypic response. Future work will include cloning the genes and illustrating the molecular mechanisms for controlling Cd accumulation in maize plants. Meanwhile, the identified QTL region could be utilized for the development of Cd resistant plants.


Plant materials

Two hundred and sixty nine maize accessions for the GWAS were selected from the Southwest China breeding program [29], including 17 parents of major expanded hybrids from Southwest China, 160 newly improved inbred lines, 35 representative inbred lines of temperate germplasm (Reid, Lancaster, etc.), and 62 CIMMYT and US imported lines and tropical germplasm. Detailed information of the accessions is listed in Additional file 9. Meanwhile, an IBMSyn10 DH population [30], including 197 doubled haploid lines, was used for QTL analysis.

Plant growing conditions

Two experiments were conducted for GWAS in maize. The seeding experiment was conducted in pots in a green house at Sichuan Agricultural University (Yaan, Sichuan, China; N 30o08’, E 103o14’) during the months of October, 2015 and April, 2016. Eight seeds of each of 269 accessions were initially sown in plastic pots (22 cm diameter and 28 cm deep) containing 14 kg clayey soil with low Cd and pH of 6.3, and three seedlings were kept after germination. Plants were grown in the greenhouse for 12 d prior to Cd treatment. Cd treatments were applied as Cd solutions with CdCl2·2.5H2O at concentrations of 0 (low-Cd level, available Cd of 3.28 mg·kg−1 in soil) and 0.1 mmol·kg−1 (middle-Cd level, available Cd 18.8 mg·kg−1 in soil). The average air temperatures in the greenhouse were 29 °C/21 °C (day/night). Irrigation was controlled manually during the experiment to avoid loss of mineral elements and Cd caused by excessive water. After 15 d of growth, the third and fourth leaves were harvested for measuring Cd concentration. The greenhouse experiment was a randomized complete block design with two replications.

A field trial for GWAS was conducted in the summers of 2015 and 2016 in Deyang city, Sichuan province of China (104°06’N, 31°11′E). Initially, 269 accessions were grown in a greenhouse for 2 weeks under environmental conditions described above. To ensure and obtain uniform plant growth, seedlings were then visually selected and transplanted to the field for phenotypic trait evaluation. Plants were grown in the contaminated soil with 32.5 mg·kg−1 of Cd (high-Cd level). Each 3 m row contained 14 plants 0.80 m apart. The maintenance of plants in the field followed routine practices of maize during the experiment. After seeds matured, five consecutive plants were chosen from the middle of each row, and the middle leaf below the tassel was harvested for measuring Cd concentration.

For the QTL mapping study, plants of IBMsyn10 DH population were initially grown in a greenhouse for 2 weeks and transplanted to the field. Phenotypic traits were collected from the matured plants of IBMsyn10 DH population in the summers of 2015 and 2016 in Deyang city, Sichuan province of China (104°06’N, 31°11′E). Plant growing conditions and the procedures of plant sampling were the same as described previously. The field experiment followed a complete randomized plot design with two replications for both GWAS and QTL mapping for both years.

Determination of Cd concentration and phenotypic data analysis

All samples were washed thoroughly with tap water to remove soil and dirt, and then were washed with deionized water three times. The leaves were dried and ground until 95%of the sample could pass through a 1 mm screen. The powdered sample (0.5 g) was digested with 20 mL of 67% concentrated nitric acid (HNO3) and 33% hydrogen-peroxide (H2O2) on a heating block at 90 °C for 1 h and 180 °C for 5 h. The digested solutions were filtered after dilution with deionized water. Subsequently, the Cd concentration in the solutions was measured by using the inductively coupled plasma-atomic emission spectrometry (ICP-MS) (Nippon-Jarrell-Ash, Tokyo, Japan).

Analysis of variance (ANOVA) and heritability (h2) of Cd concentration in leaves were performed using SPSS statistics 21.0. Broad-sense (h) was estimated as h2 = σg2/(σg2 + σge2 + σe2), where σg2, σe2, σge2 represent TypeIIISS (sums of squares) for genotype (G), environment (E), and interaction variance of G and E, respectively [31, 32].

Population structure, relative kinship and linkage disequilibrium analysis

In this study, the maizeSNP50K was used to genotype the population as described by Zhang et al. [29]. Through removal of missing rate > 20%, heterozygosity >20% and minor allele frequency (MAF) < 0.05, a total of 43,737 high-quality SNPs were obtained for association analysis.

Population structure (Q matrix) of 269 maize individuals was estimated from a randomly selected set of 5200 high-quality SNPs using STRUCTURE 2.3.4 software with the “admixture model” [33]. The parameter settings for estimating membership coefficients for lines in each subpopulation consisted of a burn-in length of 50,000 followed by 50,000 iterations for each of the clusters (K) from 1 to 10, with each K being run five times. Maximum likelihood and delta K (∆K) tests were used to determine the optimum number of subgroups. The relative kinship (K matrix) between lines was calculated using SPAGeD software [34]. All negative values from this software were set to 0.

Linkage disequilibrium (LD) between pairs of SNPs was estimated by using squared allele frequency correlations between two loci (r2) in TASSEL 5.0 [35]. Standardized disequilibrium coefficient (D’) and common haplotype patterns were assessed in Haploview version 4.2 [36]. Haplotype blocks were defined with the confidence interval method [37].

Genome-wide association analysis

Marker-trait association analysis was performed by TASSEL 5.0 with the General Linear Model (GLM) and Mixed Linear Model (MLM) procedures [35] to control for population structure (Q) and relative kinship (K). The simple linear model (S), Q (Q matrix) model, K (K matrix) model, and Q + K model were applied to assess the suitability of each model for false-positive correction using quantile-quantile (QQ) plots for association analyses. QQ plots and Manhattan plots were generated using the ‘qqman’ package in R. Associations between SNPs and traits were considered significant only if the P-value was lower than threshold Pthreshold = 0.05/N, where N was the total number of SNP markers.

QTL mapping

Genotypic data previously obtained for the IBMsyn10 DH population was used for this QTL mapping [38]. The linkage map included 5955 bins with an average bin size of 344 kb. QTL for leaf Cd accumulation was detected using the composite interval mapping method and Model 6 of the Zmapqtl module of QTL Cartographer 1.17 [39]. The LOD threshold was determined by 1000 permutations and a 10 cM window with scanning intervals at 1 cM interval. Signals were treated as separate QTLs when their peaks were more than 20 cM apart. The support interval of a QTL was defined as the segment of the chromosome in which the LOD at the peak decreased by half [40].

Prediction of candidate genes and expression analysis

We used the maize B73 reference genome (B73 RefGen_v2, [41] to identify candidate genes that were either included or close to the significantly associated SNPs. Based on overlapping regions of GWAS and QTL mapping, a region of approximately 300 kb around the SNP was examined for annotated genes putatively involved in iron transporter and/or regulation by transcription factor [4]. To validate the expression levels of candidate genes obtained by GWAS, expression patterns of candidate genes in different tissues of maize under non-limiting growth conditions were analyzed using online data from MaizeGDB. In addition, the responses of each candidate gene to Cd stress were analyzed by quantitative real-time PCR (qRT-PCR). Briefly, the 2-week plants of the B73 line were grown in ½ Hoagland’s nutrient solution amended with CdCl2·2.5H2O (200 μmol·L−1) for 0 h (control), 12, 24, and 48 h. Total RNA was extracted using TRIZOL reagent (Invitrogen, USA) and RNase-free DNase (Takara, Japan) from the roots, stems and leaves. cDNA was synthesized using 1 mg total RNA from each sample by PrimeScript RT Reagent Kit With gDNA Eraser (Perfect Real Time, Takara, Japan). The primer sequences are shown in Additional file 10. Subsequently, qRT-PCR was conducted using the SYBR premix Ex Taq kit (Takara) on an ABI 7500 Real-Time System (Applied Biosystems) and the expression of GADPH was used for normalization. Three replicates were used to calculate expression levels of candidate genes by using the 2-ΔΔCT method [42] for each sample. Putative protein sub-cellular localization was predicted using ProtComp version 9 ( which compared homologous proteins of known localization and pentamer distributions in the LocDB and PotLocDB databases.



Composite interval mapping


International Maize and Wheat Improvement Center


Linear model


Genome-wide association study


The Cd concentration of leaves under the high-Cd condiction at the maturing stage

IBM Syn10 DH population:

Intermated B73 × Mo17 synthetic 10 doubled haploid population


Relative kinship


Linkage disequilibrium


Logarithm of odds


Cd concentration of leaves under low-Cd condition at the seeding stage


Mixed linear model


The Cd concentration of leaves under the middle-Cd condition at seeding stage


Non-stiff stalk


Population structure




Quantitative trait locus

R2 :

Percentage of phenotypic variation explained by the identified SNPs


Single nucleotide polymorphisms


Stiff stalk


  1. Wagner GJ. Accumulation of cadmium in crop plants and its consequences to human health. Adv Agron. 1993;51:173–212.

    Article  CAS  Google Scholar 

  2. Nawrot T, Plusquin M, Hogervorst J, Roels HA, Celis H, Thijs L, Vangronsveld J, Hecke EV, Staessen JA. Environmental exposure to cadmium and risk of cancer: a prospective population-based study. Lancet Oncol. 2006;7(2):119–26.

    Article  CAS  PubMed  Google Scholar 

  3. Cheng F, Zhao N, Xu H, Li Y, Zhang W, Zhu Z, Chen M. Cadmium and lead contamination in japonica rice grains and its variation among the different locations in southeast China. Sci Total Environ. 2006;359(1–3):156–66.

    CAS  PubMed  Google Scholar 

  4. Uraguchi S, Fujiwara T. Cadmium transport and tolerance in rice: perspectives for reducing grain cadmium accumulation. Rice. 2012;5:5–10.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Lee S, Gynheung AN. Over-expression of OsIRT1 leads to increased iron and zinc accumulations in rice. Plant Cell Environ. 2009;32(4):408–16.

    Article  CAS  PubMed  Google Scholar 

  6. Murakami M, Nakagawa F, Ae N, Ito M, Arao T. Phytoextraction by rice capable of accumulating Cd at high levels: reduction of Cd content of rice grain. Environ Sci Techno. 2009;43(15):5878–83.

    Article  CAS  Google Scholar 

  7. Uraguchi S, Mori S, Kuramata M, Kawasaki A, Arao T, Ishikawa S. Root-to-shoot Cd translocation via the xylem is the major process determining shoot and grain cadmium accumulation in rice. J Exp Bot. 2009;60(9):2677–88.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Ueno D, Yamaji N, Kono I, Huang CF, Ando T, Yano M, Ma JF. Gene limiting cadmium accumulation in rice. Proc Natl Acad Sci U S A. 2010;107(38):16500–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Matthew SR, Gang L, Robert JR. The timing of grain Cd accumulation in rice plants: the relative importance of remobilisation within the plant and root Cd uptake post-flowering. Plant Soil. 2011;347(1–2):105–14.

    Google Scholar 

  10. Uraguchi S, Kamiya T, Sakamoto T, Kasai K, Sato Y, Nagamura Y, Yoshida A, Kyozuka J, Ishikawa S, Fujiwara T. Low-affinity cation transporter (OsLCT1) regulates cadmium transport into rice grains. Proc Natl Acad Sci U S A. 2011;108(52):20959–64.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Yamaji N, Xia J, Mitani-Ueno N, Yokosho K, Ma J. Preferential delivery of zinc to developing tissues in rice is mediated by P-type heavy metal ATPase OsHMA2. Plant Physiol. 2013;162(2):927–39.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Ueda Y, Frimpong F, Qi Y, Matthus E, Wu L, Höller S, Kraska T, Frei M. Genetic dissection of ozone tolerance in rice (Oryza sativa L.) by a genome-wide association study. J Exp Bot. 2015;66(1):293–306.

    Article  CAS  PubMed  Google Scholar 

  13. Zhang X, Warburton ML, Setter T, Liu H, Xue Y, Yang N, Yan J, Xiao Y. Genome-wide association studies of drought-related metabolic changes in maize using an enlarged SNP panel. Theor Appl Genet. 2016;129(8):1449–63.

    Article  CAS  PubMed  Google Scholar 

  14. Nicod J, Davies RW, Cai N, Hassett C, Goodstadt L, Cosgrove C, Yee BK, Lionikaite V, McIntyre RE, Remme CA. Genome-wide association of multiple complex traits in outbred mice by ultra-low-coverage sequencing. Nat Genet. 2016;48(8):912–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Zila CT, Ogut F, Romay MC, Gardner CA, Buckler ES, Holland JB. Genome-wide association study of Fusarium ear rot disease in the USA maize inbred line collection. BMC Plant Biol. 2014;14(1):372.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Yang X, Gao S, Xu S, Zhang Z, Prasanna BM, Li L, Li J, Yan J. Characterization of a global germplasm collection and its potential utilization for analysis of complex quantitative traits in maize. Mol Breed. 2011;28(4):511–26.

    Article  Google Scholar 

  17. Ueno D, Kono I, Yokosho K, Ando T, Yano M, Ma J. A major quantitative trait locus controlling cadmium translocation in rice (Oryza Sativa). New Phytol. 2009;182(3):644–53.

    Article  CAS  PubMed  Google Scholar 

  18. Chao DY, Silva A, Baxter I, Huang YS, Nordborg M, Danku J, Lahner B, Yakubova E, Salt DE. Genome-wide association studies identify heavy metal ATPase3 as the primary determinant of natural variation in leaf cadmium in Arabidopsis Thaliana. PLoS Genet. 2012;8(9):e1002923.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Li Y, Zhao S, Ma J, Li D, Yan L, Li J, Qi X, Guo X, Zhang L, He W. Molecular footprints of domestication and improvement in soybean revealed by whole genome re-sequencing. BMC Genomics. 2013;14(1):579.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Zhu C, Gore M, Buckler ES, Yu J. Status and prospects of association mapping in plants. Plant Genome. 2008;1(1):5–20.

    Article  CAS  Google Scholar 

  21. Jin T, Chen J, Zhu L, Zhao Y, Guo J, Huang Y. Comparative mapping combined with homology-based cloning of the rice genome reveals candidate genes for grain zinc and iron concentration in maize. BMC Genet. 2015;16(1):17.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Miyadate H, Adachi S, Hiraizumi A, Tezuka K, Nakazawa N, Kawamoto T, Katou K, Kodama I, Sakurai K, Takahashi H. OsHMA3, a P1B-type of ATPase affects root-to-shoot cadmium translocation in rice by mediating efflux into vacuoles. New Phytol. 2011;189(1):190–9.

    Article  CAS  PubMed  Google Scholar 

  23. Hall J. Cellular mechanisms for heavy metal detoxification and tolerance. J Exp Bot. 2002;53(366):1–11.

    Article  CAS  PubMed  Google Scholar 

  24. Takahashi R, Ishimaru Y, Shimo H, Ogo Y, Senoura T, Nishizawa NK, Nakanishi H. The OsHMA2 transporter is involved in root-to-shoot translocation of Zn and Cd in rice. Plant Cell Environ. 2012;35(11):1948–57.

    Article  CAS  PubMed  Google Scholar 

  25. Sytar O, Kumar A, Latowski D, Kuczynska P, Strzałka K, Prasad M. Heavy metal-induced oxidative damage, defense reactions, and detoxification mechanisms in plants. Acta Physiol Plant. 2013;35(4):985–99.

    Article  CAS  Google Scholar 

  26. Sharma SS, Dietz KJ, Mimura T. Vacuolar compartmentalization as indispensable component of heavy metal detoxification in plants. Plant Cell Environ. 2016;39:1112–26.

    Article  CAS  PubMed  Google Scholar 

  27. Dietz K-J, Tavakoli N, Kluge C, Mimura T, Sharma S, Harris G, Chardonnens A, Golldack D. Significance of the V-type ATPase for the adaptation to stressful growth conditions and its regulation on the molecular and biochemical level. J Exp Bot. 2001;52(363):1969–80.

    Article  CAS  PubMed  Google Scholar 

  28. Xu C, Zheng L, Gao C, Wang C, Liu G, Jiang J, Wang Y. Ovexpression of a vacuolar H+-ATPase c subunit gene mediates physiological changes leading to enhanced salt tolerance in transgenic tobacco. Plant Mol Biol Rep. 2011;29(2):424–30.

    Article  CAS  Google Scholar 

  29. Zhang X, Zhang H, Li L, Ren Z, Liu D, Wu L, Liu H, Jennifer J, Li B, Pan G, et al. Characterizing the population structure and genetic diversity of maize breeding germplasm in Southwest China using genome-wide SNP markers. BMC Genomics. 2016;17(1):697.

    Article  PubMed  PubMed Central  Google Scholar 

  30. Hussain T, Tausend P, Graham G, Ho J. Registration of IBM2 SYN10 doubled haploid mapping population of maize. J Plant Reg. 2007;1:81.

    Article  Google Scholar 

  31. Holland JB, Nyquist WE, Cervantes-Martínez CT. Estimating and interpreting heritability for plant breeding: an update. Plant Breed Rev. 2003;22:9–112.

    Google Scholar 

  32. Samayoa LF, Malvar RA, Olukolu BA, Holland JB, Butrón A. Genome-wide association study reveals a set of genes associated with resistance to the Mediterranean corn borer (Sesamia nonagrioides L.) in a maize diversity panel. BMC Plant Biol. 2015;15(1):35.

    Article  PubMed  PubMed Central  Google Scholar 

  33. Falush D, Stephens M, Pritchard JK. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics. 2003;164(4):1567–87.

    CAS  PubMed  PubMed Central  Google Scholar 

  34. Hardy OJ, Xavier V. SPAGeDi: a versatile computer program to analyse spatial genetic structure at the individual or population levels. Mol Ecol Notes. 2002;2(2):618–20.

    Article  Google Scholar 

  35. Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES. TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics. 2007;23(19):2633–5.

    Article  CAS  PubMed  Google Scholar 

  36. Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and visualization of LD and haplotype maps. Bioinformatics. 2005;21(2):263–5.

    Article  CAS  PubMed  Google Scholar 

  37. Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, Higgins J, DeFelice M, Lochner A, Faggart M. The structure of haplotype blocks in the human genome. Science. 2002;296(5576):2225–9.

    Article  CAS  PubMed  Google Scholar 

  38. Liu H, Niu Y, Gonzalez-Portilla PJ, Zhou H, Wang L, Zuo T, Qin C, Tai S, Jansen C, Shen Y. An ultra-high-density map as a community resource for discerning the genetic basis of quantitative traits in maize. BMC Genomics. 2015;16(1):1078.

    Article  PubMed  PubMed Central  Google Scholar 

  39. Basten CJ, Weir BS, Zeng ZB. QTL Cartographer, version 1.17, vol. 189. Raleigh: Department of Statistics, North Carolina State University; 2004.

    Google Scholar 

  40. Ruta N, Liedgens M, Fracheboud Y, Stamp P, Hund A. QTLs for the elongation of axile and lateral roots of maize in response to low water potential. Theor Appl Genet. 2010;120(3):621–31.

    Article  CAS  PubMed  Google Scholar 

  41. Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, Pasternak S, Liang C, Zhang J, Fulton L, Graves TA. The B73 maize genome: complexity, diversity, and dynamics. Science. 2009;326(5956):1112–5.

    Article  CAS  PubMed  Google Scholar 

  42. Schefe JH, Lehmann KE, Buschmann IR, Unger T, Funke-Kaiser H. Quantitative real-time RT-PCR data analysis: current concepts and the novel “gene expression’s CT difference” formula. J Mol Med. 2006;84(11):901–10.

    Article  CAS  PubMed  Google Scholar 

Download references


Authors thank anonymous reviewers for their comments on the manuscript. We would like to thank Thomas Lübberstedt (Iowa State University) for providing the IBMSyn10 DH population.


National Natural Science Foundation of China (31,471,513, and 31,601,236).

Availability of data and materials

All supporting data can be found within the manuscript and its additional files.

Author information

Authors and Affiliations



HL and GP conceived and designed the experiments. XZ, LL, YC, YL, YL, WW, and YL performed the experiments. XZ, YJ and YC supervised manuscript discussion and writing. SG provided pedigree information and genotyping. HL, ZZ, YS, GP conceived the project. All authors discussed the results and commented on the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Guangtang Pan or Haijian Lin.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional files

Additional file 1: Table S1.

The population structure analysis for accessions using 5200 SNPs by Structure 2.2.3. (XLS 53 kb)

Additional file 2: Figure S1.

Quantile-quantile (QQ) plots for leaf Cd concentration at seeding stage and maturing stage of maize. (PNG 104 kb)

Additional file 3: Table S2.

SNPs significantly associated with leaf Cd concentration in maize. (XLS 40 kb)

Additional file 4: Figure S2.

Manhattan plots of association analysis for leaf Cd concentration at seeding stage of maize in 2016. (PNG 64 kb)

Additional file 5: Table S3.

Haplotype analysis of polymorphic SNPs contained in the overlapped region (153.75–167.58 Mb) on chromosome 2. (XLS 84 kb)

Additional file 6: Figure S3.

The frequency distribution of leaf Cd concentration in maize IBMSyn10 double haploid (DH) population. (PNG 7 kb)

Additional file 7: Table S4.

Details of the candidate genes in the overlapped region and their putative function. (XLS 30 kb)

Additional file 8: Table S5.

The prediction analysis of putative protein sub-cellular localization. (XLS 28 kb)

Additional file 9: Table S6.

Pedigree information of 269 maize accessions used in this study. (XLS 45 kb)

Additional file 10: Table S7.

Primers of qRT-PCR assay used for quantifying expression levels of candidate genes within a QTL interval. (XLS 26 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, X., Luo, L., Cao, Y. et al. Genome-wide association analysis and QTL mapping reveal the genetic control of cadmium accumulation in maize leaf. BMC Genomics 19, 91 (2018).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: