Genome wide association mapping for heat tolerance in sub-tropical maize

Background Heat tolerance is becoming increasingly important where maize is grown under spring season in India which coincide with grain filling stage of crop resulting in tassel blast, reduced pollen viability, pollination failure and barren ears that causes devastating yield losses. So, there is need to identify the genomic regions associated with heat tolerance component traits which could be further employed in maize breeding program. Results An association mapping panel, consisting of 662 doubled haploid (DH) lines, was evaluated for yield contributing traits under normal and natural heat stress conditions. Genome wide association studies (GWAS) carried out using 187,000 SNPs and 130 SNPs significantly associated for grain yield (GY), days to 50% anthesis (AD), days to 50% silking (SD), anthesis-silking interval (ASI), plant height (PH), ear height (EH) and ear position (EPO) were identified under normal conditions. A total of 46 SNPs strongly associated with GY, ASI, EH and EPO were detected under heat stress conditions. Fifteen of the SNPs was found to have common association with more than one trait such as two SNPs viz. S10_1,905,273 and S10_1,905,274 showed colocalization with GY, PH and EH whereas S10_7,132,845 SNP associated with GY, AD and SD under normal conditions. No such colocalization of SNP markers with multiple traits was observed under heat stress conditions. Haplotypes trend regression analysis revealed 122 and 85 haplotype blocks, out of which, 20 and 6 haplotype blocks were associated with more than one trait under normal and heat stress conditions, respectively. Based on SNP association and haplotype mapping, nine and seven candidate genes were identified respectively, which belongs to different gene models having different biological functions in stress biology. Conclusions The present study identified significant SNPs and haplotype blocks associated with yield contributing traits that help in selection of donor lines with favorable alleles for multiple traits. These results provided insights of genetics of heat stress tolerance. The genomic regions detected in the present study need further validation before being applied in the breeding pipelines. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07463-y.


Background
Maize is the third dominant cereal crop next to rice and wheat worldwide. However, in the current scenario climate change poses serious threat to maize productivity. Among the environmental stresses, heat stress (HS) is the more demanding problem which hampers maize production and there is an expectation that heat stress will further reduce the crop yields. It is predicted that at the end of twenty-first century, climate of the world will be evident to increase up to 2-4°C. It has been forecast that upcoming catastrophe of heat stress will affect the tropical and subtropical regions of the world more based on global climate model analysis [1]. It has been described that central and eastern Asia, central North America and northern part of Indian subcontinents will be more liable regions to suffer from heat stress for growing maize and other crops [2]. It has been reported that global maize yield potential decreases to 45% by 2080s as compared to 1980s at extreme heat stress during anthesis [3].
Heat stress is a major concern to physiologists and plant breeders as nature and intensity of damage to crop varies extensively across the growing seasons due to its complex inheritance. Plant phenology, developmental phases, growth rates, yield components and final yield of plant are critically affected by thermal regimes. Other than morphological changes, several physiological and biochemical changes (photosynthetic acclimation, stalk sugar mobilization, chlorophyll content), and reproductive organ malfunctioning (low pollen viability, silk receptibility, lack of fertilization, embryo abortion, shrunken kernels) are known to be associated with HS in maize. Prolonged anthesis-silking interval, reduction in kernel set [4][5][6][7][8], decreased photosynthesis rate [4,9,10], damaged cellular membrane [11] and decreased chlorophyll content [12,13] have been reported under heat stress. Therefore, breeding for heat tolerance in maize is crucial for sustainable productivity. Improved heat tolerance will increase resource use efficiency by reducing levels of irrigation and will increase resilience of yield in the face of the more variable and warmer climatic conditions predicted by climate change models. However, most of these traits are not used in breeding programme as selection indices because the secondary traits data recording is time consuming and requires high end instruments. Hence, alternate cost-effective methods must be worked out for robust phenotyping of these traits along with most influenced primary morphological traits like leaf firing, tassel blast and anthesis-silking interval. Exploitation of these robust phenotypic data could be done with the help of advanced genomic techniques for breeding new cultivars that are "climate resilient".
Genomics studies along with phenotypic information can provide knowledge to the breeders that they need to make more rapid selections and application of advanced breeding strategies to produce climate-resilient crops. It is a promising tool for identifying genes or QTLs underlying heat responsive traits for translating the stress responsiveness of crop species towards marker-assisted selection approaches. Thus, analysis of genetic control of heat stress via QTL or association mapping would accelerate maize breeding programs [14]. Genome wide association studies (GWAS) would detect genomic regions controlling candidate genes by conducting statistical association between DNA polymorphisms and trait variations in diverse collection of germplasm. Genotyping-bysequencing (GBS) platform generates millions of SNPs distributed throughout the genome in a cost-effective manner for conducting effective GWAS [15,16]. Together with next generation sequencing and GWAS, mapping resolution of accurate position of genes/alleles/ QTL has increased [17][18][19]. GWAS not only facilitated to identify the marker-trait association but also refined our understanding of the genetic architecture of complex quantitative traits [20,21]. GWAS analysis have been reported in maize for flowering time [22], kernel shape, 100 seed weight [23], kernel quality [24], functional mechanisms related to drought [25] and a number of other target genes for crop improvement [26,27], root system architecture traits [28,29] and key traits of plant lodging and leaf angle [30]. Thus, the present study aims to (i) explore the genetic variation for heat stress responsive traits in the doubled haploid (DH) association mapping panel under normal and heat stress conditions across the environments, (ii) identify genomic regions associated with the heat tolerance traits through GWAS and (iii) identify candidate genes associated with heat stress tolerance.

Results
Phenotypic analysis and genetic correlation among the traits Substantial significant variability was observed among the association mapping panel of DH lines for the agronomic traits under normal and heat stress conditions on pooled analysis of two years ( Table 1). The phenotypic range was large for each trait in both the conditions and it's indicated that there is wide range of diversity within the association mapping panel (Table 1). Frequency distribution of the lines for the investigated traits at normal and heat-stressed growth conditions are presented in an Additional file 1and 2: Figures S1 and S2. The mean performance under normal conditions was significantly different for each trait under study from the mean performance under heat stress conditions. The mean value of anthesis-silking interval (ASI), plant height (PH) and ear height (EH) was 3.6 days, 175.44 cm and 80.71 respectively, under normal conditions. While mean value of ASI, PH and EH was 5.63 days, 149.13 cm and 62.74 respectively, under heat stress conditions. Broad-sense heritability for all traits ranged from 0.41 to 0.77 under normal conditions and significantly decreased to 0.07-0.42 under heat stress ( Table 1). The mean performance and statistics descriptive of days to 50% anthesis (AD) under heat stress was not presented in Table 1 as heritability was found to be zero. This may be due to opposite variation found in two years (2016 and 2017), which lead to cancellation of each other. Under heat stress, leaf firing and tassel blast were more prominent as compared to normal conditions (Data was not presented in paper). Some lines showed higher ASI and few lines had no silk emergence under heat stress conditions. Average grain yield (GY; 3.46 t/ha) was low in heat stress environment as compared to normal environment conditions (5.45 t/ ha). Significant genotypic and genotypic × environment (Gen × Env) interactions were observed among the traits under study except ear position (EPO) both under nonstress and heat stress conditions. The results clearly indicated that the traits under studied were affected by high temperature.
The traits viz., AD, SD (days to 50% silking) and ASI showed negative and significant relationship with GY in both normal and heat stress conditions (Table 2). This illustrated that prolonged ASI (> 5 days) will results in increase in grain yield reduction. Under normal and heat stress conditions, the traits: PH, EH and EPO displayed positive and significant association with GY. The correlation analysis inferred significant association among the traits except AD and SD that showed negative and nonsignificant association with EPO under normal conditions.

Principal component analysis
The objective of principal component analysis (PCA) is to measure the variance that exists in genome-wide markers. The first four principal components (PCs) explained 28.7% of the total variance (Additional file 3: Figure S3). The PC1, PC2, PC3 and PC4 explained 16.2, 6.8, 3.6 and 2.7% of the total variance, respectively. The panel with 662 DH lines revealed only a moderate population structure with the first four principal components using genome-wide markers.

Genome wide association mapping based on the SNPs
The traits which showed heritability greater or equal to 0.20 were used for genome-wide association studies. A total of 187,000 SNPs was employed for GWAS analysis. The quantile-quantile (QQ)-plot was used to assess the significance of SNPs at a threshold level using MLM   From the QQ-plot, it was evident that most of the SNPs were not associated with the traits under study. The presence of spurious associations was shown by deviation from the diagonal line due to population structure and familial relatedness while the SNPs on the upper section of the graph deviated from the diagonal were most likely to be associated with the traits. A total of 130 highly significant marker-trait associations (P = ≤10 − 5 ) were observed for the target traits under normal conditions ( Table 3, Fig. 2a-g, Additional file 4: Table  S1). Out of which, twenty-five major SNPs were associated at p-value 10 − 5 to 10 − 7 for GY localized on chromosomes 1, 3, 7 and 10 ( Fig. 2a). The phenotypic variation expressed by these SNPs ranged from 21.90 to 23.23%. Seventeen significant SNPs detected on chromosomes 1, 3, 6 and 10 were related to AD at p-value 10 − 5 and phenotypic variation explained by these SNPs ranged from 45.45 to 45.78% (Fig. 2b). Nineteen SNPs for SD trait identified on chromosomes 1, 3, 4, 6, 7 and 10 accounting phenotypic variation from 36.69 to 39.21% (Fig. 2c) whereas twenty-five significant SNPs present on chromosomes 1, 6, 8, and 10 were identified for ASI at p-value 10 − 5 to10 − 7 (Fig. 2d) with variance of 4.39 to 5.85% (Fig. 2g). Under heat stress, a total of 46 SNPs significantly associated with the yield contributing traits were identified ( Table 3, Fig. 2h-k, Additional file 4: Table S1). Twelve SNPs present on chromosomes 1, 3, 6, 7 and 10 for GY were detected (Fig. 2h). The phenotypic variation documented by these SNPs ranged from 18.14 to 18.65%. In case of ASI at p-value 10 − 5 to10 − 6 seventeen significant SNPs were identified on chromosomes 1, 2, 3 and 6 and the proportion of variation ranged from 27.81 to 28.53% (Fig. 2i). On chromosomes 3, 8 and 10 fourteen significant SNPs were found for EH explaining phenotypic variation from 25.67 to 26.27% (Fig. 2j). Only three significant SNPs on chromosome 8 were detected for EPO Fig. 1 Quantile-qantile (Q-Q)-plots showing inflation of estimated -log10 P-values on the X-axis versus observed -log10 P-values on the Y-axis for the traits using MLM model under pooled normal (A-G) and heat stress conditions (H-K). GY Grain yield, AD Days to 50% anthesis, SD Days to 50% silking, ASI Anthesis-silking interval, PH Plant height, EH Ear height and EPO Ear position at p-value 10 − 5 with proportion of variation from 21.56 to 21.95% (Fig. 2k). These results showed so many haplotypes in SNPs marker-trait association studies because of DH population as only one recombination occurred during generation of DHs.
From breeding point of view, SNPs that were associated with more than one trait gains more importance. In present study, 15 SNPs were co-localized for multiple agronomic traits and 9 SNPs were present within the gene model ( Table 4). Out of which, six SNPs (S6_156, 527,428, S6_156,527,431, S6_156,527,432, S1_4,752,039, S6_156,527,380 and S1_228,565,627) were found commonly associated with AD and SD as evident from strong correlation between these two traits ( Table 2). The proportion of phenotypic variation explained by these SNPs ranged from 38.71 to 45.78%, respectively. The SNP, S10_7,132,845 was linked with AD, SD and GY. The phenotypic variation accounted by this SNP ranged from 22.91-45.51%. Two SNPs (S10_8,852,411 and S10_9,473,175) showed association with ASI and SD; one SNP (S10_1,888,234) with EH and GY; two SNPs (S10_1,905,273 and S10_1,905,274) with PH, EH and GY; one SNP (S10_10,826,645) with GY and SD; and two SNPs (S10_1,148,841 and S10_1,883,817) with GY and PH. Among these associations for agronomic traits, favorable associations have been often observed on chromosomes 6 and 10 suggesting the presence of either a common gene or gene clusters responsible for heat stress tolerance.
A total of 20 and 6 significant haplotypes were detected, which control more than one trait under normal and heat stress conditions, respectively (Tables 6    observed on chromosomes 8 and 10 which may be due to pleiotropic effect or presence of gene clusters that are responsible for heat stress tolerance.

Candidate genes associated with the target traits
Based on SNP and haplotypes genome wide association mapping, 9 and 7 candidate genes were identified and annotated with different functions (Additional file 5: Table S2). Out of which, GRMZM5G877815 detected for AD and SD, GRMZM2G031624 for AD, SD and GY, GRMZM2G438176 and GRMZM2G048850 for ASI and SD, AC198366.3_ FGT004 for EH and GY, GRMZM2G104620 for EH, GY and PH, GRMZM2G418432 for GY and SD, GRMZM2G057557 and GRMZM2G317287 for GY and PH. The SNPs (S4_4748055, S10_1,148,841, S10_1902587, S1_ 225496598, S5_213280266, S8_146060437, S8_170952700 and S8_102137856) involved in haplotype blocks were colocalised within the gene model: GRMZM2G583593, GRMZM2G057557, GRMZM2G104620, GRMZM2G0565 94, GRMZM2G018484, GRMZM2G109144, GRMZM2 G379128 and GRMZM2G887068. Out of 7 candidate genes, GRMZM2G583593 was detected for GY, ASI and SD; GRMZM2G056594, GRMZM2G018484 and GRMZM2 G109144 for AD and SD while GRMZM2G379128 and GRMZM2G887068 for EP and EPO.

Discussion
Heat stress due to rise in temperature beyond optimum is a major threat for sustainable production of crops and shortening of cropping periods [31]. Climatic model analysis delineates central and eastern Asia, central North America and northern part of Indian subcontinent as the major heat stress prone regions. Maize is highly sensitive to heat stress during the months of April end and May which coincide with grain filling of the crop, and results in tassel blast, reduced pollen viability, fertilization failure and barren ears that causes devastating yield losses [32]. Since, heat tolerance is a polygenic trait and is more prone to genotype-environment interactions, agronomic interventions could do a very little to alleviate this stress. Breeding for heat stress tolerance is the most economical approach to challenge climate change globally [33,34]. To meet this challenge advances in genomics assisted pre-breeding approaches by identifying superior alleles from well adapted genetic resources and their utilization in breeding programs is now widely recommended [35]. We conducted GWAS to map QTLs and identified SNP markers associated with heat tolerance using DH mapping panel for various agronomic traits under normal and heat stress conditions. Exploiting the variability and identification of  significant marker-trait associations based on GWAS might provide a platform for heat stress tolerance in maize breeding through marker-assisted introgression of heat tolerance alleles into elite germplasm. All the traits were adversely affected by high temperature as indicated by significant reduction in PH, EH and GY whereas ASI increased dramatically under heat stress conditions as compared to normal conditions in the present investigation. This might be due to decreased partitioning of assimilate to the ear and low pollen viability. The mechanisms of increased ASI under heat stress conditions have been discussed by Cicchino et al. [36]. Reduction of GY up to 70% under heat stress has been reported by Khodarahmpour and Choukan [37] which might be due to low pollen viability, silk receptivity and longer ASI duration. Similar observations have been made in our study. Leaf firing and tassel blast were observed among the panel under heat stress as heat waves resulted in cell injury which led to chlorosis and death of the tissue that affected the photosynthetic apparatus and reduction in GY [38]. Broad-sense heritability was low for the traits under heat stress as compared to moderate under normal conditions. So, it could be hypothesized that genotypic variability and genotypeenvironment interactions played an important role for the expression of a particular trait under heat stress and normal conditions. Previous results depicted significant high heritability for traits viz., GY, AD, SD, ASI, leaf firing and tassel blast for hybrids [39] and for inbred lines [40] under heat stress conditions. However, our results strongly agreed with the findings of Noor et al. [40]. We found low to high positive (r between 0.22-0.99) and significant correlation between PH and SD (0.99), SD and EH (0.82), EH and PH (0.95), and PH and EPO (0.95) under heat stress whereas EH and PH (0.72), EH and EPO (0.88), and AD and SD (0.87) showed strong positive association under normal conditions. The positive significant associations were observed among traits suggested that these traits might share some common genomic regions. Similar significant relationships for agronomic traits under normal and stress conditions were reported in previous studies [26,28,41]. Moderate to strong negative correlations was observed between ASI and other traits (PH, EH, EPO, GY) under both normal and heat stress conditions. The associating mapping panel showed moderate principal components within it. Warburton et al. [42] also observed large amount of diversity within, rather than between source populations due to the heterogeneous nature of CIMMYT populations from which most of the sub-tropical and tropical lines have been derived. The moderate structure that was observed in the present study panel may be due to the inclusion of DH lines derived from crosses of selected inbreds. Before proceeding for GWAS analysis, it is necessary to use appropriate models to control false associations which could confound association mapping due to the population structure and the kinship relationship within the panel [43].
In present study, a set of 176 SNPs were associated with the various agronomic traits under normal and heat stress conditions. Out of these, some SNPs existed within different gene models whose genetic role is associated with heat stress mechanism. It has been postulated that genes associated with the target traits significantly could be resequenced from contrasting lines to identify favorable alleles for trait improvement and SNP/InDELs could be potentially converted to simple PCR-based markers to follow MAS in molecular breeding [44,45]. Several studies have been made in maize for identification of associated SNPs with a particular trait  It could be concluded that there is switch off mechanism for heat stress tolerance under stress conditions as compared to normal conditions. Xue et al. [26] identified 42 Hap_10.3 10 GY S10_1902587, S10_1902592, S10_1902657 2 0.134391 5.94E-17 Hap_10.1 EH S10_1902587, S10_1902592, S10_1902657 2 0.086085 1.07E-10 Hap_10.4 10 GY S10_1905237, S10_1905245, S10_1,905,273, S10_1, 905,274 3 0.071581 3.63E-09 GRMZM2G104620 Hap_10.3 PH S10_1905237, S10_1905245, S10_1,905,273, S10_1 Chr Chromosome, GY Grain yield, AD Days to 50% anthesis, SD Days to 50% silking, ASI anthesis-silking interval, PH Plant height, EH Ear height and EPO Ear position SNPs located in 33 genes associated with 126 traits × environment × treatment combinations and three SNPs were co-localized to drought-related QTL regions. One of the genes GRMZM2G125777 that encodes NAC domain containing protein 2, a transcription factor expressed in different tissues, was strongly associated with ear relative position, hundred kernel weights and timing of male and female flowering [26]. Similarly, Wang et al. [41] identified 206 significant SNPs associated with 115 candidate genes for drought tolerance and related traits. Zaidi et al. [28] revealed 67 significantly associated SNPs for root structural traits whereas Frey et al. [46] revealed 607 heat responsive genes as well as 39 heat tolerance genes. Compared to these studies, our results depicted only nine candidate genes because we had looked only for the colocalized SNPs. It might be possible to predict a greater number of candidate genes if we considered all the 130 SNPs. On the other hand, under heat stress we identified 46 SNPs associated with ASI, EH, EPO and GY distributed on chromosomes 1, 2, 3, 6, 7, 8 and 10 with variance of 18.14-35.69%. It is surprised to conclude that none of the SNP colocalized with multiple traits under heat stress.
Haplotype-based analysis for identifying markerphenotype associations is beneficial for the genetic dissection of loci underlying complex traits [47,48]. As compared to individual SNP-based associations, haplotype-based analysis could lessen the number of multiple evaluations since haplotype could cluster SNPs from the LD configuration [30]. Also, in this study, as DH lines formed from F 1 source populations were used in the AM panel, and hence the LD blocks are expected to be large in the panel, which can have an impact on mapping resolution. Haplotype analysis, which takes into consideration this LD structure, will be extremely useful in such cases. In our study, a total of 210 haplotypes were identified at Bonferroni P = ≤0.05 for the various agronomic traits under normal and heat stress conditions. A total of 40% haplotypes were significantly associated with ASI, EH, EPO and GY under heat stress. The present study illustrated that both SNP-based and haplotype-based association mapping are robust for deciphering the genetic architecture of heat stress using high density SNPs. Comprehensively, 9 and 7 candidate genes, respectively, were identified from SNP-based and haplotype-based association mapping. A total of 38% significantly associated haplotypes comprising one or more SNPs were identified by Contreras-Soto et al. [48] while Chen et al. [35] identified 15 haplotypes significantly associated with Fusarium ear rot (FER) resistance and each SNP had comparatively small additive effects on disease resistance explaining 1-4% of phenotypic variation. Similarly, Rashid et al. [49] found 12 SNPs and four haplotype blocks associated with sorghum downy mildew in maize and one of the SNP S2_6154311 present on chromosome 2 contributed 26.7% of the trait variation. Likewise, in the present study we identified 25 haplotypes for GY and ASI under normal conditions as compared to 12 and 17 haplotypes for GY and ASI, respectively, under heat stress. Carlos et al. [30] identified 40 haplotype blocks that were significantly associated with leaf angle and one of the haplotypes hapLA4.04 accounted up to 29% of the phenotypic variation which was consistent over two seasons.
A SNP, S10_1,905,274 and three blocks (Hap_10.4, Hap_10.3 and Hap_10.2) present on chromosome 10, was co-localized with GY, PH and EH under normal conditions and this SNP was confined to gene model, GRMZM2G104620. Almeida et al. [50] reported that chromosome 7 at 123.61 to 132.68 Mb and chromosome 3 at 169.75 to 178.23 Mb genomic regions have been associated with drought tolerance or drought adaptation for grain yield and ASI under well-watered and drought stress conditions as these genomic regions harbor 5 and 6 meta QTLs [50]. In the present study, the SNPs found on chromosomes 3 and 7 did lies within the given base pairs. Hence, it needs further thorough study and validation of the associated SNPs to be utilized in the breeding programs for developing heat tolerance cultivar. We were able to identify the marker trait associations (MTAs) under heat stress and to pinpoint the involvement of different chromosomal regions due to normal and heat-stressed conditions. Some significant SNPs and haplotype blocks were found within different gene models such as 40S ribosomal protein subunit, universal stress protein domain containing protein, expressed protein, protein kinase family protein, GRAS family transcription factor containing protein, GATA zinc finger domain containing protein, SAP domain containing protein, outer membrane protein OMP85 family, pentatricopeptide repeat super family protein, Pheophorbide a oxygenase chloroplast precursor and U-box domain-containing protein.
These genes are involved in various cellular and metabolic processes in crops. One of the candidate genes, GRMZM2G887068, detected under heat stress for EH and EPO has a physiological role associated with ion, scavenging hypoxia responses, cellular mobility and regulation of cell growth and development, and role in resistance to multiple stresses [51]. Other predicted candidate genes were associated with different traits under normal conditions. The candidate genes significantly associated with GY + other traits were GRMZM2G31624, GRMZM2G104620, AC198366.3_FGT004 & GRMZM 2G317287, GRMZM2G057557 and GRMZM2G418432 encoding U-box domain-containing protein, expressed protein, GRAS family transcription factor containing protein expressed, outer membrane protein OMP85 family putative expressed and protein kinase family protein, respectively (Additional file 5: Table S2). These genes are involved in regulation of plant growth and development and protein translocation. The respective genes could be targeted in maize breeding programs for enhancement of multiple traits simultaneously.

Conclusions
Significant SNPs and haplotype blocks found associated with yield related traits, directly or indirectly related to heat stress mechanism, which would help in selection of donor lines or lines with favorable alleles for several traits. Highest number of SNPs were identified on the chromosome 8 (47 SNPs) followed by chromosome 10 (46 SNPs), chromosome 1 (36 SNPs), chromosome 3 (25 SNPs), chromosome 7 (19 SNPs) and chromosome 6 (10 SNPs) for the various agronomic traits under normal and heat stress conditions. Before introducing to breeding pipelines, significant SNPs and haplotype blocks need further validation and it will be great help for understanding of complex genetic architecture traits under heat stress.

Plant materials
A total of six hundred and sixty-two doubled haploid (DH) lines test crosses derived with a heat susceptible tester line (CML-474) were received from CIMMYT-Asia Regional Maize Program, Hyderabad, India. CML474 belongs to heterotic group B that is why it was used as male. The evaluation of the test cross with susceptible tester always reflects the female line effect. The DH lines were derived from nine bi-parental crosses involving elite plant material of the working germplasm from CIMMYT-Asia. The female parental lines involved in the cross have been identified as potential heat tolerant elite lines in previous experiments, while the male parental lines are susceptible but elite lines of the program. The DH lines are being maintained with CIMM YT-Asia maize breeding program hub based out of Hyderabad, India, which are available for research on request. A detailed description of these lines has been provided in Additional file 6: Table S3.

Phenotypic evaluation of the association mapping panel
The DH lines were phenotypically evaluated at Punjab Agricultural University, Ludhiana (Punjab), a hot spot location for heat stress in India. Ludhiana is geographically situated at 30.91°N latitude and 75.85°E longitude and at an altitude of 256 m above sea level. Field-based assessment of panel was done during spring seasons (February-May) 2016 and 2017. The experiment was conducted in Alpha Lattice design with two replications each under normal sown and late sown conditions. Each entry was represented by single row of 3 m length with a spacing of 65 cm between rows and 20 cm between plants. The late sowing trial was delayed (1st week of March) to ensure maximal heat stress during flowering and grain formation during April/May months. Control or normal sowing was done on 1st week of February with same set of lines. As per meteorological data, late sown genotypes (natural heat stress) were coincided with high temperature during plant developmental and pollination stages (Additional file 7: Table S4).

Observation recorded
During the growing season, data was recorded for grain yield contributing traits and secondary traits. Days to 50% anthesis (AD) and silking (SD) were recorded as the date when half of the plants extruded the first anthers and silks in the plot. Anthesis-silking interval (ASI) was calculated as the difference between AD and SD. Leaf firing and tassel blast was recorded as the percentage of plants in a plot with leaf firing and tassel blast symptoms. Plant height (PH) was measured from the soil surface to the base of the tassel (excluding tassel length) and ear height (EH) was measured from the soil surface to the base of the ear. Ear position (EPO) was calculated as the ratio of plant height and ear height. Grain yield (GY) was recorded in terms of ear weight per plot immediately after crop harvest at 12.5% grain moisture content and 80% shelling percentage and transformed to tons per hectare.

Phenotypic data analysis
Genotypic (σ 2 Gen), error (σ 2 e), and genotype × environment (σ 2 Gen × Env) variance components for the phenotypic data were estimated from analysis of variance (ANOVA) and other statistical analysis were performed using standard procedures with R (METAR) [52]. The linear model used for the analysis is where Y ijkl represents the trait, μ is the mean effect, Env i represents the effect of ith environment Rep j (Env i ) is the effect of the jth replicate in environment i, Block k (Env i Rep j ) is the effect of kth incomplete block within the jth replicate in ith environment, Gen l is the effect of the kth genotype, Env i × Gen l represents the genotype × environment effects and ε ijkl is the residual associated with the ith environment, jth replication, kth incomplete block and lth genotype, which is assumed to be normally and independently distributed, with mean zero and homocedastic variance. The best linear unbiased predictors and variance components were estimated considering all the effects in the model as random. Broad-sense heritability (H 2 ) of the traits was estimated using the formula [53]: Where, σ 2 Gen = genotypic variance, σ 2 Gen × Env = genotype × environment variance, σ 2 e = error variance, l = number of environments and r = number of replications.
Genotyping DNA was extracted from leaf samples of three to 4 weeks old seedlings using the standard CIMMYT laboratory protocol [54]. The nine DH populations under study were genotyped using Genotyping-by-Sequencing (GBS) approach at the Institute for Genomic Diversity, Cornell University, Ithaca, NY, USA [55]. Genomic DNA was restricted with ApeK1 and GBS libraries were generated in 96-plex which were further sequenced on Illumina HiSeq 2000. TASSEL-GBS pipeline was used for SNP calling with B73 as the reference genome [56]. Physical co-ordinates of all SNPs were ascertained from B73 AGPV2 reference genome. The genotypic data of approximately 22,000 maize samples comprises 955,690 SNPs across all the chromosomes is available at https:// hdl.handle.net/11529/10548550 in the imputed GBS SNP dataset. A total of 187,000 SNPs for 662 DHs from the above mentioned dataset was used for GWAS which met the filtering criteria of Call Rate (CR) ≥0.7 and Minor Allele Frequency (MAF) ≥0.1.

GWAS analysis
The principal component analysis (PCA) was computed using the R package GAPIT (Genome Association and Prediction Integrated Tool) to estimate the number of subpopulations within the DH lines [57]. For each trait, BLUPs values of 643 DHs under normal and heat stress conditions across the years were used for genome-wide association studies (GWAS) (Additional file 8: Table S5 and Additional file 9: Table S6). The traits which had heritability equal to 0.2 or greater were used for further GWAS analysis as it was considered as moderate heritability for traits under abiotic stress management. Mixed linear model (MLM) was implemented for detecting significantly associated SNPs using R package GAPIT [57]. The false associations were corrected using both principal components (PCs) and kinship matrix for high precision. The combination of all PCs has greater power to detect the genetic variants (SNPs) with more robustness. The quantile-quantile (Q-Q)-plot and Manhattan plot were generated with customized R scripts to identify the SNP-trait associations [57]. Significant traits-SNPs associations were selected based on an arbitrary but high threshold cut-off P value (≤10 − 5 ) [58][59][60]. The location of each associated SNP within gene was attained from the Maize GDB genome browser (www.maizegdb.org) and functional gene annotations were retrieved from http://ensembl.gramene. org/Zea_mays.

Haplotypes trend regression
Haplotype frequency estimation was done among all the SNPs within the bottom 0.1 percentile of the distribution in the GWAS [60] for all traits, using 50 Expectation Maximization (EM) algorithm with EM convergence tolerance of 0.0001 and a frequency threshold of 0.01 [61] as implemented in SVS V_8.6.0 (Golden Helix, Inc., Bozeman, MT, www.goldenhelix.com) [49]. Haplotype blocks were identified to minimize historical recombinations based on the block-defining algorithm [62].
Stepwise regression of the trait phenotype with the haplotypes and SNPs was performed with forward elimination and a P-value cut-off of 0.001 [48]. A Bonferroni correction P value cut off of less than or equal to 0.05 was used to define significant association. Additional file 9: Table S6. Best linear unbiased prediction value (BLUPs) for DH population under heat stress conditions across two years.