Genome-wide association study reveals significant genomic regions for improving yield, adaptability of rice under dry direct seeded cultivation condition

Background Puddled transplanted system of rice cultivation despite having several benefits, is a highly labor, water and energy intensive system. In the face of changing climatic conditions, a successful transition from puddled to dry direct seeded rice (DDSR) cultivation system looks must in future. Genome-wide association study was performed for traits including, roots and nutrient uptake (14 traits), plant-morphological (5 traits), lodging-resistance (4 traits) and yield and yield attributing traits (7 traits) with the aim to identify significant marker-trait associations (MTAs) for traits enhancing rice adaptability to dry direct-seeded rice (DDSR) system. Results Study identified a total of 37 highly significant MTAs for 20 traits. The false discovery rate (FDR) ranged from 0.264 to 3.69 × 10− 4, 0.0330 to 1.25 × 10− 4, and 0.0534 to 4.60 × 10− 6 in 2015WS, 2016DS and combined analysis, respectively. The percent phenotypic variance (PV) explained by SNPs ranged from 9 to 92%. Among the identified significant MTAs, 15 MTAs associated with the traits including nodal root, root hair length, root length density, stem and culm diameter, plant height and grain yield were reported to be located in the proximity of earlier identified candidate gene. The significant positive correlation of grain-yield with seedling establishment traits, root morphological and nutrient-uptake related traits and grain yield attributing traits pointing towards combining target traits to increase rice yield and adaptability under DDSR. Seven promising progenies with better root morphology, nutrient-uptake and higher grain yield were identified that can further be used in genomics assisted breeding for DDSR varietal development. Conclusions Once validated, the identified MTAs and the SNPs linked with trait of interest could be of direct use in genomic assisted breeding (GAB) to improve grain yield and adaptability of rice under DDSR. Electronic supplementary material The online version of this article (10.1186/s12864-019-5840-9) contains supplementary material, which is available to authorized users.


Background
Developing resource-water-labor-energy efficient, dry direct seeded adapted rice varieties under reducing water-labor availability in agriculture is necessary for the sustainability of world's future food security [1]. The continuous efforts involving yield improvement under direct seeded conditions through better exploitation of genetic and genomic resources may lead to a breakthrough in improving rice yield and adaptability under dry direct seeded rice (DDSR) situation. The rapid development of next-generation sequencing (NGS) technologies has opened new avenues to harness existing genetic diversity to better understand the genetic basis of targeted traits and to deploy in genomics-assisted breeding (GAB) program [2,3]. Advanced genotyping technologies offer opportunities to plant breeding community to sequence every individual of GAB in several folds of crop genome at lower cost [4].
Water-saving technologies such as SRI (system of rice intensification) [5], AWD (alternate wetting and drying) [6] and DDSR (dry direct seeded rice) [7] can be advocated as suitable alternative to PTR. DDSR has the potential to ensure overall sustainability of rice cultivation systems when applied as a full package comprising laserassisted land levelling, conservation tillage, crop establishment, harvesting and processing using smart machines, integrated weed management, precision input delivery especially of water and nutrients and suitable varieties. DDSR accounted for 28% area in India, 45% in Korea, 39 to 47% in Vietnam and more than 90% collectively in United States, Sri Lanka, and Malaysia [8]. The mechanized DDSR method of rice cultivation has been estimated to provide a potential irrigation water savings of 40 cm ha − 1 , labor savings of 25 person-days ha − 1 , energy savings of 1500 MJ ha − 1 , a reduction of GHG emissions of 1500 kg CO 2 equivalent ha − 1 , yield increase of 0.5 t ha − 1 and increased net economic return of USD 50 ha − 1 with positive effect on soil heath in most of the rice-growing countries [8]. DDSR combines the advantage of better adaptation of upland rice varieties in aerobic soil and high yield potential of lowland varieties adapted to anaerobic soil.
A number of plant morphological features such as leaf size, thickness, angle, shape [9] play a remarkable role in photosynthesis [10], flowering time [11][12][13], lodging resistance [14] and plant height [15] and ultimately determine the crop productivity. The phenotypic root plasticity, root morphology and plant architecture largely affect crop productivity and therefore represent the key targets for the marker-assisted breeding schemes designed for increasing yield, biotic/abiotic stress tolerance and grain quality [16][17][18]. DDSR faces the key challenge of reduced nutrient uptake especially nitrogen, phosphorus and iron [19] than traditional puddled rice cultivation system. The lower uptake of nutrients create adverse effect on plant agronomic traits such as height, growth rate, tillering ability, leaf area index, spikelet number, grain filling and physiological processes such as photosynthesis, respiration and hormonal metabolism [20]. Lodging has been found as one of the major constraints in attaining high yield in DDSR compared to puddled system of rice cultivation. Moderate plant height, culm and stem diameter, thickness, length and strength are important traits contributing to lodging resistance under DDSR [21]. Root development and root system architecture is highly responsive to nutrient availability [18,19,22] and root traits reported to be associated with water uptake and grain yield in rice [23,24], wheat [25], maize [26] and soybean [27]. Addressing the challenge of appropriate plant and root architecture, efficient nutrient uptake, early vegetative vigor, early and uniform emergence, lodging resistance under DDSR and dissecting the genetic basis to maximize the potential to breed high-yielding resource-efficient DDSR varieties using modern biotechnological and bioinformatic approaches is mandatory. A better knowledge of rice root system including number of nodal roots, lateral roots, root length density, root biomass and root branching with more root hairs is must. The proper understanding of the relationship of these root morphological traits with nutrient uptake and grain yield may be important to the development of rice varieties with flexibility to be grown under variable situations ranging from wet and dry direct seeded to transplanted situations.
Genome wide association mapping has been reported as an effective method to fine map the complex traits contributing to productivity in crop species [34,38]. GWAS can be applied to study the statistical marker-trait associations involving diverse germplasm, landraces, multi-parent breeding population and NAM (nested association mapping population) populations [39]. Identification of costeffective, easy to use, widely distributed, co-dominant, phenotype-associated and regulatory SNPs, candidate genes and regulatory pathways could represent a significant milestone to accelerate the global hunt to improve rice yield under DDSR. Crop improvement programs can use association studies to access the allelic diversity and to identify the best alleles to be assembled in developing superior DDSR adapted rice varieties. A well-defined image of traits needed for DDSR, linked SNPs and functional characterization of underlying candidate genes will help rice breeders to choose specific donors, recipients and traits to undertake a systematic breeding program enhancing adaptability and improving rice yield under DDSR.
In the present study, we performed GWAS for 39 traits including seedling establishment traits, root and plant morphological traits, yield and yield attributing traits in a MAGIC (Multi-parent advanced generation inter-cross) population with the aim to identify significant marker-trait associations to be used directly to breed high yielding and nutrient-efficient rice varieties for DDSR condition.

Development of MAGIC population
The MAGIC population used in the study was developed by the inter-crossing of five diverse parents: IR  Table 1. These parents were intermated and recombined through further selfing to develop a MAGIC population comprised of 300 progenies. The detailed schematic representation of the development of the MAGIC population and phenotyping and genotyping screening is provided in Fig. 1.

Field experiments
The field experiments were conducted in upland farm fields at the International Rice Research Institute (IRRI), Los Banos, Laguna, Philippines (14 0 10 ′ 11.81 ″ N, 121 0 15 ′ 22 ″ E) in 2015 wet season (2015WS) and 2016 dry season (2016DS). The data on average rainfall, temperature, humidity, solar radition and pressure across 2015WS and 2016DS is presented in Additional file 2: Figure S1. In the present study, the term "dry direct-seeded" refers to the sowing of rice directly in the dry field without any nursery-bed raising in non-puddled, well-levelled fields. To ensure better pulverization, better germination and weed control, the field was prepared 1 month prior to sowing. Furrow was prepared using a tractor, and basal fertilizer was incorporated in the soil. To control weeds, combination of pre-emergence (Oxadiazon @ 0.5 kg ai ha − 1 at 6 days after seeding (DAS)), early post emergence (Bispyribac sodium @ 0.03 kg ai ha − 1 (9.7%, nominee) at 11 and 22 DAS) followed by spot weeding at 35 and 55 DAS was used. Integrated pest management (IPM) practices involving rat baiting using ditrac @ 0.05 g kg − 1 = 0.005% brodifacoum bait to control rats with the preseeding application of Fipronil @ 0.075 kg ai ha − 1 along the bunds and at 7 DAS along the plot edges was followed. The seeding was done at ̴ 2 cm depth in furrows in the dry plots. The seeding density was 2 g per progeny (approximately 30-35 seeds per row) for the two rows. Out of total 3 m row, the central 2 m was used for the measurement of grain yield and yield related traits and remaining 1 m (0.5 m at both side) was used for destructive sampling for the measurement of root traits at different growth stages. Sprinkler method of irrigation was used during the seedling establishment stage (1 to 21 DAS) and thereafter surface irrigation was applied once or twice a week depending on the weather and crop water status. Then irrigated field was allowed to drain naturally through normal seepage and percolation.

2015WS
A total of 450 F 2 derived F 3 progenies with significant phenotypic variability for grain yield and yield attributing traits along with five parents and five checks were evaluated in an augmented design (100 progenies per block × 5 blocks) under DDSR conditions. The total fertilizer rate was 100-35-30 NPK kg ha − 1 .

2016DS
A total of 300 F 3 derived F 4 progenies with significant phenotypic variability for grain yield and yield attributing traits along with five parents and five checks were evaluated in an alpha-lattice design (5 progenies per block × 62 blocks) with two replications under DDSR conditions. The total fertilizer rate was 120-40-40 NPK kg ha − 1 .
The detailed description of experiments conducted and traits measured under the present study is presented in Table 2.

Measurements of phenotypic traits
Five parental progenies, five checks and 300 progenies from the MAGIC population were characterized for a total of 39 agronomically important traits (Table 2; Additional file 2: Figure S2). These traits include the seedling establishment traits (9), root morphological and nutrient uptake traits (14), lodging resistance traits (4), plant morphological traits (5), grain yield and yield-related traits (7). Evaluation and characterization was done under DDSR in 2015WS and 2016DS at IRRI, Philippines.

Seedling establishment traits
Seedling emergence was recorded regularly from 3 to 12 days after seeding (DAS). Days to first emergence (DTFirst) refers to the emergence of seedling (one or two) from the soil and days to full emergence (DTFull) refers to the emergence of 90-95% of seedling per plot. For early vegetative vigor, the randomly three seedlings per plot were uprooted from the soil using trowel at 15, 22 and 30 days after seeding. The roots and shoots were separated and the shoot was oven dried at 60°C for 72 h. The oven dried shoot was then weighed to calculate RGR (Relative Growth Rate). RGR for 15 to 22 DAS (RGR1), 22 to 29 DAS (RGR2) and 15 to 29 DAS (RGR3) was calculated by using following formula [19]. log dry shoot weight at sampling 2 ð Þlog dry shoot weight at sampling 1 ð Þ date of sampling 2-dateof sampling 1 ð Þ The separated roots were used for the root traits measurements. Vegetative vigor (VVG) was also recorded following 1-9 scale (1: extra vigorous, 3: vigorous, 5: normal, 7: weak, 9: very weak) at 30 DAS according to the IRRI Standard Evaluation System for Rice [41]. The nodal roots were counted manually, and the maximum root length was measured using ruler measured in centimetre. The measurements on root hair length and density were recorded on six root samples following the procedure as described by Sandhu et al. [19].

Measurements of root traits
At booting stage, the Leaf Color Chart (LCC) developed by IRRI was used along with the chlorophyll meter (SPAD) (Minolta 502) to estimate and validate the relative accuracy of LCC in measuring greenness of fully expanded flag leaf nitrogen (N) concentration. The color of the flag leaf from the three randomly chosen plants was measured by placing the middle part of the middle lobe of the flag leaf in front of the color strip (LCC) for the comparison. The three phenotypically similar plants with approximately same value of LCC and SPAD were then uprooted for the total above-ground biomass measurement. The above ground fresh biomass was recorded immediately after uprooting, and then all the three plant samples were cut into small pieces and mixed properly, and a total of 200 g plant sample were taken for each progeny and then oven dried in 70°C for 3 days and then weight for dry plant biomass and nutrient uptake measurements [19]. In 2015WS a total of 60 progenies (30 high yielding and 30 low grain yielding along with the parents) were selected and analyzed for nutrient (N, P, Fe and Zn) uptake at IRRI Analytical Service Laboratory. The nitrogen estimation was carried out using Kjeldahl digestion method. The plant material was digested with a mixture of K 2 SO 4 (potassium sulphate) and concentrated H 2 SO 4 (sulphuric acid) in presence of catalyst (fine powdered selenium). The NH 3 (ammonia) produced was estimated by colorimetric method using Technicon autoanalyzer. The other nutrients phosphorus (P), Iron (Fe) and Zinc (Zn) were estimated using Nitric/Perchloric Acid digestion method using AIM 500 Digestion Block System.

Lodging resistance traits
Randomly three plants per plot were chosen for the lodging resistance traits measurements. Traits related to lodging resistance including stem diameter (SD; mm), culm diameter (CD; mm), bending strength (BS; kg cm) and bending moment (BM, kg cm − 2 ) were recorded. Stem and culm diameter were measured using vernier caliper. Stem strength at the breaking point was measured using the prostrate tester (Daiki Rika Kogyou Co., Tokyo; Additional file 2: Figure S3) [42]. At 20 days after flowering, the main stem of the plant was cut from the ground level and the force was given to the 2 nd and 3 rd internode of the stem at the middle (5 cm) of the internode. The dispacement in mm was measured as the stem strength at breaking point. The bending ability of internode was measured by bending the stem to point at which the stem break and the scale displacement (mm) due to bending ability was recorded.

Plant morphological, grain yield and yield-related traits
Five random plants were taken for recording of plant morphological, grain yield and yield-related traits. Days to 50% flowering (DTF) was recorded when~50% of the plants in a plot showed panicle excertion. Plant height WS wet season, DS Dry season, DAS days after seeding (DAS) IR 94225-B-82-B (Aus276/3*IR64 derived progenies with better root traits and grain yield under dry direct seeding conditions [19]; IR 94226-B-177-B (Aus276/ 3*MTU1010 derived progenies with better root traits and grain yield under dry direct seeding conditions [19]; IR 91648-B-32-B (Moroberekan/3* Swarna derived progenies with genetic region for early and uniform germination characteristics [40]; IR 91648-B-153-B and IR 91648-B-230-B (Moroberekan/3* Swarna derived progenies with genetic region for lodging resistance [40] (PHT), panicle length (PL), flag leaf length (FLL) and flag leaf width (FLW) were measured using ruler in centimeter scale. The number of productive tillers per plant (NPT) were counted manually. The PHT was measured from the ground level to the tip of the highest panicle at the maturity stage. PL was measured from the node of the panicle to the tip of the panicle. The FLL and FLW was measured from the base to tip of flag leaf and at the middle part of the flag leaf, respectively. Flag leaf area (FLA; cm 2 ) was calculated according to Palamiswamy and Gomez [43]. Flag leaf angle (FLAngle) was measured using protector keeping stem as a horizontal base. The harvested grain was threshed, cleaned and oven dried for 3 days at 50°C to 14% moisture content and then the weighing to record grain yield. The filled grains per panicle (FG/P) were counted manually and the 1000 grain weight (1000GW) was recorded in g. For straw yield (SY), the harvested straw was oven dried for 3 days at 70°C and weighed. The biomass at flowering (BMF) was measured (in g) after harvesting and drying the above ground biomass at 70°C in oven till there is no change in the dry weight of the plants.

DNA isolation, genotyping-by-sequencing and SNP calling
Samples comprised of the five parents, five checks and 300 progenies from the MAGIC population. The fresh leaf tissue samples from six plants per progeny were collected at 40 DAS. To increase the throughput and efficiency coupled with less chances of error, automated leaf sampling and high-throughput DNA extraction using the Brooks' PlantTrak Hx rice leaf tissue sampler and LGC Genomics' oKtopure systems was used. For genotyping by sequencing (GBS) a type II restriction endonuclease ApeKI was used for DNA digestion, and the digested DNAs were ligated to the adapter, and then 96plex library was constructed as per GBS protocol. GBS was carried out using HiSeq2000 100PE platform of Macrogen Inc. (Korea). From the GBS data a total of 2, 75,586 SNPs were called from the GBS genotyping. A stringent selection criterion to filter out the SNPs was used including missing percentage, MAF (minor allele frequency) and percent heterozygosity to select the panel of robust SNPs. SNP with call rate 80% and MAF (minor allele frequency) of > 5% (24,306 SNPs) were filtered using Tassel 5. The 24,306 SNPs were used to estimate the genetic relationship, the building of neighbor Joining tree and were used for GWAS for emergence, root, grain yield and yield-related traits. To detect and correct for population structure, a PCA was carried out using 24, 306 SNP markers.

Phenotypic data analysis
Analysis of variance (ANOVA) was computed using PBTools V 1.4.0 (http://bbi.irri.org/products). The trial mean and trait mean for each season/years was computed using mixed model analysis considering replications, and block within replication as random effect and progenies as a fixed effect. The broad-sense heritability was computed using the equation, H = σ 2 G /(σ 2 G + σ 2 E /r). Where H represents the broad sense heritability, σ 2 G represents the genetic variance, σ 2 E the error variance and r the number of replications. Correlation among traits was calculated using function rcorr() [in Hmisc package] in R. v.1.1.423.
The best parent for the promising traits was selected and the percent improvement in selected promising breeding progenies over the best parent for each promising trait was calculated as: % improvement = (trait value in selected promising progenytrait value in best parent/trait value in best parent) *100.

Population structure, linkage disequilibrium and association analysis
The 300 progenies and five parents included in the present study include the progenies of indica and Aus parents. The population structure was estimated using STRUCTURE V.2.3.4. Using 24,306 SNPs, a series of models with K value ranging from 1 to 10 was run with a burn-in period to 50,000 and running length to 10,000 to give consistent results overruns. To verify the consistency and accuracy of the results, ten independent runs for each K was performed. The K value with maximum likelihood over the runs was considered as the most probable number of clusters [44]. The genetic structure of the population was also described on Principal components (PC) calculated with R/GAPIT [45] and iteratively added to fixed part of the model, ranging from PC1 to PC10. The distance matrix was calculated using Tassel 5.0 [46], and an unweighted neighbor-joining tree was visualized using FigTree v1.4.2 [47]. The pairwise r 2 values were estimated using Tassel 5.0 and were averaged over the SNPs grouped based on stepwise increasing base-pair distance (0-10, 10-50, 50-500, 500-1000, 1000-1500, 1500-2000, 2000-2500, 2500-3000, 3000-3500, 3500-4000, 4000-4500 and 4500-5000). The average LD over these base-pair distances was used to estimate the linkage disequilibrium (LD) decay in the MAGIC population. GAPIT (Genome Association and Prediction Integrated Tool) was used to conduct genome wide association studies on 2015WS, 2016DS and the combined data over seasons to identify the genomic regions associated with the traits of interest [45].
The allelic effects of the significantly associated markers for root, nutrient uptake, and grain yield traits were determined by representing phenotypic data for the alleles as boxplots and the significant allelic variation for the associated traits was determined to perform Kruskal-Wallis test in "R."

Correction of false discovery rate (FDR) and candidate gene prediction
To correct the false positive in genome wide association analysis even keeping the stringent p-value benchmark, "Bonferroni Correction" the most stringent correction method was used. After the Bonferroni multiple test correction (0.01/24,306; significance level of 1%/total number of marker used in analysis), the calculated threshold value was 4.1 × 10 − 7 . Only the marker-trait associations (MTA) that surpass the Bonferroni threshold and consistent over seasons for the recorded traits were reported. For the identified significant marker-trait associations, candidate gene prediction was performed within 120-kb region around i.e. 120 kb upstream and 120 kb downstream of the SNPs with significant association signals. Earlier reported QTL that co-localized with the present study SNPs/QTLs, literature searches were performed. Regions covered by SNPs were searched for candidate genes using the MSU v.7 rice genome browser (http://rice.plantbiology.msu.edu/cgibin/gbrowse/rice/#search) and QTARO database (http:// qtaro.abr.affrc.go.jp).

Results and discussion
Phenotypic variations for targeted raits A number of seedling establishment, root and leaf morphological, agronomic, grain yield and yield related traits were measured to investigate their role in improving rice adaptation under DDSR condition. A better understanding of root traits enhancing nutrient uptake under DDSR in the framework of breeding strategies is a significant step forward to increase nutrient uptake under DDSR condition. Significant variations were observed for all the selected traits in both the seasons, except for number of filled grains per panicle, number of productive tillers per plant, panicle length and total biomass at 50% flowering ( Table 3) (Table 3). Similarly, NSICRc 222 and NSICRc 192 were having higher root hair length and density and flag leaf area compared to other parents. In term of SPAD and LCC, IR 87707-446-B-B-B and Vandana showed better performance. The trend of change of LCC and SPAD value across the progenies was almost similar indicating LCC (which is cheap to buy for farmers) is a good alternative to SPAD (which is expensive for farmers to buy) to decide on the time and dose of fertilizer application under DDSR. The greater radiations during grain filling stage in dry season contributing to the higher grain yield of dry season experiment compared to wet season experiment. The 1000 grain weight was found to be higher for IR 87707-446-B-B-B and NSICRc 192 across seasons. Transgressive segregants were observed for different seedling establishment, root morphological, plant morphological, nutrient uptake, grain yield and yield contributing traits. The traits such as SY and GY (except 2016DS) showed normal distribution. Positive skewness was observed for DTFirst emergence, DTFull emergence, NR, RHL, RHD, BS, FLAngle, SPAD, PHT, DTF and N, P, Fe, Zn uptake. The traits FLA, CD and SD exhibited negative skewness.
The grain yield was significantly and positively correlated with seedling establishment traits (vegetative vigor, days to first and full emergence), root morphological and nutrient uptake related traits (nodal root, root hair length, density, LCC, SPAD) and grain yield attributing traits (number of productive tillers per plant, number of filled grains/panicle, 1000 grain weight) in 2015WS (Fig. 2a), 2016DS (Fig. 2b) and combined seasons (Fig. 2c) analysis.
Strong seedling vigor and early and uniform emergence are desirable traits for enhancing crop establishment and weed competitiveness under DDSR. Early vigor and uniform emergence are complex traits which are influenced by different environmental and soil factors. Across the seasons, significant variability was observed among the progenies for DTFirst and DTFull emergence. The existing variations could be exploited in identifying the favourable alleles to be further used in marker-assisted introgression program. In the present study, direct significant positive correlation of grain yield with seedling establishment traits and grain yield attributing traits indicated the complexity of the relationship of these traits from seedling to reproductive stage in improving grain yield under DDSR. Investigating the genetic basis and relatedness of these traits and using them in marker-assisted breeding program is the urgent need of DDSR varietal development program.
The nutrient uptake (N, P, K and Zn) was significantly and positively correlated with nodal root, root length, root hair density, LCC, SPAD, number of productive tillers per plant, number of filled grains/panicle and grain yield (Fig. 2d). The significant positive correlation of nutrient uptake with grain yield and root morphological traits indicated the role of root traits in water-nutrient uptake and their effect in improving grain yield under DDSR. The significant positive correlation of a number of nodal roots at 15 DAS (seedling stage) with nitrogen, phosphorus, iron and zinc and at later stage 22 DAS with zinc uptake and at 29 DAS with nitrogen and phosphorus uptake (Fig. 2) indicated the importance of nodal root at each growth stage. Across the seasons, although the direction of correlation (whether positive or negative) for the seedling establishment, root and grain yield attributing traits such as DTFull emergence, nodal root,  NS non-significant, *significant at < 0.05 level, **significant at < 0.01 level, *significant at < 0.001 level maximum root length, LCC, SPAD, FLAngle, CD, DTF, NPT and SY with GY was same but the significance level was different. To investigate the underlying relationship among the proxy traits for high yield and better waternutrient uptake under DDSR, the traits data were also analyzed with PCA. This partitioned the total variations into correlated (RGR, NR, RHD, LCC, BS, SY) and inversely correlated traits (DTFirst, DTFull and DTF) with GY (Additional file 2: Figure S4). Some traits such as RHD, RHL, LCC, SD, CD, BS, PHT and FLAngle showed the effect of seasonal variability. This highlighted the role and phenotypic plasticity behaviour of different traits in increasing yield across seasons. Therefore, a better understanding of the relationship between root morphological traits at different time points, across different seasons and at different developmental stages with water-nutrient uptake is likely to provide a novel dimension in selecting traits and high-yielding varieties needed for DDSR. The correlation among the traits across seasons is shown in Additional file 1: Table S1.

Population structure and linkage disequilibrium (LD)
Population structure and LD decay were estimated using the genetic structure of the 300 progenies from MAGIC population and 5 parents using 24,306 SNPs distributed across 12 rice chromosomes. The scree plot produced from GAPIT showed that first four principal components were informative, and then there was a gradual decrease (Fig. 3a)   13.2% and PC4 explained 8.5% of total variance (Fig. 3b). All the progenies were divided into two distinct major groups and further subdivided into subgroups (Fig. 3c). The divergent aus and indica subpopulations of rice are adapted to different ecologies and geographies, and also harbor different traits of interest and alleles. The aus subpopulation is of interest as it has important source of alleles for drought tolerance and root traits while indica subpopulation harbors huge genetic variations. The kinship heatmap showed that most of the kinship value concentrated at *0.0 level indicating a very weak relatedness in the MAGIC population derived GWAS panel (Fig.  3c). Pairwise linkage disequilibrium (r 2 ) was calculated between all 24,306 markers. The average value of r 2 as a function of the inter-marker distance was used for LD decay calculation. The average r 2 for close markers of 5 kbs starts at 0.4, and the LD (r 2 ) dropped to half of its maximum value (0.2) at around 500 kb (Fig. 3d). The phylogenetic structure of the MAGIC population panel was illustrated using unweighted NJ tree (Fig. 3e).

Genome wide association study
To take advantage of identified marker-trait associations directly in MAS (marker-assisted selection), GWAS was performed directly on the MAGIC population. The high phenotypic variability existing in the MAGIC population coupled with high marker density across all chromosomes provided a strong base to the whole genome association mapping. About 25 and 12.5% of the genotypes in F 3 and F 4 populations were heterozygous. A total of 2,75,586 SNPs were called from the GBS data. After filtering SNP with call rate 80% and MAF (minor allele frequency) of > 5%, a total of 24,306 polymorphic SNPs were retained for further GWAS. The information's on p-values, FDR, SNP, SNP position and major and minor alleles are shown in Table 4. The Manhattan plots depicting -log (p-values) and Q-Q plots (quantilequantile) showing the expected vs. observed p-values for the SNP-based progeny-trait of interest associations are reported in Fig. 4 (seedling establishment traits and plant morphological traits), Fig. 5 (root morphological traits and nutrient uptake) and Fig. 6 (agronomic traits, grain yield, and lodging resistance traits).

Significant MTAs for molecular breeding for DDSR
The main objective of the present study was to identify the significant MTAs for the traits related to improving grain yield and adaptability of progenies under DDSR. A total of 37 significant trait-SNP associations were identified for 20 traits. The percent phenotypic variance (PV) explained by the SNP ranged from 9 to 92% for all the traits measured in the present study. The FDR ranged from 0.264 to 3.69 × 10 − 4 , 0.0330 to 1.25 × 10 − 4 , 0.0534 to 4.60 × 10 − 6 was observed in 2015WS, 2016DS and combined seasons analysis respectively (Table 4). Among the 37-significant marker-trait associations, 15 were located in

7-3 [74]
NR3 number of nodal roots at 29 DAS, RHL root hair length, RHD root hair density, FLA flag leaf area, FLAngle Flag leaf angle, DTFirst days to first emergence, DTFull days to full emergence, SPAD cholorophyll content, BS bending strength, SD stem diameter, CD culm diameter, PHT plant height, GY grain yield, SY straw yield, DTF days to 50% flowering, VVG vegetative vigor, Fe uptake iron uptake, N uptake nitrogen uptake, P uptake phosphorus uptake, Zn uptake zinc uptake, DS dry season, WS wet season, R 2 percent phenotypic variance (PV) explained by SNPs, FDR false discovery rate a The associated SNPs with very small interval (≤14 kb) can be considered as the same locus the proximity of earlier identified candidate genes (http:// qtaro.abr.affrc.go.jp) (Additional file 1: Table S2). The SNP-trait association with lowest p-values were reported for S2_16191048 on chromosome 2 for SPAD (p-value = 5.18 × 10 − 9 in 2016DS and 1.90 × 10 − 10 in combined seasons). A total of 12 SNPs reported being associated with root morphological and nutrient uptake traits whereas 5 SNPs reported to be associated with grain yield and yield contributing traits. Five SNPs observed to be associated with morphological traits (FLA, FLAngle, PHT). The plant height related SNP S1_38481437 on chromosome 1 colocalized with rice semi-dwarfing gene (Sd1; [63,64] which encodes a mutant enzyme Gibberellin 20-Oxidase, involved in the synthesis of gibberellin and QTL qPHT 1.1 [19]. Similarly, five SNPs reported to be significantly associated with lodging resistance traits (BS, SD, CD) and nine SNPs with seedling establishment traits (DTFirst, DTFull, VVG). S2_31078747 and S2_30984762 on chromosome 2 significantly associated with stem and culm diameter respectively, found to be colocated with Rice BRITTLE CULM 3 (BC3) gene which is known to be involved in secondary cell wall synthesis [62]. A significant association signal associated with vegetative vigor was detected for 14 kb region span between SNPs S1_37639734 and S1_ 37653811. This region was located in proximity to lax [65] and moc2 [66] candidate genes controlling shoot branching and tiller growth respectively.
The SNP S2_31078747 (SD, CD), S5_14987295 (N, P uptake) and S5_15470880 (RHL, RHD) observed to be significantly associated with more than one trait. For SNPs, S5_15470847 and S5_15470880 for root hair length and root hair density the colocalization was found with genes OsIPT3 [52] and OsEXPA3 [53] which were reported to betightly linked with the root growth in rice.
OsEXPA3 gene reported to be required for the development of primary root length, lateral root density, root vascular bundle cell length and growth in rice. In addition, these identified SNPs also located in the QTL region associated with root hair density as reported by Sandhu et al. [19]. Three SNPs S11_17412133, S11_ 17412134 and S11_17412139 on chromosome 11 significantly associated with grain yield located in the previously identified QTLs (yld11.1, gpl11.1, gw11.1, 17246592-23651853 bp) region. This region was reported to play an important role in improving grain yield in an Oryza sativa × Oryza rufipogon BC 2 F 2 population under upland conditions [59].
A 0.483 Mb region on chromosome 5 found to be associated with N, P uptake, FLA, RHL and RHD and 0.489 Mb region on chromosome 11 with DTF, PHT, and grain yield. A 0.655 Mb region on chromosome 11 reported being associated with seedling establishment traits (DTFull and VVG). For most of the traits like DTFirst, DTFull, CD, GY and VVG, the identified significant multiple SNPs which were present in the full LD region on the same chromosome were having almost the same level of significance across seasons. Some SNPs associated with more than one trait such as S5_ 14987295 (N, P uptake), and S5_15470880 (RHL, RHD) were having the same level of significance in 2015WS, 2016DS and combined seasons. The colocation of SNPs related to nutrient uptake and root traits further confirmed by the significant positive correlation of root traits with nutrient uptake. The identification of progenies carrying multiple superior haplotype associated with various colocated traits would significantly aid to the development of rice varieties adapted to DDSR. Introgression of superior haplotypes that are responsible for improving grain yield and water-nutrient uptake under DDSR using haplotype-based breeding may open avenues for designing next generation DDSR adapted rice varieties.
The identified significant marker-trait associations on the same chromosome within the 500 kb region, where the LD decays half of its maximum value were assigned as putative QTLs. The associated SNPs with very small Fig. 6 Manhattan and Q-Q plots of genome wide association mapping of agronomic traits (DTF: days to 50% flowering, PHT: plant height, GY: grain yield) and lodging resistance traits (CD: culm diameter) across seasons under direct seeded cultivation conditions. The y axis in each graph represent −log 10 P for the P value of the MTAs, while linkage groups are indicated on the x axis interval (≤14 kb) can be considered as the same locus. The putative QTLs for RHL spanning 0.033 kb region on chromosome 5, for DTFull spanning 0.036 kb and for GY spanning 0.005 kb on chromosome 11, for SPAD spanning 177 kb, and for Fe uptake spanning 0.012 kb on chromosome 2, for VVG spanning 14 kb region on chromosome 1, and for DTFirst spanning 0.045 kb (Table 4).
For the number of nodal roots at 29 DAS on chromosome 4: S4_31728342,~115 kb upstream the positional match was identified as nal1(nal5) gene, NARROW LEAF1 which regulates leaf and adventitious root development in rice [48]. QTLs for root fresh weight, root thickness [48,50] in upstream region and for rooting depth [51] in the downstream region has been reported. The p-values of significant allelic variation for the number of nodal roots at 29DAS were 5.71 × 10 − 5 , 7.13 × 10 − 6 and 6.267 × 10 − 6 in 2015WS, 2016DS and combined season analysis.
The p-values of significant variations for all three significant SNPs associated with grain yield QTL ranged from 3.125 × 10 − 5 to 8.287 × 10 − 8 across seasons. The pvalues representing allelic variation for nutrient uptake were 0.01184 (for S6_30887442), 0.04321 (for S6_ 30887442) and 0.02327 (for S5_14987295) for P uptake, 0.02449 (S5_14987295) for N uptake and 0.001039 (S7_ 26042045) for Zn uptake. Two SNPs S2_29221003 and S2_29221015 on chromosome 2 showed significant associations with iron (Fe) uptake under direct seeded cultivation conditions. Interestingly, the genes OsYSL15; a root tissue expressed iron-regulated gene for iron uptake at seedling development stage [67], OsCKI1; a rice casein kinase I gene associated with root development) [68], OsGPX3; mitochondrial glutathione peroxidase GPX3 essential in root-shoot development) [69], and MAIF1; Fbox protein gene promotes root growth in rice [70] reported to be colonized with the iron uptake gene mapped in the present study. In the present study, SNP S5_14987295 on chromosome 5 associated with both nitrogen and phosphorus uptake was identified to be colocated with the gene OsIPT3 which was involved in cytokinin biosynthesis [52] and QTL qP 5.2 , qPU 5.2 [19]. The zinc uptake linked SNP S7_26042045 on chromosome 7 reported to be located in the QTL region associated with root morphology and distribution in differential water-regime under upland conditions [74]. In addition, this SNP was reported to be colocated with qZn 7.1 ; the QTL detected for grain zinc concentration in a rice MAGIC plus population [75] and with QTL for grain breadth in an indica MAGIC population [76]. The colocation of QTLs and positive correlation between grain Zn and grain width [75] indicated the importance of grain dimensional traits during selection process.
The allelic effects of the significantly associated markers for root, nutrient uptake and grain yield traits are shown in Fig. 7. The parental allele contributing for the increased grain yield under DDSR were coming from IR 87707-446-B-B-B and Kali Aus parents. The parental allele contributing for increase in nodal roots number and Fe uptake was coming from Kali Aus, for N and P uptake from IR 87707-446-B-B-B and for Zn uptake coming from Vandana.
To identify the candidate genes underlying the significant marker-trait association/putative QTLs MSU v.7 rice genome browser (http://rice.plantbiology.msu.edu/ cgi-bin/gbrowse/rice/#search) and the existing literature were searched. The identified candidate genes were in the vicinity of the reported SNPs/putative QTLs in the present study which warrant further investigation and validation. The detailed description of the candidate genes is presented in Additional file 1: Table S2. The LD decays half of its maximum value at 500 kb in the present study and the candidate genes were searched within 120-kb region around, indicating the SNPs have strong LD within the ORF. All these identified significant MTAs and identified the breeding progenies with favourable alleles combinations can be deployed after validation for improving grain yield and adaptability through molecular breeding.
Performance of selected promising progenies DDSR technology has huge potential to save water, labor and energy with profitable results along with ecofriendly characteristics avoiding all the penalties entailed with the puddled transplanting system. DDSR offers a choice of rice establishment methods and water requirements for crop establishment and the subsequent crop growth, thus increasing the capacity of poor farmers to cope with the changes induced by the climatic conditions and the reducing soil resources. Farmers can go ahead with direct seeding with minimal soil moisture rather than waiting for sufficient rainfall for transplanting in case of early drought. DDSR also reduces the cost of additional irrigation and the risk of yield loss from the late-season drought. The generations of new DDSR rice varieties would most likely be needed for the sustainable rice production.  (Table 5).
The MAGIC population developed in the present study has an advantage over the existing MAGIC population in rice [76] as it has been developed involving upland adapted parents having better root system and improved yield under rainfed environments. The identified promising breeding progenies with favourable alleles in combination for the multiple traits might serve as potential donors for improving grain yield and adaptability under DDSR. The percent improvement in selected promising breeding progenies for nitrogen uptake ranged from 10 to 99% and phosphorus uptake from 18 to 28% over the best performing parent IR 87707-446-B-B-B (Table 5). The percent improvement in the selected progenies for iron uptake ranged from 4 to 8% and for zinc uptake ranged from 3 to 24% over the best performing parents Kali Aus and Vandana, respectively ( Table 5) . The percent improvement of number of nodal roots at 29 DAS in selected promising progenies varied from 7 to 35%, 8 to 42% and 11 to 30% in 2015WS, 2016DS and combined season, respectively over the best parent Kali Aus across seasons ( Table 5). The percent grain yield advantage of selected promising progenies ranged from 15 to 70% in 2015WS, 21 to 37% in 2016DS and 21 to 56% in combined season analysis over the stable yielding parent Kali Aus across seasons ( Table 5). The parents NSICRc 192 and NSICRc 222 were superior to other parents for root hair length and root density. Identification of suitable traits and progenies may acclerate the development of DDSR adapted rice varieties.

Conclusions
A major and successful genome-wide association study was carried out for a set of traits-seedling establishment, root morphological, nutrient uptake, grain yield and yield contributing traits increasing rice yield and adaptation in DDSR. The considerable phenotypic variability observed in the MAGIC population coupled with high marker density across all chromosomes provided a strong case for the success of the whole genome association trait mapping. A total of 37 significant marker-trait associations including nine putative QTLs related to the different DDSR traits were detected. The effectiveness of the identified marker-trait associations/putative QTLs was validated by confirming the involvement of collocated QTLs/candidate genes reported previously. The identified putative QTLs and the promising progenies possessing the identified QTLs may serve as the potential candidate donors to be use in GAB programmes . e, f, g P (phosphorus) uptake. h N (nitrogen) uptake. i Zn (zinc) uptake. The significant difference (p value in green colour) between the mean values of the allelic class was determined using Kruskal-Wallis test  ), N uptake nitrogen uptake (kg ha −1 ), P uptake phosphorus uptake (kg ha −1 ), Fe uptake iron uptake (kg ha −1 ), Zn uptake zinc uptake (kg ha −1 ) improving grain yield and adaptability of rice under DDSR. Identification of high yielding, resource efficient and direct seeded adapted breeding progenies with favourable alleles for multiple traits may help breeders to recommend for release as varieties to be cultivated by farmers under DDSR.

Additional files
Additional file 1: Table S1. The correlation observed among traits across seasons. Table S2. The detailed description on the candidate gene identified near or in the region of significant marker-trait association/ putative QTLs identified in the present study using MSU v.7 rice genome browser (http://rice.plantbiology.msu.edu/cgi-bin/gbrowse/ rice/#search). (XLSX 73 kb) Additional file 2: Figure S1. Average rainfall data (mm) collected during (a) 2015WS (b) 2016DS. Figure S2. The details on the seedling establishment traits, root and nutrient uptake traits, lodging resistance traits, Plant morphological, grain yield and yield attributing traits meausred under the present study. Figure S3. Prostrate tester quantifying stem strength at breaking point. Figure S4