Genome-wide association mapping and genomic prediction of agronomical traits and breeding values in Iranian wheat under rain-fed and well-watered conditions

Rabieyan, Ehsan; Bihamta, Mohammad Reza; Moghaddam, Mohsen Esmaeilzadeh; Mohammadi, Valiollah; Alipour, Hadi

doi:10.1186/s12864-022-08968-w

Research
Open access
Published: 15 December 2022

Genome-wide association mapping and genomic prediction of agronomical traits and breeding values in Iranian wheat under rain-fed and well-watered conditions

Ehsan Rabieyan¹,
Mohammad Reza Bihamta¹,
Mohsen Esmaeilzadeh Moghaddam²,
Valiollah Mohammadi¹ &
…
Hadi Alipour³

BMC Genomics volume 23, Article number: 831 (2022) Cite this article

2145 Accesses
9 Citations
3 Altmetric
Metrics details

Abstract

Background

The markers detected by genome-wide association study (GWAS) make it possible to dissect genetic structure and diversity at many loci. This can enable a wheat breeder to reveal and used genomic loci controlling drought tolerance. This study was focused on determining the population structure of Iranian 208 wheat landraces and 90 cultivars via genotyping-by-sequencing (GBS) and also on detecting marker-trait associations (MTAs) by GWAS and genomic prediction (GS) of wheat agronomic traits for drought-tolerance breeding. GWASs were conducted using both the original phenotypes (pGWAS) and estimated breeding values (eGWAS). The bayesian ridge regression (BRR), genomic best linear unbiased prediction (gBLUP), and ridge regression-best linear unbiased prediction (rrBLUP) approaches were used to estimate breeding values and estimate prediction accuracies in genomic selection.

Results

Population structure analysis using 2,174,975 SNPs revealed four genetically distinct sub-populations from wheat accessions. D-Genome harbored the lowest number of significant marker pairs and the highest linkage disequilibrium (LD), reflecting different evolutionary histories of wheat genomes. From pGWAS, BRR, gBLUP, and rrBLUP, 284, 363, 359 and 295 significant MTAs were found under normal and 195, 365, 362 and 302 under stress conditions, respectively. The gBLUP with the most similarity (80.98 and 71.28% in well-watered and rain-fed environments, correspondingly) with the pGWAS method in the terms of discovered significant SNPs, suggesting the potential of gBLUP in uncovering SNPs. Results from gene ontology revealed that 29 and 30 SNPs in the imputed dataset were located in protein-coding regions for well-watered and rain-fed conditions, respectively. gBLUP model revealed genetic effects better than other models, suggesting a suitable tool for genome selection in wheat.

Conclusion

We illustrate that Iranian landraces of bread wheat contain novel alleles that are adaptive to drought stress environments. gBLUP model can be helpful for fine mapping and cloning of the relevant QTLs and genes, and for carrying out trait introgression and marker-assisted selection in both normal and drought environments in wheat collections.

Peer Review reports

Background

Wheat (Triticum aestivum L., AABBDD, 2n = 6x = 42), as an economically important crop, provides iron, calcium, zinc, vitamin B, starch, fiber, fats, and dietary proteins [1, 2]. Genetic research on this crop has led to its improved productivity. For example, the last decade (2011-2020) witnessed ~ 1% yield increase per annum [3]. However, further improvement is imperative to feed the global population, which will reach over 9 B by 2050 [4]. As the most important detrimental factor, wheat production is restricted by water-limited conditions in most parts of the world. Improvement of crop tolerance to drought stress is one of the essential efforts that can guarantee sustainable yield in wheat fields [2, 4]. Right now, research attempts are focusing on exploring the genetic foundation of drought tolerance traits by using association analysis of agronomic characteristics and genomic regions [5].

The breeding of high-yielding and drought-tolerant wheat varieties continues to be a challenging task, because of large “environment×genotype” interactions and low heritability related to yield as a complicated agronomic property [6]. To overcome this problem, high-throughput methods in phenomics, including digital imaging, and in genomics, including association mapping, have been used to uncover the genetic mechanisms underlying yield and its relative characteristics under drought. The findings obtained from these methodologies had been practical for further enhancement in wheat yield not only in water-restricted environmental conditions but also in drought-stressed environments [3].

The advent of next-generation sequencing technologies has provided an opportunity to evaluate genetic variation and discover new markers through implementing the genotyping-by-sequencing (GBS) approach [7]. From this approach, molecular markers such as single nucleotide polymorphism (SNP) have been successfully adopted to discover the complicated agronomical properties of wheat and also have been well-known as key elements in the genome-wide association study (GWAS) approach [8]. The purpose of this approach is to detect genomic regions that can either be QTL, gene, or marker related to important traits for gene introgression, gene discovery, or marker-assisted breeding [2]. The markers detected by GWAS make it possible to dissect genetic structure and diversity at many loci. This can enable a wheat breeder to reveal and used genomic loci controlling drought tolerance [5].

In addition to trait mean-based GWAS (pGWAS), there is a chance to estimate breeding values by some methods such as BRR (bayesian ridge regression), gBLUP (genomic best linear unbiased prediction), and rrBLUP (ridge regression-best linear unbiased prediction) and use them in association mapping (i.e., eGWAS). There is a lack of certainty on the best algorithm when utilizing a multiple-regression model in genomic selection and GWAS since the structure of the population and the architecture of the trait have a remarkable effect on identifying marker impacts [9]. As a result, it is imperative to compare the findings from the various algorithms when dissecting the genetic basis of a complicated trait in a crop population for the first time. This process ensures the efficient detection of QTLs responsible for controlling a quantitative trait, and better control of the error of type I, which is often higher in association mapping studies [10].

To date, about 800 marker-trait associations (MTA) and quantitative trait loci (QTL) have been discovered for wheat drought tolerance traits, including yield, root, physiological, and agronomic ones by using association mapping (~ 100 MTAs) and bi-parental mapping (~ 700 QTLs). Only 70 loci, however, are known as the major genomic regions explaining more than 20% of phenotype diversity [11]. In the past, association mapping research in drought-stressed wheat has utilized a small number of molecular markers [12,13,14,15,16], which seems inadequate for efficiently exploring diversity in diverse wheat collections.

Genomic prediction (GP) is a powerful tool to boost the efficiency and speed of breeding schedules by reducing time cycles and increasing selection accuracy. This approach provides an opportunity by which a candidate gene can be chosen via genotyping before phenotype determination [17]. Genomic prediction utilizes all genetic markers within a model to train a prediction model, which is consisted of all genetic impacts. The model is applied to a validation set for estimating its accuracy [18]. Several studies have demonstrated high or moderate GP accuracy for quantitative characteristics in barley (Hordeum vulgare L.) [19], maize (Zea mays L.) [20], rice (Oryza sativa L.) [21], oat (Avena sativa L.) [22] and wheat (Triticum aestivum L.) [17].

This study was aimed at detecting drought tolerance candidate QTLs, genes, or markers linked with agronomical traits by using GWAS in 208 wheat landraces and 90 cultivars grown under normal and drought conditions. In eGWAS, the goal is to identify SNPs related to the correction value of the traits, which are passed on to the next generation. The next purpose of this work was to select the best model for estimating prediction accuracies in genomic selection. To the best of our knowledge, our report is the first study on pGWAS and eGWAS of agronomical characteristics in Iranian wheat landraces under rain-fed and well-watered conditions. The findings from this research will be an interesting source for marker-assisted breeding, genomic selection, introgression of favorable genes into high-yielding cultivars, and improvement of yield-associated characteristics under drought.

Results

Phenotypic data summary

In this study, 298 landraces and cultivars of bread wheat were grown under rain-fed and well-watered conditions and analyzed for various agronomic traits. According to the analysis of variance, genotypic, environmental, and genotype×environmental effects on agronomical traits were significant under rain-fed and well-watered environments. Variances associated with genotypic effects were higher than those associated with environment and genotype×environment effects across all traits, indicating genotypic effects had a greater impact. There is a high heritability in plant height traits, but a low heritability in grain yield traits. However, the agronomical traits of wheat grain showed acceptable heritability (Table S1). The box plots related to eight agronomical traits of wheat landraces and cultivars under favorable conditions (well-watered) and drought stress (rain-fed) are shown in Fig. 1. The mean of all traits under stress decreased when compared to a normal situation in both cultivars and native populations implying the presence of considerable diversity in agronomical traits of wheat accessions, and this variation is greater in native populations. The mean of all traits, except plant height, in both conditions, was higher in cultivars than in landraces.

Correlation analysis between traits in the normal environment showed that yield had the highest significant, positive correlation with the following traits, spike harvest index (r = 0.72**), spike weight (r = 0.71**), 1000-kernel weight (r = 0.69**), and the number of grains (r = 0.61**). However, in the stress environment, grain yield had the highest significant, positive correlation with the following traits: spike harvest index (r = 0.76**), 1000-kernel weight (r = 0.74**), the grains per spike (r = 0.66**), and spike weight (r = 0.54**) (Fig. S1).

Clustering analysis

Under normal conditions, the heatmap was plotted based on the mean of agronomic traits and breeding values by using three methods: BRR, gBLUP, and rrBLUP. From the results, wheat accessions were clustered into four groups. In clustering based on the mean of traits, Group No.1 included 82 high-yielding genotypes that were 41 cultivars and 41 landraces, Group No.2 consisted of 89 genotypes with average to high yield (24 cultivars and 65 landraces), Group No.3 contained 44 genotypes with average to low yield (21 cultivars and 23 landraces), and Group No.4 composed of 83 low yielding genotypes that were mainly native populations (4 cultivars and 79 landraces) (Fig. 2a). From the BRR method, wheat genotypes were divided into four groups; the first, second, third, and fourth groups consisted of 61, 42, 104, and 91 genotypes, respectively (Fig. 2b). From the gBLUP, the first group included 85 genotypes with a high breeding value of grain yield (72 cultivars and 13 landraces), the second group consisted of 102 genotypes with medium to high breeding value for yield and yield components (16 cultivars and 86 landraces), the third group contained 97 genotypes with medium to low breeding value for yield and components (2 cultivars and 97 landraces), the fourth group composed of genotypes (17 landraces) with low breeding values for yield and yield components (Fig. 2c). From the BRR method, wheat genotypes were divided into four groups; the first, second, third, and fourth groups consisted of 69 (67 cultivars and 2 landraces), 59 (9 cultivars and 50 landraces), 88 (12 cultivars and 76 landraces), and 82 genotypes (2 cultivars and 88 landraces), respectively (Fig. 2d). The results of gBLUP were most similar to the trait mean method in terms of genotype clustering.

Drought-stressed genotypes were also classified into four groups based on the trait mean and the breeding value methods. In clustering based on the mean of traits, the cluster 1 included 31 genotypes with high yield, which were mainly cultivars (18 cultivars and 13 landraces), the cluster 2 consisted of 123 genotypes with average to high yield (24 cultivars and 99 landraces), the cluster 3 contained 43 genotypes with average to low yield (19 cultivars and 24 landraces), and cluster 4 composed of 101 genotypes with low average yield, which were mainly native populations (29 cultivars and 72 landraces) (Fig. 3a). From the BRR, the first group included 61 cultivars with a high breeding value of grain yield, the second group consisted of 67 genotypes (18 cultivars and 49 landraces) with medium to high breeding value for yield and yield components, the third group contained 53 genotypes with medium to low breeding value for yield and components (8 cultivars and 45 landraces), the fourth group composed of 117 genotypes (3 cultivars and 114 landraces) with low breeding values for yield and yield components (Fig. 3b). From the gBLUP method, wheat genotypes were divided into four groups; the first, second, third, and fourth groups consisted of 65, 83, 48, and 102 genotypes, respectively (Fig. 3c). Clustering based on breeding values by using BRR, gBLUP, and rrBLUP had 42, 48, and 39% similarity in terms of genotype clustering in different clusters, respectively. This indicates that the gBLUP categorized wheat accessions more accurately than the other two BRR and rrBLUP methods (Fig. 3b, c and d).

Linkage Disequilibrium (LD)

LD assessment indicated that this indicator varies between chromosomes and across each chromosome and it usually decreases with increasing distances between SNP locations. A total of 1,858,425 marker pairs with r² = 0.211 were identified in cultivars, of which 700,991 (37.72%) harbored significant linkages at P < 0.001. The strongest LD was recorded between marker pairs on chr 4 A (r² = 0.318). Genomes D and B possessed the lowest (63,924) and highest (370,359) number of significant marker pairs, respectively. A similar assessment on wheat landraces found 1,867,575 marker pairs with r² = 0.182, of which 847,725 (45.39%) harbored significant linkages at P < 0.01. Similar to cultivars, marker pairs on chr 4 A showed the strongest LD (r² = 0.369). Genomes D and B possessed the lowest and highest number of marker pairs (92,702 and 427,017), respectively. In the D genome, the LD decay was slower than the LD decay in A and B genomes, indicating that the size of the linkage blocks is larger in the D genome. In addition, in cultivars, compared to the native populations in genome D, the LD decay was slower, which probably indicates the selection of more genome-related traits in breeding work. Based on the observations, the most significant marker pairs in wheat landraces were found at distance < 10 cM (Table 1).

Table 1 A summary of LD observed among marker pairs and the number of significant marker pairs per genome and chromosome

Full size table

Population structure

To estimate the subpopulations, the ΔK value was plotted against the number of clusters (K). The largest ΔK value was found at K = 3, reflecting three population substructures, Sub.1, Sub.2, and Sub.3 (Fig. 4a). Sub.1 included 113 genotypes with 6 cultivars and 107 landraces; Sub2 contained 111 genotypes with 97 landraces and 14 cultivars; Sub.3 consisted of 74 genotypes with 70 cultivars and 4 landraces (Fig. 4b). From PCA analysis, the estimated PCs showed that PCs 1 and 2 explained 10.29 and 6.28% of the genotypic variation, respectively (Fig. 4c). Cluster analysis using the kinship matrix also supported the STRUCTURE results (Fig. 4d).

Genome-wide association studies for agronomic traits and estimated breeding values

Under optimal irrigation and using imputed markers and -log10 P > 3, 283 significant SNPs were discovered for agronomic characteristics by MLM. Of these, 106, 137, and 40 markers were for genomes A, B, and D, respectively. Therefore, genome B had the highest number of significant SNPs. The number of significant markers for PH, GY, GN, TKW, SW, SA, SH, and SF were 39, 57, 19, 48, 11, 31, 43, and 35, respectively (Fig. S2a). The number of significant SNPs based on BRR, gBLUP, and rrBLUP were 362, 358, and 294, respectively. (Fig. S2b, c and d) The gBLUP method with the most similarity (81.27%) in the terms of significant markers had the best justification when compared to other methods (Table 2). BRR, gBLUP, and rrBLUP led to identifying 125, 118, and 111 significant SNPs for genome A; 201, 195, and 147 significant SNPs for genome B; as well as 36, 45, and 36 significant SNPs for genome D, respectively. (Fig. S2b, c and d). The Manhattan Plot results for all original traits are averaged (Fig. 5a) and the correction values of BRR, gBLUP, and rrBLUP (Fig. 5b, c and d) are shown in Fig. 5. The Manhattan circular plot shows significant markers at P value < 0.001 (black) and < 0.00001 (red). The Manhattan rectangular and Q-Q plot are shown in Fig. S3. Markers obtained with the mean of agronomic traits were very similar to the results of the breeding value methods, especially gBLUP.

Table 2 Similarity of expected MTAs using assigned SNPs for pGWAS and eGWAS

Full size table

In stress, less significant markers were identified than the normal situation, 194 significant SNPs were identified by the MLM method; Of these, 48, 129, and 17 markers belonged to genomes A, B, and D, respectively. Genome B had the highest percentage of significant SNPs in a stressful environment. The number of significant markers for PH, GY, GN, TKW, SW, SA, SH, and SF were 9, 30, 16, 21, 15, 31, 31, and 41, respectively (Fig. S4a). The number of significant SNPs obtained by BRR, gBLUP, and rrBLUP methods was 364, 361, and 301, respectively (Fig. S4b, c and d). The gBLUP with the most similarity (71.64%) in the terms of significant markers had the best justification when compared to other methods (Table 2). By BRR, gBLUP, and rrBLUP, a total of 134, 121, and 97 significant SNPs for genome A, 187, 198, and 167 SNPs for genome B, as well as 43, 42, and 37 SNPs for genome D were identified, respectively (Fig. S4b, c and d). The Manhattan circular plot shows significant SNPs at P value < 0.001 (black) and < 0.00001 (red) (Fig. 6). The Manhattan rectangular and Q-Q plot are shown in Fig. S5.

Gene ontology

The markers with the highest significance (P < 0.0001) and pleiotropic impact were studied in more detail. In the normal environment, 29 markers containing overlapping genes were identified that are involved in important biological and molecular processes. 12 markers were identified based on the pGWAS method and 17 markers were identified based on the eGWAS method. The number of GO based on BRR, gBLUP, and rrBLUP were 18, 15, and 16, respectively. The gBLUP and BRR method was most similar to (66.67%) the pGWAS method. The most significant markers were located on chr 6B, 5B, and 5 A. Of these, 8 SNPs were detected by both pGWAS and eGWAS methods. Some of the uncovered MTAs were responsible for the following molecular and biological processes: lipid biosynthetic process, protein-binding, carbohydrate-binding, lipid transport, RNA-binding, protein ubiquitination, protein deubiquitination, protein catabolic regulation, nucleoside metabolic process, UMP salvage, CTP salvage, and ubiquitin-dependent protein catabolic process (Table 3).

Table 3 Description of some expected MTAs using imputed SNPs for agronomic traits of Iranian wheat accessions in well-watered environment

Full size table

In the stress environment, 30 markers containing overlapping genes were identified. The most significant SNPs were located on the genome B. 13 and 17 markers were identified based on pGWAS and eGWAS methods, respectively. Of these, 10 markers were uncovered by both pGWAS and eGWAS methods, which indicates the approval of the above methods in discovering significant markers. Some of the uncovered MTAs were responsible for the following molecular and biological processes: nucleosome assembly, response to water deprivation, protein-binding, peptidase, monooxygenase, ATP-binding, acyltransferase, oxidoreductase , microtubule-binding, acyltransferase, ADP-binding, methyltransferase activity, metal ion-binding, protein dimerization, serine-type endopeptidase, ATPase, serine-type peptidase, hydrolase, ATP-dependent microtubule motor activity, and heme-binding (Table 4). The following pathways have been discovered using rice reference genomes: metabolic pathways (Fig. S6), oxidative phosphorylation (Fig. S7), biosynthesis of amino acids (Fig. S8), ascorbate and aldarate metabolism (Fig. S9), sulfur metabolism (Fig. S10), and fatty acid elongation (Fig. S11) ([23,24,25], www.kegg.jp/kegg/kegg1.html).

Table 4 Description of some expected MTAs using imputed SNPs for agronomic traits of Iranian wheat accessions in rain-fed environment

Full size table

Genomic prediction

The gBLUP, rrBLUP and BRR approaches using imputed SNPs led to the identification of the highest prediction accuracies for 5, 3, and 1 phenotypes in rain-fed, and 5, 3, and zero phenotypes in well-irrigated environments, respectively (Fig. 7). Under rain-fed, the highest prediction accuracy was determined via the gBLUP model for GY (0.381), PH (0.369), SA (0.347), SH (0.104), TKW (0.253), via the rrBLUP for GN (0.396), SW (0.359), via the BRR for SF (0.179). Under well-watered, the highest prediction accuracies were determined via the gBLUP for GY (0.521), SA (0.269), SH (0.384), SW (0.432), TKW (0.470), via the rrBLUP for GN (0.379), PH (0.499), and SF (0.265) (Fig. 7).

Discussion

Shedding light on the genetic mechanisms controlling quantitative traits such as grain yield in wheat represents an opportunity for the improvement of drought tolerance. To achieve this goal, this experiment aimed at exploring the structure of the population and at uncovering MTAs in Iranian wheat accessions. Significant, positive correlations among the wheat characteristics confirmed the value of the data in the current GWAS analysis. This is evidenced by Laido et al. [26] who highlighted the relationship between morphological characteristics having a high correlation to detect relevant QTLs.

High correlation occurring between agronomic traits can be justified by indirect or direct contributions of one trait to another [27]. Taking a look at the wheat genome, genomic regions responsible for such agronomic characteristics can be equivalent. This is supported by the presence of multi-trait correlations where one gene has a pleiotropic impact on highly-associated characteristics [2]. For example, Mwadzingeni et al. [8] showed that one locus controls several wheat properties such as grains per spike, spike length, and plant height, which are highly linked often [28]. Such observations support the requirement to confirm if such locus is not also linked to another trait, because it shares similar sequences with the regions responsible for the latter trait. Some loci, however, affect only one crop property [8].

Breeding value-clustering by using BRR, gBLUP, and rrBLUP had 77, 68, and 83% similarity with the trait mean method in the terms of wheat accessions grouping, respectively. This indicates that rrBLUP can categorize wheat accessions more accurately than the other methods. Moreover, rrBLUP with the most similarity with the trait mean method in the terms of discovered significant markers, suggesting its potential in uncovering SNPs. As a result, rrBLUP model can detect genetic impacts in wheat populations better than other models. Overall, obtaining the best outcomes from the breeding value-based methods depend on the genetic architecture of trait, genetic variation, etc. [18].

Linkage disequilibrium of markers

Of the results, the SNPs covered the wheat genome well. The SNPs were higher in genome B. The higher frequency of SNPs in genome B results from the evolutionary events [29]. Genomes D had the highest LD followed by genome A, followed by genome B. At the chromosome level, the strongest LD was recorded between marker pairs on chr 4 A. The fact that cultivars exhibited higher LD in contrast to landraces, particularly in the genome D, is presumably a consequence of selection throughout the time of breeding efforts [30]. The presence of closely linked marker pairs with non-significant LDs and marker pairs in LD over a long distance in this research has been shown previously in wheat and other crops [8, 31]. This reflects that LD is not static because LD can be affected by various elements including genetic admixture [8].

Population structure of Iranian wheat accessions

The population under consideration was divided into four distinct sub-populations. This is expected because the wheat accessions have diverse pedigrees. Of course, the presence of common parents or origins in the pedigree of accessions often leads to some relationships among them [2]. The findings derived from the population substructure analysis are beneficial in following superior parents that can be used in the improvement of wheat tolerance to drought stress conditions [3]. Therefore, latter researchers can utilize this genetic pool to employ the genetically disparate accessions, which in turn exhibit wheat farmer-preferred properties.

SNPs and MTAs for wheat agronomic traits

From a brief look at the number of SNPs, lower significant SNPs were recorded under drought than normal conditions, reflecting GWAS analysis for exploring drought tolerance is affected greatly by environment*genotype interactions [8].

This experiment led to discovering of a total of 29 and 30 highly significant MTAs in normal and drought environmental conditions, correspondingly. Albeit only those associations at P < 0.0001 were regarded as significant, the rest of these MTAs may be helpful for enhancing wheat tolerance to drought stress. These associations can be located in genomic regions affecting the agronomic characteristics. The MTAs for yield appeared significant at a higher P value, because this trait is highly complicated in genetic nature with low heritability [32].

To date, many attempts have been focused on locating QTLs and genes affecting wheat traits in drought environments for facilitating marker-assisted breeding [2, 3]. The MTAs detected in this study are added to the previous pool of candidate genes and markers. However, it is a challenging task to align our results with earlier works because of the use of disparate reference genomes than the IWGSC Ref.Seq, the lack of accurate genomic locations, or the utilization of various markers (GBS-derived SNP vs. SSR and DART) [2, 3, 5, 9]. Of course, detection of MTAs on the same chromosome as previous projects increases the assurance of these MTAs.

Four MTAs for grain yield were recorded on chr 3B, 4 A, 5 A, and 3D in this study. Earlier research efforts have discovered MTAs/QTLs for grain yield on wheat chr 7B [31, 33, 34], 7 A [31, 34,35,36], 5B [15, 31, 34], 3D [34], 3 A [31, 34, 37, 38], 2B [34, 37,38,39,40], and 1B [34, 38, 39]. Thus, MTAs on chr 3B, 4 A, and 5 A have not been reported and they are new for wheat yield. Six MTAs for TKW were found on chr 5 A, 1B, 3B, 6B, 1D, and 2D. Earlier reports have detected MTAs/QTLs for TKW on chr 7D [35], 7B [31], 5B [41], 3B [35], 3 A [40, 41], 2D [39], 2B [31, 35, 39, 42], 2 A [35], 1 A [31, 39,40,41] and 1B [43]. For plant height, two MTAs were revealed on each of chr 5B, 6B, and 2D. All 21 chromosomes carry genes that control plant height in wheat [42, 44, 45]. Up to now, 24 reduced height (Rht) genes (Rht1–Rht24) are catalogued in wheat [46, 47], where Rht8 on chromosome arm 2DS has been extensively explored [48, 49]. We could locate only two QTLs to chromosome 2DL, whereas the ones reported by Borner et al. [50], on chromosome 2DS could not be detected. Other MTAs detected in our research effort were responsible for grains per spike, spike weight, spike fertility spike area, and spike harvest index. Some of the MTAs detected in this study were involved in the following important biological and molecular processes: metal ion binding, monooxygenase, acyltransferase, oxidoreductase‎, acyltransferase, methyltransferase, peptidase, and dependent microtubule motor activity. The gBLUP with the most similarity (80.98 and 71.28% in well-watered and rain-fed environments) with the trait mean method in the terms of discovered significant SNPs, suggesting the potential of gBLUP in uncovering SNPs. The results show that the gBLUP method performs better than the rrBLUP and BRR methods in terms of predicting the accuracy of genomic breeding values. In gBLUP, genomic relationships are used to estimate an individual’s genetic merit. Genomic relationships are estimated based on DNA marker information for this purpose. To make better predictions of merit, the matrix defines the covariance between individuals on the basis of observed similarity rather than expected similarity based on pedigree. Several studies have described the gBLUP method for estimating genomic breeding values [51,52,53,54]. Research shows that gBLUP and rrBLUP are similar models. One of the advantages of gBLUP over rrBLUP is the reduction of the dimensions of the mixed equations to the number of people in the reference population, the calculation of accuracy and error predicting corrective values as commonly used in pedigree methods and combining The information of genotyped and non-genotyped individuals was mentioned simultaneously in the mixed equations [18].

Based on the GO results, the BRR and gBLUP methods were able to better identify the relationship between the studied traits, respectively, and were most similar to the pGWAS results. Generally speaking, genes/markers affecting a trait under drought also are responsible for that trait under normal conditions [8]. Ideally, the impacts of such genes/markers may not be influenced by any moderate changes in environmental conditions, thus they can be helpful in gene introgression or marker-assisted selection when adaptation improvement [55]. Some genes/markers, on the other hand, may affect specific traits differentially under various conditions [55].

Our findings suggested that genomic prediction is a helpful tool for predictive characterization of wheat genotypes, permitting phenotyping to be limited to a fraction of the germplasm rather than the whole collection [56,57,58]. Similarly, Kehel et al. [59] stated that genomic selection can be used within wheat accessions to predict key traits with an accuracy of more than 0.7, more especially for the traits with high to moderate heritability. Accounting for stratified populations is usually carried out by the first five principal components as covariates in a prediction model [57, 60, 61]. As expected, a significant population structure was identified in the Iranian wheat landraces, with the first five eigenvalues accounting for 30.5% of genetic diversity. The population structure indicated a negative effect on performance in GWAS and GP models, which was also exhibited in other researches [61, 62]. Of our observations, the highest prediction accuracy was achieved via the gBLUP model. Shabannejad et al. [18] evaluated classic approaches for exploiting GP accuracy by BRR, gBLUP, rrBLUP models in normal and drought environments in wheat cultivars and landraces. They identified the highest GP accuracies via the gBLUP and BRR method. The authors observed that obtaining the highest GP accuracy depends on the genetic variation, genetic architecture of trait, level of LD, and the genomic selection approach. As a result, the gBLUP model can detect genetic impacts in wheat populations better than other genomic prediction models.

Conclusion

MTAs are the key elements to detecting genomic regions related to wheat agronomic traits under drought stress. The current experiment found 29 and 30 highly significant MTAs under normal and drought conditions. The markers detected would be useful genomic sources for cloning and fine mapping of underlying genes, and for conducting gene introgression and marker-based selection in wheat under normal and drought conditions. A further research attempt is needed for validating the markers detected in the current project using a larger wheat population.

Methods

Plant material and experimental conditions

A field research effort was performed in two growing seasons (2018-19 and 2019-20) under rain-fed (drought) and well-watered (normal) conditions at the research farm, University of Tehran, Iran. In this study, 90 cultivars and 208 landraces (Table S2) of wheat were investigated in an alpha-lattice experiment with two replications. The wheat accessions were cultivated in the plots including four rows (1*1 m²) at 0.5 m intervals. In the well-watered crops, the threshold of irrigation was regarded based on 40 mm evaporation from a standard pan. The reference crop evapotranspiration [ET₀ = E_pan× K_pan; where K_pan is a pan coefficient (0.8) for each month and E_pan is the evaporation depth from the pan surface (40 mm)] and crop coefficient [K_C] were estimated to measure evapotranspiration (ET_C = K_C × ET₀) [63]. The time of irrigation was determined from the ratio of the assigned water for 1400 m² (the cultivation area of total genotypes in two replications) to water discharge (10.8 m³/h). The volume of water required for each hectare (m³/ha) was calculated via the depth of ET₀ (mm) multiplied by ten. The rain-fed crops were exposed to rainfall, which was the only accessible water source. The monthly rainfall pattern for the growing seasons is represented in Table S3. At the maturity stage, 20 plants were harvested from the middle rows of plots to measure traits, including spike fertility (ratio of grain number to spike weight), thousand-kernel weight (g), grain yield (g per plant), grain number per spike, spike weight (g), spike harvest index (ratio of spike grain weight to spike weight, %), spike area (cm²), and plant height (cm).

GBS analysis

To sequence wheat accessions, this experiment followed the procedure as explained by Alipour et al. [29] to establish the GBS libraries. After trimming reads to 64 bp and categorizing them, single nucleotide polymorphisms were discovered by internal alignment. SNPs were called through the UNEAK GBS pipeline, where SNPs with low- allele frequency < 1% and low-quality scores < 15 were discarded to reduce false positives. The SNP imputation process was implemented by available allele frequencies in BEAGLE V.3.3.2 [64]. The LD was calculated by the TASSEL V.5 [65]. The W7984 reference genome was adopted in the recent study because of fulfilling the highest accuracy of imputation among the wheat references [30].

Structure of wheat population

Population structure in the Iranian wheat accessions was revealed by STRUCTURE V.2.3.4. In this software, the parameters were set at 30,000 burn-in periods, with 30,000 MCMC iterations after burn-in [66]. To permit the picking up of repetition with the highest value of Ln likelihood, 10 replications were run for K values of 1 to 10. By using TASSEL software, genotypic data of wheat accessions were imputed [67]. Moreover, principal component analysis (PCA) was conducted to verify the STRUCTURE outcome. To determine the accession relationships, a neighbor-joining analysis was carried out by TASSEL V.5. Linkage disequilibrium (LD) was determined through R² value, squared allele frequency correlation, from which the significant allele pairs were estimated by 1,000 permutations.

Trait mean-based GWAS (pGWAS)

The mixed linear model (MLM) was followed to estimate the marker impacts on the wheat population. The general linear model was conducted by population structure matrix (Q) integrated as a covariate for correcting the effect of subpopulations. The mixed linear model was performed by both the family structure matrix (Kinship, K) and Q for controlling both errors of type I and II. The association mapping was implemented using MLM functions of TASSEL V.5. To correct for multiple test, a false discovery rate was utilized to declare significant MTAs [66, 68]. For a better answer in the recent study, only the outcomes of the MLM procedure were given. There are several methods to determine the threshold in GWAS and all of them have some advantage and disadvantage. But, the most important thing is confirming the results using further analysis. Here the threshold -logP > 3 was considered to find higher number of significant SNPs and identify the important ones using GO and pathway analysis. While from the threshold of -logP > 5 was considered to identify very significant and important SNPs. To explore associations between genotype and phenotype, a Manhattan plot was obtained using the CMplot package [69].

Breeding value-based GWAS (eGWAS)

Three methods rrBLUP [70], BRR [71], and gBLUP [72] using the Intelligent Prediction and Association Tool (iPat) software were used to obtain the breeding values. A mixed linear model (MLM) was used to estimate the effects of markers using breeding values on wheat populations [9].

Annotation of putative candidate MTAs

The ensemble-gramene database was employed to extract the molecular and biological functions of SNPs in the gene ontology by using the IWGSC RefSeq V.2.0, which has been provided for the Chinese Spring [http://www.gramene.org/]. Furthermore, the significant SNPs were analyzed via KOBAS version 2.0 for gene ontology enrichment analysis in KEGG [https://www.genome.jp/kegg/].

Genomic prediction strategies

GP was calculated by various approaches: BRR [71, 73], gBLUP [72, 73], and rrBLUP [70, 73]. All of the analyses were performed by iPat [74]. For the population, 20% of genotypes were assigned randomly to a validation set and all of the residuals were utilized as a training set. This process was reiterated 100 times for all of the prediction approaches. The GP accuracy was calculated as Pearson’s correlation (r) between BLUPs and GEBVs over the validation and training sets [75].

Statistical analysis

The descriptive statistics and correlation analysis were implemented by R V.4.1 using the dplyr, ggpubr, psych, and ggplot2 packages. Heatmap analysis was carried out using heatmap.2 function in gplots R package to classify wheat accessions.

Availability of data and materials

The datasets generated and analyzed during the current study are available in the Figshare repository [https://doi.org/10.6084/m9.figshare.18774476.v1].

References

Rabieyan E, Alipour H. NGS-based multiplex assay of trait-linked molecular markers revealed the genetic diversity of Iranian bread wheat landraces and cultivars. Crop Pasture Sci. 2021;72(3):173–82. https://doi.org/10.1071/CP20362.
Article CAS Google Scholar
Rabieyan E, Bihamta MR, Esmaeilzadeh Moghaddam M, Mohammadi V, Alipour H. Morpho-colorimetric seed traits for the discrimination, classification and prediction of yield in wheat genotypes under rainfed and well-watered conditions. Crop Pasture Sci. 2022;73. https://doi.org/10.1071/CP22127.
Arif MAR, Waheed MQ, Lohwasser U, Shokat S, Alqudah AM, Volkmar C, Börner A. Genetic insight into the insect resistance in bread wheat exploiting the untapped natural diversity. Front Genet. 2022;13:828905. https://doi.org/10.3389/fgene.2022.828905‎.
Rabieyan E, Bihamta MR, Esmaeilzadeh Moghaddam M, Mohammadi V, Alipour H. Imaging-based screening of wheat seed characteristics towards distinguishing drought-responsive Iranian landraces and cultivars. Crop Pasture Sci. 2022;73(4):337–55. https://doi.org/10.1071/CP21500.
Article CAS Google Scholar
Gahlaut V, Jaiswal V, Singh S, et al. Multi-Locus Genome Wide Association Mapping for Yield and Its Contributing Traits in Hexaploid Wheat under Different Water Regimes. Sci Rep. 2019;9:19486. https://doi.org/10.1038/s41598-019-55520-0.
Article CAS Google Scholar
Mathew I, Shimelis H, Shayanowako AIT, Laing M, Chaplot V. Genome-wide association study of drought tolerance and biomass allocation in wheat. PLoS ONE. 2019;14(12):e0225383. https://doi.org/10.1371/journal.pone.0225383‎.
Article CAS Google Scholar
Rabieyan E, Bihamta MR, Esmaeilzadeh Moghaddam M, Mohammadi V, Alipour H. Genome-wide association mapping for wheat morphometric seed traits in Iranian landraces and cultivars under rain-fed and well-watered conditions. Sci Rep. 2022;12(1):1–21. https://doi.org/10.1038/s41598-022-22607-0
Mwadzingeni L, Shimelis H, Rees DJG, Tsilo TJ. Genome-wide association analysis of agronomic traits in wheat under drought-stressed and non-stressed conditions. PLoS ONE. 2017;12(2):e0171692. https://doi.org/10.1016/j.gene.2020.144993.
Article CAS Google Scholar
Esmaeili-Fard SM, Gholizadeh M, Hafezian SH, Abdollahi-Arpanahi R. Genes and Pathways Affecting Sheep Productivity Traits: Genetic Parameters, Genome-Wide Association Mapping, and Pathway Enrichment Analysis. Front Genet. 2021;12:710613. https://doi.org/10.1016/j.gene.2020.144993.
Article CAS Google Scholar
Vallejo RL, Cheng H, Fragomeni BO, Shewbridge KL, Gao G, MacMillan JR, Towner R, Palti Y. Genome-wide association analysis and accuracy of genome-enabled breeding value predictions for resistance to infectious hematopoietic necrosis virus in a commercial rainbow trout breeding population. Genet Sel Evol. 2019;51(1):47. https://doi.org/10.1016/j.gene.2020.144993.
Article CAS Google Scholar
Gupta PK, Balyan HS, Gahlaut V. QTL analysis for drought tolerance in wheat: present status and future possibilities. Agronomy. 2017;7(1):5. https://doi.org/10.3390/agronomy7010005.
Article CAS Google Scholar
Maulana F, Huang W, Anderson JD, Ma X. Genome wide association mapping of seedling drought tolerance in winter wheat. Front Plant Sci. 2020;11:573786. https://doi.org/10.3389/fpls.2020.573786.
Article Google Scholar
Ballesta P, Mora F, Pozo AD. Association mapping of drought tolerance indices in wheat: QTL-rich regions on chromosome 4A. Sci Agric. 2020;77:2. https://doi.org/10.1590/1678-992X-2018-0153.
Article Google Scholar
Edae EA, Byrne PF, Manmathan H, Haley SD, Moragues M, Lopes MS, et al. Association mapping and nucleotide sequence variation in five drought tolerance candidate genes in spring wheat. Plant Genome. 2013;6:13. https://doi.org/10.3835/plantgenome2013.04.0010.
Article CAS Google Scholar
Edae EA, Byrne PF, Haley SD, Lopes MS, Reynolds MP. Genome-wide association mapping of yield and yield components of spring wheat under contrasting moisture regimes. Theor Appl Genet. 2014;127:791–807. https://doi.org/10.1007/s00122-013-2257-8.
Article CAS Google Scholar
Dodig DM, Zoric B, Kobiljski J, Savic V, Kandic S, Quarrie S, Barnes J. Genetic and association mapping study of wheat agronomic traits under contrasting water regimes. Int J Mol Sci. 2012;13:6167–88. https://doi.org/10.3390/ijms13056167.
Article CAS Google Scholar
Poland JA, Endelman J, Dawson J, Rutkoski J, Wu S, Manes Y, Jannink JL. Genomic selection in wheat breeding using genotyping-by-sequencing. Plant Genome. 2012;5(3):103–13. https://doi.org/10.3835/plantgenome2012.06.0006.
Article CAS Google Scholar
Shabannejad M, Bihamta MR, Majidi-Hervan E, Alipour H, Ebrahimi A. A classic approach for determining genomic prediction accuracy under terminal drought stress and well-watered conditions in wheat landraces and cultivars. PLoS ONE. 2021;16(3):e0247824. https://doi.org/10.1371/journal.pone.0247824.
Article CAS Google Scholar
Sallam AH, Endelman JB, Jannink JL, Smith KP. Assessing genomic selection prediction accuracy in a dynamic barley breeding population. Plant Genome. 2015;8(1):2014–05. https://doi.org/10.3835/plantgenome2014.05.0020.
Article CAS Google Scholar
Zhao Y, Gowda M, Liu W, Würschum T, Maurer HP, Longin FH, Reif JC. Accuracy of genomic selection in European maize elite breeding populations. Theor Appl Genet. 2012;124(4):769–76. https://doi.org/10.1007/s00122-011-1745-y.
Article Google Scholar
Spindel J, Begum H, Akdemir D, Virk P, Collard B, Redona E, McCouch SR. Genomic selection and association mapping in rice (Oryza sativa): effect of trait genetic architecture, training population composition, marker number and statistical model on accuracy of rice genomic selection in elite, tropical rice breeding lines. PLoS Genet. 2015;11(2):e1004982. https://doi.org/10.1371/journal.pgen.1004982.
Article CAS Google Scholar
Asoro FG, Newell MA, Beavis WD, Scott MP, Jannink JL. Accuracy and training population design for genomic selection on quantitative traits in elite North American oats. Plant Genome. 2011;4(2):132. https://doi.org/10.3835/plantgenome2011.02.0007.
Article Google Scholar
Kanehisa M, Goto S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28:27–30. https://doi.org/10.1093/nar/28.1.27.
Article CAS Google Scholar
Kanehisa M. Toward understanding the origin and evolution of cellular organisms. Protein Sci. 2019;28:1947–51. https://doi.org/10.1002/pro.3715.
Article CAS Google Scholar
Kanehisa M, Furumichi M, Sato Y, Ishiguro-Watanabe M, Tanabe M. KEGG: integrating viruses and cellular organisms. Nucleic Acids Res. 2021;49:D545–51. https://doi.org/10.1093/nar/gkaa970.
Article CAS Google Scholar
Laido G, Marone D, Russo MA, Colecchia SA, Mastrangelo AM, De Vita P, et al. Linkage disequilibrium and genome-wide association mapping in tetraploid wheat (Triticum turgidum L.). PloS One. 2014;9(4):e95211. https://doi.org/10.1371/journal.pone.0095211.
Article CAS Google Scholar
Dholakia B, Ammiraju J, Singh H, Lagu M, RoÈder M, Rao V, et al. Molecular marker analysis of kernel size and shape in bread wheat. Plant Breed. 2003;122(5):392–5. https://doi.org/10.1046/j.1439-0523.2003.00896.x.
Article CAS Google Scholar
Kashif M, Khaliq I. Heritability, correlation and path coefficient analysis for some metric traits in wheat. Int J Agric Biol. 2004;6(1):138–42.
Google Scholar
Alipour H, Bihamta MR, Mohammadi V, Peyghambari SA, Bai G, Zhang G. Genotyping-by-sequencing (GBS) revealed molecular genetic diversity of Iranian wheat landraces and cultivars. Front Plant Sci. 2017;8:1293. https://doi.org/10.3389/fpls.2017.01293.
Article Google Scholar
Alipour H, Bai G, Zhang G, Bihamta MR, Mohammadi V, Peyghambari SA. Imputation accuracy of wheat genotyping-by-sequencing (GBS) data using barley and wheat genome references. PLoS One. 2019;14(1):e0208614. https://doi.org/10.1371/journal.pone.0208614.
Article CAS Google Scholar
Neumann K, Kobiljski B, Denčić S, Varshney R, Börner A. Genome-wide association mapping: a case study in bread wheat (Triticum aestivum L.). Mol Breed. 2011;27(1):37–58. https://doi.org/10.1007/s11032-010-9411-7.
Article Google Scholar
Yagdi K, Sozen E. Heritability, variance components and correlations of yield and quality traits in durum wheat (Triticum durum Desf.). Pak J Bot. 2009;41(2):753–9.
Google Scholar
Rahimi Y, Bihamta MR, Taleei A, Alipour H, Ingvarsson PK. Genome-wide association study of agronomic traits in bread wheat reveals novel putative alleles for future breeding programs. BMC Plant Biol. 2019;19(1):1–19. https://doi.org/10.1186/s12870-019-2165-4.‎.
Article Google Scholar
Bordes J, Goudemand E, Duchalais L, Chevarin L, Oury FX, Heumez E, Lapierre A, Perretant MR, Rolland B, Beghin D, et al. Genome-wide association mapping of three important traits using bread wheat elite breeding populations. Mol Breed. 2014;33:755–68. https://doi.org/10.1007/s11032-013-0004-0.
Article Google Scholar
Sukumaran S, Lopes M, Dreisigacker S, Reynolds M. Genetic analysis of multi-environmental spring wheat trials identify genomic regions for locus-specific trade-offs for grain weight and grain number. Theor Appl Genet. 2018;131:985–98. https://doi.org/10.1007/s00122-017-3037-7.
Article CAS Google Scholar
Kumar N, Kulwal PL, Balyan HS, Gupta PK. QTL mapping for yield and yield contributing traits in two mapping populations of bread wheat. Mol Breed. 2007;19:163–77. https://doi.org/10.1007/s11032-006-9056-8.
Article Google Scholar
Hoffstetter A, Cabrera A, Sneller C. Identifying quantitative trait loci for economic traits in an elite soft red winter wheat population. Crop Sci. 2016;56(2):547–58. https://doi.org/10.2135/cropsci2015.06.0332.
Article CAS Google Scholar
Sehgal D, Autrique E, Singh R, Ellis M, Singh S, Dreisigacker S. Identification of genomic regions for grain yield and yield stability and their epistatic interactions. Sci Rep. 2017;7(1):1–12. https://doi.org/10.1038/srep41578.
Article CAS Google Scholar
Ogbonnaya FC, Rasheed A, Okechukwu EC, Jighly A, Makdis F, Wuletaw T, Hagras A, Uguru MI, Agbo CU. Genome-wide association study for agronomic and physiological traits in spring wheat evaluated in a range of heat prone environments. Theor Appl Genet. 2017;130:1819–35. https://doi.org/10.1007/s11032-006-9056-8.
Article Google Scholar
Lozada DN, Mason RE, Babar MA, Carver BF, Guedira GB, Merrill K, Arguello MN, Acuna A, Vieira L, Holder A, et al. Association mapping reveals loci associated with multiple traits that affect grain yield and adaptation in soft winter wheat. Euphytica. 2017;213(9):1–15. https://doi.org/10.1007/s10681-017-2005-2.
Article Google Scholar
Sun C, Zhang F, Yan X, Zhang X, Dong Z, Cui D, Chen F. Genome-wide association study for 13 agronomic traits reveals the distribution of superior alleles in bread wheat from the Yellow and Huai Valley of China. Plant Biotechnol J. 2017;15:953–69. https://doi.org/10.1111/pbi.12690.
Article CAS Google Scholar
Arif MAR, Shokat S, Plieske J, Lohwasser U, Chesnokov YV, Kumar N, Kulwal P, McGuire P, Sorrells M, Qualset CO, Börner A. A SNP-based genetic dissection of versatile traits in bread wheat (Triticum aestivum L.). Plant J. 2021;108:960–76. https://doi.org/10.1111/tpj.15407.
Article CAS Google Scholar
Akram S, Arif MA, Hameed A. A GBS-based GWAS analysis of adaptability and yield traits in bread wheat (Triticum aestivum L.). J Appl Genet. 2021;62(1):27–41. https://doi.org/10.1007/s13353-020-00593-1.
Article CAS Google Scholar
Borner A, Plaschke J, Korzun V, Worland AJ. The relationships between the dwarfing genes of wheat and rye. Euphytica. 1996;89:69–75. https://doi.org/10.1007/BF00015721.
Article Google Scholar
Snape JW, Law CN, Worland AJ. Whole chromosome analysis of height in wheat. Heredity. 1977;38:25–36. https://doi.org/10.1038/hdy.1977.4.
Article Google Scholar
Said AA, MacQueen AH, Shawky H, Reynolds M, Juenger TE, El-Soda M. Genome-wide association mapping of genotype-environment interactions affecting yield-related traits of spring wheat grown in three watering regimes. Environ Exp Bot. 2022;194:104740. https://doi.org/10.1016/j.envexpbot.2021.104740.
Mo Y, Howell T, Vasquez-Gross H, de Haro LA, Dubcovsky J, Pearce S. Mapping causal mutations by exome sequencing in a wheat TILLING population: a tall mutant case study. Mol Genet Genomics. 2018;293:463–77. https://doi.org/10.1007/s00438-017-1401-6.
Article CAS Google Scholar
Gasperini D, Greenland A, Hedden P, Dreos R, Harwood W, Griffiths S. Genetic and physiological analysis of Rht8 in bread wheat: an alternative source of semi-dwarfism with a reduced sensitivity to brassinosteroids. J Exp Bot. 2012;63(12):4419. https://doi.org/10.1093/jxb/ers138.
Article CAS Google Scholar
Korzun V, Röder MS, Ganal MW, Worland AJ, Law CN. Genetic analysis of the dwarfing gene (Rht8) in wheat. Part I. Molecular mapping of Rht8 on the short arm of chromosome 2D of bread wheat (Triticum aestivum L.). Theor Appl Genet. 1998;96(8):1104–9. https://doi.org/10.1007/s001220050845.
Article CAS Google Scholar
Börner A, Schumann E, Fürste A, Cöster H, Leithold B, Röder M, Weber W. Mapping of quantitative trait loci determining agronomic important characters in hexaploid wheat (Triticum aestivum L.). Theor Appl Genet. 2002;105(6):921–36. https://doi.org/10.1007/s00122-002-0994-1.
Article Google Scholar
Hayes BJ, Visscher PM, Goddard ME. Increased accuracy of artificial selection by using the realized relationship matrix. Genet Res. 2009;91:47–60. https://doi.org/10.1017/S0016672308009981.
Article CAS Google Scholar
Moser G, Tier B, Crump RE, Khatkar MS, Raadsma HW. A comparison of five methods to predict genomic breeding values of dairy bulls from genome-wide SNP markers. Genet Sel Evol. 2009;41:56. https://doi.org/10.1186/1297-9686-41-56.
Article CAS Google Scholar
VanRaden PM, Van Tassell CP, Wiggans GR, Sonstegard TS, Schnabel RD, Taylor JF, Schenkel F. Invited review: Reliability of genomic predictions for North American Holstein bulls. J Dairy Sci. 2009;92:16–24. https://doi.org/10.3168/jds.2008-1514.
Article CAS Google Scholar
Hayes BJ, Bowman PJ, Chamberlain AC, Goddard ME. Invited review: genomic selection in dairy cattle: progress and challenges. J Dairy Sci. 2009;92:433–43. https://doi.org/10.3168/jds.2008-1646.
Article CAS Google Scholar
Mathews KL, Malosetti M, Chapman S, McIntyre L, Reynolds M, Shorter R, et al. Multi-environment QTL mixed models for drought stress adaptation in wheat. Theor Appl Genet. 2008;117(7):1077–91. https://doi.org/10.1007/s00122-008-0846-8.
Article Google Scholar
Thorwarth P, Ahlemeyer J, Bochard AM, Krumnacker K, Blümel H, Laubach E, Schmid KJ. Genomic prediction ability for yield-related traits in German winter barley elite material. Theor Appl Genet. 2017;130(8):1669–83. https://doi.org/10.1007/s00122-017-2917-1.
Article Google Scholar
Crossa J, Jarquín D, Franco J, Pérez-Rodríguez P, Burgueño J, Saint-Pierre C, Singh S. Genomic prediction of gene bank wheat landraces. G3 (Bethesda). 2016;6(7):1819–34. https://doi.org/10.1534/g3.116.029637.
Article CAS Google Scholar
Azevedo Peixoto L, Moellers TC, Zhang J, Lorenz AJ, Bhering LL, Beavis WD, Singh AK. Leveraging genomic prediction to scan germplasm collection for crop improvement. PLoS ONE. 2017;12(6):e0179191. https://doi.org/10.1371/journal.pone.0179191.
Article CAS Google Scholar
Kehel Z, Sanchez-Garcia M, El Baouchi A, Aberkane H, Tsivelikas A, Charles C, Amri A. Predictive characterization for seed morphometric traits for genebank accessions using genomic selection. Front Ecol Evol. 2020;8:32. https://doi.org/10.3389/fevo.2020.00032.
Article Google Scholar
Norman A, Taylor J, Edwards J, Kuchel H. Optimising genomic selection in wheat: effect of marker density, population size and population structure on prediction accuracy. G3 (Bethesda). 2018;8(9):2889–99. https://doi.org/10.1534/g3.118.200311.
Article Google Scholar
Daetwyler HD, Bansal UK, Bariana HS, Hayden MJ, Hayes BJ. Genomic prediction for rust resistance in diverse wheat landraces. Theor Appl Genet. 2014;127(8):1795–803. https://doi.org/10.1007/s00122-014-2341-8.
Article CAS Google Scholar
Guo X, Xin Z, Yang T, Ma X, Zhang Y, Wang Z, Lin T. Metabolomics response for drought stress tolerance in chinese wheat genotypes (Triticum aestivum). Plants. 2020;9(4):520. https://doi.org/10.3390/plants9040520.
Article CAS Google Scholar
Kang S, Gu B, Du T, Zhang J. Crop coefficient and ratio of transpiration to evapotranspiration of winter wheat and maize in a semi-humid region. Agric Water Manag. 2003;59:239–54. https://doi.org/10.1016/S0378-3774(02)00150-6.
Article Google Scholar
Browning BL, Browning SR. A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. Am J Hum Genet. 2009;84(2):210–23. https://doi.org/10.1016/j.ajhg.2009.01.005.
Article CAS Google Scholar
Team R. RStudio: integrated development for R. RStudio. Inc. Boston. 2015;42:14. http://www.rstudio.com.
Pritchard JK, Stephens M, Donnelly P. Inference of population structure using multilocus genotype data. Genetics. 2000;155(2):945–59. https://doi.org/10.1093/genetics/155.2.945.
Article CAS Google Scholar
Bradbury PJ, Zhang Z, Kroon DE, Casstevens TM, Ramdoss Y, Buckler ES. TASSEL: software for association mapping of complex traits in diverse samples. Bioinformatics. 2007;23(19):2633–5. https://doi.org/10.1093/bioinformatics/btm308.
Article CAS Google Scholar
Pérez P, de Los Campos G. Genome-wide regression and prediction with the BGLR statistical package. Genetics. 2014;198(2):483–95. https://doi.org/10.1534/genetics.114.164442.
Article Google Scholar
Yin L, Zhang H, Tang Z, Xu J, Yin D, Zhang Z, Yuan X, Zhu M, Zhao S, Li X, Liu X. rMVP: a memory-efficient, visualization-enhanced, and parallel-accelerated tool for genome-wide association study. Genomics Proteomics Bioinformatics. 2021;19(4):619–28. https://doi.org/10.1016/j.gpb.2020.10.007.
Article Google Scholar
Endelman JB. Ridge regression and other kernels for genomic selection with R package rrBLUP. Plant Genome. 2011;4(3):250–5. https://doi.org/10.3835/plantgenome2011.08.0024.
Article Google Scholar
Joukhadar R, Thistlethwaite R, Trethowan RM, Hayden MJ, Stangoulis J, Cu S, Daetwyler HD. Genomic selection can accelerate the biofortification of spring wheat. Theor Appl Genet. 2021;134(10):3339–50. https://doi.org/10.1007/s00122-021-03900-4.
Article CAS Google Scholar
Clark SA, van der Werf J. Genomic best linear unbiased prediction (gBLUP) for the estimation of genomic breeding values. Methods Mol Biol. 2013;1019:321–30. https://doi.org/10.1007/978-1-62703-447-0_13.
Article Google Scholar
Rabieyan E, Bihamta MR, Esmaeilzadeh Moghaddam M, Mohammadi V, Alipour H. Genome-wide association mapping and genomic prediction for preharvest sprouting resistance, low α-amylase and seed color in Iranian bread wheat. BMC Plant Biol. 2022;22(1):1–23. https://doi.org/10.1186/s12870-022-03628-3.
Article CAS Google Scholar
Chen CJ. Zhang Z. iPat: intelligent prediction and association tool for genomic research. Bioinformatics. 2018;34(11):1925–7. https://doi.org/10.1093/bioinformatics/bty015.
Article CAS Google Scholar
Resende MF, Munoz P, Resende MD, Garrick DJ, Fernando RL, Davis JM, Kirst M. Accuracy of genomic selection methods in a standard data set of loblolly pine (Pinus taeda L.). Genetics. 2012;190(4):1503–10. https://doi.org/10.1534/genetics.111.137026.
Article Google Scholar

Download references

Acknowledgements

Not applicable.

Permission for land study

The authors declare that all land experiments and studies were carried out according to authorized rules.

Funding

This research did not receive any specific funding.

Author information

Authors and Affiliations

Department of Agronomy and Plant Breeding, Faculty of Agricultural Sciences and Engineering, University of Tehran, Karaj, Iran
Ehsan Rabieyan, Mohammad Reza Bihamta & Valiollah Mohammadi
Cereal Department, Seed and Plant Improvement Institute, Karaj, Iran
Mohsen Esmaeilzadeh Moghaddam
Department of Plant Production and Genetics, Faculty of Agriculture, Urmia University, Urmia, Iran
Hadi Alipour

Authors

Ehsan Rabieyan
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Reza Bihamta
View author publications
You can also search for this author in PubMed Google Scholar
Mohsen Esmaeilzadeh Moghaddam
View author publications
You can also search for this author in PubMed Google Scholar
Valiollah Mohammadi
View author publications
You can also search for this author in PubMed Google Scholar
Hadi Alipour
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.R. Bihamta and H. Alipour conceived the idea, M.R. Bihamta provided the plant materials, E. Rabieyan, M. E. Moghaddam and V. Mohammadi,performed field trial, were involved in designing and conducting the experiment. H. Alipour helped in the genomic data analysis, E. Rabieyan analyzed the field data and wrote the initial draft. All authors contributed to revising and editing the manuscript. All authors have read and approved of the final manuscript.

Corresponding author

Correspondence to Mohammad Reza Bihamta.

Ethics declarations

Ethics approval and consent to participate

The authors declare that all the experimental research and field studies on plants (either cultivated or wild), including the collection of plant material, were carried out in accordance with relevant institutional, national, and international guidelines and legislation. Samples are provided from the Gene Bank of Agronomy and Plant Breeding Group and these samples are available at USDA and CIMMYT with USDA PI number and CIMMYT number (Table S2), respectively.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1:

Supplementary Table 1. Mean, coefficient of variation (CV), broad senseheritability (H²), and combined analysis of variance based onstudied traits in 298 Iranian wheat landraces and cultivars. Supplementary Table 2. Overview on the landraces and cultivars of Iranian wheat studied. Supplementary Table 3. Pattern of total monthly precipitation andirrigation for the 2018-19 and 2019-20 cropping seasons. Supplementary Fig. 1. Correlation coefficients between the studied agronomic traits for Iranian wheat landraces and cultivars. (A, Well watered; B, Rainfed). Abbreviations: PH, Plant height; GY, Grain yield; GN, Grain number per spike; TKW, Thousand kernel weight; SW, Spike weight; SA, Spike area; SH, Spike harvest index; SF, Spike fertility. Supplementary Fig. 2. GWAS results for agronomic traits andbreeding Values of Iranianlandraces and cultivars in well-watered environments. Agronomic traits (A), BRR (B), gBLUP (C), and rrBLUP (D). Abbreviations: PH, Plant height; GY, Grain yield; GN, Grain number per spike; TKW, Thousand kernel weight; SW, Spike weight; SA, Spike area; SH, Spike harvestindex; SF, Spike fertility. Supplementary Fig. 3. Manhattan and QQ-plots of highly associatedhaplotypes for and MLM in Iranian wheat landraces and cultivars in well-wateredenvironments. X axis represents chromosomes: 1) 1A, 2) 1B, 3) 1D, 4) 2A, 5) 2B, 6) 2D, 7) 3A, 8) 3B, 9) 3D, 10) 4A, 11) 4B, 12) 4D, 13) 5A, 14) 5B, 15) 5D, 16) 6A, 17) 6B, 18) 6D, 19) 7A, 20) 7B, 21) 7D. Supplementary Fig. 4. GWAS results foragronomic traits and breeding Values of Iranian landraces and cultivars inrain-fed environments. Agronomictraits (A), BRR (B), gBLUP (C), and rrBLUP (D). Abbreviations: PH, Plant height; GY, Grain yield; GN, Grain number per spike; TKW, Thousand kernel weight; SW, Spike weight; SA, Spike area; SH, Spike harvest index; SF, Spike fertility. Supplementary Fig. 5. Manhattan and QQ-plots of highly associatedhaplotypes for and MLM in Iranian wheat landraces and cultivars in rain-fed environments. X axis represents chromosomes: 1) 1A, 2) 1B, 3) 1D, 4) 2A, 5) 2B, 6) 2D, 7) 3A, 8) 3B, 9) 3D, 10) 4A, 11) 4B, 12) 4D, 13) 5A, 14) 5B, 15) 5D, 16) 6A, 17) 6B, 18) 6D, 19) 7A, 20) 7B, 21) 7D. Supplementary Fig 6. The KEGG pathway of metabolic pathways. Supplementary Fig 7. The KEGG pathway of oxidativephosphorylation. Supplementary Fig 8. The KEGG pathway of biosynthesis of amino acids. Supplementary Fig 9. The KEGG pathway of ascorbate and aldarate metabolism. Supplementary Fig 10. The KEGG pathway of sulfur metabolism. Supplementary Fig 11. The KEGG pathway of fatty acid elongation.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Rabieyan, E., Bihamta, M.R., Moghaddam, M.E. et al. Genome-wide association mapping and genomic prediction of agronomical traits and breeding values in Iranian wheat under rain-fed and well-watered conditions. BMC Genomics 23, 831 (2022). https://doi.org/10.1186/s12864-022-08968-w

Download citation

Received: 16 May 2022
Accepted: 26 October 2022
Published: 15 December 2022
DOI: https://doi.org/10.1186/s12864-022-08968-w

Genome-wide association mapping and genomic prediction of agronomical traits and breeding values in Iranian wheat under rain-fed and well-watered conditions

Abstract

Background

Results

Conclusion

Background

Results

Phenotypic data summary

Clustering analysis

Linkage Disequilibrium (LD)

Population structure

Genome-wide association studies for agronomic traits and estimated breeding values

Gene ontology

Genomic prediction

Discussion

Linkage disequilibrium of markers

Population structure of Iranian wheat accessions

SNPs and MTAs for wheat agronomic traits

Conclusion

Methods

Plant material and experimental conditions

GBS analysis

Structure of wheat population

Trait mean-based GWAS (pGWAS)

Breeding value-based GWAS (eGWAS)

Annotation of putative candidate MTAs

Genomic prediction strategies

Statistical analysis

Availability of data and materials

References

Acknowledgements

Permission for land study

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher’s Note

Supplementary Information

Additional file 1:

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Genomics

Contact us