- Research article
- Open Access
Construction of high-density genetic map and QTL mapping of yield-related and two quality traits in soybean RILs population by RAD-sequencing
BMC Genomicsvolume 18, Article number: 466 (2017)
One of the overarching goals of soybean breeding is to develop lines that combine increased yield with improved quality characteristics. High-density-marker QTL mapping can serve as an effective strategy to identify novel genomic information to facilitate crop improvement. In this study, we genotyped a recombinant inbred line (RIL) population (Zhonghuang 24 × Huaxia 3) using a restriction-site associated DNA sequencing (RAD-seq) approach. A high-density soybean genetic map was constructed and used to identify several QTLs that were shown to influence six yield-related and two quality traits.
A total of 47,472 single-nucleotide polymorphisms (SNPs) were detected for the RILs that were integrated into 2639 recombination bin units, with an average distance of 1.00 cM between adjacent markers. Forty seven QTLs for yield-related traits and 13 QTLs for grain quality traits were found to be distributed on 16 chromosomes in the 2 year studies. Among them, 18 QTLs were stable, and were identified in both analyses. Twenty six QTLs were identified for the first time, with a single QTL (qNN19a) in a 56 kb region explaining 32.56% of phenotypic variation, and an additional 10 of these were novel, stable QTLs. Moreover, 8 QTL hotpots on four different chromosomes were identified for the correlated traits.
With RAD-sequencing, some novel QTLs and important QTL clusters for both yield-related and quality traits were identified based on a new, high-density bin linkage map. Three predicted genes were selected as candidates that likely have a direct or indirect influence on both yield and quality in soybean. Our findings will be helpful for understanding common genetic control mechanisms of co-localized traits and to select cultivars for further analysis to predictably modulate soybean yield and quality simultaneously.
Soybeans [Glycine max (L.) Merrill] contain complete protein and oil, providing all the essential amino acids necessary for the human diet . Hence, a great effort has been made to increase soybean yield, while maintaining a high level of quality characteristics . Yield and quality-related traits of soybean are quantitative traits that are controlled by a combination of genetic and environmental factors .
The genetic maps with traditional molecular markers including restriction fragment length polymorphism (RFLP), simple sequence repeats (SSR) and amplified fragment length polymorphism (AFLP) have been traditionally used to identify the genetic basis of complex traits in plants [4,5,6,7]. However, conventional molecular markers often display a low density and are unevenly distributed throughout the whole genome. Therefore, the genetic maps developed using these molecular markers have limited both the efficiency and accuracy of QTL positioning. Recently, with the rapid development of high-throughput sequencing technology, single nucleotide polymorphism (SNP) markers have emerged as new molecular markers of choice because of their high-density and relatively even distribution across plant genomes. Further, they have resolved many of the problems associated with the efficiency and accuracy of QTL mapping [8,9,10,11,12]. Several new technologies for SNP genotyping have been developed over the last few years. A high-throughput method for genotyping recombinant populations utilizing whole-genome resequencing to construct a dense genetic map using recombination bins as markers was developed by Huang et al. . Restriction-site associated DNA sequencing (RAD-seq), was one of the next generation sequencing (NGS) methods, has been effectively applied in high-throughput SNP marker discovery and quantitative trait loci (QTL) analysis including the mapping of quality and agronomic trait loci in soybean .
Based on these new technologies for SNP genotyping, numerous QTLs associated with yield or quality traits have been identified in soybean [15,16,17]. For example, Kim et al. evaluated two populations for seed yield and other agronomic traits using 1536 SNP markers. In total, 8 QTLs for plant height and 3 QTLs for seed yield were identified . In another study, two QTLs for protein content and six oil content QTLs were identified by Akond and colleagues using a RIL population derived from a cross of PI43848913 × Hamilton . Further, a high density map was developed using the 5376 SNP markers from the Illumina Infinium BeadChip array. In addition, one protein and 11 oil content QTLs were detected in the MD96–5722 by ‘Spencer’ RILs population . Hwang et al. detected 40 SNPs associated with seed protein content and 25 SNPs associated with seed oil content. Among these markers, 7 SNPs were found to be significantly associated with both protein and oil content .
The objectives of this research reported here were (1) to develop a high-density soybean molecular genetic bin map with the RAD-seq method, and (2) to map QTLs for yield and quality-related traits in the RIL population and compare these data with previous research (http://www.soybase.org), (3) to determine if any QTLs were identified in both years and were co-localized with any other trait-related QTLs, (4) to select candidate genes that may influence both yield and quality using Gene Ontology (GO) enrichment analysis.
Plant materials and field trials
A RIL population was developed from a cross between Zhonghuang 24 (female parent) and Huaxia 3 (male parent) using a modified single seed method . Zhonghuang 24 is a variety with high-oil content adaptive to Huang-Huai-Hai region. Huaxia 3 was derived from a cross between ‘Guizao 1’ and ‘BRSMG68 (Brazilian variety)’ that is a high-yielding soybean cultivar. The 164 F8 RILs were grown together with both parents at the Zengcheng Experimental Station (South China gricultural University, Guangzhou, China) following a randomized complete block planting with three replications in the summer of 2012. Each plot contained 10 plants per row, with 0.5 m between rows and 0.1 m between plants. The 146 F11 RILs were grown using the same methods in the same location in 2015. Field management followed normal soybean production practices for the area.
Measurement of yield-related and two quality traits
The five plants in the middle of each row were individually harvested to score the following traits: plant height (PH), number of nodes (NN), number of branches (BN), number of effective pods (EP), number of invalid pods (IP), 100-seed weight (SW), seed protein content (Pro) and seed oil content (Oil). PH was measured in mature plants as the distance (cm) from the cotyledonary node to the top node of the main stem. NN was measured by counting the number of nodes from the cotyledonary node to the top of the main stem. BN was determined by counting the number of branches with podding on the main stem. EP were obtained by counting the number of pods with more than one filled seed per pod. IP were obtained by counting number of pods that did not contain seed. SW was measured by weighing 100 random filled seeds. 50 g of seed from each line were used for protein and oil determination by an Infratec 1241 Grain Analyzer based on 10% moisture.
Frequency distribution and correlation analysis for the parental and RIL population were analyzed with the SPSS statistics 17.0 and Microsoft Excel 2007.
Genetic map and QTL detection
All the genotyping work was conducted at the Beijing Genome Institute (BGI) Tech, Shenzhen, China. The soybean reference genome from Williams 82 was used for read mapping with SOAP software . Input data for SNP calling with realSFS was prepared by SAMtools . According to site frequency at every site, population SNP calling was performed with realSFS. The likelihoods of genotypes for each individual were integrated and extracted as candidate SNPs and then filtering these SNPs using the following criteria: 40 ≤ depth ≤ 2500,sites with a probability ≥ 95%. The homozygous genotype of parents and their populations were obtained based on the high fidelity-SNPs. According to the sliding window approach, we chose to include 15 SNPs per window, identifying the genotype for each window and the exchange sites for each individual when sliding a SNP every time, and then using the genotype for each individual to generate bin information .
A high-density genetic map was constructed using MST software (http://alumni.cs.ucr.edu/~yonghui/mstmap.html). The composite interval mapping (CIM) method was employed to scan QTLs. The LOD thresholds for QTL significance were determined by a test (1000 replications) with a genome-wide at the 5% level of significance to judge whether there exist QTL. The location of a QTL was described according to its LOD peak location and the surrounding region with 95% confidence interval calculated using WinQTLCart software . Running result of software can show additive effects of QTLs and phenotypic variation. The LOD values were shown in Additional file 1. QTL mapping results were comprehensively compared to Soybase (http://www.soybase.org/).
Method for naming QTLs
All QTLs were named according to Cui et al. as follows : initial ‘q’ denotes ‘QTL’; the letters following it are the abbreviation of the corresponding traits; the next number is the soybean chromosomes on which the corresponding QTL is distributed; then, ‘a’ and ‘b’ represent whether the QTL was identified in 2012 and 2015, respectively; if more than one QTL for a certain trait was dispersed along a certain chromosome, a serial number, viz.-1, 2, etc., is used after ‘a’ or ‘b’ to describe their order.
Phenotypic analysis of the RIL population
Most of the traits of Huaxia 3 showed higher values compared with those of Zhonghuang 24, providing ideal material for population construction and QTL analysis, with the exception of oil (Additional file 2). Figure 1 shows the frequency distribution for eight traits in 2 years. Phenotypic values were found to be continuous with normal or skew normal distributions. Transgressive segregation in the RILs was shown for eight traits, suggesting that alleles with positive effects on the measured traits are distributed among the parents.
The correlation analysis showed that most of the yield-related traits were correlated with each other in both years (Table 1). PH was positively correlated with NN, BN, EP, IP and SW, except for EP and IP in 2012 and SW in 2015 where it was not significant. NN also showed significant positive correlations with BN and EP in both years, but no correlation was detected with SW. Significant negative correlations were found for SW with BN and EP, ranging from r = −0.215** to r = −0.327**in both years, but have a significant positive correlation with protein (r = 0.245**) in 2012. Most previous studies reported that there is a strong negative correlation between seed protein and seed oil content [20, 27]. In our study, a highly significant negative correlation (r = −0.775**, r = −0.761**) was observed between protein and oil in both years.
High-density SNP linkage map construction
Based on 0.2× RAD-seq (restriction-site associated DNA sequencing) of the Zhonghuang 24 and Huaxia 3 RIL population, 57.40G sequence reads were obtained and the average read number was 311.97 M. Half of them have more than 200 M reads. According to this data, a total of 47,472 high-quality polymorphic SNP sites were detected for the RILs. All of the SNP sites in the RILs were integrated into a recombination bin unit, and 2639 recombinant bins were obtained. The average physical length of the bins was 360.01 kb, ranging from 20.01 kb to 17.43 Mb. A total of 1126 bins’ length were less than 100 kb, 609 bins ranging from 100 kb to 200 kb, 291 bins from 200 kb to 300 kb, 175 bins from 300 kb to 400 kb and 438 bins above 400 kb. Based on the genotypes of 2639 bins, a high-density bin linkage map was constructed covering 2638.24 cM, with an average distance of 1.00 cM between adjacent markers. For each chromosome, the average genetic distance between adjacent bins ranged from 0.67 to 1.51 cM (Table 2). Therefore, the linkage map constructed with recombination bins resulted in well-distributed linkage distances and has higher resolution than conventional maps.
QTL analysis for yield-related traits
Forty-seven QTLs associated with yield-related traits including PH, NN, BN, EP, IP and SW, were identified on 13 chromosomes (Chr04, Chr05, Chr06, Chr07, Chr08, Chr11, Chr12, Chr13, Chr14, Chr15, Chr17, Chr19, Chr20) (Fig. 2). A single QTL explained 3.78% (qPH13a)-32.56% (qNN19a) of phenotypic variance. Among the QTLs, 28 were identified on ten chromosomes in 2012. The most prominent QTL with the highest LOD score (15.63) was identified in a 56 kb region, which we designated qNN19a, explained 32.56% of phenotypic variation and displayed a negative additive effect, mainly with the positive allele from the male parent Huaxia 3. Nineteen QTLs on nine chromosomes were detected in 2015, and qPH19b-2 has the most significant LOD score (10.34), explaining 24.49% of phenotypic variation and showed negative additive alleles from the male parent Huaxia 3. Of these QTLs, 24 were in agreement with earlier reports and 23 QTLs were found to be novel (Additional file 3 and Table 3). Eight QTL clusters responsible for more than two traits were detected on four different chromosomes (Additional file 4). A total of 18 QTLs were stable across both years. Thirty-four of these QTLs had a positive additive effect, which were contributed from the female parent Zhonghuang 24, whereas 13 QTLs had a negative effect, with additive alleles from the male parent Huaxia 3.
QTL analysis for quality traits
A total of 13 QTLs were associated with quality traits on ten different chromosomes (Chr01, Chr02, Chr06, Chr07, Chr10, Chr11, Chr12, Chr13, Chr17, Chr20) in both growing seasons (Fig. 2). Three QTLs for protein content were identified on Chr07, Chr10 and Chr13 in 2012, respectively. Five QTLs for oil content were identified on Chr01, Chr06, Chr10, Chr11 and Chr20, with the phenotypic variance effect ranging from 6.76% (qOil11a) to 13.30% (qOil01a). Four QTLs (qPro07a, qPro13a, qOil06a, qOil20a) showed positive additive effects ranging from 0.27 (qOil20a) to 0.42 (qPro13a), while the other four QTLs (qPro10a, qOil01a, qOil10a, qOil11a) showed negative additive effects that were from −0.27 to −0.47. A QTL (qPro17b) for protein content was detected in a 52 kb region on Chr17, explaining 9.29% of phenotypic variation in 2015. In addition, four QTLs on Chr02 and Chr12 were identified for oil content, which individually explained 7.52% (qOil02b-1) and 12.49% (qOil02b-2) of the phenotypic variation. Within these QTLs, three of them had positive additive effects, indicating that the female parent, Zhonghuang 24, contributed the trait for increased oil content. A total of ten QTLs were reported in prior studies, and three new QTLs were identified for the first time in the present study.
The Gene ontology enrichment analysis base on QTL hotpot
It was noteworthy that an important QTL hotspot was mapped in a physical position between 43,923,975 and 45,138,371 bp on Chr19. Seven QTLs associated with five traits that explained up to 32.56% of phenotypic variation, were all detected within this genomic region that was previously reported to be associate with seed weight, protein and oil in several different studies. In order to gain an in- depth understanding of which genes/QTLs were related to yield and quality in this region, we retrieved gene calls and annotations using Glyma.Wm82.a1.v1.1 gene model from SoyBase (https://soybase.org/SequenceIntro.php#mapscompare). A total of 139 genes were found within this region using Gene Ontology enrichment analysis, and among them, 51 annotated genes were closely related to yield or quality, which could be classified into five groups (Additional file 5). The first group contains 13 genes associated with phytohormone regulation, including hormones such as auxin, abscisic acid and ethylene, which play an essential role in coordination of in vitro and in vivo regulation mechanisms to simultaneously improve yield and quality . The second group is comprised of 19 genes that are associated with metabolic processes, including carbohydrate metabolism, lipid metabolism, fatty acid catabolism and brassinosteroid metabolism, which are known to have an effect on the growth and development of soybean. The third group contains 6 genes associated with protein phosphorylation, which could be related to functional properties of food protein. Next, the fourth group is made up of 16 genes that are associated with cellular processes, including cell differentiation, cell proliferation, multicellular organism reproduction, and cell growth, which may have positive consequences for grain yield and quality in plants . The fifth group consists of 16 genes associated with organ morphogenesis, including the development of root, stamen, leaf and seedling, etc., even directly influence on soybean yield and quality.
Main effect factors for QTL mapping
The utility of QTL mapping is to obtain valuable alleles and understand genetic mechanisms, thus promoting genetic improvement of soybean by molecular methods, which is one of the main objectives in soybean breeding. Parental genetic diversity, environmental effects, and marker density are the main factors affecting QTL mapping . In this study, the parents of the RIL population are derived from geographically distinct locations. Zhonghuang 24 is a main variety grown in central China, while male parent, Huaxia 3, is derived from Brazilian soybean germplasm that have high yield and become the main variety grown in southern China. Our data indicated that there were more differences in yield and quality-related traits between Zhonghuang 24 and Huaxia 3, relative to other similarly performed studies. Thus, the detected QTLs of these traits could be more useful for soybean improvement. In addition, quantitative traits can be strongly affected by environment factors . In order to find QTLs that are stably expressed across environments, we chose two non-consecutive years including 2012 that was determined to be a suitable climate and 2015, which experienced greater than rainfall throughout all growth stages. According to Guangzhou Meteorological Service (http://www.gz121.gov.cn/), the total rainfall from July to October was 433 mm in 2012, while 1023 mm for the same period in 2015. Under these conditions, the QTLs identified in both years can be considered robust and environmentally stable. Furthermore, QTL mapping based on the resequencing genotyping method resulted in the integration of a total of 47,472 SNPs into 2639 recombination bin units. This was used to construct a high-density bin linkage map with an average distance of 1.00 cM between adjacent markers. The map has well-distributed linkage distances and higher resolution than the conventional map, making QTL mapping more accurate and reliable.
Comparison of the present study with previous research
In the present study, 14 QTLs were identified for PH, explaining 3.78 to 28.01% of phenotypic variation across the two growth seasons, of which, qPH19a was major QTL associated with PH and was detected in both years. This QTL has been previously reported by Lee et al. and Specht et al. [32, 33]. It is worth noting the importance of the novel QTLs (qPH04a, qPH04b) on Chr04 identified in this study, because they expressed across both years and accounted for 15.71 and 21.53% of phenotypic variation, respectively. Three QTLs (qPH06a-1, qPH06a-2, qPH06b-2) were identified on Chr06, which were in similar regions of those previously reported by Wang et al. and Gai et al. [34, 35], respectively. Four novel QTLs (qPH06b-1, qPH12a-1, qPH12a-2, qPH14a-2) were identified on Chr06, Chr12 and Chr14 for PH. NN was found to be influenced by nine distinct QTLs distributed across four chromosomes. The QTL detected on Chr04 in 2012, with an interval of 3,740,934–3,781,822 bp, was in a similar region (3657048–3,740,933 bp) to another one identified in 2015, and it is likely that they are the same. Two QTLs were identified on Chr19, qNN19a and qNN19b, which were consistent in both years and explained up to 32.56% of phenotypic variation. Interestingly, no similar positions were found for NN in prior studies. BN, a key constituent of soybean yield, has been studied extensively. Some researchers think that increasing production could be achieved through adjusting the branching number, and was confirmed by Panthee et al. . In their study, sd yld24–1 was mapped for yield traits with satt076 on Chr19. Interestingly, qBN19a which controls the number of branches in our study falls within this interval. Moreover, sd wt4–1 and sd wt11–1 for seed weight were identified by Maughan and Lee [37, 38], which was located at the same position as qBN11a, qEP11a, and qEP11b on Chr11 in this study. Three other novel QTLs (qBN04a, qBN05b, qBN08b) for BN were detected on Chr04, Chr05, and Chr08, accounting for 6.29 to 13.44% of phenotypic variation. Pod number and 100-seed weight are important parameters in measuring soybean yield and controlled by multiple genes. Two QTLs, qEP19a and qIP19b, on Chr19 were found to be associated with pod number during both years, and were located in the same region as those previously reported by Zhang et al. . Moreover, qSW19a-1 was shown to be associated with 100-seed weight, and is also mapped on Chr19 near this interval. Orf et al. reported a fine-mapped, 100-seed weight QTL located on Chr15, which just overlapped the intervals of the QTL for SW detected in both years in the present study .
In our study, a total of 4 protein content QTLs and 9 seed oil content QTLs were identified in 2 years. Three QTLs (qPro10a, qPro13a, qPro17b) were found to be novel, and no similar position has been identified previously for protein content. Ten of the 13 QTLs relevant to protein or oil content detected in the present study were consistent with previous research, and some of them shortened the interval. For example, A QTL associated with oil content, qOil06a, was found on Chr06 (37764770–38,299,977 bp). Palomeque et al. also reported that a QTL for oil content fell within the same interval, and a similar locus regarding seed oil and ‘oil plus protein’ related traits was also published by other researchers [41, 42], which indicated that this QTL is stable and may have pleiotropic effects. Meanwhile, three other QTLs (qPH06a-2, qNN06a-2, qBN06a) for yield-related traits were mapped to a similar region identified in our study, which explained 6.65 to 19.77% of phenotypic variation, respectively. qOil20a was mapped in a 39 kb region to bin 73 on Chr20 (34770628–34,809,740 bp), which falls within the same region identified by both Qi et al. and Reinprecht et al. [43, 44]. Moreover, qSW20b-2 (33207531–33,259,106 bp) for yield was also located near this position, suggesting that these two aforementioned regions should be of great value for genetic improvement of both soybean yield and quality. The remaining QTLs associated with protein or oil content in agreement with those of previous studies are presented in Additional file 3 [27, 45,46,47,48,49,50]. The coincidence of QTL across different genetic backgrounds not only reveals the stability and reliability of the QTL detected herein but also highlights the significance of these regions in marker breeding works designed to develop higher protein or oil soybean cultivars.
Important QTL hotspots
Most of the QTLs were clustered in eight genomic regions, particularly on Chr04, Chr06, Chr11 and Chr19 (Additional file 4). These QTLs hotspots included at least two traits such as PH, NN and SW, and was previously reported to be associated with some other traits in different genetic sources. Four QTLs for yield-related traits were mapped in two intervals of 3,657,048–3,781,822 bp on Chr04, which explained 6.17–17.68% of phenotypic variation. These QTLs have not been published and add to the growing knowledge on the genetic control of these traits. Three other QTLs were also detected on Chr04 (3815206–5,131,478 bp) explaining the range of phenotypic variation (9.59–21.53%). However, this region was reported to be associated with seed protein and seed weight in some earlier studies [33, 40]. Seven QTLs for PH, NN, BN, and oil were identified in two regions (18376759–19,504,937 bp, 37,764,770–41,420,709 bp) that were separated by a distance of more than 7 cM on Chr06, and accounted for 5.18–19.77% of phenotypic variation. Previously, Sun et al. located two QTLs for pod number on Chr06 near these two regions . The first region on Chr06 in the present study has been shown to be associated with different traits by other researches [42, 45, 52]. Moreover, Chen et al. found that two QTLs for pod number and seed oil plus protein were consistent with the second region on Chr06 in our study . More seed weight, protein and oil content QTLs were mapped to this locus in previous studies [17, 41, 45, 53, 54]. Three QTLs for BN and EP were identified on Chr11 that explained 4.97–9.31% of phenotypic variation. Of these, two QTLs for EP were expressed over 2 years. Three previously reported QTLs for protein content and seed weight were located in this region [35, 37, 38]. Seven QTLs were located in a physical position (43923975–45,138,371 bp) on Chr19, of which qPH19a, qNN19a and qPH19b-2 have large effect (28.01, 32.56, 24.49%) on phenotypic variation in comparison to the others. Mansur et al. found two QTLs associated with protein and oil were close to this region . Orf et al. also reported that this locus as associated with seed weight . In addition, three QTLs (qBN19a, qPH19b-1, qEP19b) were detected on Chr19 (40662371–40,701,058 bp) in this study. Similar loci have been previously reported for seed weight, protein and oil content [43, 46, 56]. Another two QTLs (qIP19a, qSW19a-2) on Chr19 were mapped to the interval of 42,309,067–42,469,449 bp. Some of the seed weight QTLs were detected near this position in past studies [40, 45, 56, 57]. Moreover, QTLs for protein and oil content were also previously identified in this region by both Orf et al. and Qi er al. [40, 43]. Interestingly, in this study, highly significant correlations were observed among PH, NN, BN, EP and SW. QTL mapping analysis showed that these traits were all linked to same region on three chromosomes (Chr04, Chr06, Chr19), which is consistent with the conclusion of phenotypic correlation analysis, and provided a genetic explanation for these associations. These QTL clusters may be cause of the pleiotropism or associations between the traits related. Every single cluster may function as an independent gene or closely linked genes . More importantly, some of those QTLs on Chr04, Chr06, Chr11, and Chr19 were identified in both years. These chromosome regions can be considered robust and environmentally stable, which could be helpful for further studies aimed at simultaneously altering soybean yield and quality in a predictable manner.
Three candidate genes on Chr19
Based on the predicted function of the five groups, three predicted genes (Glyma19g37910, 37,570, 36,990) were selected as the best candidate genes that may affect both yield and quality because they are involved in various biological process (Table 4). Glyma19g37910 encodes a member of the basic leucine zipper transcription factor family, involved in arabidopsis abscisic acid signalling during seed maturation and germination. GO analysis showed that this gene participated in more than ten biological process, which include seed development, lipid storage, gibberellin biosynthesis, and vegetative to reproductive phase transition of the meristem, etc. Glyma19g37570 gene has a domain predicted to encode a serine/threonine protein kinase that could influence cells in various ways. This gene is related to the process of stem cell division, protein phosphorylation, gibberellin biosynthesis and timing of the transition from vegetative to reproductive development. Glyma19g36990 encodes a plastidic triose phosphate isomerase, and GO analysis revealed that this gene participates in three catabolic process (glycine, tryptophan, and glycerol) and four biosynthetic process (indoleacetic acid, cysteine, and glyceraldehyde-3-phosphate, isopentenyl diphosphate). Moreover, it also plays a key role in multicellular organism reproduction and primary root development, which may have an effect on the yield and quality of crops. In general, these three candidates should be investigated in more detail in further studies to increase our understanding regarding the factors involved in the process of improving quality and productivity in soybean.
In this study, we genotyped a recombinant inbred line (RIL) population (Zhonghuang 24 × Huaxia 3) using a restriction-site associated DNA sequencing (RAD-seq) approach. A high-density soybean genetic map with 2639 recombination bins was constructed and used to identify QTLs that were shown to influence six yield-related and two quality traits. A total of 47 QTLs for six yield-related traits and 13 QTLs for two quality traits were identified. Of these, 34 QTLs detected herein were coincident with those of previous research [18, 27, 32, 34, 35, 39–50, 56, 57, 59–64]. Eighteen QTLs were stable QTLs that were identified in 2 years. Twenty-six QTLs were shown for the first time in this research, of which 10 were novel and stable QTLs. In addition, eight QTL hotspots on four chromosomes were identified for the correlated traits. Three predicted genes were selected as candidate genes that may directly or indirectly influence both yield and quality in soybean.
Number of branches on main stem
Composite interval mapping
Number of effective pod
Number of invalid pod
Logarithm of the odds
Number of nodes on main stem
Quantitative trait loci
Single nucleotide polymorphisms
- R2 :
Restriction-site associated DNA sequencing
Recombinant inbred line
Warrington CV, Abdel-Haleem H, Hyten DL, Cregan PB, Orf JH, Killam AS, et al. QTL for seed protein and amino acids in the Benning × Danbaekkong soybean population. Theor Appl Genet. 2015;128(5):839–50.
Wang J, Chen P, Wang D, Shannon G, Zeng A, Orazaly M, et al. Identification and mapping of stable QTL for protein content in soybean seeds. Mol Breed. 2015;35(3):1–10.
Pathan SM, Vuong T, Clark K, Lee JD, Shannon JG, Roberts CA, et al. Genetic mapping and confirmation of quantitative trait loci for seed protein and oil contents and seed weight in soybean. Crop Sci. 2013;53(3):765–74.
Quinn TW, White BN. Identification of restriction-fragment-length polymorphisms in genomic DNA of the lesser snow goose (Anser caerulescens Caerulescens). Mol Biol Evol. 1987;4(2):126–43.
Zietkiewicz E, Rafalski A, Labuda D. Genome fingerprinting by simple sequence repeat (SSR)-anchored polymerase chain reaction amplification. Genomics. 1994;20(2):176–83.
Vos P, Hogers R, Bleeker M, Reijans M, Vandelee T, Hornes M. AFLP: a new technigue for DNA fingerprinting. Nuleic Acids Res. 1995;23(21):4407–14.
Powell W, Morgante M, Andre C, Hanafey M, Vogel J, Tingey S, et al. The comparison of RFLP, RAPD, AFLP and SSR (microsatellite) markers for germplasm analysis. Mol Breed. 1996;2(3):225–38.
Baird NA, Etter PD, Atwood TS, Currey MC, Shiver AL, Lewis ZA, et al. Rapid SNP discovery and genetic mapping using sequenced RAD markers. Plos One. 2008;3(10):e3376.
Patil G, Do T, Vuong TD, Valliyodan B, Lee JD, Chaudhary J, et al. Genomic-assisted haplotype analysis and the development of high-throughput SNP markers for salinity tolerance in soybean. Sci Rep. 2016;6:19199.
Lee YG, Jeong N, Kim JH, Lee K, Kim KH, Pirani A, et al. Development, validation and genetic analysis of a large soybean SNP genotyping array. Plant J. 2015;81(4):625–36.
Wu J, Li LT, Li M, Khan MA, Li XG, Chen H, et al. High-density genetic linkage map construction and identification of fruit-related QTLs in pear using SNP and SSR markers. J Exp Bot. 2014;65(20):5771–81.
Juwattanasomran R, Somta P, Chankaew S, Shimizu T, Wongpornchai S, Kaga A, et al. A SNP in GmBADH2 gene associates with fragrance in vegetable soybean variety “Kaori” and SNAP marker development for the fragrance. Theor Appl Genet. 2011;122(3):533–41.
Huang X, Feng Q, Qian Q, Zhao Q, Wang L, Wang A, et al. High-throughput genotyping by whole-genome resequencing. Genome Res. 2009;19(6):1068–76.
Zhou L, Wang SB, Jian J, Geng QC, Wen J, Song Q, et al. Identification of domestication-related loci associated with flowering time and seed size in soybean with the RAD-seq genotyping method. Sci Rep. 2015;5:9350.
Liu WX, Kim MY, Van K, Lee YH, Li HL, Liu XH, et al. QTL identification of yield-related traits and their association with flowering and maturity in soybean. J Crop Sci Biotechnol. 2011;14(1):65–70.
Wang J, Chen PY, Wang DC, Shannon G, Shi AN, Shi A, et al. Identification of quantitative trait loci for oil content in soybean seed. Crop Sci. 2014;55(1):23–34.
Rossi ME, Orf JH, Liu LJ, Dong ZM, Rajcan I. Genetic basis of soybean adaptation to north American vs. Asian mega-environments in two independent populations from Canadian × Chinese crosses. Theor Appl Genet. 2013;126(7):1809–23.
Kim KS, Diers BW, Hyten DL, Rouf Mian MA, Shannon JG, Nelson RL. Identification of positive yield QTL alleles from exotic soybean germplasm in two backcross populations. Theor Appl Genet. 2012;125(6):1353–69.
Akond M, Ragin B, Bazzelle R, Kantartzi SK. Quantitative trait loci associated with moisture, protein, and oil content in soybean [Glycine max (L.) Merr.]. J Agric Sci. 2012;4(11):16–25.
Akond M, Liu SM, Boney M, Kantartzi SK, Meksem K, Bellaloui N. Identification of quantitative trait loci (QTL) underlying protein, oil, and five major fatty acids contents in soybean. Am J Plant Sci. 2014;5(1):158–67.
Hwang EY, Song QJ, Jia GF, Specht JE, Hyten DL, et al. A genome-wide association study of seed protein and oil content in soybean. BMC Genomics. 2014;15(1):1.
Jumbo MD, Weldekidan T, Holland JB, Hawk A. Comparison of conventional, modified single seed descent, and doubled haploid breeding methods for maize inbred line development using Germplasm enhancement of maize breeding crosses. Crop Sci. 2011;51(4):1534–43.
Li BR, Yu C, Li Y. SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics. 2009;25(15):1966–7.
Li R, Li Y, Fang X, Yang H, Wang J, Kristiansen K, et al. SNP detection for massively parallel whole-genome resequencing. Genome Res. 2009;19(6):1124–32.
Zeng Z. Theoretical basis for separation of multiple linked gene effects in mapping quantitative trait loci. Proc Natl Acad Sci U S A. 1993;90:10972–6.
Cui F, Zhao C, Ding A, Li J, Wang L, Li X, et al. Construction of an integrative linkage map and QTL mapping of grain yield-related traits using three related wheat RIL populations. Theor Appl Genet. 2014;127(3):659–75.
Eskandari M, Cober ER, Rajcan I. Genetic control of soybean seed oil: II. QTL and genes that increase oil concentration without decreasing protein or with increased seed yield. Theor Appl Genet. 2013;126(6):1677–87.
Buzzello GL, Trezzi MM, Marchese JA, Xavier E, Junior EM, Patel F, et al. Action of auxin inhibitors on growth and grain yield of soybean. Revista Cere. 2013;60(5):621–8.
Wang S, Wu K, Yuan Q, Liu X, Liu Z, Lin X, et al. Control of grain size, shape and quality by OsSPL16 in rice. Nat Genet. 2012;44(8):950–4.
Tan C, Han ZM, Yu HH, Zhan W, Xie WB, Chen X, et al. QTL scanning for rice yield using a whole genome SNP array. J Genet Genomics. 2013;40(12):629–38.
Lee S, Jun TH, Michel AP, Mian MA. SNP markers linked to QTL conditioning plant height, lodging, and maturity in soybean. Euphytica. 2015;203(3):521–32.
Lee SH, Bailey MA, Mian MAR, Shipe ER, Ashley DA, Parrott WA, et al. Identification of quantitative trait loci for plant height, lodging, and maturity in a soybean population segregating for growth habit. Theor Appl Genet. 1996;92(5):516–23.
Specht JE, Chase K, Macrander M, Graef GL, Chung J, Markell JP, et al. Soybean response to water: a QTL analysis of drought tolerance. Crop Sci. 2001;41(2):493–509.
Wang D, Graef GL, Procopiuk AM, Diers BW. Identification of putative QTL that underlie yield in interspecific soybean backcross populations. Theor Appl Genet. 2004;108(3):458–67.
Gai JY, Wang YJ, Wu XL, Chen SY. A comparative study on segregation analysis and QTL mapping of quantitative traits in plants—with a case in soybean. Front Agric China. 2007;1(1):1–7.
Panthee D, Pantalone V, Saxton A, West D, Sams C. Quantitative trait loci for agronomic traits in soybean. Plant Breed. 2007;126:51–7.
Maughan PJ, Maroof MAS, Buss GR. Molecular-marker analysis of seed-weight: genomic locations, gene action, and evidence for orthologous evolution among three legume species. Theor Appl Genet. 1996;93(4):574–9.
Lee SH, Park KY, Lee HS, Park EH, Boerma HR. Genetic mapping of QTLs conditioning soybean sprout yield and quality. Theor Appl Genet. 2001;103(5):702–9.
Zhang D, Cheng H, Wang H, Zhang HY, Liu CY, Yu DY. Identification of genomic regions determining flower and pod numbers development in soybean (Glycine max L.). J Genet Genomics. 2010;37(8):545–56.
Orf JH, Chase K, Jarvik T, Mansur LM, Cregan PB, Adler FR, et al. Genetics of soybean agronomic traits: II. Interactions between yield quantitative trait loci in soybean. Crop Sci. 1999;39(6):1642–51.
Palomeque L, Li JL, Li W, Hedges B, Cober E, Rajcan I. QTL in mega-environments: II. Agronomic trait QTL co-localized with seed yield QTL detected in a population derived from a cross of high-yielding adapted x high-yielding exotic soybean lines. Theor Appl Genet. 2009;119(3):429–36.
Chen Q, Zhang Z, Liu C, Xin D, Qiu H, Shan D, et al. QTL analysis of major agronomic traits in soybean. Agric Sci China. 2007;6(4):399–405.
Qi ZM, Wu Q, Han X, Sun YN, Du XY, Liu CY, et al. Soybean oil content QTL mapping and integrating with meta-analysis method for mining genes. Euphytica. 2011;179(3):499–514.
Reinprecht Y, Poysa VW, Yu K, Rajcan I, Ablett GR, Pauls KP. Seed and agronomic QTL in low linolenic acid, lipoxygenase-free soybean (Glycine max (L.) Merrill) germplasm. Genome. 2006;49(12):1510–27.
Hyten DL, Pantalone VR, Sams CE, Saxton AM, Landau-Ellis D, Stefaniak TR, et al. Seed quality QTL in a prominent soybean population. Theor Appl Genet. 2004;109(3):552–61.
Tajuddin T, Watanabe S, Yamanaka N, Harada K. Analysis of quantitative trait loci for protein and lipid contents in soybean seeds using recombinant inbred lines. Breed Sci. 2003;53(2):133–40.
Brummer EC, Graef GL, Orf J, Wilcox JR, Shoemaker RC. Mapping QTL for seed protein and oil content in eight soybean populations. Crop Sci. 1997;37(2):370–8.
Lee SH, Bailey MA, Mian MAR, Carter JTE, Shipe ER, Ashley DA, et al. RFLP loci associated with soybean seed protein and oil content across populations and locations. Theo Appl Genet. 1996;93(5–6):649–57.
Qiu BX, Arelli PR, Sleper DA. RFLP markers associated with soybean cyst nematode resistance and seed composition in a ‘Peking’ x ‘Essex’ population. Theor Appl Genet. 1999;98(3–4):356–64.
Kabelka EA, Diers BW, Fehr WR, LeRoy AR, Baianu IC, You T, et al. Putative alleles for increased yield from soybean plant introductions. Crop Sci. 2004;44(3):784–91.
Sun D, Li W, Zhang Z, Chen Q, Ning H, Qiu L, et al. Quantitative trait loci analysis for the developmental behavior of soybean (Glycine max L. Merr.). Theor Appl Genet. 2006;112(4):665–73.
Funatsuki H, Kawaguchi K, Matsuba S, Sato Y, Ishimoto M. Mapping of QTL associated with chilling tolerance during reproductive growth in soybean. Theor Appl Genet. 2005;111(5):851–61.
Kim H, Kim Y, Kim S, Son B, Choi Y, Kang J, et al. Analysis of quantitative trait loci (QTLs) for seed size and fatty acid composition using recombinant inbred lines in soybean. J Life Sci. 2010;20(8):1186–92.
Liang H, Yu Y, Wang S, Lian Y, Wang T, Wei Y, et al. QTL mapping of isoflavone, oil and protein contents in soybean (Glycine max L. Merr.). Agric Sci China. 2010;9(8):1108–16.
Mansur LM, Orf JH, Chase K, Jarvik T, Cregan PB, Lark KG. Genetic mapping of agronomic traits using recombinant inbred lines of soybean. Crop Sci. 1996;36(5):1327–36.
Han Y, Li D, Zhu D, Li H, Li X, Teng W, et al. QTL analysis of soybean seed weight across multi-genetic backgrounds and environments. Theor Appl Genet. 2012;125(4):671–83.
Stombaugh SK, Orf JH, Jung HG, Chase K, Lar KG, Somers DA. Quantitative trait loci associated with cell wall polysaccharides in soybean seed. Crop Sci. 2004;44:2101–6.
Xu Y, Li S, Li L, Zhang X, Xu H, An D, et al. Mapping QTLs for salt tolerance with additive, epistatic and QTL×treatment interaction effects at seedling stage in wheat. Plant Breed. 2013;132(3):276–83.
Mansur LM, Lark KG, Kross H, Oliveira A. Interval mapping of quantitative trait loci for reproductive, morphological, and seed traits of soybean (Glycine max L.). Theor Appl Genet. 1993;86(8):907–13.
Sayama T, Hwang T, Yamazaki H, Yamaguchi N, Komatsu K, Takahashi M, et al. Mapping and comparison of quantitative trait loci for soybean branching phenotype in two locations. Breed Sci. 2010;60(4):380–9.
Li W, Zheng D, Van K, Lee S. QTL mapping for major agronomic traits across 2 years in soybean (Glycine max L. Merr.). J Crop Sci Biot. 2008;11(3):171–90.
Yan L, Li Y, Yang C, Ren S, Chang R, Zhang M, et al. Identification and validation of an over-dominant QTL controlling soybean weight using populations derived from Glycine max x Glycine soja. Plant Breed. 2014;133(5):632–7.
Sebolt AM, Shoemaker RC, Diers BW. Analysis of a quantitative trait locus allele from wild soybean that increases seed protein concentration in soybean. Crop Sci. 2000;40(5):1438–44.
Zhang WK, Wang YJ, Luo GZ, Zhang JS, He CY, Wu XL, et al. QTL mapping of ten agronomic traits on the soybean ( Glycine max L. Merr.) genetic map and their association with EST markers. Theor Appl Genet. 2004;108(6):1131–9.
The authors wish to thank Dr. John H. Snyder for discussion, comments and language improvement.
This work was supported by the China Agricultural Research System (CARS-04-PS09) and the Major Projects of New Varieties Cultivation of Genetically Modified Organisms (2014ZX08004–002). The funding bodies had no role in study design, data collection, analysis and interpretation, decision to public, or writing of the manuscript.
Availability of data and materials
The next-generation sequencing data have been deposited in NCBI 181(SRP065356). The data sets supporting the results of this study are included in the manuscript. Soybean seeds are available from the Guangdong Subcenter of the National Center for Soybean Improvement, PR China.
HN was responsible for experimental design, supervised the research and took the lead role in writing; NL participated in the experimental design, conducted field experiments and wrote the paper; ML participated in the analysis of phenotypic data and drafting the manuscript; XH performed phenotypic data of yield-related traits collection; QM and YM helped with data analysis and revised the manuscript; ZT helped with candidate genes analysis; QX and GZ was responsible for genotyping work and mapping the QTLs. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The distribution of LOD values for eight traits. Maximum LOD score of each major QTL is indicated next to the peak. Red lines indicated data was collected in 2012(yr12), blue lines indicated data was collected in 2015(yr15). Different line colors indicate data collected in different years (yr12, 2012; yr15, 2015). (PDF 7993 kb)
Difference in agronomic traits between the parents and their recombinant inbred lines. a The data used was generated in summer 2012; b The data used was generated in summer 2015. (PDF 37 kb)
QTLs detected in RILs population that reported by previously studies. **marked by QTL name indicates a new, stable QTL that was detected in both years; aChr indicates chromosome; bLOD indicates the logarithm of odds score; c Percentage of phenotypic variation explained; d Related QTLs have been reported in the previous studies of the region which was identified in the Zhonghuang 24 and Huaxi 3 RILs population. (PDF 63 kb)
8 QTL hotspots detected in Zhonghuang24 × Huaxia3 RIL population in 2 years. **marked by QTL name indicates a new, stable QTL that was detected in both years; aChr indicates chromosome; bLOD indicates the logarithm of odds score; cPercentage of phenotypic variation explained. (PDF 62 kb)
Annotation description of five gene groups based on GO analysis. (PDF 28 kb)