Genetic basis of heterosis for yield and yield components explored by QTL mapping across four genetic populations in upland cotton

Background Quantitative trait loci (QTL) mapping provides a powerful tool to unravel the genetic bases of cotton yield and its components, as well as their heterosis. In the present study, the genetic basis underlying inbreeding depression and heterosis for yield and yield components of upland cotton was investigated in recombinant inbred line (RIL), immortalized F2 (IF2), and two backcross (BCF1) populations based on a high-density SNP linkage map across four environments. Results Significant inbreeding depression of fruit branches per plant (FB), boll numbers per plant (BN), seed cotton yield (SY), and lint yield (LY) in RIL population and high levels of heterosis for SY, LY, and boll weight (BW) in IF2 and two BCF1 populations were observed. A total of 285 QTLs were identified in the four related populations using a composite interval mapping approach. In the IF2 population, 26.60% partially dominant (PD) QTLs and 71.28% over-dominant (OD) QTLs were identified. In two BCF1 populations, 42.41% additive QTLs, 4.19% PD QTLs, and 53.40% OD QTLs were detected. For multi-environment analysis, phenotypic variances (PV) explained by e-QTLs were higher than those by m-QTLs in each of the populations, and the average PV of m-QTLs and e-QTLs explained by QTL × environment interactions occupied a considerable proportion of total PV in all seven datasets. Conclusions At the single-locus level, the genetic bases of heterosis varied in different populations. Partial dominance and over-dominance were the main cause of heterosis in the IF2 population, while additive effects and over-dominance were the main genetic bases of heterosis in two BCF1 populations. In addition, the various genetic components to heterosis presented trait specificity. In the multi-environment model analysis, epistasis was a common feature of most loci associated with inbreeding depression and heterosis. Furthermore, the environment was a critical factor in the expression of these m-QTLs and e-QTLs. Altogether, additive effects, over-dominance, epistasis and environmental interactions all contributed to the heterosis of yield and its components in upland cotton, with over-dominance and epistasis more important than the others. Electronic supplementary material The online version of this article (10.1186/s12864-018-5289-2) contains supplementary material, which is available to authorized users.


Background
Inbreeding depression, the reduced fitness of progenies arising from increased homozygosity [1,2], and heterosis, wherein cross-fertilization hybrids between diverse varieties or different species exhibit superiority relative to parental performance [3], are fundamentally concerned with inbreeding and outbreeding. Inbreeding depression and heterosis are considered two aspects of the same phenomenon, and both have fundamental importance to applied genetics and breeding. In all cases, the reason for the inbreeding depression is that the inbreeding increases the probability of the homozygosity of deleterious recessive alleles in their progenies [4][5][6]. The vigor lost caused by consequence of inbreeding can be recovered by crossing [6]. Moreover, inbreeding depression may have a large impact on the formation of reproductive disorders between species, while heterosis may play a key role in maintaining genetic variation of populations [7].
In agriculture, the application of heterosis has contributed greatly to the production of many crops. However, the genetic basis of heterosis remains obscure. Three major hypotheses, including dominance [8,9], over-dominance [10][11][12], and certain types of epistasis [13][14][15] have been proposed to explain heterosis. Quantitative trait locus (QTL) mapping studies in major crops have been performed to explain the genetic basis of heterosis. An appropriate experimental design for the genetic dissection of heterosis is essential. Comstock and Robinson [16] devised the North Carolina design III (Design III) mating scheme, which was the first use of backcross designs to analyze the genetic basis of heterosis. Based on two maize backcross F 3 families (BCF 3 , a modified Design III scheme), Stuber et al. [12] reported that over-dominance was the major genetic basis of heterosis for grain yield. A study of Xiao et al. [9] investigated the genetic bases of heterosis in two rice BC 1 F 7 populations and concluded that dominance complementation was the major cause. Li et al. [15] and Luo at al. [17] reported that epistasis and over-dominance were the main causes of inbreeding depression and heterosis of grain yield, grain yield components, and biomass in five related rice mapping populations. By re-analyzing the data of maize [12] and rice [9], Garcia et al. [18] reported that dominance was the main contributor of heterosis in maize, while additive × additive epistatic interactions could be the major genetic basis of heterosis in rice. Schön et al. [19] compared QTL mapping results of three previous Design III studies [12,20,21] by advanced statistical methods [22]. Their results indicated that the positive interactions of alleles from the opposite heterotic pool would lead to high heterosis for grain yield of maize. Shang et al. [23] investigated the yield heterosis of upland cotton with two BCF 1 populations, which implied partial dominance and over-dominance were the main genetic bases. To dissect the heterotic effects directly, Hua et al. [24,25] introduced an "immortalized F 2 " (IF 2 ) population derived from pair crosses of RILs of rice and focused on detecting heterotic loci (HL) to explain the genetic basis of heterosis instead of using traditional QTLs. Based on this design, they discovered that single-locus heterotic effects and dominance × dominance (DD) interactions could explain the genetic basis of heterosis in hybrid rice. Zhou et al. [26] detected several HLs for yield and its components in a rice IF 2 population and found that the relative contributions of the genetic components varied with traits. Based on a maize IF 2 population, Tang et al. [27] demonstrated that dominance effects of HL as well as additive × additive interactions were the major genetic bases of heterosis for grain yield and its components. Using the same material, Guo et al. [28] re-analyzed yield heterosis using a reconstructed high-density linkage map. They found that dominance was more important for heterosis than other genetic effects. Moreover, over-dominance and epistasis also contributed to heterosis. IF 2 and BCF 1 populations are ideal materials for comprehensively dissecting the composition of heterosis. Firstly, the genotypes of IF 2 and BCF 1 populations can be clearly deduced from the original parents and RILs; secondly, these two populations permit replicated trials; lastly, most loci of IF 2 and BCF 1 populations are heterozygous. This provides an opportunity for mapping HL and studying heterosis, rather than analyzing solely based on measuring performance of the trait. It is well known that stably expressed QTLs across multiple environments are deeply favored in marker-assisted selection (MAS). Thus, identifying QTLs and exploring their expression levels and the genetic basis of heterosis under multiple environments in related populations would allow us to map stable QTLs and accelerate the process of breeding.
Upland cotton is the most important natural textile fiber source globally. Currently, it is grown on a total area of 30.9 million ha of land in more than 80 countries [29]. It is urgent to improve the yield of upland cotton cultivars to meet worldwide demand, and maintain profitability for cotton growers. Yield is a complex trait in cotton that is controlled by a large number of QTLs. Several studies have discovered that significant heterosis for yield and yield components exists in upland cotton [30][31][32]. In addition, some QTL mapping studies have been reported that dissect the composition of heterosis for yield and yield components of upland cotton [23,32,33], but no studies have been reported on different segregating populations from the same parental combination. In the present study, RIL, IF 2 , and two BCF 1 populations were used simultaneously to perform QTL genetic analysis for yield and yield components based on a high-density SNP intraspecific genetic map under multiple environmental conditions. The results will provide meaningful hints at the underlying genetic bases of inbreeding depression and heterosis for yield and yield components, which can be used in cotton breeding.

Plant materials
Four related genetic populations were used, including a set of 188 RILs (F 8 ), an IF 2 population, and two BCF 1 populations (Fig. 1). The RILs were derived by single-seed procedure from a cross between two elite upland germplasms, HS46 (P 1 ) and MARCABUCAG8US-1-88 (P 2 ). According to a diallel mating design [25], the IF 2 population was produced from crosses between the RILs chosen by random permutations of the 188 RILs (Fig. 1a). This procedure was repeated two times, with each time making 188 hybrids, forming a population consisting of 376 IF 2 hybrids. Both BCF 1 populations were derived from a modified Design III [16,21], in which two parents were used as the male parents backcrossed with the 188 RILs (Fig. 1b, c). The two BCF 1 populations each contained 188 hybrids named HSBCF 1 and MARBCF 1 , referring to HS46 (HS) and MARCABUCAG8US-1-88 (MAR) backcrossed with 188 RIL lines, respectively.

Field planting and phenotypic evaluation
Two separate experiments were conducted at two locations, Yacheng (inland climate) and Baogang (coastal climate) of Sanya, Hainan Province, China, in the cotton growing seasons of 2014 and 2015. All plants of the four populations and the two parental lines were planted in a randomized block design with two replications in each location and with 5.6 m 2 plot areas. Finally, 29 plants were grown in each row at a spacing of 25 cm between plants. Standard cultivation, weed and insect control practices were performed as the management of the local cotton production.
Ten consecutive plants in the middle of each row were tagged for trait measurement [32,34]. Data were collected for the following traits: seed cotton yield (SY), lint yield (LY), fruit branches per plant (FB), boll numbers per plant (BN), boll weight (BW), and lint percentage (LP). During the harvest season, twenty fully open bolls in each row were harvested for measurement of BW and LP. SY was determined as the seed cotton weight harvested from each plot and LY was determined by multiplying lint percentage with SY.

Genotype analysis and linkage maps
Young leaves were collected from RILs and two parents. Individual genomic DNA was extracted following a modified CTAB method [35].
The RIL population and two parents were genotyped with the cotton 63 K SNP array [36]. A total of 63,058 Fig. 1 Diagram of the development scheme for the RIL, IF 2 , and two BCF I populations. a Crosses were made between two different lines from RIL A and RIL B (for example, RIL A1 × RIL B2 ). Here, RIL A and RIL B represent female and male RIL lines, respectively. This procedure was repeated two times, and finally 376 lines were produced, forming the IF 2 population. The IF 2 MPH dataset derived from the IF 2 population included 376 mid-parental heterosis (MPH) values (H), which were estimated by MPH = F 1 -MP (Hua et al. 2003). Here, F 1 represents the observations in the IF 2 populations, and MP represents the average trait value between their parents. b, c The HSBCF 1 and MARBCF 1 populations were produced by RILs × P 1 and RILs × P 2 , respectively, and each contained 188 lines. The RILs were used as the female parents in two backcross designs with the two original parents. The HSBCF 1 MPH and MARBCF 1 MPH datasets also each included 188 individuals (H) and were estimated by MPH = F 1 -MP (Hua et al. 2003). Here, F 1 represents the observations in the two BCF 1 populations, and MP represents the average trait value between their corresponding parents. HS: HS46; MAR: MARCABUCAG8US-1-88 SNPs were screened for polymorphism between parents, in which a total of 2618 SNP markers were selected to genotype the RILs [37]. In the IF 2 and two BCF 1 populations, the genotypes for each F 1 were deduced from the RILs and the parental lines that were used as the parents for each cross.

Data analysis
Each year-location was treated as an independent environment. A descriptive statistics model was used to test the basic statistics of phenotypic data for RILs, IF 2 s, HSBCF 1 s, and MARBCF 1 s. One-way analysis of variance (ANOVA) was performed to analyze the difference for yield and yield components between the two parents using SPSS 20.0. Broad-sense heritability was estimated as where H 2 is broad-sense heritability, V G = genetic variance, V GE = genotype × environment interaction variance, V ε = error variance, and e and r are the numbers of environments and replicates, respectively. The V G , V GE , and V ε variances were calculated using the minimum norm quadratic unbiased estimation (MINQUE) approach [38] by in QGA Station 2.0 (http://ibi.zju.edu.cn/ software/qga/index.htm).
The hybrid breakdown value (HB), a component of inbreeding depression [39,40], was calculated for individual RILs as follows: HB = RIL− MP, where MP = (HS46 + MARCABUCAG8US-1-88) / 2. The equation for calculating values of the mid-parental heterosis (MPH) of individual IF 2 , HSBCF 1 , and MARBCF 1 hybrids for yield traits was as follows: MPH = F 1 -MP [25], where F 1 was the mean value of each hybrid in the IF 2 , HSBCF 1 , and MARBCF 1 populations, and MP was the average value of the corresponding parents. The MPH datasets were used as the raw data for exploring the genetic basis of yield heterosis.
WinQTL Cartographer 2.5 [41] was used to identify single-locus QTLs with the composite interval mapping (CIM) method. The LOD threshold for declaring a significant QTL was calculated by 1000 permutation tests with a mapping step of 1.0 cM and a significance level of P < 0.05. The MPH datasets only detected the dominance effect under the genetic model of CIM, where the QTL exhibited a significant difference between the heterozygote and the mean of the two parental homozygotes. QTLs were named as: q + trait abbreviation + chromosome number + QTL number [37]. A graphical representation of the linkage map with QTLs marked was created using Map Chart 2.2 [42].
The gene actions in different datasets were estimated as follows: a = (P 1 P 1 − P 2 P 2 ) / 2, d = (P 1 P 2 − (P 1 P 1 + P 2 P 2 ) / 2), BCF 1 = (a + d), where P 1 and P 2 represent the parents, P 1 P 1 and P 2 P 2 indicate the effects of homozygous genotypes observed in RILs, IF 2 s, and BCF 1 s; and P 1 P 2 represents the effects of heterozygous genotypes in hybrid. The gene action mode for each QTL was calculated by the absolute value of the ratio of dominant and additive effects (|d/a|) [26,35,36,43]. There were some differences between the assessment in IF 2 and BCF 1 populations. For the IF 2 population, QTLs with |d/a| ≤ 1 were considered as completely or partially dominant (D or PD) loci. If |d/a| > 1 or if it was only detectable for the MPH dataset, the QTL was referred to as an over-dominance (OD) locus. The |d/a| value was estimated in two ways; both a and d were estimated from IF 2 s for a QTL which only detected in IF 2 s; a was from RILs and d was from the MPH dataset for a QTL detected simultaneously in RILs and the IF 2 MPH dataset. Moreover, the value of |d/a| in IF 2 s was used as the criterion. For BCF 1 populations, a QTL was considered to be an OD locus in the following three cases: (1) MPH (d) times two was higher than BCF 1 performance (a + d) i.e., 2 |d| (MPH) > |a + d| (BCF 1 ) (equal to |d/a| > 1) for a QTL detected in BCF 1 s and the MPH dataset; (2) a was estimated from RILs and d from the MPH dataset with |d/a| > 1 for a QTL detected simultaneously in the RILs and MPH dataset; (3) only detected in MPH dataset. Otherwise, the QTL was considered to be a D or PD locus. QTLs detected only in BCF 1 were referred to as additive (A) loci. When a QTL was present in all three datasets, the judgment depended on the ratio of the effects in the BCF 1 s and MPH dataset.
A combined multi-environment model analysis that tests the main-effect QTLs (m-QTLs), epistatic QTLs (e-QTLs), and their environmental interactions (QTL × environment, QE) was implemented using the RILs, IF 2 s, two BCF 1 s, and three MPH datasets with the inclusive composite interval mapping (ICIM) method in IciMapping 4.1 [44]. The pre-set parameters Scan = 1 cM / PIN = 0.0001 and Scan = 5 cM / PIN = 0.0001 were used to conduct the additive and epistasis QTL mapping analyses, respectively. The threshold LOD for declaring m-QTLs and e-QTLs was calculated using a permutation test at a significance level of P < 0.05, n = 1000. The identified m-QTLs were named using the dataset abbreviation followed by "maq" (multi-environment additive QTL), and then suffixed with the abbreviation of trait and chromosome number, followed by the QTL number. The e-QTLs detected were named using the dataset abbreviation followed by "meq" (multi-environment epistatic QTL), and then with the abbreviation of the trait and the QTL pair number. Datasets were abbreviated was as follows: "R", "I", "B 1 ", and "B 2 " represent RILs, IF 2 s, HSBCF 1 s, and MARBCF 1 s, respectively, and "M" was added after the last three heterozygous population abbreviation to represent their corresponding MPH datasets, i.e., "IM", "B 1 M", and "B 2 M".

Inbreeding depression and heterosis for yield and yield components
The phenotypic variation for yield and its components among the parents, RIL, IF 2 , and two BCF 1 populations, as well as the estimated HB of RILs and the MPH of the IF 2 s and two BCF 1 s are shown in Table 1 and Table 2, respectively. The female parent, HS46, had significantly greater trait values for FB, BN, BW, SY, and LY than those of MARCABUCAG8US-1-88 in all environments (Additional file 1: Table S1). A wide range of variation was observed in yield and its components in the RILs, IF 2 s, HSBCF 1 s, and MARBCF 1 s (Table 1). In all environments, obvious reductions of the RILs were observed as a result of hybrid breakdown in the traits of FB, BN, SY, and LY ( Table 2, Additional file 2: Table S2). The mean deviation of the RILs from the midparental values for LP was found in three environments (but not in 2015Bg), while it was only detected in one environment for BW.
High levels of heterosis for SY, LY, and BW were observed in the IF 2 and two BCF 1 populations. However, other yield components showed lower levels of heterosis in these three populations.
Different levels of heterosis were found among the different populations across four environments (Additional file 2: Table S2). For SY, IF 2 and MARBCF 1 populations showed the same levels of heterosis, at 24.75 and 28.61%, respectively, which were higher than that of the HSBCF 1 population (15.42%). The mean levels of heterosis of for the LY trait showed the same trend as SY in different populations. For BW, different populations have similar levels of heterosis, although it was slightly higher in the MARBCF 1 population. For FB, the two BCF 1 populations showed the same levels of heterosis, and IF 2 s exhibited lower mean heterosis. For BN, the mean levels of heterosis in the MARBCF 1 population were higher than that of the HSBCF 1 population, while the IF 2 population exhibited negative heterosis (− 2.70%). For LP, the order of the mean values in the MPH datasets was IF 2 MPH > HSBCF 1 MPH > MARBCF 1 MPH. There were some differences between environments ( Table 2). In all three populations, the MPH (%) of SY was lower in 2015 than that in 2014. The same trend was found for LY and BW, caused by boll rot during experiments due to high rainfall in 2015 in Sanya. However, the heterosis level of FB showed the opposite trend, where higher levels of heterosis were detected in 2015. Moreover, all of the environments showed low levels of heterosis for LP in all populations except for the 2015Yc environment.
Within each population, heterosis values of individual hybrids varied considerably. Most of the trait values of extreme lines showed high MPH in all environments (Additional file 3: Table S3). For example, in 2014Yc and 2014Bg, the mean heterosis values of SY were more than 83% for the 20 highest-heterosis hybrids of the IF 2 , HSBCF 1 , and MARBCF 1 populations and were more than 46% in the 2015Yc and 2015Bg experiments.
The broad-sense heritability was analyzed using measurement data from four environments ( Table 3). All measures of yield and its components showed moderate heritability, ranging from 56.00 to 86.31%, 46.79 to 64.05%, 39.73 to 65.81%, and 41.20 to 67.21% in the RIL, IF 2 , HSBCF 1 , and MARBCF 1 populations, respectively, which presented significant genetic and environmental effects. LP exhibited nearly the highest heritability and FB the lowest in all populations. Interestingly, the heritability of all traits was highly consistent between the two BCF 1 populations, which might be related to their closer genetic basis.
The phenotypic correlations among the traits varied greatly in the RIL, IF 2 , and two BCF 1 populations (Additional file 4: Table S4). This can be illustrated with LY as an example. Consistent with previous reports [45,46], there were significant positive correlations between LY and SY in all populations, possibly because LY is derived from SY multiplied by lint percentage. Similarly, LY was positively and significantly correlated with BN and BW in all populations, indicating that variation in BN and BW contributed strongly to the variation in LY. The association between LY and LP was significant in three populations except for the RILs, and a significant positive correlation was recovered in two BCF 1 populations but was significantly negative in the IF 2 population, indicating that variation in LP contributed differently to LY variation in different populations. However, LY was only significantly positively correlated with FB in the RIL population.
QTL analysis of yield and yield components in RIL, IF 2 , and two BCF 1 populations A genetic linkage map was previously constructed based on the polymorphic loci identified [37] (Additional file 5: Figure S1). A total of 285 QTLs for yield and its components were detected using CIM in the RILs, IF 2 s, two BCF 1 s, and three MPH datasets (Additional file 5: Figure  S1, Table 4, Additional file 6: Table S5). Among them, 107 QTLs were identified in more than two environments or datasets.

Fruit branches per plant (FB)
A total of 40 QTLs were detected in seven datasets, explaining 3.15-31.66% of the total PV, and ten of them were the stable QTLs that were identified in at least two environments or datasets. Three, six, five, eight, 15, ten, and five QTLs were detected in the RILs, IF 2 s, HSBCF 1 s, MARBCF 1 s, IF 2 MPHs, HSBCF 1 MPHs, and MARB CF 1 MPHs, respectively. In the IF 2 population, three QTLs with PD or D effects and 16 with OD effects were observed. Two QTLs with PD effects were simultaneously detected in both IF 2 s and IF 2 MPHs. In the HSBCF 1 population, two QTLs with A effects and ten with OD effects were found. Four QTLs with apparent OD effects were detected in both HSBCF 1 s and HSBCF 1 MPHs. In the MARBCF 1 population, six QTLs with A effects, two with PD or D effects, and three with OD effects were observed. Two QTLs with OD effects were detected in MARBCF 1 and its MPH dataset.

Boll numbers per plant (BN)
Forty-two QTLs associated with BN were detected in seven datasets. Among those, 17 were detected in more than two environments or datasets. There were six, 11, ten, ten, six, seven, and seven QTLs in the RILs, IF 2 s, HSBCF 1 s, MARBCF 1 s, IF 2 MPHs, HSBCF 1 MPHs, and MARBCF 1 MPHs, respectively. In the IF 2 population, eight QTLs with PD effects and seven QTLs with OD effects were observed. qBN-C01-2, with PD effect, was The gene action of two QTLs, qLP-C09-3 and qLP-C25-2, was uncertain because of inconsistent dominance degree in different environments. In the HSBCF 1 population, seven QTLs with A effects and seven with OD effects were found. Two QTLs with apparent OD effects were detected in both HSBCF 1 s and HSBCF 1 MPHs. In the MARBCF 1 population, nine QTLs with A effects, one with PD or D effect, and eight with OD effects were observed. Six QTLs were detected in both MARBCF 1 and its MPH dataset.

Seed cotton yield (SY)
Fifty QTLs were identified on 22 chromosomes in the seven datasets, explaining 13.39% (ranging from 3.40 to 34.83%) of the mean total PV. Twenty-three QTLs were identified in more than two environments or datasets. There were eight QTLs detected in IF 2 s and its MPH dataset, among which two QTLs exhibited PD effects and six QTLs showed OD effects. qSY-C18-2, with PD effect, was identified in IF 2 s in 2015Bg and in IF 2 MPHs in 2015Yc and 2015Bg. In HSBCF 1 s and its MPH dataset, five QTLs with A effects and 19 with OD effects were detected. Up to ten QTLs with apparent OD effects were detected in both HSBCF 1 s and HSBCF 1 MPHs. Among them, qSY-C16-1 showed favorable alleles conferred by different parents in the two environments of HSBCF 1 MPHs. In the MARBCF 1 population, five QTLs with A effects, one with a PD or D effect, and 11 with OD effects were observed. Five QTLs were identified simultaneously in MARBCF 1 and its MPH dataset.

Lint yield (LY)
Forty-seven QTLs, explaining 3.06-34.06% of the total PV, were detected using the seven datasets. In the IF 2 hybrids, eight QTLs were detected. Four QTLs with PD effects and nine with OD effects were observed in a

Multi-environment analysis of main-effect QTL and environmental interactions
The m-QTLs and QEs detected for yield and yield components in the RILs, IF 2 s, HSBCF 1 s, MARBCF 1 s, IF 2 MPHs, HSBCF 1 MPHs, and MARBCF 1 MPHs are shown in Fig. 2, Additional file 7: Table S6, and Additional file 8: Table S7.
A total of 48 m-QTLs and QEs were identified in the RIL population. On average, m-QTLs explained 2.37% of the PV, and the QEs explained 0.90% of the PV. Three major m-QTLs related to LP, RmaqLP-C07-1, RmaqLP-C08-1, and RmaqLP-C09-1, were found to account for more than 10% of the total explained PV (PV (A) and PV (AE)). For the IF 2 population, 60 and 50 m-QTLs were identified in IF 2 and IF 2 MPH datasets, respectively. On average, m-QTLs detected in the IF 2 and IF 2 MPH datasets explained 0.92 and 1.36% of the PV (A), respectively, and the QEs explained 0.97 and 1.26% of the PV (AE), respectively. One locus, IMmaqLP-C10-1, was considered as a major m-QTL with 10.69% of the total PV explained. In the HSBCF 1 population, a total of 24 and 21 m-QTLs were detected in HSBCF 1 and HSBCF 1 MPH datasets, respectively. In HSBCF 1 s, the number of m-QTLs varied from two to five for different traits, with an average of 2.40% of the PV (A) and 2.53% of the PV (AE). Furthermore, in HSBCF 1 MPHs, the number of m-QTLs varied from zero to seven for different traits, with an average of 2.40% of PV (A) and 1.92% of PV (AE). No m-QTL was detected for BW. Two m-QTLs, B 1 MmaqSY-C14-1 and B 1 MmaqLY-C14-1, were found to have major effects, and were located in the same marker interval of i28957Gh-i36740Gh. In the MARBCF 1 population, there were 28 m-QTLs in MARBCF 1 s and 15 in MARBCF 1 MPHs detected, on average, explaining 4.25 and 6.12% of the total PV in F 1 performance and MPH, respectively. In MARBCF 1 s, B 2 maqBN-C03-1 and B 2 maqLY-C21-1 explained more than 10% of the total PV. In MARBCF 1 MPHs, B 2 MmaqBN-C21-1 was identified to be a major m-QTL, with 13.35% of the total PV explained.
Epistatic QTLs in RIL, IF 2 , and two BCF 1 populations A total of 72, 124, 126, 73, 147, 73, and 67 e-QTLs pairs were identified in the RILs, IF 2 s, HSBCF 1 s, MARBCF 1 s, IF 2 MPHs, HSBCF 1 MPHs, and MARBCF 1 MPHs, respectively (Fig. 2, Table 5, Additional file 9: Table S8, Additional file 10: Table S9). These e-QTLs explained most of the variation for yield traits. For example, the e-QTLs of SY in IF 2 s and IF 2 MPHs explained more than 80% of the total PV. The e-QTLs in HSBCF 1 s and HSBCF 1 MPHs explained 91.89 and 51.68% of the total PV for SY, respectively, and those in MARBCF 1 s and MARBCF 1 MPHs explained 47.75 and 71.12% of the total PV, respectively. In addition, the environmental interactions of these e-QTLs also accounted for considerable PV. On average, the QEs of e-QTLs for each trait explained 12 The e-QTLs were classified into three types: (I) two loci with m-QTL; (II) one loci with m-QTL and a locus without significant m-QTL; and (III) two loci without significant m-QTL [15]. For these e-QTLs in the RIL population, three pairs of LP QTLs were type II, and the remaining interactions were type III. No type I interactions were observed (Table 5). For these e-QTLs in the IF 2 population, five pairs in IF 2 s and seven pairs in IF 2 MPHs were type II and the remaining interactions were type III. Of these e-QTLs in the HSBCF 1 population, five pairs were detected between an interval with significant additive effect and other loci. The remaining interactions occurred between two complementary loci. Of these e-QTLs in the MARBCF 1 population, nine pairs in MARBCF 1 s and two pairs in MPHs were type II, and all remaining interactions were type III.

Congruence analysis of the single-locus QTLs and main-effect QTLs
Comparing the identified additive QTLs, the confidence intervals of 63 single-locus QTLs identified by the CIM method overlapped with 77 m-QTLs identified by the ICIM method, of which some single-locus QTLs harbored two or more m-QTLs identified in different datasets (Additional file 5: Figure S1, Additional file 6: Table S5, Additional file 7: Table S6, Additional file 8: Table S7).

Discussion
Application of RIL, IF 2 and BCF 1 populations RIL and doubled haploid (DH) populations are permanent populations that can be repeated in different environments to detect valuable QTLs in multi-environments [47,48]. BCF 1 or IF 2 populations based on RIL or DH populations have been constructed previously to conduct QTL mapping with respect to heterosis [15,25,27,43,49]. However, no one has used different segregating populations from the same parental combination to study heterosis. Our experimental schemes using related RIL, IF 2 and two BCF 1 populations were specifically designed to allow simultaneous and comprehensive mapping of loci contributing yield and yield components heterosis in upland cotton. Based on this, more heterozygous loci were uncovered, and more QTLs were detected than from a single population. Some QTLs that could not be identified in RIL population could be detected in IF 2 /BCF 1 populations, and the QTLs detected in RILs could be confirmed using the IF 2 /BCF 1 populations. Furthermore, through the combination of these four populations, both additive and non-additive gene actions of the detected loci were more accurately identified. For instance, the QTL main effects obtained using the F 1 mean values of the IF 2 /BCF 1 populations contained both additive and dominance effects while those obtained from the MPH values were estimates of the dominant effect [50]. Similarly, for the epistatic loci, the estimated epistatic effects using the mean F 1 values contained both additive and nonadditive epistatic interactions of the epistatic QTL, while those from MPH values represent the DD interactions [50].

Detection of heterotic loci
Detection of HL using the MPH measurements enabled by the IF 2 and two BCF 1 populations represents another feature of the study. This analysis method effectively separated the single-locus effects causing heterosis from the QTL concerning the trait performance as detected in most previous QTL studies. Making use of the MPH data, we detected 47, 65 and 45 HLs for yield and its components in the IF 2 , HSBCF 1 , and MARBCF 1 populations, respectively. Moreover, 16 stable HLs were detected in two environments, of which five showed inconsistent parental sources of favorable alleles in different environments, which indicated a high sensitivity of the HLs to the environment. To some extent, this should be taken into account in upland cotton breeding. The remaining 11 stable HLs, including three for BN, five for SY, and two for LY could be important for the application of MAS in upland cotton breeding in the future. Here, the BN trait harbored more stable HLs than other yield components, which may be attributed to its higher average heterosis of the 20 top high-heterosis hybrids (Additional file 3: Table S3). Hua et al. [25] found that only ten of 33 HLs identified in their analysis were detected by QTL analysis using trait performance and indicated that trait performance and heterosis were controlled by different sets of loci. However, HLs detected in our study were not independent, and a subset overlapped with QTLs controlling trait phenotypes. Among all HLs, 12, 30, and 20 HLs identified in IF 2 MPH, HSBCF 1 MPH, and MARBCF 1 MPH datasets were also detected by QTL analysis using the data of the IF 2 s, HSBCF 1 s, and MARBCF 1 s, respectively. This result suggests that the MPH and performance per se of the hybrid might share identical genetic modes of action in upland cotton. In fact, it is impossible to demonstrate the genetic mechanism underlying yield traits without involving heterosis.

Genetic bases of inbreeding depression and heterosis of yield and yield components
In the present study, the levels of hybrid breakdown were ordered as LY > SY > BN > LP > FB > > BW, while heterosis was LY > SY > BW > FB > LP > > BN. All traits showed moderate and low heritability except for LP. This tendency toward more complex yield traits showing much greater levels of inbreeding depression and heterosis has been universally observed in many crops [17,23]. The traits with serious inbreeding depression did not necessarily possess high heterosis in hybrids, which implied that there are differences in the mechanisms controlling these two biological phenomena. At the same time, wide variations were also observed in mid-parental heterosis of the IF 2 and two BCF 1 populations. The performances of some hybrids were better than those of the MP values of the original parents, while some other hybrids showed the opposite. Similar results were also obtained by Luo et al. [20,43]. Together, it can be speculated that high heterosis is derived from heterozygosity at certain loci but not from genome-wide heterozygosity [17,24,43,49,51,52].
The reduction of the RIL population from the midparental value was highly significant. This might be attributed to the homozygosity of deleterious alleles and/or less fit multilocus genotypes during the development of RILs [15,39,40,53]. Theoretically, the inbreeding depression values of individual RILs and the MPH values of IF 2 /BCF 1 hybrids for yield and its components have three components. The first is additive gene action, which leads to the deviation of the RILs from the midparental value and hybrid breakdown [1,39,40]. The genes of this group are directly detected in the RILs but are confounded in the IF 2 s and BCF 1 s. The second is dominant gene action, which leads to the deviation of the F 1 hybrids from their corresponding midparental value. Those genes are segregating and contribute to heterosis in the IF 2 s/BCF 1 s but are not directly detected in the RILs. The third is nonadditive gene action, which causes disharmonious interactions in inbreeding depression of RILs and beneficial interactions in heterosis of IF 2 s and BCF 1 s. In fact, there are some overlapping genes between inbreeding depression and heterosis. The overlapping genes of this type are particularly important since they contribute negatively to the mean value of the inbred RILs when homozygous (resulting in hybrid breakdown) and positively to heterosis when heterozygous.
In the IF 2 population of our present study, 25 (26.60%) PD QTLs and 67 (71.28%) OD QTLs were identified. In the HSBCF 1 population, 34 (34.34%) A QTLs, two (2.02%) PD QTLs, and 63 (63.64%) OD QTLs were detected. In the MARBCF 1 population, 47 (51.09%) A QTLs, six (6.52%) PD QTLs, and 39 (42.39%) OD QTLs were detected (Table 4). These results revealed that the genetic basis of heterosis was varied in different populations. At the single-locus level, partial dominance and over-dominance were the main causes of heterosis in the IF 2 population, and additive and over-dominance were the main genetic bases of heterosis in the two BCF 1 populations. Similar results have been discovered in previous studies [28,33,43]. In addition, similar to Zhou et al. [26], our results showed that the relative contributions of the various genetic components to heterosis were trait specificity. Over-dominance and additive effects were the main contributors to heterosis for SY, LY, BW, and LP. Over-dominance, partial dominance, and additive effects all had roles in heterosis of BN. Over-dominance was the most important contributor to heterosis of FB. Overall, over-dominance played an important role in the formation of heterosis in these traits.
Epistasis is a common feature of most loci associated with inbreeding depression and heterosis. First, the e-QTLs explained a much greater portion of the total PV than the m-QTLs for the yield in each of the mapping populations, but this was not true for the yield components (Fig. 2, Table 5). This was consistent with the results of Li et al. [39], which indicated that complex traits tended to be determined by a greater degree of epistasis. In a similar experimental design, Xiao et al. [9] detected a single main-effect QTL which had an R 2 of 6-7% for grain yield in each of the two rice BCF 1 populations. However, the majority of the phenotypic variation was unexplained. Apparently, their failure to detect epistasis was largely attributed to the unavailability of an appropriate analytical method. With a similar experimental design, Shang et al. [23] reported m-QTLs and e-QTLs that had different proportions for yield and yield components in an upland cotton BCF 1 population, but the relative importance of m-QTLs and e-QTLs was not evaluated. Second, most epistasis occurred between complementary loci with no detectable main effects (Table 5). Fewer cases of epistasis occur between m-QTLs and complementary loci. The predominance of epistasis between complementary loci indicate that yield and its component traits related e-QTLs occurred more in multilocus genotypes than in specific alleles at individual loci, which has been demonstrated by a large number of empirical studies [17,23,43]. In addition, the environment was a critical factor in the expression of these m-QTLs and e-QTLs. The average PV of m-QTLs and e-QTLs explained by QEs occupied a large proportion of the total PV in all seven datasets.

Implications for MAS in yield improvement of upland cotton
Numerous classical genetic studies have clearly found that the phenotypic relationships between yield and its components in crops are complex, and the genetic bases of heterosis in segregating populations remains poorly understood. The results of our study have several implications. For breaking the yield ceiling of hybrid upland cotton cultivars, simultaneous selection for all yield components, with an emphasis on increased BN and BW, should be much more efficient than selecting for only lint yield. This is because both BN and BW were significantly positively correlation with LY, and the heterosis of BW in three hybrid populations was obvious and showed the same trend with SY and LY.
Although several QTLs for cotton yield traits have been detected previously using an intraspecific map, few shared markers were used in the present research. Furthermore, it is difficult to compare our results with previous yield QTLs due to the use of different maps, population types, population structure, and environments, etc. [33]. In our study, 12 stable QTLs were identified across at least four datasets or environments. Additionally, there were six stable QTLs, qBN-C05-1, qBN-C05-2, qBN-C22-3, qSY-C16-2, qSY-C19-1, and qSY-C24-2, that are also stable HLs. These stable QTLs across multiple populations and environments should greatly promote further interest in the fine mapping of yield traits or implementation of MAS. When we compared the single-locus QTLs from CIM with m-QTLs from ICIM, 29 stable single-locus QTLs overlapped with m-QTLs. qSY-C24-2 could be an important QTL identified in this study, as it was not only identified as a stable QTL and HL across multiple environments but was also simultaneously confirmed by both CIM and ICIM. The large dominance effects of this QTL justify its potential use in genetic improvement of the yield of both inbred and hybrid cultivars through marker-assistant transfer in upland cotton breeding programs.