- Research article
- Open Access
Identification of favorable SNP alleles and candidate genes for traits related to early maturity via GWAS in upland cotton
© The Author(s). 2016
- Received: 23 February 2016
- Accepted: 5 July 2016
- Published: 30 August 2016
Early maturity is one of the most important and complex agronomic traits in upland cotton (Gossypium hirsutum L). To dissect the genetic architecture of this agronomically important trait, a population consisting of 355 upland cotton germplasm accessions was genotyped using the specific-locus amplified fragment sequencing (SLAF-seq) approach, of which a subset of 185 lines representative of the diversity among the accessions was phenotypically characterized for six early maturity traits in four environments. A genome-wide association study (GWAS) was conducted using the generalized linear model (GLM) and mixed linear model (MLM).
A total of 81,675 SNPs in 355 upland cotton accessions were discovered using SLAF-seq and were subsequently used in GWAS. Thirteen significant associations between eight SNP loci and five early maturity traits were successfully identified using the GLM and MLM; two of the 13 associations were common between the models. By computing phenotypic effect values for the associations detected at each locus, 11 highly favorable SNP alleles were identified for five early maturity traits. Moreover, dosage pyramiding effects of the highly favorable SNP alleles and significant linear correlations between the numbers of highly favorable alleles and the phenotypic values of the target traits were identified. Most importantly, a major locus (rs13562854) on chromosome Dt3 and a potential candidate gene (CotAD_01947) for early maturity were detected.
This study identified highly favorable SNP alleles and candidate genes associated with early maturity traits in upland cotton. The results demonstrate that GWAS is a powerful tool for dissecting complex traits and identifying candidate genes. The highly favorable SNP alleles and candidate genes for early maturity traits identified in this study should be show high potential for improvement of early maturity in future cotton breeding programs.
- Gossypium hirsutum L
- Early maturity traits
- Candidate gene
- SNP alleles
Cotton is the most important natural textile fiber source worldwide. The tetraploid species Gossypium hirsutum L. (2n = 4x = 52, AD genome), also referred to as ‘upland cotton’, accounts for 95 % of the world’s cotton production. Early fiber production is one of the most important traits in cotton, and the selection and popularization of early-maturing cotton varieties are of significant value in reducing the dilemma of whether to plant farmlands with cotton or cereals during cropping system optimization in China [1, 2]. Early maturity is a complex quantitative trait that mainly includes components such as the growth period, growth stages (including the seedling period, squaring period, flowering and boll-setting period (FBP) and boll-opening period), yield percentage before frost (YPBF), node of the first fruiting branch (NFFB), and height of the node of the first fruiting branch (HNFFB) [1, 2]. These components of this quantitative trait are regulated by quantitative trait loci (QTLs) and the environment, as reflected in different genetic models in different cultivars . Early maturity has been reported to be negatively correlated with yield and fiber quality . It is difficult to simultaneously improve early maturity, yield and fiber quality using conventional breeding methods. Fortunately, the rapid development of applied genomics research has provided alternative tools to improve efficiency in plant breeding programs. For example, molecular markers linked to causal genes or QTLs can be used for marker-assisted selection (MAS) and genomic selection.
Over the last two decades, many QTLs related to target traits have been identified using QTL-mapping methods by constructing intraspecific segregating populations of G. hirsutum with different target traits, such as fiber quality traits [4–6], yield and its components , resistance traits [8–10], early maturation traits [2, 11, 12] and drought-related traits . In a study of traits associated with early maturity in cotton, more than 70 related QTLs were detected by linkage mapping [2, 11, 12]. These QTLs may be valuable for improving early maturity by MAS.
Association mapping is another effective approach for connecting phenotypes and genotypes in plants when information on population structure and linkage disequilibrium (LD) is available . This method is convenient because it helps to avoid the difficulty of screening large biparental mapping populations. Association mapping was introduced to maize genetics in 2001  and has been subsequently applied in studies of many plant species . Association mapping is widely used to identify molecular markers associated with target traits, and it has been employed in genetic studies of rice, maize, wheat and other important agricultural crops [16–19]. Genome-wide association studies (GWAS) represent a powerful approach for identifying the locations of genetic factors that underlie complex traits . GWAS have been successfully implemented in Arabidopsis thaliana [21, 22], rice [20, 23], maize  and soybean  for the identification of single nucleotide polymorphism (SNP) loci and candidate genes for various ecological and agricultural traits. In recent years, association mapping has also been widely used in studies of cotton [10, 19, 26–30]. For example, Abdurakhmonov et al.  performed association mapping to examine QTLs related to fiber-quality traits in G. hirsutum accessions using microsatellite markers. Further, Kantartzi and Stewart  detected QTLs related to fiber quality in G. arboreum accessions using association mapping with simple sequence repeat (SSR) markers. Recently, Association mapping was performed to assess QTL alleles during three cotton breeding periods, revealing that some alleles could be detected in nearly all of the Chinese cotton cultivars studied . Favorable QTL alleles for yield and its components have been identified via association mapping in Chinese upland cotton cultivars . Some QTL alleles associated with verticillium wilt resistance in upland cotton have also been detected using this approach . However, few QTLs for cotton early maturity traits have been identified via association mapping.
To better understand the genetic architecture of early maturity traits in upland cotton, genome-wide SNP discovery based on the specific-locus amplified fragment sequencing (SLAF-seq) method and a GWAS strategy were used to identify the SNP loci associated with early maturity traits. We successfully identified several significant associations between SNP loci and early maturity traits using the generalized linear model (GLM) and mixed linear model (MLM). The highly favorable SNP alleles for early maturity traits were mined by computing the phenotypic effect of each SNP locus identified, and the pyramiding effects of the highly favorable SNP alleles for these traits were assessed. Moreover, major SNP loci and potential candidate genes for early maturity were detected. The results of this important study serve as a foundation for analyses of the genetic mechanisms underlying cotton earliness and for MAS for early maturity in cotton.
Genome and chromosome characteristics of SLAF-based SNPs in upland cotton varieties
SNP distribution on each chromosome
Chromosome length (Mb)
SNP densitya (kb)
Chromosome length (Mb)
SNP densitya (kb)
Population structure and linkage disequilibrium
To determine the mapping resolution for GWAS, we quantified the average extent of LD decay. Using the whole set of SNPs, the LD decay rate of the population for the entire genome was estimated to be 100 kb, with r 2 = 0.07 at half of the maximum value (Fig. 2c).
Phenotypic characteristics of traits related to early maturity
Analysis of variance (ANOVA) indicated that the genotype (G) and interactions between the genotype and environmental factors (G × E) were both significant (P < 0.01) for all six traits (Additional file 1: Table S1). The correlation coefficients for the association of the WGP with the FT, FBP, NFFB, HNFFB and YPBF were 0.9541, 0.9659, 0.8775, 0.8513 and −0.9230, respectively. These results indicated that the WGP was significantly associated with the FT, FBP, NFFB, HNFFB and YPBF in all four environments (P < 0.01) (Additional file 1: Table S2).
GWAS for early maturity traits
Mining of highly favorable SNP alleles associated with early maturity traits
Favorable SNP alleles, their phenotypic effects (ai), and representative accessions
zhongmiansuo74, xia25, zhong416
zhongmiansuo74, xia25, zhong416
zhongmiansuo74, xia25, zhong416
zhong6426, zhong51822, xia13-7
xia25, zhong416, baimian17
1476, zhongmiansuo74, xiaomian3
xia25, xiazao3, zhongmiansuo14
xia25, 1476, xiazao3
xia25, 1476, xiazao3
zhong416, xia25, zhongmiansuo64
xiaozao2, xiazao3, xia25
zhongmiansuo74, xia25, xiazao3
Pyramiding effects of highly favorable SNP alleles associated with early maturity traits
Pyramiding effects of the highly favorable alleles that contribute to early maturity
No. of favorable alleles
Mean ± SD
125.05 ± 2.66 (A)
117.39 ± 5.83 (B)
113.55 ± 5.89 (B)
108.84 ± 2.63 (C)
69.75 ± 2.26 (A)
64.26 ± 1.99 (B)
64.42 ± 2.36 (B)
62.3 ± 1.10 (C)
54.71 ± 3.36 (A)
47.55 ± 2.1 (B)
7.23 ± 0.82 (A)
5.59 ± 0.38 (B)
61.03 ± 9.59 (A)
82.19 ± 4.48 (B)
A major locus on chromosome Dt3 and candidate genes that potentially underlie early maturity
Candidate genes most highly associated with early maturity within 500 kb of either side of the SNP locus rs13562854
Distance to SNP (kb)
Tetratricopeptide repeat-like superfamily protein, putative
Zinc finger protein, putative isoform 1
Enolase 1, chloroplastic-like protein
Zinc finger protein, putative isoform 1
Proline and serine-rich 1
Ribonuclease P subunit p30
Hypothetical protein F383_23360
Tetratricopeptide repeat-like superfamily protein, putative
Hypothetical protein F383_21541
ADP, ATP carrier protein ER-ANT1-like
DNA-directed RNA polymerases I and III subunit RPAC1
UDP-glucosyl transferase 89B1, putative
Zinc finger CONSTANS-LIKE 2-like protein
Agamous-like MADS-box protein A
Crooked neck-like protein 1
Serine/threonine protein kinase 16
OBF-binding protein 4, putative
Hypothetical protein CISIN_1g035470mg
Putative acyl-activating enzyme 17, peroxisomal-like protein
ARM repeat superfamily protein
Hypothetical protein F383_15236
WD repeat and HMG-box DNA-binding 1
Identification and verification of SNP loci associated with traits related to early maturity in upland cotton
Both linkage mapping and association analysis provide tools for interpreting the genes that underlie complex traits. To date, linkage mapping is a major method for the mining of QTLs for early maturity traits in cotton. Based on the findings of previous studies, it can be concluded that only preliminary progress has been achieved toward localization of QTLs for cotton early maturity traits with desirable effects in the segregation population (F2 populations and recombinant inbred lines (RILs)) [2, 11, 32], and these findings require further verification. Although several studies have identified QTLs for early maturity traits by association analysis in upland cotton [33, 34], these studies were limited by the sizes of the SSR markers and germplasm populations. As the availability of whole-genome sequences has increased and they have become more cost-effective to generate, the practicality of GWAS has increased. In our study, to improve the efficiency and accuracy of association analysis, a wider selection of germplasm resources for upland cotton was collected that was selected based on maturity traits. Further, a substantial number of SNP markers were developed by genome sequencing. Thirteen associations were identified between 8 SNP loci and five early maturity traits (-lg(p) ≥6.21) (Additional file 1: Table S3). Thus, this study has addressed gaps in the study of cotton early maturity traits using GWAS. Most importantly, a main SNP locus for the WGP and FT was identified on chromosome Dt3.
Mining of favorable SNP alleles and candidate genes to improve early maturity in cotton
Obtaining satisfactory yield and quality during a short growing season is complicated due to conflict between early maturity and yield, as well as between early maturity and fiber quality; thus, it is increasingly difficult to simultaneously improve upon these agriculturally desirable traits in early-maturing cotton using traditional breeding methods. Therefore, the mining of favorable SNP (or QTL) alleles is necessary for improving important agronomic traits in upland cotton cultivars via MAS. Association mapping is one of the most effective approaches for the mining of favorable alleles. Elite alleles for fiber-quality traits  and yield and its components  in upland cotton cultivars/accessions were explored via association analysis. In our study, by comparing the average phenotypic effect value of each allele for the target traits in the thirteen stable associations detected, we identified eleven highly favorable alleles for five early maturity traits (Table 1). Moreover, the examination of favorable SNP alleles and germplasm resources for early maturity traits, such as zhongmiansuo74, xia25, and xiazao3, could be useful for plant breeders; however, the effects of these alleles must be verified. Therefore, the positive effects of highly favorable alleles were selected and assessed. To date, many studies have demonstrated that marker-based gene pyramiding strategies are very effective [35–37]. Dosage pyramiding effects of the highly favorable SNP alleles were also demonstrated (Table 2, Fig. 5); thus, the highly favorable alleles identified in this study have substantial potential for the development of early-maturing upland cotton cultivars in future breeding programs.
Of particular interest, the detailed annotations revealed that the major locus rs13562854 was located on chromosome Dt3 and that the 32 candidate genes in the nearby region were the most highly associated with the WGP and FT. Specifically, four candidates (CotAD_01914, CotAD_01926, CotAD_01936 and CotAD_01947) related to plant floral development were annotated. CotAD_01947 and CotAD_01914 were located -452.83 kb (backward) and 495.80 kb (forward), respectively, from the peak SNP (rs13562854), with MADS-box genes that encode transcription factors involved in plant developmental control and signal transduction . Notably, a WD repeat (WDR) gene (CotAD_01936) was identified 154.53 kb from the rs13562854 locus. Plant WDR proteins are intimately involved in various cellular and organismal processes, including cell division and cytokinesis, apoptosis, light signaling and vision, cell motility, flowering, floral development and meristem organization . CotAD_01947 expression in the early-maturing varieties zhongmiansuo50 and zhongmiansuo74 was significantly higher than that in the late-maturing varieties lumianyan28 and zhongmiansuo41. However, expression of the other genes did not significantly differ between the early-maturing and late-maturing varieties (Additional file 6: Figure S5 B and C).
MADS-box family genes play significant roles in plant growth and development, and they also control flowering time and flower initiation [40, 41]. AGAMOUS-LIKE8 (AGL8, AT5G60910) in Arabidopsis is another MADS-box family member that regulates the transcription of genes required for cellular differentiation and floral determination [42–44]. The BLAST alignment results indicated that the coding sequence (CDS) identity of CotAD_01947 with the Arabidopsis AGL8 gene was as high as 47.50 % (Additional file 7: Figure S6A) and that CotAD_01947 encoded a protein that shared 50.90 % sequence identity with the Arabidopsis AGL8 protein (Additional file 7: Figure S6B). In addition, although fifty-three MADS-box genes have been identified in upland cotton to date , few molecular studies of MADS-box genes in G. hirsutum have been conducted. For example, GhMADS11 affects cell elongation in fibers , GhMADS7 regulates anther development , and GhMADS3 participates in flower development . GhMADS42 in Arabidopsis accelerates flowering, and GhMADS42 transgenic plants exhibit abnormal floral organ phenotypes . In addition, we found that CotAD_01947 shared 50.90 % amino acid sequence identity with Arabidopsis AGL8 (Additional file 7: Figure S6B), that most MADS-box family genes in upland cotton regulated flower development, and that CotAD_01947 expression in early-maturing cotton was higher than that in late-maturing cotton (Fig. 8e). Thus, it is reasonable to postulate that CotAD_01947 may be a candidate gene for improving early maturity traits via the regulation and control of early flowering time in upland cotton. However, clear and definite identification of CotAD_01947 as an annotated MADS-box family gene requires further validation.
A substantial number of SNP markers in upland cotton were developed through SLAF-seq technology and were used in a GWAS. Thirteen significant associations were identified among eight SNP loci and five traits related to early maturity using the GLM and MLM, and two of the 13 associations were observed in both models. Eleven highly favorable SNP alleles for the WGP, FT, FBP, NFFB and YPBF were identified. Moreover, dosage pyramiding effects of the highly favorable SNP alleles and significant linear correlations between the number of highly favorable alleles and the phenotypic values of target traits were detected. Most importantly, a major locus (rs13562854) on chromosome Dt3 and a potential candidate gene (CotAD_01947) for early maturity were detected. The beneficial alleles and candidate gene should be useful for improving early maturity in upland cotton breeding via a molecular design approach.
SLAF-seq, sequencing data analysis and SNP calling
Three hundred fifty-five upland cotton accessions (260 varieties, 71 accessions collected from China, and ten additional varieties, ten accessions introduced from the United States, including the genetic standard line TM-1 and four varieties from central Asia) were used for genome sequencing. Seeds from the 355 upland cotton accessions were obtained from the cotton germplasm collection in our laboratory and from the low-temperature germplasm genebank of the Cotton Research Institute, Chinese Academy of Agricultural Sciences (CRI-CAAS). All accessions had been self-pollinated for more than three generations.
Young leaves of ten plants from each of the 355 varieties/accessions were collected, mixed, frozen in liquid nitrogen, and used for DNA extraction. Genomic DNA was isolated from samples from each cotton variety/accession using the cetyltrimethylammonium bromide (CTAB) method, as described by Paterson et al. ; RNase A and proteinase K treatments were used to prevent RNA and protein contamination, and then the DNA extracts were subjected to Illumina sequencing and SSR-PCR amplification.
The SLAF library was constructed as described by Sun et al.  with several modifications. A SLAF pilot experiment was performed, and the SLAF library was generated in accordance with the predesigned scheme. For this population, two enzymes (RsaI and HaeIII, New England Biolabs, NEB, USA) were used to digest the genomic DNA. A single nucleotide (A) overhang was subsequently added to the digested fragments using Klenow Fragment (3′ → 5′ exo−) (NEB) and dATP at 37 °C. Duplex tag-labeled sequencing adapters (PAGE-purified, Life Technologies, USA) were then ligated to the A-tailed fragments using T4 DNA ligase. PCR was performed using diluted restriction-ligation DNA samples, dNTP, Q5® High-Fidelity DNA Polymerase and PCR primers (forward primer: 5′-AATGATACGGCGACCACCGA-3′; and reverse primer: 5′-CAAGCAGAAGACGGCATACG-3′) (PAGE-purified, Life Technologies). Next, the PCR products were purified using Agencourt AMPure XP beads (Beckman Coulter, High Wycombe, UK) and pooled. The pooled samples were separated by 2 % agarose gel electrophoresis. Fragments that ranged in size from 314 to 364 bp (with indexes and adaptors) were excised and purified using a QIAquick gel extraction kit (Qiagen, Hilden, Germany). The gel-purified products were subsequently diluted. Paired-end sequencing (125 bp at each end) was performed using an Illumina HiSeq 2500 system (Illumina, Inc.; San Diego, CA, USA) according to the manufacturer’s recommendations.
The raw reads (100 bp in length) were filtered and trimmed as follows: reads with ≥10 % unknown nucleotides were removed; reads with ≥30 % low-quality bases (base quality ≤10) were removed; reads with clear index information were trimmed; and low-quality bases at the 3′ ends of reads were removed. Read quality was considered acceptable if the Q30 ratio was ≥80 % after trimming and a paired sequence length of 80 bp was retained at each end. To evaluate sequence quality, real-time monitoring was performed in each cycle during sequencing, and the ratio of the number of high-quality reads with quality scores > Q30 (a quality score of 30 indicates a 0.10 % chance of an error and thus 99.90 % confidence) to the total number of raw reads and the GC content were calculated. BWA software was used to map the raw paired-end reads to the reference genome (Gossypium hirsutum v 1.0) . SLAF groups were generated by grouping reads that were mapped to the same position. If an accession was only partly digested by the restriction enzymes, some reads that mapped to the reference genome overlapped by two SLAF tags. These reads were assigned to both SLAF tags in the accession. The GATK and SAMtools packages were used for SNP calling.
Population structure and linkage disequilibrium estimation
The ADMIXTURE  program was used to assess the population structure based on the maximum-likelihood method with 10,000 iterations, and the number of clusters (K) was set from 2 to 10. The SNPs were used after filtering for an MAF >0.05 and an identity of greater than 80 %. Pairwise LD between markers was calculated as the squared correlation coefficient (r2) of alleles using GAPIT software .
Field experiments and collection and analysis of phenotypic data
A subset of 185 lines was selected from the 355 upland cotton accessions from the cotton germplasm collection in our laboratory and from the low-temperature germplasm genebank of the CRI-CAAS. Selection was based on analyses of population structure and maturity, with the genotypes from the nine subpopulations characterized into two main groups according to maturity traits. The first group (103 genotypes) contained the early-maturing genotypes, including 76 varieties/accessions that originated from the Yellow River region, 15 varieties/accessions that originated from the northern specific early-maturing region, ten varieties/accessions that originated from the northwestern inland early-maturing region and two varieties introduced from the United States. The second group (82 genotypes) contained the late-maturing genotypes, including 69 varieties/accessions that originated from the Yellow River region, five varieties/accessions that originated from the Yangtze River region and 8 varieties introduced from the United States (Additional file 1: Table S4).
The population was planted at the experimental station of the CRI-CAAS in Anyang, Henan (36°05 N; 114°21E). All cotton lines were sown at two time points, including late April and late May (referred to as SP-sowing and SU-sowing, respectively), in 2013 and 2014. The different cotton varieties/accessions were each grown in a single-row plot (5.0 m long and 0.8 m row wide), with three replicates and a random complete block design. The field management conformed to local practices.
The following six traits related to early maturity were investigated in this study: WGP (the period from sowing to the first boll opening), FT (the period from sowing to the first flower blooming), FBP (the period from the first flower blooming to the first boll opening), NFFB (the number of nodes from the cotyledon node to the first fruiting branch node), and HNFFB (the distance between the cotyledon node and the NFFB) and YPBF (the seed yield percentage before October 25th). Ten consecutive plants in the middle of each row were tagged for trait measurements. These plants were observed, and the average value of three replicates was recorded. The phenotypic data were analyzed using SAS 9.3 statistical software (SAS, Chicago, IL, USA). To reduce environmental error, BLUPs for six early maturity traits per genotype were obtained using the PROC MIXED procedure of SAS9.3. ANOVA was performed using PROC ANOVA. Linear regression analysis was conducted using the GLM procedure in SAS.
GWAS and favorable allele identification
For all SNP loci and phenotypic data, we applied the GLM and MLM. In addition, to minimize the effects of environmental variation, BLUPs were computed for GWAS . The BLUP values for the four environments and the phenotypic values of six early maturity traits for each environment were used in GWAS. The high-quality SNPs were filtered according to the MAF (MAF >0.05) and the integrity of each SNP (>50 %). These SNPs from 185 cotton accessions were used in association analysis conducted using the GLM and MLM with GAPIT software . Bonferroni-adjusted P-values of ≤0.01 and 0.05 (-lg(p) ≥ 6.91 and -lg(p) ≥ 6.21, respectively) were used as thresholds to determine whether significant associations existed . SNP loci significantly associated with the target traits based on the GWAS results were analyzed. According to the computational method described by Zhang et al. , the phenotypic effect of each SNP locus (ai) was estimated through comparison of the average phenotypic value for each accession for the specific locus with that of all accessions. The favorable alleles were subsequently identified according to the breeding objective of each target trait. For the WGP, FT, FBP, NFFB and HNFFB, ai < 0 indicates a favorable allele, and for the YPBF, ai > 0 indicates a favorable allele.
Quantitative real-time PCR
Total RNA was isolated from the samples using a Plant RNA Purification Kit (Tiangen, Beijing, China). Reverse transcription was conducted using a SuperScript III First-Stand Synthesis System to obtain cDNA for qRT-PCR (Invitrogen, Carlsbad, CA, USA). Transcript levels were subsequently determined by qRT-PCR using a 7500 Real-Time PCR System (Applied Biosystems, Foster City, CA, USA) and SYBR PremixEx Taq (2×) (TaKaRa). The gene-specific primer pairs used for PCR amplification are listed in Additional file 1: Table S5 and were designed to avoid conserved regions. To normalize the variance among samples, actin was used as an endogenous control, and the gene expression levels were calculated using the 2−ΔΔCT method .
ANOVA, analysis of variance; BLUP, best linear unbiased prediction; CV, coefficients of variance; FBP, flowering and boll-setting period; FT, flowering time; GLM, generalized linear model; GWAS, genome-wide association study; HNFFB, height of the node of the first fruiting branch; LD, linkage disequilibrium; MAF, minor allele frequency; MAS, marker-assisted selection; MLM, mixed linear model; NFFB, node of the first fruiting branch; SLAF-seq, specific-locus amplified fragment sequencing; SNP, single nucleotide polymorphism; SSR, simple sequence repeat; WGP, whole growth period; YPBF, yield percentage before frost
This research was funded by the Chinese National Natural Science Foundation (31660409) and the National Key Technology R&D Program (2014BAD03B01).
Availability of data and materials
The sequence read data from SLAF-seq analysis for the 355 sequenced upland cotton lines are available in the Sequence Read Archive (http://www.ncbi.nlm.nih.gov/bioproject/PRJNA314284/) (SRS1322001 under the accession number PRJNA314284).
SXY and CSW designed and supervised the research; JJS, CYP, LBL and HLW analyzed the data; JJS, BL, SLF, MZS, XYJ and GZM conducted the field trial to evaluate early maturity traits; DDG and LH performed genome sequencing; JJS and SQZ analyzed gene expression by RT-PCR; and JJS, CXW, CYP and HTW wrote the manuscript. All authors read and approved the manuscript.
The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Yu S, Huang Z. Inheritance analysis on earliness components of short season cotton varieties in G. hirsutum. Sci Agric Sin. 1990;23:48–54.Google Scholar
- Li C, Wang X, Dong N, Zhao H, Xia Z, Wang R, et al. QTL analysis for early-maturing traits in cotton uing two upland cotton (Gossypium hirsutum L.) crosses. Breed Sci. 2013;63:154–63.View ArticlePubMedPubMed CentralGoogle Scholar
- Song M, Yu S, Fan S, Ruan R, Huang Z. Genetic analysis of main agronomic traits in short season upland cotton(G. hirsutum L.). Acta Gossypii Sin. 2005;17:94–8.Google Scholar
- Shen X, Guo W, Zhu X, Yuan Y, Yu JZ, Kohel RJ, et al. Molecular mapping of QTLs for fiber qualities in three diverse lines in Upland cotton using SSR markers. Mol Breed. 2005;15:169–81.View ArticleGoogle Scholar
- Fang DD, Jenkins JN, Deng DD, Mccarty JC, Li P, Wu J. Quantitative trait loci analysis of fiber quality traits using a random-mated recombinant inbred population in Upland cotton (Gossypium hirsutum L.). BMC Genomics. 2014;15:2–14.View ArticleGoogle Scholar
- Tan Z, Fang X, Tang S, Zhang J, Liu D, Teng Z, et al. Genetic map and QTL controlling fiber quality traits in upland cotton (Gossypium hirsutum L.). Euphytica. 2014;203:615–28.View ArticleGoogle Scholar
- Xia Z, Zhang X, Liu Y, Jia Z, Zhao H, Li C, et al. Major gene identification and quantitative trait locus mapping for yield-related traits in upland cotton (Gossypium hirsutum L.). J Integr Agric. 2014;13:299–309.View ArticleGoogle Scholar
- Jiang F, Zhao J, Zhou L, Guo W, Zhang T. Molecular mapping of Verticillium wilt resistance QTL clustered on chromosomes D7 and D9 in upland cotton. Sci China C Life Sci. 2009;52:872–84.View ArticlePubMedGoogle Scholar
- Ulloa M, Hutmacher RB, Roberts PA, Wright SD, Nichols RL, Davis RM. Inheritance and QTL mapping of fusarium wilt race 4 resistance in cotton. Theor Appl Genet. 2013;126:1405–18.View ArticlePubMedGoogle Scholar
- Zhao Y, Wang H, Chen W, Li Y. Genetic structure, linkage disequilibrium and association mapping of verticillium wilt resistance in elite cotton (Gossypium hirsutum L.) germplasm population. PLoS One. 2014;9:e86308.View ArticlePubMedPubMed CentralGoogle Scholar
- Fan S, Yu S, Song M, Yuan R. Construction of molecular linkage map and QTL mapping for earliness in short-season cotton. Cotton Sci. 2006;18:135–9.Google Scholar
- Li C, Wang C, Dong N, Wang X, Zhao H, Converse R, et al. QTL detection for node of first fruiting branch and its height in upland cotton (Gossypium hirsutum L.). Euphytica. 2012;188:441–51.View ArticleGoogle Scholar
- Levi A, Paterson AH, Cakmak I, Saranga Y. Metabolite and mineral analyses of cotton near-isogenic lines introgressed with QTLs for productivity and drought-related traits. Physiol Plant. 2011;141:265–75.View ArticlePubMedGoogle Scholar
- Thornsberry JM, Goodman MM, Doebley J, Kresovich S, Nielsen D, Buckler 4th ES. Dwarf8 polymorphisms associate with variation in flowering time. Nat Genet. 2001;28:286–9.View ArticlePubMedGoogle Scholar
- Zhu C, Gore M, Buckler ES, Yu J. Status and prospects of association mapping in plants. Plant Genome. 2008;1:5–20.View ArticleGoogle Scholar
- Flint-Garcia SA, Thuillet AC, Yu J, Pressoir G, Romero SM, Mitchell SE, et al. Maize association population: a high-resolution platform for quantitative trait locus dissection. Plant J. 2005;44:1054–64.View ArticlePubMedGoogle Scholar
- Maccaferri M, Sanguineti MC, Noli E, Tuberosa R. Population structure and long-range linkage disequilibrium in a durum wheat elite collection. Mol Breed. 2005;15:271–90.View ArticleGoogle Scholar
- Eizenga GC, Agrama HA, Lee FN, Yan W, Jia Y. Identifying novel resistance genes in newly introduced blast resistant rice germplasm. Crop Sci. 2006;46:1870–8.View ArticleGoogle Scholar
- Abdurakhmonov I, Kohel R, Yu J, Pepper A, Abdullaev A, Kushanov F, et al. Molecular diversity and association mapping of fiber quality traits in exotic G. hirsutum L. germplasm. Genomics. 2008;92:478–87.View ArticlePubMedGoogle Scholar
- Zhao K, Tung CW, Eizenga GC, Wright MH, Ali ML, Price AH, et al. Genome-wide association mapping reveals a rich genetic architecture of complex traits in Oryza sativa. Nat Commun. 2011;2:1020–1.Google Scholar
- Atwell S, Huang YS, Vilhjálmsson BJ, Willems G, Horton M, Li Y, et al. Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature. 2010;465:627–31.View ArticlePubMedPubMed CentralGoogle Scholar
- Horton MW, Hancock AM, Huang YS, Toomajian C, Atwell S, Auton A, et al. Genome-wide patterns of genetic variation in worldwide Arabidopsis thaliana accessions from the RegMap panel. Nat Genet. 2012;44:212–6.View ArticlePubMedPubMed CentralGoogle Scholar
- Huang X, Wei X, Sang T, Zhao Q, Feng Q, Zhao Y, et al. Genome-wide association studies of 14 agronomic traits in rice landraces. Nat Genet. 2010;42:961–7.View ArticlePubMedGoogle Scholar
- Kump KL, Bradbury PJ, Wisser RJ, Buckler ES, Belcher AR, Oropeza-Rosas MA, et al. Genome-wide association study of quantitative resistance to southern leaf blight in the maize nested association mapping population. Nat Genet. 2011;43:163–8.View ArticlePubMedGoogle Scholar
- Zhao X, Han Y, Li Y, Liu D, Sun M, Zhao Y, et al. Loci and candidate gene identification for resistance to Sclerotinia sclerotiorum in soybean (Glycine max L. Merr.) via association and linkage maps. Plant J. 2015;82:245–55.View ArticlePubMedGoogle Scholar
- Kantartzi S, Stewart JM. Association analysis of fibre traits in Gossypium arboreum accessions. Plant Breed. 2008;127:173–9.View ArticleGoogle Scholar
- Zeng L, Meredith WR, Gutiérrez OA, Boykin DL. Identification of associations between SSR markers and fiber traits in an exotic germplasm derived from multiple crosses among Gossypium tetraploid species. Theor Appl Genet. 2009;119:93–103.View ArticlePubMedGoogle Scholar
- Mei H, Zhu X, Zhang T. Favorable QTL alleles for yield and its components identified by association mapping in Chinese Upland cotton cultivars. PLoS One. 2013;8:e82193.View ArticlePubMedPubMed CentralGoogle Scholar
- Zhang T, Qian N, Zhu X, Chen H, Wang S, Mei H, et al. Variations and transmission of QTL alleles for yield and fiber qualities in upland cotton cultivars developed in China. PLoS One. 2013;8:e57220.View ArticlePubMedPubMed CentralGoogle Scholar
- Cai C, Ye W, Zhang T, Guo W. Association analysis of fiber quality traits and exploration of elite alleles in upland cotton cultivars/accessions (Gossypium hirsutum L.). J Integr Plant Biol. 2014;56:51–62.View ArticlePubMedGoogle Scholar
- Li F, Fan G, Lu C, Xiao G, Zou C, Kohel R, et al. Genome sequence of cultivated Upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution. Nat Biotechnol. 2015;33:524–30.View ArticlePubMedGoogle Scholar
- Guo Y, Mccarty JC, Jenkins JN, Saha S. QTLs for node of first fruiting branch in a cross of an upland cotton, Gossypium hirsutum L., cultivar with primitive accession Texas 701. Euphytica. 2008;163:113–22.View ArticleGoogle Scholar
- Ai N, Liu R, Zhao T, Qin J, Zhang T. Analysis of early maturity gene sources in upland cotton using molecular markers. Acta Agron Sin. 2013;39:1548–61.View ArticleGoogle Scholar
- Liang B, Fan S, Song M, Pang C, Wei H, Yu S. Association analysis of agronomic traita in upland cotton using SSR markers. Cotton Sci. 2014;26:387–95.Google Scholar
- Werner K, Friedt W, Ordon F. Strategies for pyramiding resistance genes against the barley yellow mosaic virus complex (BaMMV, BaYMV, BaYMV-2). Mol Breed. 2005;16:45–55.View ArticleGoogle Scholar
- Sacco A, Di MA, Lombardi N, Trotta N, Punzo B, Mari A, et al. Quantitative trait loci pyramiding for fruit quality traits in tomato. Mol Breed. 2013;31:217–22.View ArticlePubMedGoogle Scholar
- Zhang B, Li W, Chang X, Li R, Jing R. Effects of favorable alleles for water-soluble carbohydrates at grain filling on grain weight under drought and heat stresses in wheat. PLoS One. 2014;9:e102917.View ArticlePubMedPubMed CentralGoogle Scholar
- Riechmann JL, Meyerowitz EM. MADS domain proteins in plant development. Biol Chem. 1997;378:1079–101.PubMedGoogle Scholar
- Nocker SV, Ludwig P. The WD-repeat protein superfamily in Arabidopsis: conservation and divergence in structure and function. BMC Genomics. 2003;4:1023–5.Google Scholar
- Theißen G. Development of floral organ identity: stories from the MADS house. Curr Opin Plant Biol. 2001;4:75–85.View ArticlePubMedGoogle Scholar
- Becker A, Theißen G. The major clades of MADS-box genes and their role in the development and evolution of flowering plants. Mol Phylogenet Evol. 2003;29:464–89.View ArticlePubMedGoogle Scholar
- Tabata S, Kaneko T, Nakamura Y, Kotani H, Kato T, Asamizu E, et al. Sequence and analysis of chromosome 5 of the plant Arabidopsis thaliana. Nature. 2000;408:823–6.View ArticlePubMedGoogle Scholar
- Gu Q, Ferrándiz C, Yanofsky MF, Martienssen R. The FRUITFULL MADS-box gene mediates cell differentiation during Arabidopsis fruit development. Development. 1998;125:1509–17.PubMedGoogle Scholar
- Hempel FD, Weigel D, Mandel MA, Ditta G, Zambryski PC, Feldman LJ, et al. Floral determination and expression of floral regulatory genes in Arabidopsis. Development. 1997;124:3845–53.PubMedGoogle Scholar
- Jing S, Pang C, Song M, Wei H, Fan S, Yu S. Analysis of MIKCC− type MADS-box gene family in Gossypium hirsutum. J Integr Agric. 2014;13:1239–49.View ArticleGoogle Scholar
- Li Y, Ning H, Zhang Z, Wu Y, Jiang J, Su S, et al. A cotton gene encoding novel MADS-box protein is preferentially expressed in fibers and functions in cell elongation. Acta Biochim Biophys Sin (Shanghai). 2011;46:607–17.View ArticleGoogle Scholar
- Shao S, Li B, Zhang Z, Zhou Y, Jiang J, Li X. Expression of a cotton MADS-box gene is regulated in anther development and in response to phytohormone signaling. J Genet Genomics. 2010;37:805–16.View ArticlePubMedGoogle Scholar
- Guo Y, Zhu Q, Zheng S, Li M. Cloning of a MADS box gene (GhMADS3) from cotton and analysis of its homeotic role in transgenic tobacco. J Genet Genomics. 2007;34:527–35.View ArticlePubMedGoogle Scholar
- Zhang X, Wei J, Fan S, Song M, Pang C, Wei H, et al. Functional characterization of GhSOC1 and GhMADS42 homologs from upland cotton (Gossypium hirsutum L.). Plant Sci. 2016;242:178–86.View ArticlePubMedGoogle Scholar
- Paterson AH, Brubaker CL, Wendel JF. A rapid method for extraction of cotton (Gossypium spp.) genomic DNA suitable for RFLP or PCR analysis. Plant Mol Biol Report. 1993;11:122–7.View ArticleGoogle Scholar
- Sun X, Liu D, Zhang X, Li W, Liu H, Hong W, et al. SLAF-seq: an efficient method of large-scale de novo SNP discovery and genotyping using high-throughput sequencing. PLoS One. 2013;8:e58700.View ArticlePubMedPubMed CentralGoogle Scholar
- Alexander DH, Novembre J, Lange K. Fast model-based estimation of ancestry in unrelated individuals. Genome Res. 2009;19:308–25.View ArticleGoogle Scholar
- Lipka AE, Tian F, Wang Q, Peiffer J, Li M, Bradbury PJ, et al. GAPIT: genome association and prediction integrated tool. Bioinformatics. 2012;28:2397–9.View ArticlePubMedGoogle Scholar
- Holm S. A simple sequentially rejective multiple test procedure. Scand J Stat. 1979;6:65–70.Google Scholar
- Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative PCR and the 2− ΔΔCT method. Methods. 2001;25:402–8.View ArticlePubMedGoogle Scholar
- Zhang T, Hu Y, Jiang W, Fang L, Guan X, Chen J, et al. Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nat Biotechnol. 2015;33:531–7.View ArticlePubMedGoogle Scholar