Cotton fiber elongation network revealed by expression profiling of longer fiber lines introgressed with different Gossypium barbadense chromosome segments

Background Cotton fiber, a highly elongated, thickened single cell of the seed epidermis, is a powerful cell wall research model. Fiber length, largely determined during the elongation stage, is a key property of fiber quality. Several studies using expressed sequence tags and microarray analysis have identified transcripts that accumulate preferentially during fiber elongation. To further show the mechanism of fiber elongation, we used Digital Gene Expression Tag Profiling to compare transcriptome data from longer fiber chromosome introgressed lines (CSILs) containing segments of various Gossypium barbadense chromosomes with data from its recurrent parent TM-1 during fiber elongation (from 5 DPA to 20 DPA). Results A large number of differentially expressed genes (DEGs) involved in carbohydrate, fatty acid and secondary metabolism, particularly cell wall biosynthesis, were highly upregulated during the fiber elongation stage, as determined by functional enrichment and pathway analysis. Furthermore, DEGs related to hormone responses and transcription factors showed upregulated expression levels in the CSILs. Moreover, metabolic and regulatory network analysis indicated that the same pathways were differentially altered, and distinct pathways exhibited altered gene expression, in the CSILs. Interestingly, mining of upregulated DEGs in the introgressed segments of these CSILs based on D-genome sequence data showed that these lines were enriched in glucuronosyltransferase, inositol-1, 4, 5-trisphosphate 3-kinase and desulfoglucosinolate sulfotransferase activity. These results were similar to the results of transcriptome analysis. Conclusions This report provides an integrative network about the molecular mechanisms controlling fiber length, which are mainly tied to carbohydrate metabolism, cell wall biosynthesis, fatty acid metabolism, secondary metabolism, hormone responses and Transcription factors. The results of this study provide new insights into the critical factors associated with cell elongation and will facilitate further research aimed at understanding the mechanisms underlying cotton fiber elongation. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-838) contains supplementary material, which is available to authorized users.

Cotton fiber is an excellent model for studying the mechanisms of plant cell elongation, with peak rates of expansion of >2 mm/day in Gossypium hirsutum during the elongation period [5,8,9]. In recent years, cotton functional genomics studies have provided new insights into fiber development, and transcriptome profiling has been employed to analyze the early stages of fiber development and the elongation stage in G. hirsutum, G. barbadense and G. arboretum cotton species [1,[10][11][12]. Mutant analysis in combination with microarray or next generation sequencing provides a powerful approach for discovering fiber developmental mechanisms by comparing gene expression in mutant vs. wild-type plants [11,[13][14][15]. Phytohormones such as auxins [13,16,17], ethylene [11,18] and brassinosteroids [18,19] are involved in fiber development. In addition, carbohydrate and lipid metabolisms play important roles in fiber development by providing the plant with cell wall polysaccharides and fatty acids [13,20,21]. Some genes encoding members of the cell wall-loosening expansin family are highly expressed in elongating fiber cells [20] and downregulated in fuzzless-lintless mutants [22]. Several studies have elucidated the role of xyloglucan, pectin and the actin cytoskeleton in cotton fiber elongation [23][24][25][26]. Transcription factors such as MYB25 and MYB25like are also involved in fiber development [27][28][29].
G. hirsutum, which represents over 95% of cultivated cotton worldwide, is characterized by high yield and moderate fiber quality. G. barbadense, acultivated extralong staple tetraploid cotton, is characterized by low yield and increased fiber quality (fineness and strength). Chromosome segment introgression line (CSIL) production is an effective method for combining the high yield of G. hirsutum with the superior fiber properties of G. barbadense. With the exception of a single, homozygous chromosome segment transferred from a donor parent, the remaining genome of each CSIL is the same as that of the recipient parent [30]. CSILs consist of a battery of near-isogenic lines that have been developed to cover the entire genomes of some crops, including Lycopersicon esculentum (tomato), Oryza sativa (rice), Triticum aestivum (wheat) and Gossypium (cotton) [30][31][32][33][34].
In this study, we analyzed the transcriptome profiles of longer fiber CSILs containing inserts of various G. barbadense chromosome segments in the background of the standard genetic line G. hirsutum cv. TM-1, developed in our laboratory [30], using the Illumina HiSeq 2000 platform. These results were further validated by quantitative real-time PCR, and functional enrichment and metabolic pathway analysis were performed on the DEGs. This study showed a network including carbohydrate-, fatty acid-, secondary metabolism-, hormone-and transcription factor-related genes associated with fiber elongation. The goal of this study was to gain new insights into the molecular mechanisms behind superior quality fiber formation and to identify new candidate genes as potential targets for fiber property improvement.

Plant materials
G. hirsutum cv. TM-1, the genetic standard line for upland cotton [35], was obtained from the Southern Plains Agricultural Research Center, USDA-ARS, College Station, Texas, USA. G. barbadense cv. Hai7124, extra-long staple cotton, is widely grown in China [30,36]. The detailed method used to develop the CSILs has been described previously [30]. The introgressed G. barbadense chromosomal segments were different in all four lines [37].The samples were collected at 5, 10, 15 and 20 DPA, frozen in liquid nitrogen and stored at -70°C.

RNA isolation and evaluation
Total RNA was extracted from frozen tissue using an improved CTAB extraction protocol [38]. RNAs were evaluated for quality using RNA Pico Chips in an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). All RNA samples were quantified, and samples with an RNA Integrity Number (RIN) >8 and 28S/18S rRNA band intensity (2:1) were subjected to further analysis.

Library construction and sequencing
Digital gene expression (DGE) libraries were constructed using an Illumina Gene Expression Sample Preparation Kit according to the manufacturer's instructions. A total of 24 libraries derived from immature fibers at 5, 10, 15 and 20 DPA were constructed and sequenced using the Solexa Genome Sequencing Analyzer system provided by BGI (Beijing Genomics Institute at Shenzhen, China), which was described in detail previously [39].
Data processing, statistical evaluation and selection of differentially expressed genes Raw data reads were filtered by the Illumina pipeline to produce clean data. All low-quality data, such as short tags (<21 nt) and singletons, were removed. A database of 21-base-long sequences was produced beginning with CATG using 37,505 reference genes from the diploid species G. raimondii (http://www.phytozome.net). The remaining high quality sequences were then mapped to this database; only a single mismatch was allowed, and more than one match was excluded. Gene expression levels were the summation of tags aligned to different positions of the same gene. Expression levels were expressed as TPM, transcripts per million. To identify DEGs during fiber elongation, pairs of DEG profiles from different libraries were compared. Four fiber developmental periods for the five CSILs were compared with the same period for TM-1, and 20 comparisons were obtained. P-and Qvalues were also calculated for every comparison [40]. DEGs were defined as FDR ≤ 0.001, with an absolute value of |log 2 Ratio| ≥ 1, to judge the significance of differences in transcript abundance.

Digital tag profiling analysis
Genes expressed in more than half of the libraries were used for PCC analysis, and clustering of log2-transformed TPM values of these genes was performed with the "Selforganizing tree algorithm" (SOTA, Multiple Array Viewer software, MeV 4.9.0; http://www.tm4.org/mev.html) [41]. Clustering of DEGs in CSILs at different developmental stages was performed with Cluster3.0 (http://bonsai.hgc. jp/~mdehoon/software/cluster/software.htm).
Mapman was also used to analyze gene enrichment [42] and metabolic pathways based on the KEGG database [43]. GO enrichment and KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway analysis were performed using BLAST2GO (http://www.blast2go.com/b2ghome).

Quantitative RT-PCR
Quantitative RT-PCR assays were performed in a 7500 Real-Time PCR system (Applied Biosystems, San Francisco, CA, USA). The reactions were performed in a final volume of 20 μL containing 2 μL of diluted cDNA, 10 μL of 2× SYBR mix (Roche, Basel, Switzerland) and 200 nM of forward and reverse primers. Primer lengths were designed to range from 18 to 24 nt using Beacon Designer 7, and PCR amplicon lengths were designed to range from 100 bp to 150 bp. The thermal cycling conditions were 40 cycles of 95°C for 15 s, 60°C for 30 s and 72°C for 30 s. All reactions were run in triplicate, and the cotton histone3 gene (ACC NO.AF024716) was used as an internal control for normalization of expression levels (F: 5′-GGTGGTGTGAAGAAGCCTCAT-3′, and R: 5′-AATTTCACGAACAAGCCTCTGGAA-3′). The relative gene expression levels were presented as 2-ΔCT. The Pfaffl method was used to analyze expression data [44].

Fiber quality of CSILs and TM-1
In this study, four CSILs with longer fibers than the recurrent parent TM-1, containing inserts of various G. barbadense chromosome segment(s), were identified. The average fiber length of CSIL-35431, CSIL-31134, CSIL-31068 and CSIL-31044 was 31.33, 30.63, 31.00 and 29.97 mm, respectively, which was significantly longer than that of TM-1, while CSIL-35368 had a shorter fiber length than TM-1, at 27.67 mm ( Table 1). The fiber quality of these CSILs showed significant difference compared to the recurrent parent TM-1 and these CSILs provide good materials for the study of fiber elongation and the functional genetic study of cotton fiber trait.

Gene expression patterns during cotton fiber elongation
To obtain a global view of transcriptome profiles relevant to cotton fiber elongation, we sequenced 24 libraries of elongating fibers from CSILs and their recurrent parent TM-1. The number of raw tag per library ranged from 7.0 to 8.7 million, and the number of clean tags named distinct sequences ranged from 6.8 to 8.5 million. The distribution of unambiguous clean tag mapping to genes was nearly 50%, and 55-60% of reference genes were mapped with unambiguous tags, showing highly similar tendencies for all libraries (Additional file 1: Table S1).
A total of 22,153 genes (76.4% of all expressed genes in all libraries) were expressed in more than half of the 24 libraries. To examine the relationship between the experimental samples, Pearson correlation coefficient (PCC) analysis was performed on these genes obtained from all 24 libraries. As shown in Figure 1A, the gene expression profiles in TM-1 showed low similarities at all four stages of fiber elongation, and we also found low similarities in all libraries at the early stages (5 DPA and 10 DPA). However, at later stages (15 DPA and 20 DPA), higher similarities were observed compared to the earlier stages, except  for TM-1, which indicates that the gene expression patterns were altered more dramatically in CSILs in the later stages than in the earlier stages, perhaps because the CSILs carried distinct G. barbadense chromosomal segments.
To examine gene expression patterns during fiber development, the 22,153 genes were classified into four groups, along with an unclassified group ( Figure 1B). Genes in groups 1 and 2 were more highly expressed in the early stage than in the later stage, but genes in groups 3 and 4 showed an opposite expression pattern.
Classification of gene functions showed that group 1 and 2 genes were enriched in the categories glycerolipid biosynthetic process, phospholipid biosynthetic process, glutamine biosynthetic process, auxin signal pathway and chromatin modification, and groups 3 and 4 were enriched in the categories sucrose metabolic process, cellulose biosynthetic process, cytoskeleton organization, secondary cell wall biogenesis and glucuronoxylan biosynthetic process ( Figure 1C). This unbalanced distribution of biological process reflects the different physiological events that occur during fiber elongation.

Cluster analysis of differentially expressed genes between and/or among CSILs
To identify differentially expressed genes (DEGs) in the CSILs, we examined 20 comparison groups between CSILs and TM-1 from 5 DPA to 20 DPA. The number of DEGs between the same developmental stage ranged from 4,500 to 8,000 (Additional file 2: Figure S1). However, the number of DEGs was lower in some libraries than in others, especially in CSIL-35431 and CSIL-35368 at 5 DPA and in CSIL-31044 at 15 DPA. Interestingly, we found that more genes were upregulated than downregulated throughout the elongation stage in CSIL-35431.
To examine the expression patterns of the DEGs, we performed cluster analysis of 19,806 DEGs expressed in four CSILs. These DEGs were grouped into six clusters according to their expression patterns, designated G1-G6 (Figure 2A). Compared to TM-1, 2,486 genes in the G1 category had low expression levels from 5 to 20 DPA, while 3,273 genes in G2 were highly expressed at 10 DPA and 20 DPA, 4,698 genes in G3 were highly expressed from 10 to 20 DPA, 2,698 genes in G4 were highly expressed at 5 DPA and 10 DPA, 3,970 genes were highly expressed at 5 DPA, 10 DPA and 20 DPA, and 2,681 genes in G6 were highly expressed throughout the elongation stage.

Gene functional annotation by GO enrichment
To identify possible biological pathways that were altered in the CSILs, GO functional enrichment was performed using an FDR adjusted p-value of ≤0.05 as the cutoff. The GO annotations of DEGs related to cell wall formation and hormone for all six groups are shown in Figure 2B. We found that DEGs enriched in cell wall formation were mainly in G1 and G6. In G1, DEGs involved in cell wall degradation were downregulated in CSILs, such as the categories pectate lyase and polygalacturonase. In G6, DEGs involved in hemicellulose synthesis, cell wall protein and pectin were upregulated in the CSILs, such as the categories glucuronoxylan, AGPs and pectin esterase. Plant hormones play an important role in fiber development. DEGs related to almost all known hormones including auxin, BRs, ABA and jasmonic acid were mainly in G4 and G5. DEGs related to ethylene, cytokinin and GAs were mainly in G6.
We also analyzed the functional enrichment of common DEGs in four longer fiber CSILs from 5 to 20 DPA and selected 424, 908, 493 and 956 common DEGs at 5, 10, 15 and 20 DPA, respectively (Additional file 3: Figure S2A). Although there were fewer common DEGs at 5 DPA and 15 DPA, more processes were enriched at these two stages, including major/minor CHO metabolism, glycolysis process, cell wall, lipid metabolism and secondary metabolism (Additional file 3: Figure S2B). In addition, GO enrichment analysis of common upregulated DEGs in these four CSILs showed that there were many DEGs associated with molecular functions related to carbohydrate metabolism (Additional file 4: Table S2).

Pathway analysis of DEGs in the CSILs
Based upon GO analysis, we determined that lots of biological processes were affected in the CSILs, but it was still not quite clear how the mechanisms underlying fiber elongation were affected in the CSILs. Therefore, we performed pathway analysis of 19,806 DEGs in the four CSILs, according to Figure 2. Notably, the most highly enriched pathways included starch and sucrose metabolism (355 genes), amino sugar metabolism (145 genes), glycolysis/gluconeogenesis (133 genes), pyruvate metabolism (116 genes) and fatty acid degradation (89 genes) (Figure 3). In addition, the pathways phenylalanine metabolism, galactose metabolism and oxidative phosphorylation were enriched; additional pathways are listed in Additional file 6: Table S3.
The expression of hormone-, cell wall-, lipid metabolismand transcription factor-related genes was substantially altered in the CSILs Phytohormones play an important regulatory role in various plant growth and developmental processes. In the present study, DEGs involved in hormone biosynthesis and signal transduction pathways were identified in the CSILs. Numerous genes involved in hormone signal transduction and biosynthesis of auxin, BR, ethylene, JA and ABA were upregulated in the CSILs ( Figure 4A and Additional file 9: Table S5). In both CSIL-35431 and CSIL-31134, the expression of genes involved in auxin was more dramatically altered than that of other hormones. There were more such upregulated genes in CSIL-35431 than in CSIL-31134 throughout the elongation stage, except at 10 DPA. Notably, more genes related to all five hormones were upregulated at 10 DPA.
Several genes involved in cell wall and fatty acid biosynthesis were differentially expressed at various stage of fiber elongation in the CSILs compared to TM-1 ( Figure 4B and Additional file 9: Table S5). More genes, such as genes involved in xyloglucan endotransglycosylases (XETs), arabinogalactan (AGP) and pectin (including pectin methyl esterase [PME] and pectin acetyl esterase 1-4), expansin and tubulin, were upregulated in CSIL-35431 than in CSIL-31134. However, more ACC synthase (ACS) and long-chain acyl-CoA synthetase (LACS) genes were upregulated in CSIL-31134 than in CSIL-35431. Similar to hormone-related genes, at the early stages, the expression of the above genes was distinctly altered at 5 DPA and 15 DPA in CSIL-31134.
Moreover, genes involved in transcription factors including ERF, NAC, MYB and WRKY were highly upregulated in the CSILs ( Figure 4C and Additional file 9: Table S5). These genes in CSIL-35431 were differentially expressed at the later elongation stage, while an opposite result was obtained for CSIL-31134.
It was interesting that most of these genes were differentially expressed at 10 DPA in CSIL-31134. However, in CSIL-35431, these genes were upregulated at early stage, and TFs were upregulated at 15 DPA and 20 DPA in CSIL-35431. Our result indicated that metabolic pathways were differentially altered in CSIL-35431 vs. CSIL-31134.

Metabolism associated with fiber elongation
Sugars represent a basic source of energy, and they provide carbon skeletons for all biomolecules and are required for the regulation of cell homeostasis and the synthesis of cell wall precursors. Our result show that genes involved in minor CHO (carbohydrate), cell wall,  lipid, starch and sucrose metabolism exhibited altered expression at 10 DPA in all three CSILs (Additional file 10: Figue S5). In particular, CSIL-31134 had more DEGs involved in metabolism, and most DEGs encoding cell wall proteins and cellulose synthase showed upregulated expression in CSIL-35431. In addition, numerous genes associated with minor CHO metabolism were downregulated in CSIL-35368, especially callose-related genes.
Regulatory networks showed genes involved in cell wall precursors, glycolysis, pentose phosphate, fatty acid and phenylpropanoid metabolism were mostly upregulated in the CSILs ( Figure 5). Pyruvate and acetyl-CoA are products of glycolysis and are related to VLCFA biosynthesis and phenylpropanoid metabolism, and very long-chain fatty acid (VLCFA) is linked to ethylene biosynthesis. Genes involved in encoding nucleotide sugars such as fucose, rhamnose, arabinose and sucrose, cell wall precursors, pectin and cellulose biosynthesis were more significantly upregulated in CSIL-35431, while galactose, VLCFA and ethylene biosynthesis genes were more significantly upregulated in 31134. In addition, genes encoding 4CL and CAD (in the phenylpropanoid metabolic pathway) were upregulated in different stages in CSIL-35431 and CSIL-31134, respectively. In other CSILs, different expression pattern of DEGs involved in fucose, xylose and lignin biosynthesis were showed in Additional file 11: Table S6. These results suggest that at 10 DPA, differences in metabolism may affect fiber elongation in the CSILs.
To further examine genes related to carbohydrate metabolism, we constructed a directed acyclic graph using common DEGs involved in glycosyltransferase activity (GO: 0016757) at 10 DPA and 15 DPA in the four longer fiber CSILs ( Figure 6). Almost all of the GO terms were enriched at 15 DPA, and six GO terms (asterisk marker in the graph) were enriched at 10 DPA. Genes encoding proteins such as xyloglucan endotransglycosylases (XET), glycogenin-like starch initiation protein (GUX), galatosyltransferase and others may play important roles in fiber elongation.

Mining of DEGs in introgressed G. barbadense chromosome segments
According to the locations of introgressed G. barbadense chromosome segments detected in a previous study [37], we searched the genome sequence of G. raimondii [45], selected related genes located in these segments and identified candidate genes comparing the DEGs from the transcriptome data using the same gene name. Fortunately, two genes in CSIL-35431, 12 genes in CSIL-31134 and 56 genes in CSIL-31044 exhibited upregulated expression in the transcriptome data during the fiber elongation stage. In addition, most of these confirmed genes were upregulated mainly at 10 DPA (Additional file 12: Table S7). ARM repeat and protein kinase superfamily genes showed upregulated expression in CSIL-35431, and upregulated genes encoding amino acid kinase family and zinc finger proteins were detected in CSIL-31134. Moreover, functional enrichment analysis of the 56 upregulated genes in CSIL-31044 indicated that genes involved in transferase activity, including glucuronosyltransferase activity (GO:0080116), inositol-1, 4, 5-trisphosphate 3-kinase activity (GO:0008440) and desulfoglucosinolate sulfotransferase activity (GO:0047364), were enriched in CSIL-31044 (Additional file 13: Figure S6); the results of enrichment analysis are similar to the results of transcriptome analysis of DEGs in the CSILs.
To validate the transcriptome results, we selected 13 genes in two plant materials for quantitative real-time PCR analysis. The results of pair-by-pair comparisons between the transcriptome data and qRT-PCR analysis and qRT-PCR primers are shown in Additional file 14: Table S8. The expression levels were highly correlated (r 2 = 0.75 -0.99); therefore, quantitative real-time PCR confirmed the accuracy and reliability of the expression levels determined by DGE analysis.

Discussion
Cotton fiber cell elongation is a complex and highly regulated process involving metabolic pathways, signal transduction and transcriptional regulation. To date, the roles of carbohydrate metabolism, phytohormones, lipid metabolism and transcription factors in promoting cotton fiber elongation have been underexplored. CSILs with longer fibers provide an excellent system for studying cotton fiber elongation. First, our results support the notion that these metabolic and signaling pathways are significantly upregulated and cooperate during cotton fiber elongation. In addition, we obtained detailed information about DEGs involved in cell wall precursor biosynthesis (fucose, rhamnose, galactose, arabinose and callose biosynthesis), cell wall-related protein (AGP, LRR), verylong-chain fatty acid biosynthesis (ACCase, FATB, KCS), phenylpropanoid pathway (OMT, 4CL, CAD) and so on. We also identified new candidate genes in introgressed segments by combing the transcriptome data. Our integrated analysis provided additional details and new insights into the mechanisms of fiber elongation.
Carbohydrate metabolism, cell wall-loosening and cytoskeleton genes associate with fiber cell elongation Function enrichment, KEGG analysis and metabolic pathway analysis showed that the categories carbohydrate metabolism and cell wall biosynthesis were highly enriched in the CSILs, and the categories cell wall processor synthesis, cellulose synthesis, hemicellulose synthesis, cell wall proteins (AGP, LRR) and pectin were enriched as well (Figure 2, 3 and Additional file 10: Figure S5). We also identified a large number of genes involved in primary cell wall biosynthesis and elongation, such as xyloglucan endotransglycosylases, pectin modification enzymes, expansins, tubulins and arabinogalactans that were upregulated in the CSILs during fiber elongation (Figure 4 and Additional file 9: Table S5).
Previous studies have shown that xyloglucan modifying enzymes such as xyloglucan endotransglycosylases (XETs) play a role in fiber cell development [22,23,46]. In the current study, XET22, XET28 and other XETs were highly upregulated in the CSILs during fiber elongation, which was similar to previous reports [46]. In particular, more XETs were upregulated in CSIL-35431 from 5 to 20 DPA.
Pectin is a polysaccharide component of the primary cell wall, and pectin modification enzymes play an important role in fiber elongation [20,22,50]. We examined the expression of genes encoding two types of pectin modification enzymes, such as pectin methyl esterase (PME) and pectin acetyl esterase, which were upregulated in CSILs from 5 to 20 DPA. In CSIL-31134, more pectin modification enzymes were upregulated during the early stage of cotton fiber development than.
Arabinogalactan proteins (AGPs), which are involved in many aspects of plant development, are abundant in developing fibers and are involved in fiber elongation. GhFLA1 improves fiber initiation and elongation by affecting the integrity of the primary cell wall matrix [51]. We found that AGPs genes were upregulated mainly at 10 DPA in CSIL-35431 and CSIL-31134, with high levels of increased expression ( Figure 4B). Similar results were obtained from metabolic pathway analysis (Additional file 10: Figure S5).
Expansins and tubulins are important for cell wallloosening and fiber elongation [25,26,46,52,53]. Expansins comprise four subfamilies including α-Expansin (EXPA), β-Expansin (EXPB), Expansin-like A (EXLA) and Expansinlike B (EXLB) [54]. EXPAs produce polysaccharide complexes such as xyloglucan and pectin, which link  Figure S2. Each square indicates one molecular function with GO term number. The GO terms level decreased from top to bottom. As the color deepens (from white to red), the molecular functions were enriched more significantly, with smaller P-values. Each gene family in a GO term is listed in black. DEGs indicated in red were upregulated in four CSILs, and DEGs indicated in blue were upregulated in three CSILs. The line with an asterisk indicates that the same molecular function is enriched, using common DEGs at 10 DPA. cellulosic microfibrils together [55]. In the current experiment, Expansin A1 (α-expansin 1) genes were highly expressed in the CSILs, while one expansin-like B1 gene was only upregulated in CSIL-35431. In addition, the βtubulin 6 gene was only upregulated in CSIL-35431 during the early stage of cotton fiber elongation.
Mining of DEGs in introgressed G. barbadense segments also showed that genes involved in glucuronosyltransferase, inositol-1, 4, 5-trisphosphate 3-kinase and desulfoglucosinolate sulfotransferase (SOT) activity were upregulated in the CSILs, indicating the important roles of these enzymes in carbohydrate metabolism. Several SOT proteins have been characterized in Flaveria sp. and Brassica napus L. that show substrate specificity for several flavonols, steroids and brassinosteroids, and these genes were involved in various physiological processes, such as growth, development and adaptation to stress [56][57][58]. Also, steroid sulfotransferases are targeted by small RNAs in fiber initials [59].
Fatty acids and secondary metabolic pathways associated with fiber cell elongation Fatty acids are another important factor involved in fiber elongation [60,61]. Several ACC synthase and long-chain acyl-CoA synthetase genes were upregulated in the CSILs, and more DEGs associated with metabolic pathways were detected ( Figure 5 and Additional file 10: Figure S5). KEGG and metabolic pathway analyses also identified numerous DEGs that participate in fatty acid metabolism. For example, ACCase, FATB, LACS and KCS genes, which are important for VLCFA biosynthesis, were upregulated in CSIL-31134 [62].
Secondary metabolism-related genes were among the most statistically significant differentially expressed categories between CSILs and TM-1 during fiber elongation. The phenylpropanoid pathway participates in the biosynthesis of many plant cell wall phenolics, which are responsible for the biosynthesis of a variety of products including lignins, lignans, hydroxycinnamic acid conjugates, flavonoids and other related constituents, and cinnamyl alcohol dehydrogenase (CAD) is considered to be a key enzyme in the biosynthesis of these products [63]. Our results show that 4CL and CAD genes were upregulated in CSIL-35431 and CSIL-31134, respectively. The expression of phenylpropanoid genes was highly correlated with specific fiber properties in an inter-specific cotton recombinant inbred line (RIL) population, supporting their role in determining fiber quality [64].

Phytohormones and TFs associated with fiber cell elongation
In this study, a large group of genes response to hormones including auxin, BR, ethylene, JA and ABA were upregulated in the CSILs, which were shown to play roles in fiber cell elongation (Figure 4 and Additional file 11: Table S6) [13,18,20]. We found that more auxin response genes were upregulated during the early stage (5 and 10 DPA) than during the later stage (15 and 20 DPA) of fiber development, including auxin response factor genes and SAURlike auxin responsive genes. In addition, the enrichment results show that genes involved in auxin signaling pathways were more highly expressed at the early stage, as previously reported [7]. Also, we found that nine ethylene responsive element binding factor genes had high expression levels in the CSILs in the early stage of fiber elongation; ethylene is an important factor in fiber elongation [11]. We also detected three upregulated BR genes in the CSILs; BR biosynthesis can induce the expression of GhTUB1, GhTUB3 and GhTUB9 in cultured cotton ovules [65].

Differentially altered metabolic pathway genes in the CSILs
Interestingly, by performing PCC analysis of all 24 libraries, we determined that different PCC values existed between the CSILs and TM-1, particularly at 15 DPA and 20 DPA (Figure 1). Normally, a low PCC value was observed between different stages only in TM-1 as a result of altered expression levels of most genes during fiber elongation. We propose that the expression of DEGs in different metabolic pathways is altered differentially in these lines, particularly during the later stage of fiber elongation.
Several metabolic pathways were also examined to help uncover the mechanism of altered fiber elongation, including cell wall, lipids, minor CHO, starch and sucrose pathways (Additional file 10: Figure S5). The number of DEGs in CSIL-35431, CSIL-31134 and CSIL-35368 differed at 10 DPA, with more DEGs identified in CSIL-31134. In CSIL-35431, most DEGs encoding cell wall protein (AGP, LRR), callose and cellulose (CESA, COBRA) and genes involved in cell wall precursor synthesis were upregulated. Fatty acid and ethylene-related genes were upregulated in CSIL-31134. In CSIL-35368, numerous downregulated genes were involved in lipid metabolism and starch and sucrose metabolism, especially in callose (Additional file 10: Figure S5). Callose serves as a matrix for the deposition of other cell wall materials and as a cell wallstrengthening material in cotton fibers and growing pollen tubes [10,69]. Our functional enrichment results also show that DEGs downregulated only in CSIL-35368 were enriched in microtubule-base movement, lipid catabolic process, wax biosynthetic in other stages; these categories were reported to be related to fiber elongation.