Identification, chromosomal distribution and subcellular localization of BBX gene family
To identify the BBX genes in the Gossypium spp. genome and obtain their sequences, a global search of the Gossypium spp. genomes were carried out by using HMM profiling of the BBX domain (PF00643). After ensuring that the identified members contained conserved domains and deleted the repeated sequences, in total, of 17, 18, 37 and 33 putative BBX sequences were identified in G. arboreum, G. raimondii, G. hirsutum and G. barbadense, respectively, via genome-wide identification analysis. In G. hirsutum, 1 BBX was located on scaffold fragments. The BBXs were named according to their location on the chromosomes (Fig. 1), and the BBXs located on the scaffold fragments in G. hirsutum is finally named. Table S2 contained detailed location information. The lengths of putative GaBBX protein sequences ranged from 163 aa (GaBBX3) to 374 aa (GaBBX13); GrBBXs, 197 aa (GrBBX16) to 374 aa (GrBBX10); GhBBXs, 166 aa (GhBBX29) to 374 aa (GhBBX32) and GbBBXs, 136 aa (GbBBX3) to 505 aa (GbBBX27). The predicted MW and pI of each BBX were shown in Table S2. The results of subcellular localization showed that all of the BBXs were located in the nucleus, indicating that the nucleus was the main region of biological functions of BBXs.
Based on the genomic location information of 105 BBX genes, we visualized the chromosome distribution of GaBBXs, GrBBXs, GhBBXs and GbBBXs (Fig. 1). In G. arboreum, 17 GaBBXs were unevenly distributed on 10 chromosomes. A12 and A13 contained 3 GaBBXs, whereas the other 8 chromosomes, A01, A02, A04, A05, A06, A08, A09 and A11, contained 1 or 2 GaBBXs. In G. raimondii, 18 GrBBXs were located on 9 chromosomes. D02 and D08 contained the most GrBBXs (3), while the other 6 chromosomes contained only 1 or 2 GrBBXs. In G. hirsutum, 37 GhBBXs were unevenly mapped to 21 chromosomes, while, GhBBX37 was located on unassembled scaffolds. At13 contained 4 GhBBXs. At01, Dt01, Dt12 and Dt13 contained 3 GhBBXs, and the other 16 chromosomes contained only 1 or 2 GhBBXs. In G. barbadense, 33 GrBBXs were unevenly mapped to 20 chromosomes. At12, At13, Dt12 and Dt13 contained 3 GrBBXs. The other 16 chromosomes contained only 1 or 2 GrBBXs.
Phylogenetic analysis of the BBX gene family
To investigate the phylogenetic relationships of BBXs, 137 BBX protein sequences (G. arboreum (17), G. raimondii (18), G. hirsutum (37), G. barbadense (33) and A. thaliana (32)) were used to construct a phylogenetic tree based on the NJ method. Members of the BBX family were classified into 5 major groups, I-V (Fig. 2), and each subgroup was named according to the taxonomic results of previous studies in Arabidopsis [4]. It was worth noting that although AtBBX26 and AtBBX27 belonged to group V of Arabidopsis according to their structural classification, they were phylogenetically closer to AtBBX12 and AtBBX13, which were in group II. As shown in Fig. 2, group II was the smallest subgroup, containing 7 BBXs. By contrast, group IV had the most massive numbers of BBX genes, including 69 BBXs. There were 29 BBXs in group I. No cotton species were divided into group III or group V. In G. hirsutum, BBX gene had 2, 9 and 26 members in group II, I and IV, respectively.
Replication events of BBX gene family
From the perspective of the cotton evolution, tetraploid cotton is the result of genome doubling of two diploid cotton hybrids. In terms of the number of genes, the sum of BBX genes in G. arboreum (17) and G. raimondii (19) was about equal to the number of those in G. hirsutum (37) or G. barbadense (33). The results further confirmed this view. To explore the replication events of the BBX gene family, MCScanX was used to analyze the collinearity between the At and Dt subgenomes of G. hirsutum and their corresponding ancestral A and D diploid genomes (Fig. 3). The data showed that most homologous gene pairs of the BBX gene family were amplified by segmental replication, which meant segmental replication played a key role in the evolution of the BBX gene family. However, the genomic evolution of allotetraploid cotton is extremely complex. In the process of evolution, the genome has experienced not only segmental duplication events but also many tandem duplication events. The duplicate types of BBXs in G. hirsutum were shown in detail in Table S3. In G. arboretum and G. raimondii, 1 and 2 tandem duplication events (GaBBX15/GaBBX16 as well as GrBBX1/GrBBX2 and GrBBX17/GrBBX18) were identified, respectively. In G. hirsutum, 4 tandem duplications (GhBBX1/GhBBX2, GhBBX15/GhBBX16, GhBBX18/GhBBX19, GhBBX34/GhBBX35) were discovered.
Ka/Ks ratios were calculated to evaluate the selection pressure of these homologous gene pairs. Among the 95 homologous gene pairs, 90 homologous gene pairs had Ka/Ks values < 1, which indicated that most of the homologous gene pairs had undergone purifying selection in the process of evolution, and these genes pairs might play a similar function. Only a few homologous gene pairs had experienced positive selection, which might lead to new biological functions of these genes.
Analysis of gene structure and conservative motif
The results of the phylogenetic analysis showed that 37 GhBBXs could be divided into 5 groups (A-E), which contained 9, 2, 12, 5 and 9 members, respectively (Fig. 4I). To better understand the structural characteristics of GhBBXs, the exon/intron structure was analyzed by GSDS (Fig. 4III). The GhBBX genes contained 3 to 7 exons, but most of them contained less than 5 exons. Moreover, the conserved motif was further analyzed by MEME program. The GhBBXs in the same group showed similar motif composition, which further validated the classification results (Fig. 4II). Except for group A, the order of motif 1 and motif 2 in GhBBX of other groups was the same. Motif 3 existed only in group A and group B, but motif 4 existed in all groups except group A and group B. Motif 5 existed only in group C, while motif 6 only existed in group E. Figure 4 showed that the distribution of conserved motif and exon/intron structure were different among different groups, but they were highly conservative on the same branches. The results showed apparent conservation, which laying a foundation for functional conservatism and providing guidance for follow-up functional research.
Analysis of cis-acting elements in GhBBX promoter regions
To better understand the regulation of GhBBXs gene transcription and expression, the promoter region of GhBBX (genomic DNA sequence 2 kb upstream of the transcription start site) were used to search the PlantCARE database. A variety of cis-elements were found in the GhBBX promoter region. Among the cis-acting elements, the cis-acting elements related to phytohormone and stress response were the focus of our attention. We found abscisic acid (ABA) response element, gibberellin (GA) response element, auxin (IAA) response element, salicylic acid (SA) response element and methyl jasmonate (MeJA) response element in 21, 19, 11, 17 and 17 GhBBX promoters, respectively. In some GhBBX promoters, there were cis-acting element related to multiple phytohormone, while in other GhBBX promoters, there were only cis-acting element related to a single phytohormone response. In terms of stress-related response elements, these cis-acting elements were mainly related to low temperature, drought, anaerobic and other defenses. In the midst of these elements, the anaerobic cis-acting element was the most frequent stress response element, which appeared in the promoters of 32 GhBBX genes, followed by the cis-acting element in response to low temperature. It existed in the promoters of 20 GhBBX genes. Thus, it could be seen that GhBBX might respond to stress response and abiotic stress of cotton. In addition, a large number of light response elements were found in the promoter region of GhBBXs, including Box-4, G-box, GT1-motif, TCT-motif and MRE.
Expression patterns of GhBBXs in different tissues
In order to study the expression pattern of GhBBXs in different tissues, we analyzed the transcriptomic data of root, stem, leaf, anther, filament, pistil and petal in TM-1. The results showed that different members of the cotton BBX family showed different expression patterns. According to the expression characteristics and based on hierarchical clustering analysis, 37 GhBBXs were divided into 3 categories (I-III) (Fig. 5). 6 GhBBXs (GhBBX5, 8, 9, 23 26, and 28) belonging to group II were highly expressed in nearly all tissues. 10 GhBBXs (GhBBX2, 4, 10, 16, 19, 21, 22, 29, 32 and 35) belonging to group I were poorly expressed in all tissues. The remaining members (GhBBX1, 3, 6, 7, 11, 12, 13, 14, 15, 17, 18, 20, 24, 25, 27, 30, 31, 33, 34, 36 and 37) belonging to group III exhibited slightly higher expression in vegetative organs, while others showed slightly higher expression in floral organs. These differences in expression patterns might be related to the various functions of GhBBXs.
Expression characterization of GhBBXs in cotton flower bud differentiation
Flower bud differentiation is an important sign that a plant is undergoing a transition from vegetative growth to reproductive growth [27]. To explore whether GhBBX gene, which was highly expressed in flower organs, was involved in the process of flower bud differentiation, we analyzed the relative expression of these genes in the leaf and shoot apex of the early-maturing cotton cultivar CCRI50 and late-maturing cotton cultivar GX11 from one-leaf stage to five-leaf stage. The graphical representation of the expression profiles of 6 genes in the leaf and shoot apex at 5 different times was shown in Fig. 6. In the leaf, the expression levels of these six genes in the three-leaf stage and five-leaf stage of early-maturing materials were higher than those of late-maturing materials, and showed a significant difference. There was a difference in the expression of GhBBX5, GhBBX8 and GhBBX26 between early-maturing materials and late-maturing materials at four-leaf stage, but in one-leaf stage and two-leaf stage, only the expression of GhBBX23 was different between early-maturing materials and late-maturing materials, there was no significant difference in the expression of the other five genes. In the shoot apex, the expression levels of GhBBX5, GhBBX8, GhBBX9, and GhBBX26 in the early-maturing material were higher than those in the late-maturing material, and significant or extremely significant differences were detected at 4, 1, 3, and 5 periods, respectively. The expression levels of the other 2 genes, GhBBX23 and GhBBX28, in the early-maturing material were higher than those in the late-maturing materials at the most stages, but at three-leaf stage, the expression levels in the early-maturing material were lower than those in the late-maturing material.
GhBBXs interact with key genes in cotton flowering pathway
The previous studies showed that GhFT, GhSOC1, GhLFY and GhAP1 might be key factors in cotton flowering pathway [28,29,30,31]. In order to explore the interaction of GhBBXs with key regulatory factors in flowering pathway, we selected GhBBX5, GhBBX8, GhBBX9, GhBBX23, GhBBX26 and GhBBX28 to detect their interaction with GhFT, GhSOC1, GhLFY and GhAP1 by yeast two-hybrid assay. The results showed that GhBBX5, GhBBX8, GhBBX23 and GhBBX26 could interacte with GhFT. At the same time, there was no interaction between these genes and GhSOC1, GhAP1 and GhLFY (Fig. 7). Therefore, the results suggested that some members of GhBBXs regulated the flowering time by interacting with GhFT.
Expression pattern of GhBBXs under multiple stress
The analysis of cis-acting elements in the promoter region showed that GhBBXs might be related to the abiotic stress, and some studies have shown that BBX genes play a positive role in abiotic stress resistance [23, 32]. Therefore, based on the transcriptomic data of TM-1, we further analyzed the expression patterns of GhBBX under PEG and NaCl stresses. The results showed that GhBBXs clustered mainly into three categories (I-III) based on a hierarchical clustering analysis according to expression features (Fig. 5B). The GhBBX genes that belonged to groups II and III responded to PEG and NaCl, especially at 12 h after treatment, showing significantly downregulated expression. 11 GhBBXs belonging to group I showed universally low expression after stress treatment.
Based on the transcriptomic data, GhBBX5, GhBBX8, GhBBX9, GhBBX17, GhBBX23, GhBBX26, GhBBX28 and GhBBX36 which belonged to group II and whose expression levels increased after stress treatment, were selected for qRT-PCR verification. As shown in Fig. 8, PEG stress affected the activity of GhBBXs. GhBBX5, GhBBX23 and GhBBX28 responded to PEG treatment at 1 h, after which their expression was downregulated continuously until a minimum point was reached at 12 h. Afterwards, their expression increased at 24 h. The expression of GhBBX8, GhBBX9, GhBBX17 and GhBBX26 increased at the beginning, after which it decreased but then increased again. The minimum expression level occurred at 12 h. GhBBX36 was the only gene whose expression abruptly decreased but then increased, followed by a sudden increase after reaching the minimum at 12 h. Under NaCl stress treatment, GhBBX5, GhBBX8, GhBBX9, GhBBX17, GhBBX23, GhBBX26 and GhBBX36 showed the same expression pattern. The expression of these genes quickly decreased but then increased. Afterwards, their expression reached a minimum at 12 h followed by an increase. GhBBX28 was the only gene whose expression began at 9 h, after which it decreased; after reaching a minimum at 12 h, the expression increased. Taken together, the results showed that GhBBXs could respond to plant abiotic stresses, and at the same time, it provided potential candidate genes for further study.
Expression pattern of GhBBXs under phytohormone
In order to study the response of GhBBXs to phytohormone, 9 genes containing corresponding hormone-response elements in their promoters were selected for qRT-PCR experiment to further analyze the transcript levels of BBX genes under three different hormone treatments (Fig. 9). After treated with ABA (Fig. 9 A), the expression of most of these genes increased early (mainly from 0.5 to 2 h), followed by a decrease. The expression of three genes decreased early, after which it increased and finally decreased with time. Under GA treatment (Fig. 9B), 4 expression patterns were found. the expression of most of these genes increased early, peaking at 0.5 or 1 h after GA application, after which their expression decreased over time. The expression of two genes (GhBBX18 and GhBBX20) was downregulated at all time points, and the expression of one gene, GhBBX11, was upregulated at all time points except the 9 h time point. The expression of GhBBX7 was upregulated early on, after which it decreased and then increased. In response to IAA treatment (Fig. 9 C), these genes could be divided into 2 major patterns on the basis of their expression characteristics. The expression of most of these genes increased early, followed by a decrease, peaking from 0.5 to 2 h. Three genes (GhBBX7, GhBBX13 and GhBBX25) presented downregulated expression at all time points, reaching a minimum at 12 h.