Genome-wide characterization of the GRF family and their roles in response to salt stress in Gossypium
BMC Genomics volume 21, Article number: 575 (2020)
Cotton (Gossypium spp.) is the most important world-wide fiber crop but salt stress limits cotton production in coastal and other areas. Growth regulation factors (GRFs) play regulatory roles in response to salt stress, but their roles have not been studied in cotton under salt stress.
We identified 19 GRF genes in G. raimondii, 18 in G. arboreum, 34 in G. hirsutum and 45 in G. barbadense, respectively. These GRF genes were phylogenetically analyzed leading to the recognition of seven GRF clades. GRF genes from diploid cottons (G. raimondii and G. arboreum) were largely retained in allopolyploid cotton, with subsequent gene expansion in G. barbadense relative to G. hirsutum. Most G. hirsutum GRF (GhGRF) genes are preferentially expressed in young and growing tissues. To explore their possible role in salt stress, we used qRT-PCR to study expression responses to NaCl treatment, showing that five GhGRF genes were down-regulated in leaves. RNA-seq experiments showed that seven GhGRF genes exhibited decreased expression in leaves under NaCl treatment, three of which (GhGRF3, GhGRF4, and GhGRF16) were identified by both RNA-seq and qRT-PCR. We also identified six and three GRF genes that exhibit decreased expression under salt stress in G. arboreum and G. barbadense, respectively. Consistent with its lack of leaf withering or yellowing under the salt treatment conditions, G. arboreum had better salt tolerance than G. hirsutum and G. barbadense. Our results suggest that GRF genes are involved in salt stress responses in Gossypium.
In summary, we identified candidate GRF genes that were involved in salt stress responses in cotton.
Cotton (from the genus Gossypium) is the most important fiber crop in the world. While the genus itself contains more than 50 species, cultivated cottons are composed of four independently domesticated species, among which the Upland cotton (G. hirsutum L.) is the most prominent cultivar, accounting for 95% of global cotton fiber output [1, 2]. Due to the agronomic importance of cotton, it is worth studying its defense and stress responses. Cotton plants produce gossypol and related sesquiterpene aldehydes that function as defensive compounds against pests and pathogens [3,4,5,6,7]. With respect to abiotic stresses, cotton has a moderate tolerance to salinity and consequently is cultivated in some saline-alkali soils; however, salt stress is still a limiting factor that affects cotton production .
Identification and characterization of salt response genes is important in dissecting the molecular mechanisms of plant adaptation to salt stresses and, ultimately, in engineering salt-tolerant crops. Consequently, research into the genetic responses to salt stress has yielded much information (see recent reviews, such as [9,10,11,12]), including the role of growth regulation on salt tolerance. Growth-regulating factors (GRFs) are plant-specific transcription factors, which form a relatively small family and function in plant development and stress responses . The first GRF was identified from rice, which uncovered a regulatory role for GRFs during leaf and stem development . Subsequent research showed that these transcription factors are also involved in other aspects of plant growth and adaptation, including root development , flowering [16, 17], leaf size and longevity , and response to abiotic stresses [19,20,21,22]. Recent research on abiotic stress responses have reported a role for GRF repression in abiotic stress tolerance. For example, the activation of the Arabidopsis stress-responsive gene AtDREB2A, whose expression increases plant tolerance to osmotic stress [23, 24], requires repression of at least one GRF. Under non-stress conditions, AtGRF7 binds to the DREB2A promoter to suppress expression; however, abiotic stress leads to suppression of AtGRF7 expression and consequently the activation of osmotic stress-responsive genes .
Recent advances in cotton genomics have produced the resources necessary to characterize the GRF gene family in Gossypium. Multiple high-quality genome sequences are available for several species, including two diploid species, i.e., G. raimondii (D5) [25, 26] and G. arboreum (A2; cultivated) [27, 28], and two cultivated allotetraploids, namely G. hirsutum (AD1) [29,30,31] and G. barbadense (AD2) [31,32,33,34]. These resources provide the foundation for identifying the suite of GRF genes in Gossypium and their potential roles as salt stress-related genes. Here, we report our genome-wide analysis of the GRF transcription factor genes in four important cotton species, reporting a modestly sized family (18 and 19 members in diploid species, 32 and 45 in polyploids) that are largely conserved during evolution and polyploidization. We further explore the role of GRF genes in response to salt stress. We identified three G. hirsutum GRF genes, six G. arboreum GRF genes, and three G. barbadense GRF genes that exhibit decreased expression under salt stress, respectively. Compared to the tetraploid species, diploid cotton G. arboreum was more salt tolerant. These candidate GRF genes may be useful in future molecular breeding of salt-tolerant cotton species.
Genome-wide identification and sequence analysis of genes encoding putative GRFs in G. hirsutum
GRF proteins are involved in plant responses to abiotic stresses [19, 22], including salt stress . These proteins are defined by the presence of a QLQ domain for protein-protein interactions, a WRC domain responsible for DNA binding and a putative nuclear localization signal [13, 14, 35]. Hmmersearch against the G. hirsutum genome database with these two conserved domains (PF08879 for WRC domain and PF08880 for QLQ domain) identified 34 putative GRF-encoding genes, here designated GhGRF1A–GhGRF17A (A homoeologs) and GhGRF1D–GhGRF17D (D homoeologs), where the A and D suffix in the gene name indicates the genome of origin. These 34 GhGRF genes are dispersed over 20 of the 26 G. hirsutum chromosomes, with most, but not all, homoeologs conserved (Fig. 1). Syntenic conservation of two homoeologous pairs, i.e., GhGRF5A/D and GhGRF11A/D, was disrupted by a known large chromosomal translocation [29, 30, 32]. This translocation resulted in these conserved homoeologs being located on the non-homoeologous chromosomes A02 and D03 (Fig. 1). In addition, a single homoeolog each was recovered for both GhGRF7D and GhGRF13A (Fig. 1) by conserved domain search; however, BLAST recovered syntenic copies of GRF-like genes for both genes that could represent the missing homoeologs, i.e., GhGRF7A and GhGRF13D. Both syntenic GRF-like homoeologs lack the canonical conserved GRF protein domains, possibly indicating a loss of, or change in, function; therefore, these homoeologs were excluded from further analyses.
Structural organization of GhGRF genes
More than threefold variation in length was detected in the predicted coding sequences (CDS) for the recovered GhGRFs, from 546 bp for GhGRF14A/D to 1833 bp for GhGRF2A/D, (Table 1), which translates to proteins ranging from 181 amino acids (aa) (20.73 kDa) to 610 aa (65.58 kDa). Predicted isoelectric points (pI) for members of this family also vary widely, from 5.89 to 9.59, possibly due to the composition of the C-terminal region (Additional file 3: Figure S1). All of the putative GhGRF proteins have both QLQ and WRC domains  in the N-terminal region (Table 1, Fig. 2, Additional file 3: Figure S1). GhGRF15A and GhGRF15D also contain a second WRC domain downstream of the first one (Additional file 3: Fig. S1). A zinc finger motif (CCCH) was also observed within the WRC domain in all putative GhGRF proteins (Fig. 2).
While all putative GhGRF genes contain predicted introns (Fig. 3), they also exhibit considerable variation, in both length and number. In general, homoeologous GRF genes show highly similar intron patterns (see exceptions below); however, intron structure among homoeologous pairs can exhibit variation in intron number (1 to 4) and length. Three of the homoeologous gene pairs did exhibit divergence in structure between homoeologs, namely GhGRF9A vs GhGRF9D, GhGRF16A vs GhGRF16D and GhGRF17A vs GhGRF17D. In two of the three cases (i.e., GhGRF9, GhGRF16), the acquisition of a splice site in the A-homoeolog led to an additional, large intron. For GhGRF17A/D, however, the D-homoeolog exhibits loss of splicing for the first intron, resulting in read-through transcription. Characterization of parental (in the diploids) gene structure for these three homoeologs suggests that this structural variation represents inherited, parental divergence. Phylogenetic analysis of the GRF gene family combined with intron/exon structure characterization naturally divides the family into six groups, here designated A-F, containing 3 to 12 genes (Fig. 3).
Phylogenetic analysis of GRF proteins in Gossypium
The general conservation of GRF genes between the two homoeologous genomes of G. hirsutum suggests minimal loss and/or gain since the divergence of the progenitor diploid genomes. We specifically assessed this using the protein sequences of 114 cotton GRFs (G. hirsutum, 32; G. barbadense, 45; G. raimondii, 19; and G. arboreum, 18; see methods) with 9 Arabidopsis thaliana GRFs for phylogenetic analysis (Fig. 4). The six previously designated clades (A–F) were again recovered plus extra clade “G”, with 1–2 A. thaliana genes associated with each clade except clades B and G, which are composed of Gossypium GRFs only.
Overall, the expected diploid-polyploid topology is reflected in the tree for each set of orthologous/homoeologous genes, indicating general preservation during diploid divergence and through polyploid evolution. That is, the number of GRF genes in G. hirsutum was generally additive with respect to the model diploid progenitors (G. raimondii and G. arboreum), with each homoeolog (At or Dt) sister to their respective parental copies.
Although the GRF family exhibits general preservation, a few deviations were noted. For example, Clade C exhibits evidence of homoeolog loss; that is, the D copy of GhGRF13 is missing from polyploid genome, as is both copy of GaGRF11/GrGRF13 (one from each subgenome). In clade D, genes related to AtGRF8 exhibit duplication in G. arboreum only, while the A-homoeolog from either polyploid species was not recovered from the genome sequence. This may indicate a duplication in G. arboreum after divergence from the polyploid ancestor coupled with loss in the A-subgenome of the polyploid. The evolution leading to clade E is less straightforward in that it is composed of eight genes, including one from each diploid progenitor and three from each tetraploid species. The phylogenetic topology, however, does not suggest a simple duplication of one homoeologous copy; rather, the homoeologous copies of GhGRF8 are placed sister first to each other and then to G. arboreum only. The remaining GRF from G. hirsutum, GhGRF7D, is sister to the only G. raimondii GRF of this clade and has no inferred homoeologous AT copy. Given this pattern of relationships and that G. raimondii has one additional GRF (relative to G. arboreum; see next section), it seems possible that GhGRF7D represents a uniquely inherited GRF present in the D ancestor only and that the G. raimondii ortholog of GhGRF8D was lost after polyploidization. Alternatively, this pattern could reflect a non-syntenic duplication of GhGRF8D, followed by conversion of the original GhGRF8D into an A-like copy. It bears noting, however, that these observed deviations could be due to errors in the genome sequences; however, the general preservation of GRF copies in the expected relationships suggests general robustness of the analysis. Clade G contains 11 GRF genes, one from G. raimondii and the other ten from G. barbadense, also indicative of substantial duplication in G. barbadense. These new genes belongs to one family and might specially originate in G. raimondii and the tetraploid progenitor of G. barbadense or undergone the homoeolog loss in G. arboreum and G. hirsutum.
Dynamic evolution of GRF family genes in plants
We further evaluated the general preservation of GRF genes in plants using 28 representative species and the same search criteria (see methods and Fig. 5). In Chlorophyta, only one GRF gene was identified as putatively encoding a GRF protein (in Chlamydomonas reinhardtii only); however, all land plant species surveyed recovered a minimum of five putative GRF genes in Physcomitrella patens and nearly double were found in angiosperms (8 in the basal angiosperm, Amborella trichopoda). Among angiosperms, GRF copy number varied from the minimum of 8, in Amborella trichopoda, to 48 copies in the tetraploid Brassica napus. Four monocot species were included, where between 9 (in rice) to 27 (in maize) copies were detected. Notably, this high copy number in maize is slightly more than double the copy number in sorghum, likely reflective of the duplicated history of maize. GRF copy numbers in the eudicot species surveyed varied even more, from 12 putative GRF genes in Carica papaya to 48 in Brassica napus, notably another polyploid species genome.
Comparatively, Gossypium GRF copy number is generally stable. We recovered 19 and 18 GRF genes from G. raimondii (model D-genome ancestor) and G. arboreum (model A-genome ancestor), respectively, dispersed among 10 of the 13 chromosomes (Additional file 4: Figure S2). The allotetraploid cotton species included here both contain a roughly additive number of genes; however, while G. hirsutum appears to have lost genes (34 GRF genes, versus the additive expectation of 36), G. barbadense has gained 9 additional predicted copies through duplication (45 versus 36; Fig. 5).
Expression changes of GhGRF genes in G. arboreum, G. hirsutum, and G. barbadense under salt stress
GRF genes have been linked to various aspects of growth, development, and response to stress. Long-term effects of salt stress affect all aspects of the plant life cycle, from seed germination through growth and development to future reproductive potential . While the effects of salt stress on roots is often considered [37, 38], equally important are the consequent physiological changes in leaves where the plants exclude sodium ions, potassium/potassium balance, and reduce water loss [39,40,41,42,43,44,45]. Previously, a single cotton GRF gene (GhGRF1 ;) was associated with responses to salt stress; however, the contributions of the remaining family members to salt stress responses is unknown.
To evaluate the contribution of these GhGRF genes to leaf physiology, we first evaluated their expression under normal conditions in 12 different cotton tissues, including root, stem, leaf, cotyledon, hypocotyl, petal, stamen, pistil, sepals, receptacle, ovules (0 dpa) and fiber (6 dpa), using both qPCR and existing RNA-seq data; homoeologs were not distinguished in the qPCR analysis. Most GhGRF genes exhibited different expression patterns among the various tissues; however, certain generalities were observed. Several tissues exhibited distinct paralog preference in our qPCR survey. For example, six of the examined GhGRF genes (GhGRF5, 6, 11, 15–17; Fig. 6) were broadly expressed across tissues, whereas six other GhGRF genes (GhGRF7, 9, 10, 12–14; Fig. 6) were mainly expressed in the ovule and fiber samples, with limited expression in most other tissues. Most GhGRF genes, including GhGRF5-GhGRF12, were expressed in the developing ovule (Fig. 6), which transitioned to preferential usage of seven genes (GhGRF1, GhGRF4, GhGRF7, GhGRF10, GhGRF13 and GhGRF14) during the primary elongation stage of cotton fiber development (6 dpa). Transcripts of GhGRF1, GhGRF2, GhGRF3 and GhGRF4 were most abundant in stem and root, whereas GhGRF16 and GhGRF17 were preferentially expressed in cotyledon. Over half of the GhGRF genes showed the relatively high level of transcripts in leaves.
RNA-sequencing (RNA-Seq) data (Additional file 5: Figure S3) obtained from the G. hirsutum genome sequencing project  and the ccNET database  are generally congruent with the expression patterns revealed by qPCR (Fig. 6), although there exist a few discrepancies (Additional file 5: Figure S3). These are generally limited to relative abundance among tissues, such as GhGRF14, which exhibited the highest expression in pistil by the RNA-Seq data, but showed higher expression in ovule (0 dpa) and fiber (6 dpa) using qPCR (Fig. 6). Expression in the leaf RNA-seq was much lower for most genes, except GhGRF1, and the observation may be attributable to differences in leaf maturity. Generally, however, the two expression analyses suggest a complex pattern of paralog usage among GhGRF genes indicative of multiple biological functions during plant growth and development.
Upland cotton can experience stress  due to salt accumulation in the soil. Since GRFs have been shown to respond to salt stress responses in other plants , we observed phenotypic changes under two concentrations of NaCl and concurrently analyzed GhGRF expression profiles using qRT-PCR (Fig. 7). Phenotypic changes in the leaves of Upland cotton were observed using both 200 and 500 mM NaCl over a period of 6 h. With 200 mM NaCl, leaves of G. hirsutum experienced yellowing only (indicative of chloroplast degradation), whereas treatment with 500 mM NaCl resulted in leaf yellowing and withering (Fig. 7a). Expression changes (relative to the control; Fig. 7b) were not significant for the 200 mM NaCl at three time-points (1 h, 3 h and 6 h) post-treatment; however, the higher NaCl concentration (500 mM) resulted in down-regulation of five genes (GhGRF3, GhGRF4, GhGRF5, GhGRF7, and GhGRF16; Fig. 7b). No GhGRF genes experienced up-regulation under salt stress, which is congruent with the observation that salt stress leads to diminished growth. The RNA-seq data derived from the G. hirsutum genome and ccNET  generally concur with these results. For example, 3 of the 5 genes downregulated here (i.e., GhGRF3, GhGRF4, and GhGRF16) also exhibited significantly decreased expression levels (≥2-fold change) at four time-points during NaCl stress in the available RNA-seq data (Fig. 8). These three GRFs in particular may be commonly responsive to salt stress and warrant further functional characterization.
We also observed the phenotypic changes of G. arboreum and G. barbadense under two concentrations of NaCl and concurrently analyzed the GaGRF and GbGRF expression profiles by qRT-PCR (Fig. 9). Phenotypic changes in the leaves of G. arboreum and G. barbadense were observed using both 200 and 500 mM NaCl over a period of 6 h. Although neither 200 nor 500 mM NaCl treatment resulting in yellowing or withering of G. arboreum leaves (Fig. 9a), downregulation of six GaGRF genes (Fig. 9b) was nevertheless detected at three time-points (1 h, 3 h and 6 h) post-treatment. Notably, only two of these genes, i.e., GaGRF5 and GaGRF16, were also down-regulated in G. hirsutum. Similar to G. hirsutum, G. barbadense was modestly affected by low salt treatment, with leaf withering beginning at 200 mM NaCl and both leaf yellowing and withering present at 500 mM NaCl (Fig. 9c); however, here only three GbGRF genes (Fig. 9d) exhibited down-regulation, only one of which (GbGRF4) exhibited downregulation in either G. hirsutum or G. arboreum. Of the three genes downregulated in G. barbadense, expression changes of GbGRF17 was significant for the 200 mM and 500 mM NaCl at three time-points (1 h, 3 h and 6 h) post-treatment. These results demonstrate that (1) G. arboreum has better salt tolerance than both G. hirsutum and G. barbadense, congruent with the previous observations [28, 48, 49], and (2) GRF responses to salt stress overlap, but are not consistent, among species.
The GRF transcription factor family of genes has been investigated in a number of plant species, including Arabidopsis thaliana [20, 50,51,52,53], Zea mays , Brassica napus , Brassica rapa , Oryza sativa [56,57,58], Brachypodium distachyon  and Solanum lycopersicum . The family generally contains conserved functional regions, including a protein-protein interaction QLQ (glutamine, leucine, glutamine) domain that was involved in chromatin remodeling  and a plant-specific WRC (tryptophan, arginine, cysteine) motif relevant to DNA binding and targeting of the TF to the nucleus [14, 58]. The conservation of these domains among GRF genes facilitates computational identification of the family from new and emerging genomes. Here, we identify 32 GRF genes from the allotetraploid G. hirsutum, and show that these genes are generally conserved among the diploid progenitors and the cultivated allotetraploid cottons. Analysis of the GRF genes in an additional cotton species, as well as representatives of diverse plant lineages, show that the family is somewhat labile but the most remarkable changes in copy number are attributable to retention after genome duplication. In tetraploid cottons, the gene copy number is nearly additive in G. hirsutum, but contains 25% more paralogs in the related (and also domesticated) G. barbadense genome. Phylogenetic analysis indicated that these additive genes belongs to one family and specially originate in G. barbadense. Further research is required to discern what functional relevance these additional copies have.
We systematically characterized cotton GRF family genes using bioinformatic and phylogenetic approaches and gene expression analyses. We analyzed gene structures, chromosomal locations, intron-exon organizations, phylogenetic relationships and expression profile patterns in different cotton tissues and under salt stress condition to predict their possible biological functions. GhGRF genes are variably expressed in different cotton tissues with particularly high expression in ovules. The decreased expression of several GhGRF genes in response to salt stress treatments implies their function in the regulations of growth and development under the abiotic stress conditions. Together, our results provide data to facilitate the functional identification of the GRF genes in cotton plant growth, development and stress tolerance.
Identification of GRF family genes and GRF proteins in diploid and tetraploid Gossypium species
We downloaded the genome sequences of four cotton species from the CottonGen database , including G. raimondii , G. arboreum , G. hirsutum , and G. barbadense . To identify all putative GRF transcription factor proteins in each genome assembly, the GRF protein conserved domains (PF08879 for WRC domain and PF08880 for QLQ domain) were used to develop a Hidden Markov Model  profile matrix via the hmmbuild program from the HMMER package  using default parameters. This HMM profile matrix was used in conjunction with hmmersearch with default parameters against three Gossypium genome databases, i.e., G. raimondii, G. arboreum, and G. hirsutum, to identify putative GRF genes (GhGRFs) that contain the relevant conserved protein domains. Genes were considered candidate GRFs if they harbored WRC and QLQ domains within the N-terminal . Previously identified GRF gene sequences from Arabidopsis thaliana (AtGRFs) were retrieved from the TAIR database  for phylogenetic comparison. The presence of conserved domains in each Arabidopsis gene was verified using the SMART conserved domain search tool  and Pfam databases . The same method was used to identify the number of GRF genes in other plant genera/species (Additional file 1: Table S1).
Chromosomal location and gene structure analyses
Chromosomal locations for each of the above identified GhGRFs were extracted from the genome annotation gff3 file . Chromosomal locations of the predicted GhGRFs was visualized using MapChart , and the exon-intron structure of each gene was displayed using the online tool GSDS 2.0 . The number of amino acids, molecular weight (MW), and theoretical isoelectric point (pI) of putative GhGRF proteins were determined using the ProtParam tool .
Sequence alignment and phylogenetic analyses
Complete protein-coding sequences for each of these genes from all three cotton species and Arabidopsis were aligned using MAFFT with the G-INS-i algorithm . Phylogenetic analyses based on the whole protein sequences were performed using Neighbor-Joining (NJ) and Maximum Likelihood (ML). The NJ tree was constructed using MEGA version 5.03  by sampling 1000 bootstrap replicates. The ML tree was also built using MEGA version 5.03, using the general time reversible (GTR) model, including rate variation among sites (+G) and invariable sites (+I; full model = GTR + G + I), and running 1000 bootstrap replicates of the data.
Transcriptome data-based gene expression analyses
Raw RNA-Seq data for G. hirsutum seed, root, stem, leaf, torus, petal, stamen, ovary, calyx, ovule (− 3 dpa, − 1 dpa, 0 dpa, 1 dpa, 3 dpa, 5 dpa, 10 dpa, 20 dpa, 25dpa, 35dpa) and fiber (5 dpa, 10 dpa, 20 dpa, 25dpa) were downloaded from the NCBI Sequence Read Archive (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA248163) , represented by one library each. Reads were mapped to G. hirsutum genome  via HISAT2 software with default parameters, and read abundance with calculated via StringTie [72, 73]. Read counts were normalized in R3.2 using RUVSeq  and the internal control reference gene GhUBQ7 (accession number: DQ116441), which is detected at relatively constant levels across different cotton samples . Potential batch-effects were corrected by an improved version of ComBat, ComBat-seq . Gene expression was estimated by Ballgown , using fragments per kilobase million (FPKM) values to calculate the gene expression levels across libraries. Expression levels of G. hirsutum leaf RNA-Seq data (in FPKM) for each GhGRF gene under salt stress (time points: 0, 1, 3, 6, 12 h) were retrieved from the ccNET database . Genes were considered differentially expressed if expression varied more than two-fold change with a p-value of less than 0.05. Multiple Experiment Viewer (MeV)  was used to display the gene expression patterns from the reported FPKM values.
Plant cultivation and treatment
To generate new expression information via qRT-PCR, we grew representatives of G. hirsutum, G. arboreum, and G. barbadense. For G. hirsutum, seeds of G. hirsutum cv. R15 , were germinated in potting soil in a growth chamber, and the resulting seedlings were maintained in a controlled environment at 28 °C day/20 °C night, with a 16-h light/8-h dark photoperiod. Roots, stems, leaves, cotyledons, and hypocotyls were collected from the three-week old plants, and additional samples were collected from older, flowering plants; these include petal, stamen, pistil, sepals, torus, ovules (0 dpa (days post anthesis)) and fiber (6 dpa). Three biological replicates were collected for each sample, each with three technical replicates. For salt treatment, 28-day old plants were sprayed with 200 and 500 mM NaCl solution after surfactant (Triton X-100) treatments. Leaves from salt-treated plants were collected at 0 (control), 1, 3, and 6 h post-NaCl treatment for further expression analyses. All plant tissues were frozen in liquid nitrogen immediately after collection and stored at − 80 °C until RNA extraction. All treatments was sampled at least three times.
Similarly, G. arboreum cv. Shixiya1 and G. barbadense H7124 were grown for qRT-PCR of salt-exposed leaf tissue timepoints only. For this experiment, seeds of G. arboreum cv. Shixiya1 were provided by Prof. Tianzhen Zhang and G. barbadense H7124 seeds were provided from the Esquel Group. These two Gossypium species were planted in the Damao field in Sanya, Hainan Province, China. For salt treatment, 50-day old plants were sprayed with 200 and 500 mM NaCl solution after surfactant (Triton X-100) treatment. Leaves from salt-treated plants were collected at 0 (control), 1, 3, and 6 h post-NaCl treatment as above.
RNA extraction, cDNA synthesis and qRT-PCR expression analyses
Total RNAs from cotton tissues were extracted using the RNAprep pure plant kit (TIANGEN, Shanghai, China) according to the manufacturer’s protocol. The resulting RNAs were treated with DNase I prior to synthesizing cDNA with oligo (dT) primers and M-MLV Reverse Transcriptase (Invitrogen); these products were diluted 5-fold before use. For quantitative real-time PCR (qRT-PCR), Primer5 software was used to design gene-specific forward and reverse primers (Additional file 2: Table S2). As these primers are not homoeolog specific, both copies were amplified when retained in duplicate. Analyses were performed with SYBR-Green PCR Mastermix (TaKaRa) on a cycler (Mastercycler RealPlex; Eppendorf Ltd., Shanghai, China). The G. hirsutum histone-3 (GhHIS3, AF024716) and GhUBQ7 (accession number: DQ116441) genes were used as internal references, and the relative amount of amplified product was calculated following the 2-∆∆Ct method . For the G. hirsutum samples, relative expression levels among different organs were normalized by calibrating with the root sample from that plant. The root sample was washed with DEPC sterile water three times before extracting the RNA.
Availability of data and materials
The genome sequences of four cotton species and the genome annotation gff3 file were downloaded from the CottonGen database (https://www.cottongen.org/data/download) . GRF gene sequences from Arabidopsis thaliana were retrieved from the TAIR database (https://www.arabidopsis.org/servlets/Search?type=general&search_action=detail&method=1&show_obsolete=F&name=GRF&sub_type=gene&SEARCH_EXACT=4&SEARCH_CONTAINS=1) . Raw RNA-Seq data for G. hirsutum seed, root, stem, leaf, torus, petal, stamen, ovary, calyx, ovule and fiber were downloaded from the NCBI Sequence Read Archive (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA248163) (NCBI Sequence Read Archive SRR1695173, SRR1695174, SRR1695175, SRR1695177, SRR1695178, SRR1695179, SRR1695181, SRR1695182, SRR1695183, SRR1695184, SRR1695185, SRR1695191, SRR1695192, SRR1695193,SRR1695194, SRR1768504, SRR1768505, SRR1768506, SRR1768507, SRR1768508, SRR1768509, SRR1768510, SRR1768511, SRR1768512, SRR1768513, SRR1768514, SRR1768515, SRR1768516, SRR1768517, SRR1768518 and SRR1768519) . The G. hirsutum histone-3 (GhHIS3, AF024716) and GhUBQ7 (accession number: DQ116441) genes were downloaded from the National Center for Biotechnology Information (NCBI) database, which were used as internal references. All other data generated or analyzed during this study are included in this published article and its Additional files.
Growth regulating factors
Days post anthesis
Fragments per kilobase of transcript per million mapped fragments
- G. arboreum :
- G. hirsutum :
- G. raimondii :
Quantitative real-time polymerase chain reaction
Wendel FJ, Grover C. Taxonomy and evolution of the cotton genus, Gossypium. 2nd ed. Madison: American Society of Agronomy Inc., Crop Science Society of America Inc., and Soil Science Society of America Inc; 2015.
Wang K, Wendel JF, Hua J. Designations for individual genomes and chromosomes in Gossypium. J Cotton Res. 2018;1(1):3.
Tian X, Ruan JX, Huang JQ, Yang CQ, Fang X, Chen ZW, Hong H, Wang LJ, Mao YB, Lu S, et al. Characterization of gossypol biosynthetic pathway. Proc Natl Acad Sci U S A. 2018;115(23):E5410–8.
van Kretschmar B, Cabrera AR, Bradley JR, Roe RM. Novel adult feeding disruption test (FDT) to detect insecticide resistance of lepidopteran pests in cotton. Pest Manag Sci. 2013;69(5):652–60.
Chen D, Chen F, Chen C, Chen X, Mao Y. Transcriptome analysis of three cotton pests reveals features of gene expressions in the mesophyll feeder Apolygus lucorum. Sci China Life Sci. 2017;60(8):826–38.
Zhang W, Lu Y, van der Werf W, Huang J, Wu F, Zhou K, Deng X, Jiang Y, Wu K, Rosegrant MW. Multidecadal, county-level analysis of the effects of land use, Bt cotton, and weather on cotton pests in China. Proc Natl Acad Sci U S A. 2018;115(33):E7700–9.
Huang JQ, Fang X, Tian X, Chen P, Lin JL, Guo XX, Li JX, Fan Z, Song WM, Chen FY, et al. Aromatization of natural products by a specialized detoxification enzyme. Nat Chem Biol. 2020.
Deinlein U, Stephan AB, Horie T, Luo W, Xu GH, Schroeder JI. Plant salt-tolerance mechanisms. Trends Plant Sci. 2014;19(6):371–9.
Liang W, Ma X, Wan P, Liu L. Plant salt-tolerance mechanism: a review. Biochem Biophys Res Commun. 2018;495(1):286–91.
Parihar P, Singh S, Singh R, Singh VP, Prasad SM. Effect of salinity stress on plants and its tolerance strategies: a review. Environ Sci Pollut Res Int. 2015;22(6):4056–75.
Fahad S, Hussain S, Matloob A, Khan FA, Khaliq A, Saud S, Hassan S, Shan D, Khan F, Ullah N, et al. Phytohormones and plant responses to salinity stress: a review. Plant Growth Regul. 2015;75(2):391–404.
Mickelbart MV, Hasegawa PM, Bailey-Serres J. Genetic mechanisms of abiotic stress tolerance that translate to crop yield stability. Nat Rev Genet. 2015;16(4):237–51.
Omidbakhshfard MA, Proost S, Fujikura U, Mueller-Roeber B. Growth-regulating factors (GRFs): a small transcription factor family with important functions in plant biology. Mol Plant. 2015;8(7):998–1010.
van der Knaap E, Kim JH, Kende H. A novel gibberellin-induced gene from rice and its potential regulatory role in stem growth. Plant Physiol. 2000;122(3):695–704.
Bao ML, Bian HW, Zha YL, Li FY, Sun YZ, Bai B, Chen ZH, Wang JH, Zhu MY, Han N. miR396a-mediated basic helix-loop-helix transcription factor bHLH74 repression acts as a regulator for root growth in Arabidopsis seedlings. Plant Cell Physiol. 2014;55(7):1343–53.
Liang G, He H, Li Y, Wang F, Yu DQ. Molecular mechanism of microRNA396 mediating pistil development in Arabidopsis. Plant Physiol. 2014;164(1):249–58.
Liu HH, Guo SY, Xu YY, Li CH, Zhang ZY, Zhang DJ, Xu SJ, Zhang C, Chong K. OsmiR396d-regulated OsGRFs function in floral organogenesis in Rice through binding to their targets OsJMJ706 and OsCR4. Plant Physiol. 2014;165(1):160–74.
Debernardi JM, Mecchia MA, Vercruyssen L, Smaczniak C, Kaufmann K, Inze D, Rodriguez RE, Palatnik JF. Post-transcriptional control of GRF transcription factors by microRNA miR396 and GIF co-activator affects leaf size and longevity. Plant J. 2014;79(3):413–26.
Hewezi T, Maier TR, Nettleton D, Baum TJ. The Arabidopsis MicroRNA396-GRF1/GRF3 regulatory module acts as a developmental regulator in the reprogramming of root cells during cyst nematode infection. Plant Physiol. 2012;159(1):321–35.
Kim JS, Mizoi J, Kidokoro S, Maruyama K, Nakajima J, Nakashima K, Mitsuda N, Takiguchi Y, Ohme-Takagi M, Kondou Y, et al. Arabidopsis growth-regulating Factor7 functions as a transcriptional repressor of abscisic acid- and osmotic stress-responsive genes, including DREB2A. Plant Cell. 2012;24(8):3393–405.
Casadevall R, Rodriguez RE, Debernardi JM, Palatnik JF, Casati P. Repression of growth regulating factors by the microRNA396 inhibits cell proliferation by UV-B radiation in Arabidopsis leaves. Plant Cell. 2013;25(9):3570–83.
Khatun K, Robin AHK, Park JI, Nath UK, Kim CK, Lim KB, Nou IS, Chung MY. Molecular characterization and expression profiling of tomato GRF transcription factor family genes in response to abiotic stresses and phytohormones. Int J Mol Sci. 2017;18(5):1056.
Heidel AJ, Clarke JD, Antonovics J, Dong XN. Fitness costs of mutations affecting the systemic acquired resistance pathway in Arabidopsis thaliana. Genetics. 2004;168(4):2197–206.
Sakuma Y, Maruyama K, Osakabe Y, Qin F, Seki M, Shinozaki K, Yamaguchi-Shinozaki K. Functional analysis of an Arabidopsis transcription factor, DREB2A, involved in drought-responsive gene expression. Plant Cell. 2006;18(5):1292–309.
Paterson AH, Wendel JF, Gundlach H, Guo H, Jenkins J, Jin D, Llewellyn D, Showmaker KC, Shu S, Udall J, et al. Repeated polyploidization of Gossypium genomes and the evolution of spinnable cotton fibres. Nature. 2012;492(7429):423–7.
Wang K, Wang Z, Li F, Ye W, Wang J, Song G, Yue Z, Cong L, Shang H, Zhu S, et al. The draft genome of a diploid cotton Gossypium raimondii. Nat Genet. 2012;44(10):1098–103.
Li F, Fan G, Wang K, Sun F, Yuan Y, Song G, Li Q, Ma Z, Lu C, Zou C, et al. Genome sequence of the cultivated cotton Gossypium arboreum. Nat Genet. 2014;46(6):567–72.
Du X, Huang G, He S, Yang Z, Sun G, Ma X, Li N, Zhang X, Sun J, Liu M, et al. Resequencing of 243 diploid cotton accessions based on an updated a genome identifies the genetic basis of key agronomic traits. Nat Genet. 2018;50(6):796–802.
Li FG, Fan GY, Lu CR, Xiao GH, Zou CS, Kohel RJ, Ma ZY, Shang HH, Ma XF, Wu JY, et al. Genome sequence of cultivated upland cotton (Gossypium hirsutum TM-1) provides insights into genome evolution. Nat Biotechnol. 2015;33(5):524–U242.
Zhang T, Hu Y, Jiang W, Fang L, Guan X, Chen J, Zhang J, Saski CA, Scheffler BE, Stelly DM, et al. Sequencing of allotetraploid cotton (Gossypium hirsutum L. acc. TM-1) provides a resource for fiber improvement. Nat Biotechnol. 2015;33(5):531–7.
Wang M, Tu L, Yuan D, Zhu SC, Li J, Liu F, Pei L, Wang P, Zhao G, et al. Reference genome sequences of two cultivated allotetraploid cottons, Gossypium hirsutum and Gossypium barbadense. Nat Genet. 2019;51(2):224–9.
Liu X, Zhao B, Zheng HJ, Hu Y, Lu G, Yang CQ, Chen JD, Chen JJ, Chen DY, Zhang L, et al. Gossypium barbadense genome sequence provides insight into the evolution of extra-long staple fiber and specialized metabolites. Sci Rep-Uk. 2015;5:14139.
Yuan D, Tang Z, Wang M, Gao W, Tu L, Jin X, Chen L, He Y, Zhang L, Zhu L, et al. The genome sequence of Sea-Island cotton (Gossypium barbadense) provides insights into the allopolyploidization and development of superior spinnable fibres. Sci Rep. 2015;5:17662.
Hu Y, Chen J, Fang L, Zhang Z, Ma W, Niu Y, Ju L, Deng J, Zhao T, Lian J, et al. Gossypium barbadense and Gossypium hirsutum genomes provide insights into the origin and evolution of allotetraploid cotton. Nat Genet. 2019.
Zhang DF, Li B, Jia GQ, Zhang TF, Dai JR, Li JS, Wang SC. Isolation and characterization of genes encoding GRF transcription factors and GIF transcriptional coactivators in maize (Zea mays L.). Plant Sci. 2008;175(6):809–17.
Dong Y, Hu G, Yu J, Thu SW, Grover CE, Zhu S, Wendel JF. Salt-tolerance diversity in diploid and polyploid cotton (Gossypium) species. Plant J. 2020;101(5):1135–51.
Shi GY, Guo XY, Guo JY, Liu LH, Hua JP. Analyzing serial cDNA libraries revealed reactive oxygen species and gibberellins signaling pathways in the salt response of upland cotton (Gossypium hirsutum L.). Plant Cell Rep. 2015;34(6):1005–23.
Guo JY, Shi GY, Guo XY, Zhang LW, Xu WY, Wang YM, Su Z, Hua JP. Transcriptome analysis reveals that distinct metabolic pathways operate in salt-tolerant and salt-sensitive upland cotton varieties subjected to salinity stress. Plant Sci. 2015;238:33–45.
Mitsuya S, Kawasaki M, Taniguchi M, Miyake H. Relationship between salinity-induced damages and aging in rice leaf tissues. Plant Prod Sci. 2003;6(3):213–8.
Chazen O, Hartung W, Neumann PM. The different effects of Peg-6000 and Nacl on leaf development are associated with differential inhibition of root water transport. Plant Cell Environ. 1995;18(7):727–35.
Shabala S, Demidchik V, Shabala L, Cuin TA, Smith SJ, Miller AJ, Davies JM, Newman IA. Extracellular Ca2+ ameliorates NaCl-induced K+ loss from Arabidopsis root and leaf cells by controlling plasma membrane K+-permeable channels. Plant Physiol. 2006;141(4):1653–65.
Speer M, Kaiser WM. Ion relations of Symplastic and Apoplastic space in leaves from Spinacia-Oleracea L and Pisum-Sativum L under salinity. Plant Physiol. 1991;97(3):990–7.
Wu HH, Shabala L, Barry K, Zhou MX, Shabala S. Ability of leaf mesophyll to retain potassium correlates with salinity tolerance in wheat and barley. Physiol Plantarum. 2013;149(4):515–27.
Wu HH, Shabala L, Zhou MX, Shabala S. Chloroplast-generated ROS dominate NaCl- induced K+ efflux in wheat leaf mesophyll. Plant Signal Behav. 2015;10(5).
Wu HHY, Zhu M, Shabala L, Zhou MX, Shabala S. K+ retention in leaf mesophyll, an overlooked component of salinity tolerance mechanism: a case study for barley. J Integr Plant Biol. 2015;57(2):171–85.
You Q, Xu WY, Zhang K, Zhang LW, Yi X, Yao DX, Wang CC, Zhang XY, Zhao XH, Provart NJ, et al. ccNET: database of co-expression networks with functional modules for diploid and polyploid Gossypium. Nucleic Acids Res. 2017;45(9):5625–6.
Chen L, Luan Y, Zhai J. Sp-miR396a-5p acts as a stress-responsive genes regulator by conferring tolerance to abiotic stresses and susceptibility to Phytophthora nicotianae infection in transgenic tobacco. Plant Cell Rep. 2015;34(12):2013–25.
Dilnur T, Peng Z, Pan Z, Palanga KK, Jia Y, Gong W, Du X. Association analysis of salt tolerance in Asiatic cotton (Gossypium arboreum) with SNP markers. Int J Mol Sci. 2019;20(9):2168.
Liu F, Zhou ZL, Wang CY, Wang YH, Cai XY, Wang XX, Zhang ZS, Wang KB. Genetic diversity and relationship analysis of Gossypium arboreum accessions. Genet Mol Res. 2015;14(4):14522–9.
Yang FX, Liang G, Liu DM, Yu DQ. Arabidopsis MiR396 mediates the development of leaves and flowers in transgenic tobacco. J Plant Biol. 2009;52(5):475–81.
Kim JH, Choi DS, Kende H. The AtGRF family of putative transcription factors is involved in leaf and cotyledon growth in Arabidopsis. Plant J. 2003;36(1):94–104.
Kim JH, Lee BH. Growth-regulating Factor4 of Arabidopsis thaliana is required for development of leaves, cotyledons, and shoot apical meristem. J Plant Biol. 2006;49(6):463–8.
Lee BH, Ko JH, Lee S, Lee Y, Pak JH, Kim JH. The Arabidopsis GRF-interacting factor gene family performs an overlapping function in determining organ size as well as multiple developmental properties. Plant Physiol. 2009;151(2):655–68.
Liu J, Hua W, Yang HL, Zhan GM, Li RJ, Deng LB, Wang XF, Liu GH, Wang HZ. The BnGRF2 gene (GRF2-like gene from Brassica napus) enhances seed oil production through regulating cell number and plant photosynthesis. J Exp Bot. 2012;63(10):3727–40.
Wang FD, Qiu NW, Ding Q, Li JJ, Zhang YH, Li HY, Gao JW. Genome-wide identification and analysis of the growth-regulating factor family in Chinese cabbage (Brassica rapa L. ssp pekinensis). BMC Genomics. 2014;15:807.
Kuijt SJH, Greco R, Agalou A, Shao JX, CCJ t H, Overnas E, Osnato M, Curiale S, Meynard D, van Gulik R, et al. Interaction between the growth-regulating factor and Knotted1-like homeobox families of transcription factors1. Plant Physiol. 2014;164(4):1952–66.
Luo AD, Liu L, Tang ZS, Bai XQ, Cao SY, Chu CC. Down-regulation of OsGRF1 gene in rice rhd1 mutant results in reduced heading date. J Integr Plant Biol. 2005;47(6):745–52.
Choi D, Kim JH, Kende H. Whole genome analysis of the OsGRF gene family encoding plant-specific putative transcription activators in rice (Oryza sativa L.). Plant Cell Physiol. 2004;45(7):897–904.
Filiz E, Koc I, Tombuloglu H. Genome-wide identification and analysis of growth regulating factor genes in Brachypodium distachyon: in silico approaches. Turk J Biol. 2014;38(2):296–306.
Treich I, Cairns BR, Delossantos T, Brewster E, Carlson M. Snf11, a new component of the yeast Snf-Swi complex that interacts with a conserved region of Snf2. Mol Cell Biol. 1995;15(8):4240–8.
Yu J, Jung S, Cheng CH, Ficklin SP, Lee T, Zheng P, Jones D, Percy RG, Main D. CottonGen: a genomics, genetics and breeding database for cotton research. Nucleic Acids Res. 2014;42(Database issue):1229–36.
Wong DC, Schlechter R, Vannozzi A, Holl J, Hmmam I, Bogs J, Tornielli GB, Castellarin SD, Matus JT. A systems-oriented analysis of the grapevine R2R3-MYB transcription factor family uncovers new insights into the regulation of stilbene accumulation. DNA Res. 2016.
Mistry J, Finn RD, Eddy SR, Bateman A, Punta M. Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions. Nucleic Acids Res. 2013;41(12):e121.
Poole RL. The TAIR database. Methods Mol Biol. 2007;406:179–212.
Letunic I, Bork P. 20 years of the SMART protein domain annotation resource. Nucleic Acids Res. 2018;46(D1):D493–6.
Finn RD, Coggill P, Eberhardt RY, Eddy SR, Mistry J, Mitchell AL, Potter SC, Punta M, Qureshi M, Sangrador-Vegas A, et al. The Pfam protein families database: towards a more sustainable future. Nucleic Acids Res. 2016;44(D1):D279–85.
Voorrips RE. MapChart: software for the graphical presentation of linkage maps and QTLs. J Heredity. 2002;93(1):77–8.
Hu B, Jin J, Guo AY, Zhang H, Luo J, Gao G. GSDS 2.0: an upgraded gene feature visualization server. Bioinformatics. 2015;31(8):1296–7.
Wilkins MR, Gasteiger E, Bairoch A, Sanchez JC, Williams KL, Appel RD, Hochstrasser DF. Protein identification and analysis tools in the ExPASy server. Methods Mol Biol. 1999;112:531–52.
Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30(4):772.
Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011;28(10):2731–9.
Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nat Methods. 2015;12(4):357–60.
Pertea M, Kim D, Pertea GM, Leek JT, Salzberg SL. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat Protoc. 2016;11(9):1650–67.
Risso D, Ngai J, Speed TP, Dudoit S. Normalization of RNA-seq data using factor analysis of control genes or samples. Nat Biotechnol. 2014;32(9):896–902.
Wang M, Wang Q, Zhang B. Evaluation and selection of reliable reference genes for gene expression under abiotic stress in cotton (Gossypium hirsutum L.). Gene. 2013;530(1):44–50.
Zhang Y, Parmigiani G, Johnson WE. Combat-seq: batch effect adjustment for RNA-seq count data. bioRxiv. 2020; 2020.2001.2013.904730.
Saeed AI, Sharov V, White J, Li J, Liang W, Bhagabati N, Braisted J, Klapa M, Currier T, Thiagarajan M, et al. TM4: a free, open-source system for microarray data management and analysis. BioTechniques. 2003;34(2):374–8.
Shan CM, Shangguan XX, Zhao B, Zhang XF, Chao LM, Yang CQ, Wang LJ, Zhu HY, Zeng YD, Guo WZ, et al. Control of cotton fibre elongation by a homeodomain transcription factor GhHOX3. Nat Commun. 2014;5:5519.
Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative PCR and the 2(T)(−Delta Delta C) method. Methods. 2001;25(4):402–8.
We thank Prof. Tian-Zhen Zhang for providing the RNA-seq data and calculating the RPKM values and Prof. Xiao-Ya Chen participating in discussion and revising the manuscript.
This work reported in this publication was supported by the National Natural Science Foundation of China through the Awards Nos. 31690092, 31571251, 31788103, the National Key R&D Program of China (2016YFD0100500) and the Ministry of Agriculture of China (2016ZX08005–003), the China Postdoctoral Science Foundation through the Awards Nos. 2017 M621546 and 2018 T110411. The funding bodies did not participate in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
28 plant species used for identifying the number of GRF family genes.
List of forward and reverse primers used for qRT-PCR analyses
Multiple alignment of GhGRF (Gossypium hirsutum GRF) protein sequences
Chromosome distribution of GaGRF (Gossypium arboreum GRF) and GrGRF (Gossypium raimondii GRF) genes.
About this article
Cite this article
Cao, JF., Huang, JQ., Liu, X. et al. Genome-wide characterization of the GRF family and their roles in response to salt stress in Gossypium. BMC Genomics 21, 575 (2020). https://doi.org/10.1186/s12864-020-06986-0
- Growth regulating factor
- Organ-specific expression
- Salt stress