- Research article
- Open Access
Genome-wide analysis of the Hsf family in soybean and functional identification of GmHsf-34 involvement in drought and heat stresses
BMC Genomics volume 15, Article number: 1009 (2014)
High temperature affects organism growth and metabolic activity. Heat shock transcription factors (Hsfs) are key regulators in heat shock response in eukaryotes and prokaryotes. Under high temperature conditions, Hsfs activate heat shock proteins (Hsps) by combining with heat stress elements (HSEs) in their promoters, leading to defense of heat stress. Since the first plant Hsf gene was identified in tomato, several plant Hsf family genes have been thoroughly characterized. Although soybean (Glycine max), an important oilseed crops, genome sequences have been available, the Hsf family genes in soybean have not been characterized accurately.
We analyzed the Hsf genetic structures and protein function domains using the GSDS, Pfam, SMART, PredictNLS, and NetNES online tools. The genome scanning of dicots (soybean and Arabidopsis) and monocots (rice and maize) revealed that the whole-genome replication occurred twice in soybean evolution. The plant Hsfs were classified into 3 classes and 16 subclasses according to protein structure domains. The A8 and B3 subclasses existed only in dicots and the A9 and C2 occurred only in monocots. Thirty eight soybean Hsfs were systematically identified and grouped into 3 classes and 12 subclasses, and located on 15 soybean chromosomes. The promoter regions of the soybean Hsfs contained cis-elements that likely participate in drought, low temperature, and ABA stress responses. There were large differences among Hsfs based on transcriptional levels under the stress conditions. The transcriptional levels of the A1 and A2 subclass genes were extraordinarily high. In addition, differences in the expression levels occurred for each gene in the different organs and at the different developmental stages. Several genes were chosen to determine their subcellular localizations and functions. The subcellular localization results revealed that GmHsf-04, GmHsf-33, and GmHsf-34 were located in the nucleus. Overexpression of the GmHsf-34 gene improved the tolerances to drought and heat stresses in Arabidopsis plants.
This present investigation of the quantity, structural features, expression characteristics, subcellular localizations, and functional roles provides a scientific basis for further research on soybean Hsf functions.
Heat stress, defined as a rise in the temperature of 10-15°C above the ambient , beyond a given threshold level for a period of time, is an agricultural problem in many areas all over the world, affecting plant growth and development and often leading to reductions in yield. All organisms, including eukaryotes and prokaryotes, share a common heat shock response mechanism, which involves a number of reactions, including new protein synthesis, folding, intracellular targeting, specific biological functions, and protein degradations. Among these proteins, Hsps acting as molecular chaperones are essential for the maintenance and/or restoration of protein homeostasis [2–8]. The Hsp expression is regulated by the multiple mechanisms. The central regulators are Hsfs. Under high temperature conditions, Hsfs activate Hsps by combining with HSEs in their promoters, leading to the defense of the heat stress and even recovering from its effects.
A typical Hsf protein contains a modular structure with an N-terminal DNA-binding domain (DBD), an adjacent oligomerization domain (OD) composed of heptad repeats of hydrophobic amino acid residues (HR-A/B), a nuclear localization signal (NLS) region essential for nuclear uptake of the protein, a nuclear export signal (NES) region, and an activator motif (AHA) . Arabidopsis Hsfs were classified into A, B, and C classes according to the differences in their HR-A/B regions. Due to the insertion of 21 (class A) or 7 (class C) amino acid residues between the A and B parts of the HR-A/B regions, the class A and class C Hsfs have longer HR-A/B regions than class B, which is distinguished from class A and C by the presence of a heptad repeat pattern instead of an insertion. Unlike class B and class C, the class A members contain a C-terminal AHA motif relevant to their own activator function, and a hydrophobic, frequently leucine-rich NES required for the receptor-mediated nuclear export in complex with the NES receptor .
Under normal circumstances, the inactive state of a monomeric Hsf is maintained by the interaction with the molecular chaperones, such as Hsp70 and Hsp90. In response to heat stress, Hsfs released from the chaperone complex are converted from a transcriptional inactive monomer to an active trimmer through combination of their ODs. As sequence-specific trimeric DNA binding proteins, the active Hsfs are capable of recognizing and combining HSEs in the Hsf-inducible gene promoters . HSEs are formed of repetitive palindromic binding motifs of the 5’-AGAAnnTTCT-3’ sequence upstream of the TATA box in the Hsf-inducible genes [12–15].
Since the first plant Hsf gene was identified in tomato , the Hsf family genes have been thoroughly characterized, and 21, 25, 25, and 27 Hsf genes were found in Arabidopsis, rice, maize, and tomato, respectively [9, 17–19]. In the present study, we scanned for and integrated all the nonredundant sets of the soybean Hsf genes, determined their chromosomal locations, predicted their protein structures by available software and network stations, analyzed the expression levels of the soybean Hsf genes by qRT-PCR and identified the function of GmHsf-34 in the tolerance to drought and heat stresses. This study provides a version on the structures and evolutionary history of the soybean Hsfs, and a candidate gene to the crop molecular breeding.
Identification, phylogenetic, and evolutionary analyses
The amino acid sequences of Hsf-type DBD domains (Pfam: PF00447) were submitted into JGI Glyma1.0 annotation for BLASTP searches. Fifty-eight putative soybean Hsf sequences were acquired. After surveyed using the Pfam database and SMART online tool, 4 soybean Hsf sequences were rejected due to the absence of typical Hsf DBD domains, and 16 were abandoned due to the absence of coiled-coil structures. Consequently, 38 nonredundant soybean Hsfs were identified (Table 1). The polypeptide lengths of soybean Hsfs varied widely, ranging from 213 to 510. Isoelectric points of the proteins were diverse (Table 1).
To determine the phylogenetic relationships among soybean Hsfs, a phylogenetic analysis of 38 soybean Hsfs, 25 maize Hsfs, 25 rice Hsfs, and 21 Arabidopsis Hsfs was performed by generating a neighbor-joining phylogenetic tree (Figure 1). According to differences in the amino acid sequences of DBD, the HR-A/B region, and the linker between them, the A, B, and C Hsf classes formed three clusters. Class A was divided into 10 sub-clusters, designated A1, A2, A3, A4, A5, A6, A7, A8, A9, and A10. Class B was divided into sub-clusters B1, B2, B3, and B4, and the class C contains sub-clusters C1 and C2. Soybean Hsfs were further divided into 12 sub-clusters according to their phylogenetic relationship, defined as A1, A2, A3, A4, A5, A6, A8, B1, B2, B3, B4, and C1 (Figure 1). As a dicot, soybean was more similar to Arabidopsis than to the monocots rice and maize. AtHsf-09 and AtHsf-10 were the only two members of subclass A7. The A8 and B3 subclasses were present only in the dicots, and A9 and C2 existed only in the monocots. Interestingly, soybean subclass B4 had higher similarity to Arabidopsis B4 than to the rice or maize B4 subclasses, and soybean subclass A6 Hsfs showed higher similarity to A4 rather than to Arabidopsis subclass A6.
Physical locations of soybean Hsfs
According to the soybean genome database, 38 soybean Hsf genes were distributed among 15 chromosomes, with the exception of chromosome 2, 6, 7, 12, and 18 (Figure 2). The number of soybean Hsf genes in each chromosome differed considerably. For example, chromosome 1 and 10 carried 5 soybean Hsf genes, whereas only one was present in chromosome 4, 14, and 19 respectively. Using soybean genome repeat informations, 15 paralogous genes were identified (Figure 2).
Gene structures and cis-acting elements
Gene structure analysis revealed the existence of introns in the soybean Hsf genes. Four introns were found in GmHsf-20, 3 in GmHsf-23, and 2 in GmHsf-02, GmHsf-12, GmHsf-17, and GmHsf-18 respectively (Figure 3). Cis-element analysis demonstrated that every soybean Hsf member carried one or more MYB and MYC elements in their promoters. In addition, 52.6% of the members contained an ABA-responsive element (ABRE), 31.6% contained a dehydration-responsive element (DRE), and 42.1% contained a low-temperature responsive element (LTRE) (Table 2). It was reported that the above-mentioned 5 elements play different significant roles in stress responses in plants. For example, MYB is involved in stress-induced drought, low temperature, salt, ABA, and GA responses . ABRE responds to drought and ABA via combination with ABRE binding proteins (AREB) . DRE combining with DRE binding proteins (DREB) participate in drought, salt, low temperature, and ABA responses . LTRE contributes primarily to low temperature response and regulation . Analyses of cis-elements in the promoters suggest that Hsfs are significantly related to stress response.
Conserved domains and motifs of soybean Hsfs
The modular structure of the Hsf family in plants has been described thoroughly in several model plants [9, 18, 24]. A typical soybean Hsf protein contains 5 conserved domains. There is a gradation of DBD, OD, NLS, NES, and AHA domains from N-terminal to C-terminal (Table 3). The DBD domain, the most conserved section, composed of approximately 100 amino acids, contains 3 α-helices and a four-stranded antiparallel β-sheet (α1-β1-β2-α2-α3-β3-β4) (Figure 4). This helix-turn-helix motif (H2-T-H3) specifically combines to HSEs in the promoters of heat-stress-induced genes. HR-A/B adjacent to the DBD domain in the C-terminal is characterized by a coiled-coil structure (coil-coil structure). According to the distinction between the HR-A and HR-B motifs, Hsfs were artificially divided into A, B, and C classes. Because of the insertion of 21 (class A) or 7 (class C) amino acid residues between the A and B parts of the HR-A/B motif, class A and class C Hsfs have longer HR-A/B regions than class B Hsfs, which are distinguished from classes A and C by the presence of a heptad repeat pattern instead of an insertion (Figure 5).
Depending on the balance of nuclear import and export, the intracellular distribution of Hsfs changes dynamically between nucleus and cytoplasm [10, 25]. A hydrophobic, frequently leucine-rich NES at the C-terminal of many Hsfs is required for receptor-mediated nuclear export in a complex with the NES receptor. Together with the adjacent AHA motifs, NES serves as part of a type-specific signature region in the C-terminal of class A Hsfs in plants . AHA motifs exist only in class A Hsfs (Table 3). It was noted that the α2-α3 sequence in DBD of GmHsf-12 was unique (Figure 4). An NLS was not detected in GmHsf-10, GmHsf-14, GmHsf-05, and GmHsf-27; NES was located in HR-A/B in 9 Hsf proteins (GmHsf-03, GmHsf-07, GmHsf-16, GmHsf-18, GmHsf-19, GmHsf-21, GmHsf-26, GmHsf-32, and GmHsf-35); and two HR-A/B regions were found in GmHsf-23 (Table 3).
Expression patterns of soybean Hsf genes
To examine expression patterns in different soybean tissues and organs, an expression pattern map of soybean Hsf genes based on the gene-chip data downloaded from the soybean genome database was drawn (Figure 6 and Additional file 1: Table S1). The data analysis revealed that soybean Hsf genes were expressed in 14 tissues and organs and at different developmental stages. Moreover, soybean Hsf genes were expressed at the highest level in roots and at the lowest level in seeds after 21 days of development (Figure 6A).
Three soybean Hsf genes showed tissue-specific expression patterns. For example, GmHsf-02 was expressed in roots; GmHsf-28 in roots and seeds after 14 days of development; and GmHsf-37 in young leaves and root nodules. GmHsf-02, GmHsf-19, and GmHsf-28 expressed at a low level, whereas GmHsf-08, GmHsf-25, GmHsf-33, and GmHsf-34 at an extremely high level. Expression levels were disparate in different soybean Hsf subclasses. Compared with others, the expression levels for subclass A3 were lower. Even in the same subclass, expression levels were varied. For example, GmHsf-17 transcripts reached maximum levels in young leaves, whereas GmHsf-33 reached maximum levels in flowers and pod shells at 14 DAF, and also in nodules. In addition, data from the tissue expression chip revealed differences in expression between 15 pairs of paralogous genes. For example, although GmHsf-20 and GmHsf-28 were expressed at relatively low levels, GmHsf-20 was expressed in 8 tissues and organs, and GmHsf-28 was expressed only in seeds 14 DAF and in roots at a very low level; Although GmHsf-23 was expressed much like GmHsf-37 in quantity, GmHsf-37 was expressed only in young leaves and nodules, whereas GmHsf-23 was also expressed in flowers and seeds at 35 DAF.
qRT-PCR analyses of soybean Hsf genes
Nineteen soybean Hsf genes which expressed highly in different tissues were selected to further confirm their responses to drought stress and heat stress. qRT-PCR was carried out using soybean plants exposed to drought (0, 6, and 12 h) and high temperature (0, 6, and 12 h). These genes expressed diversely under both stresses (Figure 7A, B); 14 genes were up-regulated (>2-fold) by drought stress, and 13 were up-regulated by heat stress. Notably, 10 soybean Hsf genes (GmHsf-04, GmHsf-08, GmHsf-09, GmHsf-10, GmHsf-11, GmHsf-16, GmHsf-17, GmHsf-25, GmHsf-33, and GmHsf-34) showed up-regulation under both drought and heat stress conditions. Two soybean Hsf genes (GmHsf-03 and GmHsf-07) were greatly down-regulated (<0.5-fold) during the heat stress treatment. Moreover, the transcript level of GmHsf-38, was unchanged by either stress.
GmHsf-04, GmHsf-33, and GmHsf-34 were localized in the nucleus
Three genes (GmHsf-04, GmHsf-33, and GmHsf-34) up-regulated strongly by both heat and drought were selected for subcellular localization. Expression vectors with green fluorescent protein (GFP) tags were constructed for subcellular localization analysis. The coding regions of GmHsf-04, GmHsf-33, and GmHsf-34 were amplified from the soybean cDNA by PCR with specific primers and fused to the N-terminal of GFP under control of the CaMV 35S promoter. Subcellular localization of GFP expression in mesophyll cell protoplasts of Arabidopsis was monitored by confocal microscopy 16 h after transformation mediated by PEG; 35S::GFP vector was transformed as the control. As shown in Figure 8, control hGFP was uniformly distributed throughout the mesophyll cell protoplast, whereas GmHsf-04, GmHsf-33, and GmHsf-34 fusion proteins were exclusively localized in the nucleus. These results suggest that GmHsf-04, GmHsf-33, and GmHsf-34 are nuclear proteins, possibly serving as transcription factors.
Overexpression of GmHsf-34 improved tolerance to drought and heat stresses in Arabidopsis
According to the expression analysis, GmHsf-34 was strongly induced by drought and heat stresses. To confirm the functions of GmHsf-34 in abiotic stress response, three lines of Arabidopsis overexpressing GmHsf-34 were tested under drought and high temperature conditions, respectively. Seed germination and root growth of transgenic Arabidopsis were tested in the presence of 4% PEG (Figure 9A to D). Under standard culture conditions, no significant differences in germination rate or morphology between transgenic and wild-type (Col-0) plants were observed. However, germination percentage of transgenic plants was enhanced by nearly 15% compared to wild-type after 2-3 days (Figure 9A and B). In the presence of 4% PEG, roots of transgenic lines were longer than those of wild-type plants (Figure 9C and D), showing that overexpression of GmHsf-34 improved tolerance to the imposed drought treatment in Arabidopsis. After heat stress treatment, survival rates of Arabidopsis seedlings overexpressing GmHsf-34 and wild-type were recorded (Figure 9E and F). The survival rate of wild-type Arabidopsis seedlings was about 11.5%, whereas that of Arabidopsis seedlings overexpressing GmHsf-34 was improved to 58.5-62.5%. Obviously, the transgenic seedlings displayed higher tolerance to high temperature compared to wild-type plants.
In previous work, 59 soybean Hsfs presented from the soybean Transcription Factor Database website (http://soybeantfdb.psc.riken.jp/) were reported . In this study, although a comprehensive set of 58 possible soybean Hsfs were obtained after scanning the current version of the soybean genome (JGI Glyma1.0 annotation), 38 nonredundant soybean Hsfs were finally identified and characterized. Comparison with the locus numbers indicated that soybean Hsfs identified in our study were completely included in reported 59 soybean Hsfs. The widely accepted model of Hsfs defines the necessity of Hsf-type DBD and OD characterized by coiled-coil structure . Briefly, DBD ensures the Hsfs combination with HSEs, and coiled-coil domain is indispensable for trimerization leading to Hsfs activity. Consequently, we surveyed and discarded extra 21 similar Hsfs due to the absence of Hsf-type DBD domains and/or coiled-coil structures.
The monocots rice and maize contain the same number of Hsf genes (25), whereas the numbers in dicots soybean (38) and Arabidopsis (21) are quite different. This probably results from the double duplications of genome in soybean  but only a single replication in Arabidopsis during evolution. The cluster analysis indicated that Hsfs in the same subclasses in Arabidopsis and soybean, or maize and rice, belong to the same branch in accord with the evolutionary relationships of Arabidopsis and soybean being dicots and maize and rice being monocots. Several genes are unique to monocots or dicots. For example, the subclasses A8 and B3 are restricted to dicots, and A9 and C2 are characteristic of monocots, suggesting the evolution of these subclasses followed the divergence of monocots and dicots. In addition, the subclass A7 is absent in soybean, presumably was lost in the processes of gene recombination, mutation, or redundancy.
In recent years, research on the role of introns has made significant progress. Studies in mammals, nematodes, insects, fungi, and plants suggest that the introns not only regulate the gene expression, but also participate in the gene evolution . Analysis of gene structures revealed that soybean Hsf genes contain a single intron except GmHsf-12 containing two introns. Combined with analysis of the gene chip expression results (Figure 6), the soybean Hsf gene GmHsf-12 was not expressed at lower levels than others under the normal conditions. Seemingly, an intron does not affect gene expressions. Combined with the analysis to phylogenetic evolution (Figure 1), the number and location of Hsf intron in the same subclass are conserved (Figure 3). For example, GmHsf-30 and GmHsf-34 in subclass A2 respectively at 307 bp to 2110 bp and 304 bp to 2113 bp section contains an intron, which indicated that introns could be a reference to the evolution of plant genes.
Transcriptional activity of class A Hsfs is normally mediated by the AHA motif in the C-terminal region. However, the AHA motif is absent in GmHsf-10 or GmHsf-14 in the A3 subclass in soybean (Table 3). It was proposed that proteins without an AHA motif were activated through formation hetero-polymers with other class A Hsfs . Unlike class A Hsfs, most of the class B and C Hsfs do not have the transcription activation ability, since their CTDs lack detectable the AHA motifs. Instead, the class B of Hsfs is characterized with a tetrapeptide-LFGV- in the C-terminal region, which is assumed to function as a repressor motif in the transcription machinery. The previous research showed that several other transcription factors functioning as the repressors also contain a conservative tetrapeptide-LFGV-motif, such as ABI3/VP1, AP2/ERF, MYB and GRAS, although their mechanisms of the action remain unclear [31, 32].
The signal transduction pathways are complicated networks where components work together to control the plant physiological and biochemical process. AREB1 (ABRE-BINDING PROTEIN 1), AREB2, and ABF3, members of class A bZIP transcription co-factors of the ABRE elements, regulate the response to osmotic stress through combining with the element in the DREB2A promoter region . The analysis of cis-acting elements in their promoter regions revealed that the soybean Hsf genes contain the MYB/MYC elements and some contain the ABRE, DRE, and/or LTRE elements, demonstrating that the soybean Hsfs play significant roles in the regulation of stress responses (Table 2). The MYB elements basically participate in the drought, low temperature, salt, ABA, and GA stress responses [34, 35] and the MYC elements participate in the drought, salt, and ABA stress responses. We show that the Arabidopsis, rice, and maize Hsf gene promoters contain the MYB/MYC elements (Table 2) and we concluded that Hsfs are involved in the responses to drought, salt, and ABA in plants. However, the expression results from the gene chips did not agree perfectly with the conclusion. For instance, GmHsf-25 contained 15 cis-acting elements and its expression value was up to 1090, whereas the value for GmHsf-10 carrying 85 cis-acting elements was only 228 (Table 2 and Additional file 1: Table S1). One explanation may be that, several elements lost their activities or performed in a negative way.
In consideration of putative similar functions of Hsf genes in the same subclass, 19 genes containing all subclass were selected to investigate responses to drought and heat stresses using qRT-PCR. The results showed that the soybean Hsf genes were differently expressed under the drought stress and heat stress conditions. We detected that three soybean Hsfs (GmHsf-04, GmHsf-17, and GmHsf-33) belonged to the subclass A1 were expressed at significantly high levels under the drought stress and heat stress (Figure 7). Moreover, GmHsf-08 and GmHsf-34 (subclass A2), GmHsf-11 (subclass A4), GmHsf-09 (subclass A5), GmHsf-10 (subclass A8), GmHsf-25 (subclass B1), and GmHsf-16 (subclass C1) were up-regulated under both drought stress and heat stress conditions. It was reported that tomato HsfA1a and Arabidopsis HsfA2 function as master regulators for acquired thermo-tolerance [36, 37], and tomato HsfB1 was a co-regulator with HsfA1a . These results indicate that A1, A2, and several other subclass members may be involved in the drought stress and heat stress responses in plants. According to the soybean gene chip data, GmHsf-02, GmHsf-07, GmHsf-19, GmHsf-20, GmHsf-28, GmHsf-35, and GmHsf-37 were expressed at very low levels. Among them, GmHsf-07, GmHsf-20, and GmHsf-28 belong to subclass A3. It is likely that the functions of these soybean Hsf genes, especially subclass A3, are not related to drought or heat stress responses. Most of the soybean Hsf genes expressed in roots were regulated by drought while those expressed in young leaves were regulated by heat. This is consistent with the fact that the root is the first organ sensing drought stress whereas leaf is first to experience heat stress.
In the former publication , expression analyses of 5 soybean Hsf genes (GmHsf12, GmHsf28, GmHsf34, GmHsf35, and GmHsf47) were performed under heat, low-temperature, NaCl, and drought stresses, respectively. These 5 genes were named in our study as GmHsf-08, GmHsf-21, GmHsf-26, GmHsf-25, and GmHsf-32 respectively. We founded that GmHsf-08, GmHsf-21, and GmHsf-25 showed markedly up-regulation by heat, which was consistent with the former works, while GmHsf-26 showed no detectable alteration. Under drought condition, GmHsf-08, GmHsf-25, and GmHsf-26 showed to be up-regulated strongly while GmHsf-21 expression was not influenced. GmHsf-32 expressions under abotic conditions were not surveyed in our study. GmHsf-34 was strongly induced by drought and heat stresses and its overexpression improved survival rate and/or root development in Arabidopsis under simulated drought and heat conditions (Figure 9). Similarly, overexpression of Arabidopsis AtHsfA2, one of the most strongly induced genes, lead to enhanced tolerance to heat stress . Given the close phylogenetic relationship of GmHsf-34 with AtHsfA2, it is speculated that GmHsf-34 functions as a typical transcription factor due to the existence of Hsf-type DBD, OD, NES, NLS, and AHA motifs, and participates in heat and drought responses.
Thirty eight soybean Hsf genes were initially identified and classified after scanning for the soybean genome data base. Their locations and duplications, intron-exon structures, DBD structures and HR-A/B, distribution of cis-acting elements in the soybean Hsf promoters, and the expression patterns were determined. Based on the expression analysis, we inferred that soybean Hsf subclasses A1 and A2 may be the primary regulators of the heat stress response in soybean. GmHsf-34, a member of subclass A2, played an important role in the response to the drought and heat stress treatments imposed in this study.
Database searches for Hsf genes in soybean, Arabidopsis, and rice genomes
The whole genome data and the repeat information of soybean and maize were obtained from JGI Glyma1.0 annotation . The gene sequences and protein sequences of Arabidopsis and rice Hsfs were acquired from TAIR , and TIGR , respectively. The gene chip data of soybean were derived from SoyBase .
Identification and physical locations of soybean Hsfs
To gather the probable candidate soybean Hsf amino acid sequences, the Hsf-type DBD domain (Pfam: PF00447) was submitted as a query in a BLASTP (P = 0.001) search of the soybean genome data base. A total of 38 soybean Hsfs were obtained after manually filtering out repeated sequences, and sequences without integrated Hsf-type DBD domains or classic coilled-coil structures by SMART . All non-redundant Hsfs were mapped on the 20 soybean chromosomes on the basis of the information in the soybean database using MapDraw software . The paralogous genes are identified and connected by lines according to Lin’s method .
Genetic structure and cis-acting elements
An exon-intron substructure map was produced by Tools Online GSDS , and Promoter 2.0  was applied to predict the soybean Hsf promoters. Cis-acting elements were analyzed by referring to the plant cis-acting element database PLACE26.0 .
Domain prediction and phylogenetic relationships
Clustal X 2.0  was applied in protein sequence comparison analyses of Hsfs in Arabidopsis, rice, and soybean. Database tools Pfam , SMART, PredictNLS  and NetNES  were consulted to analyze their typical functional structure domains. A phylogenetic tree was constructed using the adjacent method by MEGA5.0  with a 1000 bootstrap value.
An analysis was conducted using the soybean gene chip expression data, the analysis was carried out which included the 38 soybean Hsfs in the different tissues and development stages, and also the diversity of different genes, subclasses, organs, tissues, and development stages.
Plant materials and stress treatments
The soybean seeds were germinated in the vermiculite in a light chamber at 25°C for 14 days. The soybean seedlings were removed and exposed to a heat stress temperature (42°C) for 0, 6, and 12 h, after which they were sampled for RNA extraction. For the drought stress, soybean seedlings were removed from the soil, and dehydrated for 0, 6, and 12 h before being sampled and frozen in liquid nitrogen and stored at -80°C.
RT-PCR and qRT-PCR
The total RNA was isolated from the whole plants using an RNeasy Plant Mini Kit (Qiagen) according to the manufacture’s handbook. The cDNA synthesis and reverse transcription-PCR (RT-PCR) were conducted as previously described . Quantitative real-time PCR (qRT-PCR) for examination of the soybean Hsfs were performed with the SYBR Premix ExTaqTM kit (TaKaRa) and an ABI 7300 according to the manufacturer’s protocols (Applied Biosystem). The expression patterns were analyzed with an ABI Prism 7300 sequence detection system (Applied Biosystems) as previously described . The soybean Hsf genes primers for qRT-PCR were designed using the Primer Premier 5.0 software avoiding the Hsfs conservative domain and soybean Actin (U60506) was used as an internal control for normalization of the template cDNA.
Subcellular localization in Arabidopsisprotoplasts
The expression vectors with green fluorescent protein (GFP) tags were constructed for the subcellular localization analysis as described previously . The coding regions of GmHsf-04, GmHsf-33, and GmHsf-34 were amplified by PCR using the specific primers and fused to the N-terminal of GFP under control of the CaMV35S promoter. The subcellular localization of the GFP expression in the Arabidopsis protoplasts was monitored by confocal microscopy 16 h after polyethylene glycol mediated transformation as described .
Tolerance assays under stress conditions
The GmHsf-34 gene, which is induced by the drought and heat stresses, was selected to confirm gene functions. Expression vector pBI121::GmHsf-34 containing GmHsf-34 under control of the CaMV35S promoter was built. Three Arabidopsis lines overexpressing GmHsf-34 were obtained after the transformation mediated by agrobacterium (Agrobacterium tumefaciens). For the germination assays, the seeds of Col-0 and transgenic plants were placed on ½ MS medium containing no or 4% (w/v) PEG for 7 days. For the root growth assays, 4-day-old seedlings grown on ½ MS medium were transferred to ½ MS medium containing no or 4% (w/v) PEG for 4 days after which the root lengths were measured by a root system scanner. For the heat stress tolerance assays, the 21-day-old seedlings grown in the soil transferred from ½ MS medium were treated at 42°C for 6 h and then grown under the normal condition for several days.
ABRE binding protein
DRE binding protein
Green fluorescent protein
Heat stress element
Heat shock transcription factor
Heat shock proteins
Low-temperature responsive element
Nuclear export signal
A nuclear localization signal
Wahid A, Gelani S, Ashraf M, Foolad MR: Heat tolerance in plants: an overview. Environ Exp Bot. 2007, 61: 199-223. 10.1016/j.envexpbot.2007.05.011.
Boston RS, Viitanen PV, Vierling E: Molecular chaperones and protein folding in plants. Plant Mol Biol. 1996, 32: 191-222. 10.1007/BF00039383.
Bukau B, Weissman J, Horwich A: Molecular chaperones and protein quality control. Cell. 2006, 125: 443-451. 10.1016/j.cell.2006.04.014.
Nakamoto H, Vigh L: The small heat shock proteins and their clients. Cell Mol Life Sci. 2007, 64: 294-306. 10.1007/s00018-006-6321-2.
Morimoto RI: Proteotoxic stress and inducible chaperone networks in neurodegenerative disease and aging. Genes Dev. 2008, 22: 1427-1438. 10.1101/gad.1657108.
Hartl FU, Hayer-Hartl M: Converging concepts of protein folding in vitro and in vivo. Nat Struct Mol Biol. 2009, 16: 574-581. 10.1038/nsmb.1591.
Pratt WB, Morishima Y, Peng HM, Osawa Y: Proposal for a role of the Hsp90/Hsp70-based chaperone machinery in making triage decisions when proteins undergo oxidative and toxic damage. Exp Biol Med (Maywood). 2010, 235: 278-289. 10.1258/ebm.2009.009250.
Xu ZS, Li ZY, Chen Y, Chen M, Li LC, Ma YZ: Heat shock protein 90 in plants: molecular mechanisms and roles in stress responses. Int J Mol Sci. 2012, 13: 15706-15723. 10.3390/ijms131215706.
Nover L, Bharti K, Döring P, Mishra SK, Ganguli A, Scharf KD: Arabidopsis and the heat stress transcription factor world: how many heat stress transcription factors do we need?. Cell Stress Chaperones. 2001, 6: 177-189. 10.1379/1466-1268(2001)006<0177:AATHST>2.0.CO;2.
Heerklotz D, Döring P, Bonzelius F, Winkelhaus S, Nover L: The balance of nuclear import and export determines the intracellular distribution and function of tomato heat stress transcription factor HsfA2. Mol Cell Biol. 2001, 21: 1759-1768. 10.1128/MCB.21.5.1759-1768.2001.
Wang F, Dong Q, Jiang H, Zhu S, Chen B, Xiang Y: Genome-wide analysis of the heat shock transcription factors in Populus trichocarpa and Medicago truncatula. Mol Biol Rep. 2012, 39: 1877-1886. 10.1007/s11033-011-0933-9.
Pelham HR: A regulatory upstream promoter element in the Drosophila hsp 70 heat-shock gene. Cell. 1982, 30: 517-528. 10.1016/0092-8674(82)90249-5.
Santoro N, Johansson N, Thiele DJ: Heat shock element architecture is an important determinant in the temperature and transactivation domain requirements for heat shock transcription factor. Mol Cell Biol. 1998, 18: 6340-6352.
Guo L, Chen S, Liu K, Liu Y, Ni L, Zhang K, Zhang L: Isolation of heat shock factor HsfA1a-binding sites in vivo revealed variations of heat shock elements in Arabidopsis thaliana. Plant Cell Physiol. 2008, 49: 1306-1315. 10.1093/pcp/pcn105.
Akerfelt M, Morimoto RI, Sistonen L: Heat shock factors: integrators of cell stress, development and lifespan. Nat Rev Mol Cell Biol. 2010, 11: 545-555. 10.1038/nrm2938.
Scharf KD, Rose S, Zott W, Schöffl F, Nover L: Three tomato genes code for heat stress transcription factors with a region of remarkable homology to the DNA-binding domain of the yeast Hsf. EMBO J. 1990, 9: 4495-4501.
Guo J, Wu J, Ji Q, Wang C, Luo L, Yuan Y, Wang Y, Wang J: Genome-wide analysis of heat shock transcription factor families in rice and Arabidopsis. J Genet Genomics. 2008, 35: 105-118. 10.1016/S1673-8527(08)60016-8.
Lin YX, Jiang HY, Chu ZX, Tang XL, Zhu SW, Cheng BJ: Genome-wide identification, classification and analysis of heat shock transcription factor family in maize. BMC Genomics. 2011, 12: 76-10.1186/1471-2164-12-76.
Scharf KD, Berberich T, Ebersberger I, Nover L: The plant heat stress transcription factor (Hsf) family: Structure, function and evolution. Biochim Biophys Acta. 2012, 1819: 104-119. 10.1016/j.bbagrm.2011.10.002.
Zhu ZF, Sun CQ, Fu YC, Qian XY, Yang JS, Wang XK: Isolation and analysis of a novel MYC gene from rice. Yi Chuan Xue Bao. 2005, 32: 393-398. (in Chinese)
Li W, Cui X, Meng Z, Huang X, Xie Q, Wu H, Jin H, Zhang D, Liang W: Transcriptional regulation of Arabidopsis MIR168a and ARGONAUTE1 homeostasis in ABA and abiotic stress responses. Plant Physiol. 2012, 158: 1279-1292. 10.1104/pp.111.188789.
Zhang Y, Chen C, Jin XF, Xiong AS, Peng RH, Hong YH, Yao QH, Chen JM: Expression of a rice DREB1 gene, OsDREB1D, enhances cold and high-salt tolerance in transgenic Arabidopsis. BMB Rep. 2009, 42: 486-492. 10.5483/BMBRep.2009.42.8.486.
Maestrini P, Cavallini A, Rizzo M, Giordani T, Bernardi R, Durante M, Natali L: Isolation and expression analysis of low temperature-induced genes in white poplar (Populus alba). J Plant Physiol. 2009, 166: 1544-1556. 10.1016/j.jplph.2009.03.014.
Baniwal SK, Bharti K, Chan KY, Fauth M, Ganguli A, Kotak S, Mishra SK, Nover L, Port M, Scharf KD, Tripp J, Weber C, Zielinski D, Döring P: Heat stress response in plants: a complex game with chaperones and more than twenty heat stress transcription factors. J Biosci. 2004, 29: 471-487. 10.1007/BF02712120.
Scharf KD, Heider H, Hohfeld I, Lyck R, Schmidt E, Nover L: The tomato Hsf system: HsfA2 needs interaction with HsfA1 for efficient nuclear import and may be localized in cytoplasmic heat stress granules. Mol Cell Biol. 1998, 18: 2240-2251.
Kotak S, Port M, Ganguli A, Bicker F, Döring P: Characterization of C-terminal domains of Arabidopsis heat stress transcription factors (Hsfs) and identification of a new signature combination of plant class A Hsfs with AHA and NES motifs essential for activator function and intracellular localization. Plant J. 2004, 39: 98-112. 10.1111/j.1365-313X.2004.02111.x.
Chung E, Kim KM, Lee JH: Genome-wide analysis and molecular characterization of heat shock transcription factor family in Glycine max. J Genet Genomics. 2013, 40: 127-135. 10.1016/j.jgg.2012.12.002.
Schlueter JA, Dixon P, Granger C, Grant D, Clark L, Doyle JJ, Shoemaker RC: Mining EST databases to resolve evolutionary events in major crop species. Genome. 2004, 47: 868-876. 10.1139/g04-047.
Blanc G, Barakat A, Guyot R, Cooke R, Delseny M: Extensive duplication and reshuffling in the Arabidopsis genome. Plant Cell. 2000, 12: 1093-1101. 10.1105/tpc.12.7.1093.
Rose AB: Intron-mediated regulation of gene expression. Curr Top Microbiol Immunol. 2008, 326: 277-290.
Ikeda M, Ohme-Takagi M: A novel group of transcriptional repressors in Arabidopsis. Plant Cell Physiol. 2009, 50: 970-975. 10.1093/pcp/pcp048.
Kumar M, Busch W, Birke H, Kemmerling B, Nürnberger T, Schöffl F: Heat shock factors HsfB1 and HsfB2b are involved in the regulation of Pdf1.2 expression and pathogen resistance in Arabidopsis. Mol Plant. 2009, 2: 152-165. 10.1093/mp/ssn095.
Kim JS, Mizoi J, Yoshida T, Fujita Y, Nakajima J, Ohori T, Todaka D, Nakashima K, Hirayama T, Shinozaki K, Yamaguchi-Shinozaki K: An ABRE promoter sequence is involved in osmotic stress-responsive expression of the DREB2A gene, which encodes a transcription factor regulating drought-inducible genes in Arabidopsis. Plant Cell Physiol. 2011, 52: 2136-2146. 10.1093/pcp/pcr143.
He Y, Li W, Lv J, Jia Y, Wang M, Xia G: Ectopic expression of a wheat MYB transcription factor gene, TaMYB73, improves salinity stress tolerance in Arabidopsis thaliana. J Exp Bot. 2012, 63: 1511-1522. 10.1093/jxb/err389.
Yang A, Dai X, Zhang WH: A R2R3-type MYB gene, OsMYB2, is involved in salt, cold, and dehydration tolerance in rice. J Exp Bot. 2012, 63: 2541-2556. 10.1093/jxb/err431.
Mishra SK, Tripp J, Winkelhaus S, Tschiersch B, Theres K, Nover L, Scharf KD: In the complex family of heat stress transcription factors, HsfA1 has a unique role as master regulator of thermotolerance in tomato. Genes Dev. 2002, 16: 1555-1567. 10.1101/gad.228802.
Nishizawa-Yokoi A, Yabuta Y, Yoshida E, Maruta T, Yoshimura K, Shigeoka S: Arabidopsis heat shock transcription factor A2 as a key regulator in response to several types of environmental stress. Plant J. 2006, 48: 535-547. 10.1111/j.1365-313X.2006.02889.x.
Ogawa D, Yamaguchi K, Nishiuchi T: High-level overexpression of the Arabidopsis HsfA2 gene confers not only increased themotolerance but also salt/osmotic stress tolerance and enhanced callus growth. J Exp Bot. 2007, 58: 3373-3383. 10.1093/jxb/erm184.
Goodstein DM, Shu S, Howson R, Neupane R, Hayes RD, Fazo J, Mitros T, Dirks W, Hellsten U, Putnam N, Rokhsar DS: Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res. 2012, 40 (Database issue): D1178-D1186. [http://www.phytozome.net/index.php]
Huala E, Dickerman AW, Garcia-Hernandez M, Weems D, Reiser L, LaFond F, Hanley D, Kiphart D, Zhuang M, Huang W, Mueller LA, Bhattacharyya D, Bhaya D, Sobral BW, Beavis W, Meinke DW, Town CD, Somerville C, Rhee SY: The Arabidopsis information resource (TAIR): a comprehensive database and web-based information retrieval, analysis, and visualization system for a model plant. Nucleic Acids Res. 2001, 29: 102-105. 10.1093/nar/29.1.102. [http://arabidopsis.org]
Ouyang S, Zhu W, Hamilton J, Lin H, Campbell M, Childs K, Thibaud-Nissen F, Malek RL, Lee Y, Zheng L, Orvis J, Haas B, Wortman J, Buell CR: The TIGR rice genome annotation resource: improvements and new features. Nucleic Acids Res. 2007, 35 (Database issue): D883-D887. [http://rice.plantbiology.msu.edu]
Grant D, Nelson RT, Cannon SB, Shoemaker RC: SoyBase, the USDA-ARS soybean genetics and genomics database. Nucleic Acids Res. 2010, 38 (Database issue): D843-D846. [http://www.soybase.org]
Letunic I, Doerks T, Bork P: SMART 7: recent updates to the protein domain annotation resource. Nucleic Acids Res. 2012, 40 (Database issue): D302-D305. [http://smart.embl-heidelberg.de/]
Liu RH, Meng JL: MapDraw: a microsoft excel macro for drawing genetic linkage maps based on given genetic linkage data. Hereditas. 2003, 25: 317-321. (in Chinese)
Guo AY, Zhu QH, Chen X, Luo JC: GSDS: a gene structure display server. Hereditas. 2007, 29: 1023-1026. (in Chinese). [http://gsds.cbi.pku.edu.cn]
Knudsen S: Promoter2.0: for the recognition of PolII promoter sequences. Bioinformatics. 1999, 15: 356-361. 10.1093/bioinformatics/15.5.356. [http://www.cbs.dtu.dk/services/Promoter/]
Higo K, Ugawa Y, Iwamoto M, Korenaga T: Plant cis-acting regulatory DNA elements (PLACE) database. Nucleic Acids Res. 1999, 27: 297-300. 10.1093/nar/27.1.297. [http://www.dna.affrc.go.jp/PLACE/signalscan.html]
Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, Valentin F, Wallace IM, Wilm A, Lopez R, Thompson JD, Gibson TJ, Higgins DG: Clustal W and Clustal X version 2.0. Bioinformatics. 2007, 23: 2947-2948. 10.1093/bioinformatics/btm404.
Finn RD, Mistry J, Schuster-Bockler B, Griffiths-Jones S, Hollich V, Lassmann T, Moxon S, Marshall M, Khanna A, Durbin R, Eddy SR, Sonnhammer EL, Bateman A: Pfam: clans, web tools and services. Nucleic Acids Res. 2006, 34 (Database issue): D247-D251. [http://pfam.xfam.org/]
Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S: MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011, 28: 2731-2739. 10.1093/molbev/msr121. [http://megasoftware.net]
Xu ZS, Xia LQ, Chen M, Cheng XG, Zhang RY, Li LC, Zhao YX, Lu Y, Ni ZY, Liu L, Qiu ZG, Ma YZ: Isolation and molecular characterization of the Triticum aestivum L. ethylene-responsive factor 1 (TaERF1) that increases multiple stress tolerance. Plant Mol Biol. 2007, 65: 719-732. 10.1007/s11103-007-9237-9.
Li ZY, Xu ZS, He GY, Yang GX, Chen M, Li LC, Ma YZ: Overexpression of soybean GmCBL1 enhances abiotic stress tolerance and promotes hypocotyl elongation in Arabidopsis. Biochem Biophys Res Commun. 2012, 427: 731-736. 10.1016/j.bbrc.2012.09.128.
Li ZY, Xu ZS, He GY, Yang GX, Chen M, Li LC, Ma YZ: A mutation in Arabidopsis BSK5 encoding a brassinosteroid-signaling kinase. Biochem Biophys Res Commun. 2012, 426: 522-527. 10.1016/j.bbrc.2012.08.118.
Yoo SD, Cho YH, Sheen J: Arabidopsis mesophyll protoplasts: a versatile cell system for transient gene expression analysis. Nat Protoc. 2007, 2: 1565-1572. 10.1038/nprot.2007.199.
This research was financially supported by the National High Technology Research and Development Program of China (2013AA102602) and the National Natural Science Foundation of China (31171546). We are grateful to Dr. Lijuan Qiu of the Institute of Crop Science, Chinese Academy of Agricultural Sciences for kindly providing soybean seeds.
The authors declare they have no competing interests.
Z-S X coordinated the project, conceived and designed experiments, and edited the manuscript. P-S L conducted the bioinformatic work, generated and analyzed data, and wrote the manuscript. T-F Y and G-H H performed experiments and analyzed data. M C and Y-B Z provided reagents. S-C C and Y-Z M contributed with valuable discussions. All authors read and approved the final manuscript.
Pan-Song Li, Tai-Fei Yu contributed equally to this work.
Electronic supplementary material
Additional file 1: Table S1: Normalized digital gene expression counts of the uniquely mappable reads of soybean Hsf genes. For informations collection of gene expressions, ID numbers of soybean Hsf genes were submitted into Soybase [http://soybase.org/soyseq/]. (DOC 98 KB)
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
About this article
Cite this article
Li, PS., Yu, TF., He, GH. et al. Genome-wide analysis of the Hsf family in soybean and functional identification of GmHsf-34 involvement in drought and heat stresses. BMC Genomics 15, 1009 (2014). https://doi.org/10.1186/1471-2164-15-1009
- Genome-wide identification
- Expression pattern
- Subcellular localization
- Functional identification