- Open Access
Genome-wide identification and analysis of bZIP gene family reveal their roles during development and drought stress in Wheel Wingnut (Cyclocarya paliurus)
BMC Genomics volume 23, Article number: 743 (2022)
The bZIP gene family has important roles in various biological processes, including development and stress responses. However, little information about this gene family is available for Wheel Wingnut (Cyclocarya paliurus).
In this study, we identified 58 bZIP genes in the C. paliurus genome and analyzed phylogenetic relationships, chromosomal locations, gene structure, collinearity, and gene expression profiles. The 58 bZIP genes could be divided into 11 groups and were unevenly distributed among 16 C. paliurus chromosomes. An analysis of cis-regulatory elements indicated that bZIP promoters were associated with phytohormones and stress responses. The expression patterns of bZIP genes in leaves differed among developmental stages. In addition, several bZIP members were differentially expressed under drought stress. These expression patterns were verified by RT-qPCR.
Our results provide insights into the evolutionary history of the bZIP gene family in C. paliurus and the function of these genes during leaf development and in the response to drought stress. In addition to basic genomic information, our results provide a theoretical basis for further studies aimed at improving growth and stress resistance in C. paliurus, an important medicinal plant.
The basic leucine zipper (bZIP) family, a supergene family encoding transcription factors (TFs), is evolutionarily conserved and widely distributed across eukaryotic organisms . bZIP TFs contain a bZIP domain, generally composed of 60–80 amino acids, with two functionally distinct parts, a highly conserved basic region and a variable leucine-zipper region (explaining the name bZIP) [2, 3]. The basic binding region has a nuclear localization signal (NLS) and a N-X7-R/K structural unit [4, 5]. The bZIP gene family has been studied extensively in plants. The number of bZIP genes varies considerably among species, with 78 in Arabidopsis , 92 in rice , 86 in poplar , 50 in Arachis duranensis , and 52 in Carthamus tinctorius L. . bZIP genes are involved in vital biological processes, including cell elongation, seed and flower development, and nitrogen/carbon and energy metabolism . In addition to the essential regulatory functions in plant growth and development, bZIP genes participate in the response to abiotic stress. For instance, bZIP17 and bZIP24 in Arabidopsis [11, 12], bZIP72 and ABF1 in rice [13, 14], and bZIP44, bZIP62, and bZIP78 in Glycine max  positively regulate plant responses to salt stress, either directly or indirectly. bZIP52, bZIP16, bZIP23, and bZIP45 in rice are involved in drought tolerance [16,17,18]. Moreover, bZIP52 in rice is a negative regulator in cold signaling . bZIP72 in rice positively regulates the ABA response , while bZIP44, bZIP62, and bZIP78 in G. max show negatively regulatory effects .
Cyclocarya paliurus (Batal.) Iljinskaja (Wheel Wingnut), belonging to the family Juglandaceae , is a deciduous tree and is widely distributed in the mountainous regions of sub-tropical China . In China, leaves of C. paliurus are used as a traditional medicine or nutraceutical tea . Its leaves contain abundant physiologically active compounds , such as triterpenoids, polysaccharides, and flavonoids. Furthermore, there is evidence for strong health-promoting effects of its leaves, including the ability to lower blood sugar, reduce blood lipids, protect against cancer, and enhance immunity . The growth and development of C. paliurus leaves are affected by environmental stress, such as drought, salt, cold, and heat , and various TFs contribute to the regulation of growth in C. paliurus leaves. For example, bZIP is involved in the regulation of amino acid biosynthesis , and MYB and bHLH are involved in the regulation of flavonoid biosynthesis . The analysis of transcriptome data of the leaves in C. paliurus revealed the bZIP gene family was one of the most abundant TFs in this organism that regulate leaf development . In addition to participate in leaf development, bZIP gene family is regarded as important regulators in signaling and responses to drought stress [16,17,18]. However, bZIP gene family characteristics have not been evaluated by integrative genome and transcriptomic analyses in C. paliurus.
The complete genome of C. paliurus has been sequenced, and 46,292 protein-coding genes have been identified . In this study, we performed the genome-wide identification of the bZIP gene family and explored the structural characteristics of bZIP genes. We also measured the differential expression of bZIP genes at four developmental stages and under four drought stress treatments. We explored the evolution of bZIP genes and its roles in leaf developmental process and under drought stress. Our results provide a basis for further analysis of the molecular basis of growth, development, and stress responses in C. paliurus leaves.
Genome-wide identification of bZIP family members in C. paliurus
We identified 58 bZIP genes in the C. paliurus genome, named CpbZIP1 to CpbZIP58 according to their localization on the chromosomes (Table 1). The lengths of CpbZIP mRNA transcripts and protein sequences ranged from 399 bp to 4,116 bp (CDS sequences) and 132 amino acids (CpbZIP8) to 1,371 amino acids (CpbZIP22) (translated protein sequences). The average molecular weight of CpbZIP family members was 43.39 kDa. The average isoelectric point (pI) of CpbZIP genes was 4.78 (CpbZIP11) to 9.53 (CpbZIP27). A plot of the molecular weight with pI for each CpbZIP gene revealed that the majority of CpbZIPs clustered together, indicating that they have a similar properties (Fig. S1). The grand average of hydropathy index (GRAVY) values for CpbZIP members ranged from -0.968 to -0.301, suggesting that these proteins are hydrophilic. All of the CpbZIP genes were predicted to be located in the nucleus, consistent with the biological function of TFs.
To explore evolutionary relationships, we constructed a maximum likelihood phylogenetic tree based on the full-length sequences of proteins encoded by bZIP genes in C. paliurus and Arabidopsis (Fig. 1). The bZIP family members in C. paliurus and Arabidopsis were assigned to 13 groups according to the classification system for Arabidopsis. Only the bZIP proteins of Arabidopsis were assigned to group J and M. The three largest groups in C. paliurus included 13 (group A), 10 (group D), 7 (group I) CpbZIP members (Fig. S1 and Fig. S2).
Chromosome localization, selective pressure, and collinearity analysis of CpbZIP genes
All CpbZIP genes were found on 14 chromosomes of C. paliurus (Fig. 2 and Table 1), with an uneven distribution and substantial variation. Apart from Chromosome 13 and 14, which had no CpbZIP genes, chromosome 3 harbored the largest number of CpbZIP genes (9, 15.5%), while the fewest CpbZIP genes were detected on chromosome 16 (1, 1.7%). In addition, most of the CpbZIP genes were located near the ends of chromosomes.
Furthermore, we examined duplication events of CpbZIP family members. Based on the phylogenetic tree constructed (Fig. S3), several duplication events were predicted. In a survey of CpbZIP genes in the C. paliurus genome, 15 segmental duplications and 5 tandem duplications were identified, as shown in Figure S4 and Table S1, indicating that segmental duplication might play an important role in bZIP gene family expansion. Duplications of CpbZIP genes may have occurred at two time points, approximately 0.25–38.29 Mya and 80.60–99.47 Mya (Table S1). The non-synonymous substitution rate (Ka), synonymous substitution rate (Ks), and Ka/Ks ratio for 21 duplicated gene pairs were calculated to evaluate selective pressure (Table S1). Values of Ka/Ks < 1, Ka/Ks = 1, and Ka/Ks > 1 suggest purifying selection, neutral selection, and positive selection, respectively . The Ka/Ks ratios for all bZIP genes in C. paliurus were 0.1121–1.1166, and only one pair had a Ka/Ks ratio exceeding 1.0, suggesting that most CpbZIP genes were under purifying selection.
The collinearity between C. paliurus bZIP genes and related genes from four other species (i.e., Oryza sativa, Arabidopsis thaliana, Fragaria vesca, and Juglans regia) was also evaluated using the Multiple Collinearity Scan toolkit. In total, 33 bZIP genes in C. paliurus showed collinear relationships with 5 O. sativa genes, 12 Arabidopsis genes, 15 F. vesca genes, and 17 J. regia genes (Fig. 3 and Table S2). The numbers of orthologous gene pairs were 18 between C. paliurus and O. sativa, 22 between C. paliurus and Arabidopsis, 30 between C. paliurus and F. vesca, and 38 between C. paliurus and J. regia. Less orthologous gene pairs were found between C. paliurus and O. sativa, which may be explained by the closer phylogenetic relationships between C. paliurus and other species .
Analyses of gene structure and conserved motifs
To understand the sequence structure of the bZIP family in C. paliurus, the intron–exon structure (Fig. 4) and motif composition of each member (Fig. 5) were analyzed. CpbZIP genes had 1 to 17 exons. Most CpbZIP genes contained 1–3 introns, and some members of the CpbZIP gene family were intron-less, such as CpbZIP2, CpbZIP3, CpbZIP8, CpbZIP15, CpbZIP21, CpbZIP24, CpbZIP34, CpbZIP50, CpbZIP54, and CpbZIP57. A maximum of 16 introns were found in CpbZIP22 (Fig. S5). Moreover, some CpbZIP members belonging to the same group shared similar gene structures (Fig. 4). For example, all members of group S and group H lacked introns. Out of six members in group E, five had four exons and three introns. Of four members in group C, three had six exons and five introns.
To discover conserved motifs of CpbZIP genes, we used MEME (Multiple Em for Motif Elicitation). A total of 20 conserved motifs were identified in 58 CpbZIP genes (Fig. 5), all of which had a bZIP domain (PF00170) represented by motif 1 (Table S3). Motif 6 and motif 14 were detected in the majority of CpbZIP members. In addition, motif 7, motif 8, and motif 15 occurred only in group A. Motif 12 was present only in group E and group I. Motif 2, motif 3, motif 4, motif 5, and motif 10 were located only in group A. Motif 18 was shared only by three members in group F. Many conserved motifs were found in specific groups and might be related to specific biological functions.
Promoter region analysis of CpbZIP genes
We analyzed the 2000 bp region upstream of CpbZIP genes to elucidate cis-acting regulatory elements (CAREs) involved in processes related to development and the stress response using the PlantCARE webserver (Fig. 6). We found 16 unique CAREs in the CpbZIP gene family, including elements related to light responsiveness, defense and stress responsiveness, drought response, flavonoid biosynthetic regulation, and phytohormone responsiveness, including methyl jasmonate (meJA), gibberellin, abscisic acid, auxin, and salicylic acid. CAREs involved in light, plant hormone, and stress responses were most frequent in the CpbZIP gene family (Table S4 and Fig. 6B), suggesting that these genes are important for the regulation of plant growth and stress responses. Moreover, CAREs in CpbZIP members were also related to seed-specific regulation, meristem expression, and endosperm expression, indicating that these genes may be involved in diverse developmental processes. These data provide useful insights into the regulatory effects of the CpbZIP gene family under stress and during development.
Gene ontology analysis of CpbZIP genes
To understand the functions of bZIP family members, we performed a Gene Ontology (GO) analysis [29,30,31,32]. CpbZIP genes were effectively annotated using eggNOG-Mapper (Table S5) . In the biological process category, CpbZIP genes were enriched for processes related to phytohormones and stress responses (Fig. S6 and Table S6). The GO terms related to hormone responses included response to abscisic acid (GO:0,009,737), cellular response to hormone stimulus (GO:0,032,870), and abscisic acid-activated signaling pathway (GO:0,009,738). The GO terms related to the stress response included response to stimulus (GO:0,050,896), response to osmotic stress (GO:0,006,970), and response to salt stress (GO:0,009,651). The results of the GO analysis also further supported the roles of CpbZIP genes in biological processes related to plant development and stress responses.
Expression of CpbZIP genes under drought stress and across developmental stages
To explore the expression pattern of CpbZIP genes at various leaf developmental stages and under drought stress, we retrieved fragments per kilobase million (FPKM) values for all CpbZIP genes from RNA-Seq data. We used FPKM values to build a principal component analysis (PCA) plot (Fig. S7) and heatmaps (Fig. 7). Four stages of leaf development and four drought treatments were evaluated (for details, please refer to the Materials and Methods section). Under drought treatment, Compared to the control C group of drought treatment, 361 different expressed genes (DEGs) were identified from W1 group, 427 DEGs were from W2 group, and 1,213 DEGs were from W3 group. Of 58 CpbZIP genes, 50 were expressed in the drought-treated samples (FPKM > 0) and showed differences in expression (Fig. 7A). For example, CpbZIP4, CpbZIP5, CpbZIP19, CpbZIP22, and CpbZIP41 showed higher expression levels under drought stress condition (W1, W2, and W3) than in the control group (C). Moreover, during leaf development, 53 CpbZIP genes were expressed at different developmental stages, some of which showed higher expression in the smallest fully expanded leaves (Y stage) and small leaves (X stage) than in intermediate-sized leaves (Z stage) and in the largest fully expanded leaves (D stage) (Fig. 7B). CpbZIP1, CpbZIP7, CpbZIP8, CpbZIP15, CpbZIP28, CpbZIP49, CpbZIP51, and CpbZIP55 were most highly expressed in the Y and X stages. These results indicated CpbZIP genes are important for drought tolerance and leaf development.
To confirm the RNA-Seq results, nine differentially expressed genes were selected for validation by qRT-PCR. As shown in Fig. 8A, all selected CpbZIP genes were up-regulated under drought stress. The expression levels of CpbZIP4, CpbZIP19, and CpbZIP41 were significantly higher in all three drought treatments than in the control, while CpbZIP5 expression was significantly higher in W2 and W3 conditions and CpbZIP21 expression was highest in W1 and W2 conditions. An increase in the expression level of CpbZIP22 was detected in W3. During leaf development, as shown in Fig. 8B, CpbZIP7 and CpbZIP55 were highly expressed in the Y developmental stage, while CpbZIP28 was highly up-regulated in the X developmental stage.
Co-expression analysis is a powerful approach to screen associated genes, which may be co-regulated or involved in the same signaling pathway or physiological process . Therefore, co-expression networks were constructed based on the differently expressed genes under developmental and drought stress conditions in C. paliurus. The nine genes with expression changes supported by both RNA-Seq and qRT-PCR (CpbZIP4, CpbZIP5, CpbZIP7, CpbZIP19, CpbZIP21, CpbZIP22, CpbZIP28, CpbZIP41, and CpbZIP55) and mRNAs from plant leaves were used to identify patterns of co-expression (Fig. 9). Nine co-expression networks were obtained, including 342 significantly correlated gene pairs. Among these, the network centered on CpbZIP22 was the largest (90 genes). The network centered on CpbZIP21 was the smallest, with only one co-expressed gene. In addition, with the annotation of 342 significantly correlated gene pairs, several genes were found to be involved in the responses to the water deprivation (Table S7).
We performed a gene set enrichment analysis of eight sets of co-expressed genes (the smallest network involving CpbZIP21 was excluded). The ten most significant GO terms were selected for each set (Fig. 10). CpbZIP4, CpbZIP5, CpbZIP19, CpbZIP22, and CpbZIP41, which were up-regulated under drought stress, were enriched for the response to abiotic stimulus (GO:0,009,607), response to external stimulus (GO:0,009,605), and response to stress (GO:0,006,950). In addition, CpbZIP7, CpbZIP28, and CpbZIP55, which were highly expressed in during leaf development (Y stage and X stage), were enriched for reproduction (GO:0,000,003), post-embryonic development (GO:0,009,791), and growth (GO:0,040,007). CpbZIP genes may therefore play important roles in the regulation of C. paliurus growth and development and stress responses.
C. paliurus is an endangered plant that only grows in China and is a very important medical plant; its leaves contain polysaccharides, triterpenoids, and other chemical components with numerous health benefits . In plants, bZIP TFs have been reported to contribute to developmental processes and abiotic stress tolerance . Members of the bZIP family have been comprehensively identified and analyzed in several plants, including Arabidopsis , rice , poplar , Arachis duranensis , and Carthamus tinctorius L. . Although a chromosome-scale genome assembly of C. paliurus has been reported, bZIP genes have not been comprehensively identified and their roles in leaf development and drought stress are unclear. In this study, 58 bZIP genes were identified in the C. paliurus genome by a homology search. A transcriptome analysis of C. paliurus revealed 60 differentially expressed bZIP genes among different developmental stages , which was higher than number of genes identified in our genome-wide homology-based search. This may explained by the transcriptomic data obtained from four sub-genomes in autotetraploid C. paliurus and the lack of bZIP domain validation. In addition, compared to the genes predicted from transcriptomic data, genome-wide identification combined with a transcriptomic analysis can provide more information on gene structures, functions, and expression patterns [36, 37]. Further chromosome-level assemblies of the four sub-genomes may facilitate more comprehensive functional studies of bZIP genes and their regulatory mechanisms in C. paliurus. The genomic survey revealed 58 members of the C. paliurus bZIP gene family, which was fewer than estimates in Arabidopsis (78 bZIPs), rice (92 bZIPs), maize (125 bZIPs), and poplar (86 bZIPs) [1, 6, 7, 38]. Similar to the C. paliurus family, the bZIP families in Arachis duranensis (50 bZIPs) and Carthamus tinctorius L (52 bZIPs) were relatively small [8, 9], indicating that the gene family in these taxa contracted during evolution.
In this study, all CpbZIPs were predicted to be located in the nucleus, consisting with the TF characteristics and experimental studies in other organisms, such as rice . Moreover, the 58 CpbZIP genes were not uniformly distributed across the 16 chromosomes in C. paliurus (Fig. 2) and were preferentially located near the ends of the chromosomes, similar to observations in sweet potato , Cucumis sativus , and wheat . Based on the phylogenetic reconstruction (Fig. 1), bZIP genes in this study could be categorized into 13 groups; C. paliurus lacked CpbZIP genes in group J and group M in Arabidopsis, suggesting that genes in these groups diverged or were lost in C. paliurus . Recent studies have proposed that gene duplication events are the main driving forces for gene family expansion and genome evolution, particularly segmental duplication and tandem duplication [43, 44]. In the expansion of the bZIP gene family, segmental duplications are more common than tandem duplications in many plants, such as Ipomoea trifida , Malus halliana , and wheat . We detected 15 gene pairs with evidence for segmental duplications and 5 pairs with evidence for tandem duplications (Table S1), consistent with these previous findings. Most CpbZIPs (95.24%) showed evidence for purifying selection (Ka/Ks > 1) , indicating that CpbZIP genes in C. paliurus are highly conserved. One gene pair with Ka/Ks above 1.0 may be under positive selection , with rapid recent evolution and potential functional importance . Furthermore, that there was greater collinearity between C. paliurus and J. regia than between C. paliurus and other plants due to the relatively closer evolutionary relationships . In C. paliurus, CpbZIP members showed similar gene structures in the majority of subfamilies (Fig. 4), especially in the number and length of exons, consistent with results reported in wheat . A motif analysis (Fig. 5) revealed 20 motifs in C. paliurus, named motif 1 to motif 20 (Fig. 5), consistent with results in wheat , Carthamus tinctorius , and cassava . In addition to the bZIP domain (motif 1) located in each CpbZIP gene, the overall compositions of motifs were similar within the same subgroup but different among groups, indicating that functional divergence of bZIP genes may be determined by group-specific motifs . This was consistent with results of studies of polar  and Malus halliana . Both gene structure and motif analyses support the classification of bZIP genes in the phylogenetic analysis.
Several studies have demonstrated the roles of plant bZIP proteins in numerous developmental processes and in responses to biotic and abiotic stresses [8, 49,50,51,52]. However, little is known about their functions in C. paliurus. In this study, we explored their expression patterns after drought stress treatment and during different stages of leaf development. A transcriptome analysis revealed that a large number of CpbZIP genes were up-regulated after drought treatment or in the Y stage and X stage (Figs. 7 and 8), such as CpbZIP4, CpbZIP5, CpbZIP7, CpbZIP19, CpbZIP21, CpbZIP22, CpbZIP28, CpbZIP41, and CpbZIP55, indicating CpbZIPs have vital functions in leaf development and responses to drought stress. Similarly, the cis-acting elements in promoter regions contained a variety of components involved in the stress response (drought response, low-temperature response, and defense and stress response) and phytohormone responses (gibberellin, auxin, abscisic acid, salicylic acid, and methyl jasmonate) (Fig. 6). These results supported the important roles of the CpbZIP gene family in environmental stress and plant development, consistent with previously reported functions of bZIP TFs [1, 4, 15,16,17, 19, 51]. In the present study, in addition to the up-regulated genes, some CpbZIPs were down-regulated in response to drought stress and during leaf development, indicating that CpbZIP TFs might act as positive or negative regulators. This phenomenon has been reported in other organisms. For example, AtbZIP17 and AtbZIP24 act as positive regulators in Arabidopsis under salt stress [11, 12], while OsbZIP52  in rice functions as a negative regulator in cold signaling. Moreover, OsbZIP72 in rice positively regulates the ABA response , while GmbZIP44 and GmbZIP62 in Glycine max show negatively regulatory effects . To understand bZIP gene functions in C. paliurus, co-expression network and gene set enrichment analyses were performed (Figs. 9 and 10). The differentially expressed genes at different developmental stages and their corresponding networks were mainly enriched in processes related to plant growth, while differentially expressed genes in drought stress were not only enriched in stress response-related biological processes but also in growth-related processes. These results suggested that CpbZIP genes are potentially involved in drought resistance and leaf development in C. paliurus. Nonetheless, further experimental analyses should be carried out to elucidate the precise regulatory mechanism by which CpbZIP genes contribute to the response to drought stress and development.
C. paliurus is an endangered medical plant distributed in the mountainous regions of sub-tropical China. Research has mainly focused on increasing yield, quality, and stress tolerance in C. paliurus. The bZIP gene family is involved in plant growth and development and plays important roles in the tolerance to environmental stress. In this study, we identified and characterized the bZIP gene family in C. paliurus. Expression profiling and functional enrichment analyses clearly demonstrated the role of CpbZIPs in leaf development and the response to drought stress. The results of this study improve our understanding of the role of bZIPs in developmental processes and in drought stress and provide a good foundation for further studies of the molecular regulatory mechanisms underlying C. paliurus stress resistance and growth.
Genome-wide identification of bZIP transcription factors in C. paliurus
The hidden Markov model of the bZIP domain (PF00170) was obtained from the PFAM database (http://pfam.xfam.org/, accessed on 19 November 2021) and the genome sequence and genome annotation of C. paliurus were downloaded from Genome Warehouse in National Genomics Data Center Beijing Institute of Genomics, Chinese Academy of Sciences/China National Center for Bioinformation (https://ngdc.cncb.ac.cn/gwh, under accession number GWHBEHY00000000, accessed on 18 December 2021). To identify CpbZIP genes in C. paliurus, two methods were applied. First, a local database of protein sequences was made for C. paliurus, and bZIP genes from Arabidopsis were utilized to discover putative bZIP genes in C. paliurus by BLASTp searches. A cutoff e-value of 10–5 and bit score of 100 were thresholds for the identification of putative bZIP genes. Second, another protein sequence database of bZIP genes from other plant species was built from Ensembl hosts (http://plants.ensembl.org/index.html, accessed on 21 February 2022). Then, BLASTp searches were performed against the proteome of C. paliurus with an e-value threshold of 10–5 and bit score threshold of 100. After removing redundancy, 72 putative bZIP candidates were obtained, which were further verified for the existence of the bZIP domain (PF00170) using HMM-scan (https://www.ebi.ac.uk/Tools/hmmer/search/hmmscan), NCBI CDD (https://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml), interPro (https://www.ebi.ac.uk/interpro/), and SMART tools (https://smart.embl-heidelberg.de/). After removing sequences without bZIP domains, 58 bZIP genes were named according to the locations on the chromosomes.
Sequence analysis of CpbZIP genes in C. paliurus
The isoelectric point and molecular weight of CpbZIP proteins were characterized using the isoelectric point calculator (https://web.expasy.org/compute_pi/). CELLO [53, 54] was used to predict the subcellular localization of CpbZIP proteins. The annotation file was utilized to extract intron–exon distributions and gene structures were visualized using Gene Structure Display Server 2.0 . MEME  was used to elucidate conserved motifs. The maximum number of motifs was set to 10, motif width was 6–20, and other parameters were set to default values. For the identification of CAREs, the 2000 bp sequences upstream of the CpbZIP genes were analyzed by the PlantCARE online server (http://bioinformatics.psb.ugent.be/webtools/plantcare/html) and visualized using TBtools .
Chromosomal location, gene duplication, and synteny analysis
The genomic positions of CpbZIP genes and length of each chromosome were extracted from genome sequence and annotation files using local Perl scripts. TBtools was used to represent CpbZIP genes on C. paliurus chromosomes. MCScanX was used to investigate gene duplication events within C. paliurus species and similarity between bZIP genes in C. paliurus and four species, Oryza sativa, Arabidopsis thaliana, Fragaria vesca, and Juglans regia. Data for the first three species were downloaded from the Phytozome database  and data for Juglans regia were downloaded from the NCBI Nucleotide database (NC_049901–NC_049916). The nonsynonymous substitution rate and synonymous substitution rates were calculated using DnaSP . The time of each gene duplication event was calculated with formula T = Ks/2λ, assuming 6.5 × 10−9 synonymous substitutions per site per year [41, 60, 61].
Plant material and drought treatment
Leaf materials of C. paliurus were collected from ZhuZhang Village, Longquan City, Lishui City, Zhejiang province, China (E118°48’28”, N28°5’57”). Leaves were divided into four development stages, including the smallest fully expanded leaves (Y stage), small leaves (X stage), intermediate-sized leaves (Z stage), and the largest fully expanded leaves (D stage). The leaves of C. paliurus were sampled separately on the same tree at the same time of each developmental stage. The collected leaves were stored in a liquid nitrogen tank immediately after being collected from the branches. Then the leaves were transferred to -80℃ freezer for storage after returning to the laboratory. Three biological replicates were independently performed, and each developmental stage contained three plants in one biological replicate. To avoid experimental errors between repetitions, we collected leaves of four developmental stages on the same tree with different orientations at the same time. In addition, one replicate of each developmental stage mixed the leaves from three randomly selected trees. For each developmental stage, the whole leaves were used for further RNA-seq analysis.
For the drought treatment, 2-year-old C. paliurus seedlings were moved to greenhouse in Taizhou University with a ratio of peat soil to vermiculite of 2:1. After the seedlings were adapted to the growth environment and maintained stable growth, four drought treatments were applied for 100 days, including 22.5–25.5% soil water (control C group), 16.5–19.5% soil water (W1), 10.5–13.5% soil water (W2), and 4.5–7.5% soil water (W3). Similar to the developmental leaf materials, three biological replicates for each drought treatment were included for transcriptome analyses.
Transcriptomic data for C. paliurus leaves at four developmental stages were collected as described previously by Sheng et al.  and were downloaded from the NCBI database with accession no. PRJNA548403. For different drought treatment groups, total RNA was extracted from the leaves using a Total RNA Extractor (TRIzol) Kit (B51311; Sangon Biotechnology, Shanghai, China). Three biological replicates were performed for a total of 12 samples, which were used for mRNA library construction after the determination of the quality and concentration of extracted RNAs using the NanoDrop 2000 (Thermo Fisher, Waltham, MA, USA). mRNA libraries were constructed using the VAHTS mRNA-seq V2 Library Prep Kit for Illumina (NR60102; Vazyme Biotechnology, Nanjing, China). The T100TM thermal cycler (Bio-Rad, Hercules, CA, USA) was used to synthesize the first- and second-strand cDNAs, and the library fragments were further purified by AMPure XP System (Beckman Coulter Company, Beverly, MA, USA). After library amplification by PCR, the products were purified using the AMPure XP system and qualified using the Bioanalyzer 2100 system (Agilent Technologies Inc., Santa Clara, CA, USA). Finally, paired-end sequencing of these libraries was performed using HiSeq X Ten sequencers (Illumina, San Diego, CA, USA) by Novagen Co., Ltd. (Beijing, China). After removing the adapters and low-quality reads using Trimmomatic , the trimmed reads were aligned to the C. paliurus genome using HISAT2 with default parameters . The expression profiles including FPKM values and read counts for each CpbZIP gene were calculated using StringTie  with default parameters. Heatmaps and a principal component analysis (PCA) were performed using TBtools  and the FactoMineR R package .
Real-time PCR analysis
RNAs extracted from plants at different developmental stages and under drought stress were treated with DNase-I (Takara Bio. Inc., Shiga, Japan) at 37 °C for 30 min to remove genomic DNA contamination. RNAs were reverse transcribed to cDNA using the cDNA Synthesis SuperMix Kit (Applied Biosystems, Shanghai, China). Quantitative real-time PCR (qRT-PCR) was performed using SYBR qPCR Master MIX (Vazyme). Three biological replicates were included for each sample. Relative expression by qRT-PCR was normalized to beta actin (β-actin). The fold change values were calculated based on mean 2−∆∆CT values . Primers were designed using the Sangon Biotech online server (https://www.sangon.com/newPrimerDesign). The primers are listed in Table S8.
Gene co-expression and gene ontology analysis
Nine differentially expressed CpbZIP genes were evaluated. Co-expression between CpbZIP genes and non-CpbZIP genes was evaluated based on Pearson correlation coefficients (PCC). Gene pairs for which the absolute value of the PCC was higher than 0.99 (p < 0.01) were regarded as co-expressed. Cytoscape  was used for network visualization. A gene set enrichment analysis was performed using the clusterprofiler package in R .
Availability of data and materials
The raw RNA-Seq data of drought treatment groups in C. paliurus analyzed in this study have been deposited in the Nation Center for Biotechnology Information (NCBI) Sequence Read Archive (SRA) database under the accession number PRJNA870281. The transcriptomic data for C. paliurus leaves at four developmental stages that analyzed in this study were from the NCBI database with accession number PRJNA548403. The genome sequence and genome annotation of C. paliurus were from Genome Warehouse in National Genomics Data Center Beijing Institute of Genomics, Chinese Academy of Sciences/China National Center for Bioinformation (https://ngdc.cncb.ac.cn/gwh, accession number GWHBEHY00000000).
Basic leucine zipper
Real time quantitative polymerase chain reaction
Grand average of hydropathy index
Nuclear localization signal
Millions of Years Ago
- K a :
Non-synonymous substitution rate
- K s :
Synonymous substitution rate
Cis-acting regulatory elements
Principal component analysis
Dröge-Laser W, Snoek BL, Snel B, Weiste C. The Arabidopsis bZIP transcription factor family — an update. Curr Opin Plant Biol. 2018;45(Pt A):36–49.
Kouzarides T, Ziff E. Leucine zippers of fos, jun and GCN4 dictate dimerization specificity and thereby control DNA binding. Nature. 1989;340(6234):568–71.
Vinson CR, Sigler PB, McKnight SL. Scissors-grip model for DNA recognition by a family of leucine zipper proteins. Science. 1989;246(4932):911–6.
Jakoby M, Weisshaar B, Dröge-Laser W, Vicente-Carbajosa J, Tiedemann J, Kroj T, et al. bZIP transcription factors in Arabidopsis. Trends Plant Sci. 2002;7(3):106–11.
Yu Y, Qian Y, Jiang M, Xu J, Yang J, Zhang T, et al. Regulation Mechanisms of Plant Basic Leucine Zippers to Various Abiotic Stresses. Front Plant Sci. 2020;11:1258.
Corrêa LGG, Riaño-Pachón DM, Schrago CG, dos Santos RV, Mueller-Roeber B, Vincentz M. The role of bZIP transcription factors in green plant evolution: adaptive features emerging from four founder genes. PLoS One. 2008;3(8):e2944.
Zhao K, Chen S, Yao W, Cheng Z, Zhou B, Jiang T. Genome-wide analysis and expression profile of the bZIP gene family in poplar. BMC Plant Biol. 2021;21(1):122.
Wang Z, Yan L, Wan L, Huai D, Kang Y, Shi L, et al. Genome-wide systematic characterization of bZIP transcription factors and their expression profiles during seed development and in response to salt stress in peanut. BMC Genomics. 2019;20(1):51.
Li H, Li L, ShangGuan G, Jia C, Deng S, Noman M, et al. Genome-wide identification and expression analysis of bZIP gene family in Carthamus tinctorius L. Sci Rep. 2020;10(1):15521.
Yang Z, Sun J, Chen Y, Zhu P, Zhang L, Wu S, et al. Genome-wide identification, structural and gene expression analysis of the bZIP transcription factor family in sweet potato wild relative Ipomoea trifida. BMC Genet. 2019;20(1):41.
Liu J-X, Srivastava R, Howell SH. Stress-induced expression of an activated form of AtbZIP17 provides protection from salt stress in Arabidopsis. Plant Cell Environ. 2008;31(12):1735–43.
Yang O, Popova OV, Süthoff U, Lüking I, Dietz K-J, Golldack D. The Arabidopsis basic leucine zipper transcription factor AtbZIP24 regulates complex transcriptional networks involved in abiotic stress resistance. Gene. 2009;436(1–2):45–55.
Baoxiang W, Yan L, Yifeng W, Jingfang L, Zhiguang S, Ming C, et al. OsbZIP72 Is Involved in Transcriptional Gene-Regulation Pathway of Abscisic Acid Signal Transduction by Activating Rice High-Affinity Potassium Transporter OsHKT1;1. Rice Sci. 2021;28(3):257–67.
Hossain MA, Lee Y, Cho J-I, Ahn C-H, Lee S-K, Jeon J-S, et al. The bZIP transcription factor OsABF1 is an ABA responsive element binding factor that enhances abiotic stress signaling in rice. Plant Mol Biol. 2010;72(4–5):557–66.
Liao Y, Zou H-F, Wei W, Hao Y-J, Tian A-G, Huang J, et al. Soybean GmbZIP44, GmbZIP62 and GmbZIP78 genes function as negative regulator of ABA signaling and confer salt and freezing tolerance in transgenic Arabidopsis. Planta. 2008;228(2):225–40.
Liu C, Wu Y, Wang X. bZIP transcription factor OsbZIP52/RISBZ5: a potential negative regulator of cold and drought stress response in rice. Planta. 2012;235(6):1157–69.
Chen H, Chen W, Zhou J, He H, Chen L, Chen H, et al. Basic leucine zipper transcription factor OsbZIP16 positively regulates drought resistance in rice. Plant Sci. 2012;193–194:8–17.
Park S-H, Jeong JS, Lee KH, Kim YS, Choi YD, Kim J-K. OsbZIP23, and OsbZIP45, members of the rice basic leucine zipper transcription factor family, are involved in drought tolerance. Plant Biotechnology Reports. 2015;9:89–96.
Lu G, Gao C, Zheng X, Han B. Identification of OsbZIP72 as a positive regulator of ABA response and drought tolerance in rice. Planta. 2009;229(3):605–15.
Deng B, Li Y, Xu D, Ye Q, Liu G. Nitrogen availability alters flavonoid accumulation in Cyclocarya paliurus via the effects on the internal carbon/nitrogen balance. Sci Rep. 2019;9(1):2370.
Kakar MU, Naveed M, Saeed M, Zhao S, Rasheed M, Firdoos S, et al. A review on structure, extraction, and biological activities of polysaccharides isolated from Cyclocarya paliurus (Batalin) Iljinskaja. Int J Biol Macromol. 2020;156:420–9.
Liu Y, Fang S, Yang W, Shang X, Fu X. Light quality affects flavonoid production and related gene expression in Cyclocarya paliurus. J Photochem Photobiol, B. 2018;179:66–73.
Wang H, Tang C, Gao Z, Huang Y, Zhang B, Wei J, et al. Potential Role of Natural Plant Medicine Cyclocarya paliurus in the Treatment of Type 2 Diabetes Mellitus. J Diabetes Res. 2021;2021:1655336.
Zheng X, Xiao H, Su J, Chen D, Chen J, Chen B, et al. Insights into the evolution and hypoglycemic metabolite biosynthesis of autotetraploid Cyclocarya paliurus by combining genomic, transcriptomic and metabolomic analyses. Ind Crops Prod. 2021;173:114154.
Yang Z-T, Fan S-X, Li R, Huang T-M, An Y, Guo Z-Q, et al. The optimal reference gene validation in Cyclocarya paliurus (Batal) Iljinskaja under environmental stresses. Agron J. 2022;10:1–12.
Du Z, Lin W, Zhu J, Li J. Amino acids profiling and transcriptomic data integration demonstrates the dynamic regulation of amino acids synthesis in the leaves of Cyclocarya paliurus. PeerJ. 2022;10:e13689.
Sheng X, Chen H, Wang J, Zheng Y, Li Y, Jin Z, et al. Joint Transcriptomic and Metabolic Analysis of Flavonoids in Cyclocarya paliurus Leaves. ACS Omega. 2021;6(13):9028–38.
Hurst LD. The Ka/Ks ratio: diagnosing the form of sequence evolution. Trends Genet. 2002;18(9):486.
Tao Y-T, Ding X-B, Jin J, Zhang H-B, Guo W-P, Ruan L, et al. Predicted rat interactome database and gene set linkage analysis. Database (Oxford). 2020;2020:baaa086.
Jin J, Tao Y-T, Ding X-B, Guo W-P, Ruan L, Yang Q, et al. Predicted yeast interactome and network-based interpretation of transcriptionally changed genes. Yeast. 2020;37(11):573–83.
Ding X-B, Jin J, Tao Y-T, Guo W-P, Ruan L, Yang Q-L, et al. Predicted Drosophila Interactome Resource and web tool for functional interpretation of differentially expressed genes. Database (Oxford). 2020;2020:baaa005.
Guo W-P, Ding X-B, Jin J, Zhang H, Yang Q, Chen P-C, et al. HIR V2: a human interactome resource for the biological interpretation of differentially expressed genes via gene set linkage analysis. Database (Oxford). 2021;2021:baab009.
Huerta-Cepas J, Szklarczyk D, Heller D, Hernández-Plaza A, Forslund SK, Cook H, et al. eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res. 2019;47(D1):D309-14.
van Dam S, Võsa U, van der Graaf A, Franke L, de Magalhães JP. Gene co-expression analysis for functional classification and gene–disease predictions. Brief Bioinform. 2018;19(4):575–92.
Yang Y, Li J, Li H, Yang Y, Guang Y, Zhou Y. The bZIP gene family in watermelon: genome-wide identification and expression analysis under cold stress and root-knot nematode infection. PeerJ. 2019;7:e7878.
Li X, Ke X, Qiao L, Sui Y, Chu J. Comparative genomic and transcriptomic analysis guides to further enhance the biosynthesis of erythromycin by an overproducer. Biotechnol Bioeng. 2022;119(6):1624–40.
Khodadadian A, Darzi S, Haghi-Daredeh S, Sadat Eshaghi F, Babakhanzadeh E, Mirabutalebi SH, et al. Genomics and Transcriptomics: The Powerful Technologies in Precision Medicine. IJGM. 2020;13:627–40.
Wei K, Chen J, Wang Y, Chen Y, Chen S, Lin Y, et al. Genome-wide analysis of bZIP-encoding genes in maize. DNA Res. 2012;19(6):463–76.
Zou M, Guan Y, Ren H, Zhang F, Chen F. A bZIP transcription factor, OsABI5, is involved in rice fertility and stress tolerance. Plant Mol Biol. 2008;66(6):675–83.
Baloglu MC, Eldem V, Hajyzadeh M, Unver T. Genome-wide analysis of the bZIP transcription factors in cucumber. PLoS One. 2014;9(4):e96014.
Liang Y, Xia J, Jiang Y. Genome-Wide Identification and Analysis of bZIP Gene Family and Resistance of TaABI5 (TabZIP96) under Freezing Stress in Wheat (Triticum aestivum). Int J Mol Sci. 2022;23(4):2351.
de Souza SJ, Long M, Gilbert W. introns and gene evolution. Genes Cells. 1996;1(6):493–505.
Lawton-Rauh A. Evolutionary dynamics of duplicated genes in plants. Mol Phylogenet Evol. 2003;29(3):396–409.
Moore RC, Purugganan MD. The early stages of duplicate gene evolution. Proc Natl Acad Sci USA. 2003;100(26):15682–7.
Wang S, Zhang R, Zhang Z, Zhao T, Zhang D, Sofkova S, et al. Genome-wide analysis of the bZIP gene lineage in apple and functional analysis of MhABF in Malus halliana. Planta. 2021;254:78.
Xing Y, Lee C. Can RNA selection pressure distort the measurement of Ka/Ks. Gene. 2006;370:1–5.
Parmley JL, Hurst LD. How common are intragene windows with KA > KS owing to purifying selection on synonymous mutations. J Mol Evol. 2007;64(6):646–55.
Hu W, Yang H, Yan Y, Wei Y, Tie W, Ding Z, et al. Genome-wide characterization and analysis of bZIP transcription factor gene family related to abiotic stress in cassava. Sci Rep. 2016;6:22783.
Joo H, Baek W, Lim CW, Lee SC. Post-translational Modifications of bZIP Transcription Factors in Abscisic Acid Signaling and Drought Responses. Curr Genomics. 2021;22(1):4–15.
Gangappa SN, Botto JF. The Multifaceted Roles of HY5 in Plant Growth and Development. Mol Plant. 2016;9(10):1353–65.
Ma H, Liu C, Li Z, Ran Q, Xie G, Wang B, et al. ZmbZIP4 Contributes to Stress Resistance in Maize by Regulating ABA Synthesis and Root Development. Plant Physiol. 2018;178(2):753–70.
Lorenzo O. bZIP edgetic mutations: at the frontier of plant metabolism, development and stress trade-off. J Exp Bot. 2019;70(20):5517–20.
Yu C-S, Lin C-J, Hwang J-K. Predicting subcellular localization of proteins for Gram-negative bacteria by support vector machines based on n -peptide compositions. Protein Sci. 2004;13(5):1402–6.
Yu C-S, Chen Y-C, Lu C-H, Hwang J-K. Prediction of protein subcellular localization. Proteins. 2006;64(3):643–51.
Hu B, Jin J, Guo A-Y, Zhang H, Luo J, Gao G. GSDS 2.0: an upgraded gene feature visualization server. Bioinformatics. 2015;31(8):1296–7.
Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 2009;37(Web Server issue):W202-8.
Chen C, Chen H, Zhang Y, Thomas HR, Frank MH, He Y, et al. TBtools: An Integrative Toolkit Developed for Interactive Analyses of Big Biological Data. Mol Plant. 2020;13(8):1194–202.
Goodstein DM, Shu S, Howson R, Neupane R, Hayes RD, Fazo J, et al. Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res. 2012;40(Database issue):D1178-86.
Librado P, Rozas J. DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics. 2009;25(11):1451–2.
Lynch M, Conery JS. The evolutionary fate and consequence of duplicategenes. Science. 2000;290(5494):1151–5.
Yang Z, Gu S, Wang X, Li W, Tang Z, Xu C. Molecular evolution of the CPP-like gene family in plants: insights from comparative genomics of Arabidopsis and rice. J Mol Evol. 2008;67(3):266–77.
Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20.
Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nat Methods. 2015;12(4):357–60.
Pertea M, Kim D, Pertea GM, Leek JT, Salzberg SL. Transcript-level expression analysis of RNA-seq experiments with HISAT StringTie and Ballgown. Nat Protoc. 2016;11(9):1650–67.
Lê S, Josse J, Husson F. FactoMineR: An R Package for Multivariate Analysis. J Stat Soft. 2008;25(1):1–18.
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13(11):2498–504.
Yu G, Wang L-G, Han Y, He Q-Y. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS. 2012;16(5):284–7.
We thank Taizhou Bigdata AI Research Center for providing computing resources.
This work was supported by Jiebang Guashuai Program in Traditional Chinese Medicine Industry of Pan'an County, Zhejiang Provincial Key Research and Development Program (2018C02021), Ten Thousand Talent Program of Zhejiang Province (No. 2019R52043), and Taizhou Science and Technology Project (No. 21ywb76).
Ethics approval and consent to participate
Plant materials of wild C. paliurus were collected from ZhuZhang Village, Longquan City, Lishui City, Zhejiang province, China. All the required permissions have been obtained from Forest Research Institute of Longquan City. The wild C. paliurus was identified by Professor Zexin Jin in Taizhou University. The voucher specimen of C. paliurus was deposited in the herbarium of Zhejiang Province Laboratory of Plant Evolution Ecology and Conservation, Taizhou University. The plant materials don’t include any wild species at risk of extinction. We comply with relevant institutional, national, and international guidelines and legislation for plant study.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Fig. S1. Molecular weight (kDa) vs. isoelectric point for CpbZIP genes.
Fig. S2. Distribution of CpbZIPs in different groups in the phylogenetic tree.
Fig. S3. Phylogenetic analysis of CpbZIP genes. The phylogenetic tree was constructed using IQ-tree with the maximum likelihood (ML) method and 1000 bootstrap replications. Black asterisks indicate putative duplicated genes.
Fig. S4. Chromosomal distribution and duplicated CpbZIP gene pairs. Duplicated bZIP gene pairs are connected by lines with distinct colors.
Fig. S5. Distribution of intron numbers in CpbZIP genes in different groups according to the phylogenetic tree.
Fig. S6. Gene Ontology term distribution in CpbZIP genes.
Fig. S7. PCA plots displaying differentiation with respect to developmental stages and drought stress conditions based on CpbZIP expression patterns.
Table S1. Information on duplicated bZIP gene pairs in C. paliurus. Table S2. Orthologous relationships between CpbZIP genes and bZIP genes in Oryza sativa, Arabidopsis thaliana, Fragaria vesca, and Juglans regia. Table S3. Domain organization of CpbZIP genes predicted using pfam. Table S4. Cis-regulatory elements in CpbZIP promoter regions. Table S5. Gene annotation using eggnog-mapper. Table S6. Gene Ontology analysis of CpbZIP genes. Table S7. Potential genes involved in drought stress responses according to 342 significantly correlated gene pairs. Table S8. qRT-PCR primers for CpbZIP genes.
About this article
Cite this article
Tao, YT., Chen, LX., Jin, J. et al. Genome-wide identification and analysis of bZIP gene family reveal their roles during development and drought stress in Wheel Wingnut (Cyclocarya paliurus). BMC Genomics 23, 743 (2022). https://doi.org/10.1186/s12864-022-08978-8
- Cyclocarya paliurus
- bZIP gene family
- Leaf development
- Drought stress