- Open Access
Genome-wide identification and characterization of members of the LEA gene family in Panax notoginseng and their transcriptional responses to dehydration of recalcitrant seeds
BMC Genomics volume 24, Article number: 126 (2023)
Late embryogenesis abundant (LEA) proteins play an important role in dehydration process of seed maturation. The seeds of Panax notoginseng (Burkill) F. H. Chen are typically characterized with the recalcitrance and are highly sensitive to dehydration. However, it is not very well known about the role of LEA proteins in response to dehydration stress in P. notoginseng seeds. We will perform a genome-wide analysis of the LEA gene family and their transcriptional responses to dehydration stress in recalcitrant P. notoginseng seeds.
In this study, 61 LEA genes were identified from the P. notoginseng genome, and they were renamed as PnoLEA. The PnoLEA genes were classified into seven subfamilies based on the phylogenetic relationships, gene structure and conserved domains. The PnoLEA genes family showed relatively few introns and was highly conserved. Unexpectedly, the LEA_6 subfamily was not found, and the LEA_2 subfamily contained 46 (75.4%) members. Within 19 pairs of fragment duplication events, among them 17 pairs were LEA_2 subfamily. In addition, the expression of the PnoLEA genes was obviously induced under dehydration stress, but the germination rate of P. notoginseng seeds decreased as the dehydration time prolonged.
We found that the lack of the LEA_6 subfamily, the expansion of the LEA_2 subfamily and low transcriptional levels of most PnoLEA genes might be implicated in the recalcitrant formation of P. notoginseng seeds. LEA proteins are essential in the response to dehydration stress in recalcitrant seeds, but the protective effect of LEA protein is not efficient. These results could improve our understanding of the function of LEA proteins in the response of dehydration stress and their contributions to the formation of seed recalcitrance.
Nowadays, LEA gene family has been identified in rice (Oryza sativa) , maize (Zea mays) , Brassica napus , wheat (Triticum aestivum)  and Arabidopsis thaliana . LEA proteins are found in large numbers in plant species, for example, A. thaliana has 51 members , B. napus has 108  and wheat has 281 . The number of LEA genes is different across species and the diversity might be related to the response of plants to abiotic stresses. The 26 MeLEA genes were identified in cassava (Manihot esculenta Crantz) and were observed to respond to multiple abiotic stresses, and H2O2 and ABA signaling . The transgenic A. thaliana and foxtail millet (Setaria italica) plants overexpressing SiLEA14 showed higher tolerance to salt and osmotic stress than the wild type (WT) . The 33 CsLEA genes have been identified in the genome of the recalcitrant seed of tea tree, and they are closely related to the response to low temperature and dehydration stresses in tea tree (Camellia sinensis) . Currently, the LEA gene family have become a popular research topic in plant response to stress.
Late embryogenesis abundant (LEA) proteins is firstly found in cotton (Gossypium hirsutum) seeds . Cotton seeds significantly accumulate LEA proteins when they mature and dehydration in order to protect them from damage . The expressions of LEA genes have been recorded in different tissues, including seeds, roots, stems and buds . For example, the expression of most LEA genes showed different tissue-specificity in maize . The SmLEA genes of Salvia miltiorrhiza are specifically expressed in distinct tissues, and most of them are up regulated when Salvia miltiorrhiza is under drought conditions . LEA proteins are generally classified into eight subfamilies based on the similarity of sequences and specific conserved domains, including LEA_1, LEA_2, LEA_3, LEA_4, LEA_5, LEA_6, dehydrin (DHN) and seed maturation protein (SMP) . The molecular weight of most LEA proteins ranges from 10 to 30 kDa. LEA proteins are composed with glycine and other hydrophilic amino acids, which are highly hydrophilic and heat stable and play a role in stabilizing cell membranes, molecular barriers, ion binding and antioxidant in plants under stress . LEA proteins are protectors of cell membranes and biomolecules, and they stabilize the structure of other proteins and cell membranes by forming dense hydrogen bonds [14, 15]. LEA proteins retargeting intracellular water molecules, binding salt ions, and eliminating active oxygen radicals accumulated in cells due to dehydration . In addition, LEA proteins can combine with misfolded proteins through molecular chaperones that stabilize denatured proteins and promote their refolding . The overexpression of the Group LEA_4 protein from B. napus considerably improve abiotic stress tolerance including salt stress and drought stress in transgenic A. thaliana plants . The overexpression of the ShDHN gene enhances the tolerance of tomato (Solanum lycopersicum) to abiotic stresses . Drought tolerance is enhanced through protecting embryos and endosperm from water deficiency in transgenic A. thaliana plants with MdoDHN11 overexpression . Therefore, LEA proteins are essential to the process of obtaining dehydration tolerance.
Seeds could be divided into orthodox and recalcitrant according to their storage characteristics and desiccation tolerance [21, 22]. Recalcitrant seeds maintain a high water content when they mature and fall off, and it is sensitive to dehydration and low temperature during the growth and development process . The germination rate decrease from 92 to 50% when the water content of recalcitrant Ginkgo biloba seeds is reduced from 48% to 40.1% . Similarly, recalcitrant Saraca asoca seeds show an initial water content of 56.8% and are completely inactivated when water content is reduced to between 11 and 17% . The accumulation of LEA proteins play an important role in the acquirement of dehydration tolerance [26,27,28]. LEA proteins are hydrophilic and they create a protective membrane around the cellular internal structure and macromolecules, consequently making the seeds confer dehydration tolerance . High accumulation of LEA proteins has been observed in the orthodox maize seeds during maturation dehydration . The lack of LEA protein is found in dehydration-sensitive recalcitrant Avicennia marina seeds . The deficiency of LEA proteins may be an essential reason for its susceptibility to dehydration in the recalcitrant seeds of chestnut bean tree (Castanospermum australe) . Gene expression of antioxidant enzymes and LEA proteins are down-regulated in recalcitrant tea tree during dehydration process . However, it is still unclear whether the lack of LEA proteins causes the dehydration sensitivity of the recalcitrant seeds.
Panax notoginseng (Burk.) F. H. Chen (Sanqi in Chinese), is a perennial herb of the family of Araliaceae . The Panax notoginseng seeds belong to the group of morphophysiological dormancy (MPD) type, and moreover it has been typically characterized by the recalcitrant trait that show a high water content at postharvest after-ripening process . The seeds need to undergo about 45 ~ 60 days of after-ripening process before the germination . It is extremely unfavorable for the storage of P. notoginseng seeds with dehydration sensitivity and dormancy. Slow dehydration is more harmful to P. notoginseng seeds than rapid dehydration . Membrane peroxidation and the reduced activity of antioxidant enzyme are one of the important reasons for the dehydration sensitivity of P. notoginseng seeds . Recently, RNA-Seq analysis showed that the LATE EMBRYOGENESIS ABUNDANT PROTEIN DC3 and DEHYDRIN9 may be involved in the dehydration sensitivity of P. notoginseng seeds at different after-ripening stages . Our previous study has found that the lack of LEA proteins in embryos may be a key factor in the dehydration sensitivity of recalcitrant P. notoginseng seeds . However, the identification of the LEA gene family in recalcitrant seeds of P. notoginseng has not been performed in the context of whole genome and thus it is not very well known about the functions of the PnoLEA proteins, especially in the response to dehydration stress.
In our study, the LEA genes would be identified from the genome of P. notoginseng, and we analyzed the gene structure, conserved domains, phylogenetic relationship, chromosomal location and duplication event. In addition, we analyzed the expression of the PnoLEA genes in distinct tissues and the response to dehydration stress. These results would improve our understanding of the PnoLEA genes family, and this study would provide a new insight for the functions of LEA proteins of recalcitrant seeds under dehydration stresses.
Identification of PnoLEA genes in the P. notoginseng
We identified 61 LEA genes in the P. notoginseng genome by combining HMMER and local BLAST methods (Table 1). We renamed each PnoLEA genes according to its localization on the P. notoginseng chromosome. Based on conserved domains, PnoLEA genes were divided into seven subfamilies and the LEA_6 subfamily was not identified in P. notoginseng genome. The LEA_2 subfamily had 46 (75.4%) members and was the largest number of subfamily members. The LEA_1 subfamily, DHN subfamily and SMP subfamily contained 2, 6 and 4 members, respectively. The LEA_3 subfamily, LEA_4 subfamily and LEA_5 subfamily were only one gene member. The 61 PnoLEA proteins showed different physicochemical properties (Table 1). The 61 PnoLEA genes encoded polypeptides ranging from 106 to 865 amino acids, and predicted molecular weights of the 61 PnoLEA proteins range from 10.96706 (PnoLEA10) to 100.10171 (PnoLEA32) kDa. The 61 PnoLEA proteins predicted the isoelectric points (pI) ranging from 4.65 (PnoLEA60) to 10.57 (PnoLEA18). The hydropathicity (GRAVY) values of the 61 PnoLEA proteins between -1.35 and 0.389, and the hydropathicity (GRAVY) values of 42 PnoLEA proteins (68.8%) were less than 0. It suggests that most of the PnoLEA proteins were highly hydrophilic. The prediction of subcellular localization indicated that the most PnoLEA proteins were located in the endomembrane system and nucleus, but a few PnoLEA proteins were distributed in the organelle membrane, plasma membrane or chloroplast. All PnoLEA proteins of the DHN subfamily and SMP subfamily were distributed in the nucleus.
Gene structure and conserved structural domains of PnoLEA proteins
The structure of 61 PnoLEA genes was analyzed to reveal their intron and exon characteristics (Fig. 1). The analysis of the structures of PnoLEA genes showed that most genes had between 0 and 2 introns. The 51% of PnoLEA genes contain no intron, 75% of PnoLEA genes contain 0 ~ 1 intron, and only 6% of PnoLEA genes have more than 2 introns. The PnoLEA25 had 3 introns, PnoLEA16 and PnoLEA32 had 4 introns, respectively.
The motif characteristics of 56 PnoLEA proteins were analyzed by MEME tool (Fig. 2). The LEA_1 subfamily (PnoLEA10 and PnoLEA56), LEA_3 subfamily (PnoLEA48), LEA_4 subfamily (PnoLEA51) and LEA_5 subfamily (PnoLEA33) had few members, and their motifs were hardly found in other subfamilies, so they were not analyzed together. The members of the same subfamily are similar in the type and number of motifs. Both motif_14 and motif_15 were found in members of the DHN subfamily. All members of the SMP subfamily have motif_10.
Phylogenetic analyses of the PnoLEA genes
To classify the PnoLEA genes, a neighbor-joining (NJ) tree was constructed using the protein sequences of the identified 61 PnoLEA and 51 AtLEA (Fig. 3). The LEA families were clustered into nine subfamilies, including LEA_1, LEA_2, LEA_3, LEA_4, LEA_5, LEA_6, DHN, SMP and ATM. The 61 PnoLEA genes were divided into seven subfamilies, and the LEA_6 subfamily was not present in the P. notoginseng genome. ATM subfamily is unique in A. thaliana. The LEA_1 subfamily had two members. The LEA_2 subfamily had 46 members. The LEA_3 subfamily, LEA_4 subfamily and LEA_5 subfamily had only one member. In addition, the phylogenetic tree constructed with the 61 PnoLEA proteins sequences was also divided into seven subfamilies (Additional file 1: Figure S1).
Chromosomal distribution and expansion of the PnoLEA genes
The 61 PnoLEA genes were randomly distributed on 11 chromosomes of P. notoginseng (Fig. 4). Chromosome 1 has the maximum number of genes, with 12 PnoLEA genes. The 8 PnoLEA genes were distributed on the chromosome 2 and 6, respectively. The 7 PnoLEA genes were distributed on the chromosome 5. There are no genes distributed on chromosome 11 and only one gene was distributed on chromosome 10.
It is necessary to understand the mechanisms of evolution in the LEA gene family of P. notoginseng. We compared the nucleotide sequences of the PnoLEA genes in order to confirm their replication patterns. We found 19 pairs of fragment duplication events involving 28 identified homologous genes (Fig. 5). The non-synonymous substitution (Ka) and synonymous substitution (Ks) values of homologous genes pairs were calculated, and Ka/Ks ratios ranged between 0.06 and 0.58. The results showed that these homologous genes might have experienced a purifying selection in the process of evolution (Additional file 2: Table S1). In addition, we compared PnoLEA genes with related genes from four species (A. thaliana, Oryza sativa, Solanum lycopersicum and Zea mays) (Additional file 3: Figure S2). The results showed that PnoLEA genes has more homologues with three dicotyledons (A. thaliana and Solanum lycopersicum).
Gene expression analysis of the PnoLEA genes in different tissues
In order to reveal the tissue specificity of PnoLEA genes expression, we selected five tissue types including the roots, stems, leaf, flowers and seeds for detailed transcriptome analysis. The expression of 61 PnoLEA genes were divided into two cluster in different tissues and the seeds was divided into one cluster (Fig. 6). The PnoLEA6, PnoLEA21, PnoLEA22, PnoLEA34, PnoLEA48, and PnoLEA61 genes were not expressed in five tissues. The expression of PnoLEA5, PnoLEA12, PnoLEA46 and PnoLEA54 genes were up regulated in five tissues. The expression of PnoLEA33, PnoLEA39, PnoLEA43, PnoLEA58 and PnoLEA59 genes were up regulated in seeds tissues.
Changes of germination percentage of P. notoginseng seeds during dehydration stress
The mature seeds of freshly harvested P. notoginseng had a high water content of about 64.52% (Fig. 7a). The water content of seeds decreases with increasing dehydration time. After 24 h of dehydration, the water content was below 15% (Fig. 7a). The number of germinations of P. notoginseng seeds was slightly increased after a short period (3 h) of dehydration stress (Fig. 7b). The germination rate of P. notoginseng seeds was reduced after dehydration stress over 3 h (Fig. 7b). The germination rate of P. notoginseng seeds is significantly reduced when the water content of seeds was below 15% (Fig. 7b).
Expression patterns of PnoLEA genes in response to dehydration stress
The expression of 61 PnoLEA genes were divided into three cluster under different levels of dehydration stress and only few genes responded strongly to dehydration stress (Fig. 8). The PnoLEA54, PnoLEA33, PnoLEA43, PnoLEA5, and PnoLEA58 genes were highly expressed under dehydration stress. The second cluster of genes showed lower expression and did not responded to dehydration stress. The expression of the third cluster of genes was slightly or not responsive to dehydration stress.
To verify the response of PnoLEA genes to dehydration stress, we selected genes in each subfamily of LEA gene family that were highly expressed in seeds and validated expression changes using qRT-PCR (Fig. 9). Among them are genes of the LEA_1 subfamily (PnoLEA10), the LEA_2 subfamily (PnoLEA5, PnoLEA46, PnoLEA58, PnoLEA59), the LEA_4 subfamily (PnoLEA51), the LEA_5 subfamily (PnoLEA33), the SMP subfamily (PnoLEA16, PnoLEA39), and the DHN subfamily (PnoLEA12, PnoLEA43, PnoLEA54). Except PnoLEA48 of the LEA_3 subfamily because its expression was too low. It was observed that the relatively high expression of PnoLEA12, PnoLEA33, PnoLEA39, PnoLEA43 and PnoLEA54 under dehydration stress.
Molecular characteristics of PnoLEA genes
The LEA gene family is widely important for plant growth and development . In many species, LEA gene family has been identified, for example, 51 members in A. thaliana , 27 members in tomato , 68 members in Sorghum bicolor  and 281 members in wheat . In our study, HMM model and BLAST were used to search for PnoLEA gene family in the genome database of P. notoginseng, and consequently 61 PnoLEA genes were identified (Table 1). The numbers of PnoLEA genes identified in P. notoginseng are consistent with the ones in S. miltiorrhiza , but is less than the ones in wheat  and poplar (Populus simonii) . The number of LEA gene family members showed significant differences among species, and this may be related to the ploidy of the species and the expansion of the gene family.
Low intron numbers of genes accelerate its process of transcriptional expression, and it is convenient to decrease the cost for transcription and make cell a fast reaction to abiotic stresses . The 66.7% of AtLEA genes in A. thaliana contain only one intron , and the 62% of the LEA genes in wheat are no introns . Low numbers of introns were also observed in other stress-responsive genes, the most HSP20 genes (92.6%) are no introns in apple (Malus domestica) . Likewise, our results showed that the 51% of PnoLEA genes were no intron, 75% of PnoLEA genes were 0 ~ 1 intron, and only 6% of PnoLEA genes were more than 2 introns, and it could contribute to transcriptional regulation of PnoLEA genes in response to stress conditions (Fig. 1). Different groups of LEA proteins show a low similarity . Motif analysis in A. thaliana , flax (Linum usitatissimum)  and wheat  indicated that the members of LEA gene subfamily contain specifically conserved structures. In our study, the conserved motifs of PnoLEA proteins were different among subfamily groups (Fig. 2), suggesting that PnoLEA proteins probably have specific group functions.
In the PFAM database, the LEA gene family is divided into eight subfamilies . In our study, the LEA_6 subfamily was absent and the 61 PnoLEA genes were classified into seven subfamilies (Table 1 and Fig. 3). Consistently, it has been reported that the LEA_6 subfamily is not present in the recalcitrant seeds of tea plants . The LEA_6 subfamily has not been identified in algal and rice in the genomes [51, 52]. Similarly, the LEA_6 subfamily is not identified in the whole genomes of tomato and Salvia miltiorrhiza [11, 41], suggesting that the loss events of LEA_6 subfamily may occur during the evolution of these plants. DNA segmental duplication, tandem duplication, and conversion events promote an evolution amplification of the gene family . Among of the PnoLEA genes subfamilies, the LEA_2 subfamily has the greater number of members, with 46 (75.4%) members (Table 1 and Fig. 3). This is line with a series of previous study that a large number of LEA_2 subfamily members are found in potato (Solanum tuberosum) , Sorghum bicolor , wheat , rice  and poplar . However, the number of LEA_2 subfamily is low in A. thaliana  and flax . These results indicate that the LEA gene family has been expanded during the evolutionary process of P. notoginseng. Similarly, fragment duplication events are also present in the PtrLEA gene family and may contribute to the expansion of the PtrLEA genes . Five pairs of homologous genes pairs are identified in cassava (Manihot esculenta Crantz), indicating that fragment duplication may be one of the reasons for the expansion of the MeLEA gene family in the process of evolution . It has been observed that within 19 pairs of fragment duplication events, among them 17 pairs are LEA_2 subfamily genes (Fig. 5). This result suggested that the expansion of the PnoLEA gene family might be caused by the fragment duplication. In addition, the ka/ks ratios of most homologous genes pairs were less than 1. This result suggests that the PnoLEA genes may have experienced purifying selection in the process of evolution. Thus, it speculates that the lack of the LEA_6 subfamily and the expansion of the LEA_2 subfamily might be involved in the formation of seed recalcitrance.
Expression patterns of PnoLEA genes and its response to dehydration stress
The LEA genes are expressed widely in plant flower, fruit, seeds, leaf, stems and roots. Seven MeLEA genes are low levels in different tissues of cassava, and the expression of eight, seven and five MeLEA genes are up regulated in storage roots, stems and leaf, respectively . The expression of the StASR-2, StLEA 1–14, StLEA 2–29, StLEA 3–3 and StDHN-3 are up regulated in different tissues of potato, whereas the StLEA1 and StSMP subfamilies are low expressed . The expression of the PnoLEA genes in different tissues was analyzed by hierarchical clustering using publicly available RNA-seq data (Fig. 6). The expression of five genes (PnoLEA5, PnoLEA12, PnoLEA46 and PnoLEA54) were up regulated in all tissues (Fig. 6). Seed tissue was a single cluster in the hierarchical cluster analysis of the different tissues, and five genes (PnoLEA33, PnoLEA39, PnoLEA43, PnoLEA58 and PnoLEA59) specifically expressed in seeds. These results suggest that the expression of PnoLEA gene family members is tissue specificity. 93 of 121 TaLEA genes are highly expressed in the grain of wheat, and among them there were 35 out of 47 members in the DHN subfamily . There are 50 LEA genes in the flax genome, of which 42 LuLEA genes are expressed in all stages in Heiya No.14. However, the number of the expressed PnoLEA genes was small in seeds, and only 8 genes were expressed at high levels in the seeds (Fig. 6). We presume that the low transcriptional levels of most PnoLEA genes might be associated with the recalcitrance of P. notoginseng seeds.
The LEA genes are widely responded to abiotic stresses in plants, such as low temperature, drought and salt stress . Overexpression of OsEml gene in rice plants increase osmotic tolerance, when rice plants face with drought stress . The expression of the LEA_1 subfamily, LEA_2 subfamily, LEA_4 subfamily and DHN subfamily genes is up regulated in tomato to respond to drought and salt stress treatments . The genes of the LEA_1 subfamily, LEA_3 subfamily, LEA_4 subfamily and DHN subfamily in A. thaliana are also responded to drought stress . Similarly, LEA proteins are abundantly accumulated under dehydration stress in maize seeds . The CsLEA genes of tea tree are highly expressed in response to dehydration stress . In our study, the expression of a few genes was up regulated in response to dehydration stress (Fig. 8). The PnoLEA46, PnoLEA58 and PnoLEA59 genes of LEA_2 subfamily were high expression in seeds, but their expression levels were decreased under dehydration stress (Fig. 8 and Fig. 9). The expression of PnoLEA43 and PnoLEA54 genes of the DHN subfamily in seeds is increased under dehydration stress. Previous study have shown that the alpha helix structure of DNH protein is amphiphilic and binds to the plasma membrane to prevent cell dehydration . These results suggest that the DNH subfamily plays an important role in response to dehydration stress in P. notoginseng seeds and the expansion of the LEA 2 subfamily may lead to its non-functionalization or functional divergence. An increase in the germination rate of P. notoginseng seeds was observed after a short period of dehydration stress, while it was decreased with increasing time of dehydration stress (Fig. 7). The result is consistent with the previous studies that appropriate dehydration stress could promote the germination of recalcitrant seeds [58, 59]. The germination rate of recalcitrant Quercus wutaishanica seeds was increased with light dehydration stress and decreased with increasing time of dehydration stress . The expression of the PnoLEA33 (SMP) was significantly increased under appropriate dehydration stress treatment (Fig. 8 and Fig. 9). We believe that the increase in the expression the PnoLEA33 gene might promote the accumulation of seed maturation proteins, thus increasing germination rate of P. notoginseng seeds. We presume that the protective effect of the LEA proteins might be limited, and cells have been irreversibly damaged at the critical water content (15%) of P. notoginseng seeds. These evidences confirm that LEA proteins are essential in seed germination and the response of recalcitrant seeds to dehydration stress, thus providing new insights into recalcitrant seeds in agricultural production and storage.
In summary, we identified 61 PnoLEA genes from P. notoginseng genome. They were divided into seven subfamilies based on phylogenetic relationships, gene structures and protein conserved domains. Most members of the PnoLEA genes family show a low number of introns, and this could be related to a rapid response to dehydration stresses. We believe that the lack of the LEA_6 subfamily, the expansion of the LEA_2 subfamily and low transcriptional levels of most PnoLEA genes might be involved in the formation of seed recalcitrance of P. notoginseng. LEA proteins are critical in the response to dehydration stress in recalcitrant seeds, and the protective role of LEA proteins is closely related to the degree of recalcitrance of the seeds. These findings improve our understanding of the role of LEA proteins in the response of recalcitrant seeds to dehydration stress.
Genome-wide identification of the PnoLEA genes
The genomic data of P. notoginseng was obtained in the Herbal Medicine Omics Database (http://herbalplant.ynau.edu.cn) . The Hidden Markov Model (HMM) profiles of LEA (PF00257, PF00477, PF02987, PF03168, PF03242, PF03760, PF04927 and PF10714) were downloaded from the InterPro database (https://www.ebi.ac.uk/interpro). The local genome database of P. notoginseng proteins were scanned using the HMMER . LEA proteins of A. thaliana were downloaded from the database TAIR (https://www.arabidopsis.org), and the sequence of the AtLEA proteins as reference sequences was blasted search in the protein sequence of P. notoginseng. The original candidate LEA proteins of P. notoginseng were obtained by combining BLAST and HMMER results. The SMART tool was used to check for conserved domains in the LEA protein sequences of the original candidate (http://smart.embl-heidelberg.de), and the protein sequences lacking the LEA domains were removed. The Expasy website (https://web.expasy.org) was used to calculate the isoelectric point (pI) and molecular weight of LEA proteins. The BUSCA annotation system was used to predict the subcellular localization of LEA proteins (https://busca.biocomp.unibo.it).
Phylogenetic and gene structure of PnoLEA genes
The LEA proteins of P. notoginseng and A. thaliana were performed by multiple alignment of full-length amino acid sequences using the MATTF program . The phylogenetic tree was constructed by the MEGA11 program using the neighbor joining method with 1000 bootstrap replicates . The exons and introns of sequences were examined, and visualized with the R. The MEME Suite was used to predict the conserved motifs of the PnoLEA protein .
Chromosomal distribution and gene duplication of the PnoLEA genes
We used TBtools software to map the distribution of the PnoLEA genes on the chromosome . MCScanX program was used to calculate duplicate events for the PnoLEA genes . The KaKs_Calculator software was used to calculate the ratios of non-synonymous substitution and synonymous substitution (Ka/Ks) for duplication genes pairs in P. notoginseng . In addition, the genomic sequences of four species (A. thaliana, Oryza sativa, Solanum lycopersicum and Zea mays) were acquired in the Ensembl Plants database (https://plants.ensembl.org/index.html). The collinearity of LEA homologous genes between P. notoginseng and the four species analyzed using MCScanX, and visualized with the Dual Synteny Plotter of TBtools software .
Plant material and dehydration stress
The P. notoginseng of three-year-old was planted to be used as experimental material in experimental fields in Wenshan Miao Xiang, Yunnan Province, China. The mature seeds of P. notoginseng were picked in November. Seeds were removed the red peel, disinfected with 5% CuSO4 for 30 min, washed with double distilled water (ddH2O) twice, dried indoors shade. The seeds of P. notoginseng were dehydrated with silica gelilica gel for 3 h, 6 h, 12 h and 24 h respectively. The ratio was 1:10 between seed weight and silica gel weight, and the silica gel was changed every 6 h. The seeds were placed in well-ventilated fully meshed baskets in a wet sand-laminated indoors at a temperature of 15 ± 5 °C, and kept the water content of the seeds was at about 60%. After 50 days, the number of seed germination was observed.
Expression analysis of PnoLEA genes
The transcriptome data of different tissues of P. notoginseng were obtained from public databases. The RNA-Seq data of roots, stems, leaf and flowers of P. notoginseng were downloaded from the NCBI public database (Accession Number: SRR7427743, SRR7427744, SRR7427758, SRR7427747, SRR7427748, SRR7427750, SRR7427752, SRR7427755, SRR7427756 SRR7427752, SRR7427755, SRR7427756, SRR7427751, SRR7427754, SRR7427757), and the BioProject related links are https://www.ncbi.nlm.nih.gov/bioproject/PRJNA477910 . The RNA-Seq data of P. notoginseng seeds were obtained from the NGDC database (GSA: CRA008378, https://ngdc.cncb.ac.cn).
For 3 h, 12 h, 24 h and CK treatment, the sample used for RNA extraction. Total RNA was extracted using Plant Plus Kit (Tiangen, Beijing, China) according to the manufacturer’s protocol, with three replications. RNA samples were tested for degradation and impurities by using 1% agarose electrophoresis. RNA quality was assessed on an Agilent 2100 Bioanalyzer (Agilent Technologies, Palo Alto, CA, USA) and checked using RNase free agarose gelelectrophoresis. Sequencing libraries were generated using NEBNext® UltraTM RNA Library Prep Kit for Illumina® (NEB, USA) following manufacturer’s recommendations and index codes were added to attribute sequences to each sample. The Illumina HiSeq platform was used to perform cDNA library sequencing and acquire a large amount of high-quality data. The raw data have been submitted to the NGDC database with the GSA number CRA010115.
The raw reads were cleaned using Trimmomatic (version 0.39) to remove low quality reads and reads containing adapter . The expression levels of the genes were calculated using the salmon  and transformed using log2(TPM + 1) for the mean of three biological replicates. Using the R program pheatmap, the hierarchical clustering of expression levels was visualized.
Total RNA extraction and qRT-PCR analysis
In order to determine the expression of the PnoLEA genes under different levels of dehydration stress and to explain the relationship between LEA protein and dehydration sensitivity, qRT-PCR analysis was performed on the PnoLEA genes. The total RNA of P. notogensing seeds was extracted using a TAKARA MIniBEST Plant RNA Extraction kit. Using the Prime Script RT kit (Takara Bio, Kyoto, Japan), RNA was reverse transcribed into cDNA. The Premier 3.0 software was used to design the primer sequences of qRT-PCR  and the primer sequences were synthesized by Shanghai Generay Biotech Co., Ltd. (Shanghai, China). The primer sequences were showed Table S2. The reference gene used for P. notoginseng seeds was GLYCERALDEHYDE-3-PHOSPHATE DEHYDROGENASE(GAPDH). The Quant studio12K Flex System (Thermo Fisher Scientific) was used for qRT-PCR with three technical replicates. The relative expression levels of LEA genes were calculated using the 2−ΔΔCt method .
Availability of data and materials
The raw RNA-Seq data in different tissues (root, stem, leaf and flower) of P. notoginseng are available in the NCBI database under the Bioproject accession number PRJNA477910 (https://www.ncbi.nlm.nih.gov/bioproject/PRJNA477910). All data generated or analyzed during this study are included in this published article and its supplementary information files. The raw sequencing data of P. notoginseng seeds for this study have been deposited in the Genome Sequence Archive in BIG Data Center (https://bigd.big.ac.cn/), Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, under the accession number: CRA008378, CRA010115. Other data generated or analyzed during this study are included in this published article and its supplementary information files. Hoo & Tseng firstly undertook the formal identification of the plant material Panax notoginseng (Burkill) (Journal of Systematics and Evolution 11: 435, 1973) in Flora of China.
Non-synonymous substitution rate
Synonymous substitution rate
Late embryogenesis abundant
Quantitative real-time PCR
Seed maturation protein
Transcripts per kilobase of exon model per million mapped reads
Yu J, Lai YM, Wu X, et al. Overexpression of OsEm1 encoding a group I LEA protein confers enhanced drought tolerance in rice. Biochem Biophys Res Commun. 2016;478(2):703–9.
Li X, Cao J. Late Embryogenesis Abundant (LEA) gene family in maize: identification, evolution, and expression profiles. Plant Mol Biol Report. 2016;34(1):15–28.
Dalal M, Tayal D, Chinnusamy V, et al. Abiotic stress and ABA-inducible Group 4 LEA from Brassica napus plays a key role in salt and drought tolerance. J Biotech. 2009;139(2):137–45.
Sasaki K, Christov NK, Tsuda S, et al. Identification of a Novel LEA Protein Involved in Freezing Tolerance in Wheat. Plant Cell Physiol. 2013;55(1):136–47.
Hundertmark M, Hincha DK. LEA (late embryogenesis abundant) proteins and their encoding genes in Arabidopsis thaliana. BMC Genomics. 2008;9(1):1–22.
Chunlai W, Wei H, Yan Y, et al. The late embryogenesis abundant protein family in cassava (Manihot esculenta Crantz): genome-wide characterization and expression during abiotic stress. Molecules. 2018;23(5):1196.
Wang M, Li P, Li C, et al. SiLEA14, a novel atypical LEA protein, confers abiotic stress resistance in foxtail millet. BMC Plant Biol. 2014;14:290.
Wang W, Gao T, Chen J, et al. The late embryogenesis abundant gene family in tea plant (Camellia sinensis): Genome-wide characterization and expression analysis in response to cold and dehydration stress. Plant Physiol Biochem. 2019;135:277–86.
Dure L, Greenway SC, Galau GA. Developmental biochemistry of cottonseed embryogenesis and germination: changing messenger ribonucleic acid populations as shown by in vitro and in vivo protein synthesis. Biochemistry. 1981;20(14):4162–8.
Shao HB, Liang ZS, Shao MA. LEA proteins in higher plants: structure, function, gene expression and regulation. Colloids surfaces B: Biointerfaces. 2005;45(3–4):131–5.
Chen J, Li N, Wang X, et al. Late embryogenesis abundant (LEA) gene family in Salvia miltiorrhiza: identification, expression analysis, and response to drought stress. Plant Sign Behav. 2021;16(5):1891769.
Hunault G, Jaspard E. LEAPdb: a database for the late embryogenesis abundant proteins. BMC Genomics. 2010;11(1):1–9.
Battaglia M, Olvera-Carrillo Y, Garciarrubio A, et al. The enigmatic LEA proteins and other hydrophilins. Plant Physiol. 2008;148(1):6–24.
Candat A, Paszkiewicz G, Neveu M, et al. The ubiquitous distribution of late embryogenesis abundant proteins across cell compartments in Arabidopsis offers tailored protection against abiotic stress. Plant Cell. 2014;26(7):3148–66.
Amara I, Zaidi I, Masmoudi K, et al. Insights into late embryogenesis abundant (LEA) proteins in plants: from structure to the functions. Am J Plant Sci. 2014;5(22):3440–55.
Tompa P, Bánki P, Bokor M, et al. Protein-water and protein-buffer interactions in the aqueous solution of an intrinsically unstructured plant dehydrin: NMR intensity and DSC aspects. Biophys J. 2006;91(6):2243–9.
Chakrabortee S, Tripathi R, Watson M, et al. Intrinsically disordered proteins as molecular shields. Mol BioSyst. 2012;8(1):210–9.
Dalal M, Tayal D, Chinnusamy V, et al. Abiotic stress and ABA-inducible Group 4 LEA from Brassica napus plays a key role in salt and drought tolerance. J Biotechnol. 2009;139(2):137–45.
Liu H, Yu C, Li H, et al. Overexpression of ShDHN, a dehydrin gene from Solanum habrochaites enhances tolerance to multiple abiotic stresses in tomato. Plant Sci. 2015;231:198–211.
Falavigna VdS, Malabarba J, Silveira CP, et al. Characterization of the nucellus-specific dehydrin MdoDHN11 demonstrates its involvement in the tolerance to water deficit. Plant Cell Reports. 2019;38(9):1099–107.
Roberts EH. Predicting the storage life of seeds. Seed Sci Tech. 1973;1(1):499–514.
Ellis RH, Hong TD, Roberts EH. An intermediate category of seed storage behaviour? I. COFFEE. J Exp Bot. 1990;41(9):1167–74.
Pammenter N, Berjak P. A review of recalcitrant seed physiology in relation to desiccation-tolerance mechanisms. Seed Sci Res. 1999;9(1):13–37.
Feng J, Shen Y, Shi F. Study on desiccation sensitivity of Ginkgo biloba seeds. J Nanjing Forestry University. 2019;43(6):193–200.
Kundu M, Tiwari S, Haldkar M. Collection, germination and storage of seeds of Saraca asoca (Roxb.) Willd. J Applied Res Med Aromatic Plants. 2020;16:100231.
Marques A, Buijs G, Ligterink W, et al. Evolutionary ecophysiology of seed desiccation sensitivity. Funct Plant Biol. 2018;45(11):1083–95.
Leprince O, Pellizzaro A, Berriri S, et al. Late seed maturation: drying without dying. J Exp Bot. 2017;68(4):827–41.
Oliver MJ, Farrant JM, Hilhorst HW, et al. Desiccation tolerance: avoiding cellular damage during drying and rehydration. Annu Rev Plant Biol. 2020;71:435–60.
Farrant JM, Pammenter NW, Berjak P, et al. Presence of dehydrin-like proteins and levels of abscisic acid in recalcitrant (desiccation sensitive) seeds may be related to habitat. Seed Sci Res. 1996;6(4):175–82.
Wang WQ, Ye JQ, Rogowska W, et al. Proteomic comparison between maturation drying and prematurely imposed drying of Zea mays seeds reveals a potential role of maturation drying in preparing proteins for seed germination, seedling vigor, and pathogen resistance. J Proteome Res. 2014;13(2):606–26.
Farrant JM, Pammenter N, Berjak P. Seed development in relation to desiccation tolerance: a comparison between desiccation-sensitive (recalcitrant) seeds of Avicennia marina and desiccation-tolerant types. Seed Sci Res. 1993;3(1):1–13.
Julien D, Michaela H, Jérôme B, et al. LEA polypeptide profiling of recalcitrant and orthodox legume seeds reveals ABI3-regulated LEA protein abundance linked to desiccation tolerance. J Exp Bot. 2013;64(14):4559–73.
Jin XF, Liu DD, Ma LL, et al. Transcriptome and expression profiling analysis of recalcitrant tea (Camellia sinensis L.) seeds sensitive to dehydration. Intern J Genomics. 2018;2018:1–11.
Chen JW, Kuang SB, Long GQ, et al. Photosynthesis, light energy partitioning, and photoprotection in the shade-demanding species Panax notoginseng under high and low level of growth irradiance. Funct Plant Biol. 2016;43(6):479–91.
Yang K, LI L, LONG GQ, et al. Changes of antioxidant enzyme and ultrastructure in recalcitrant seeds of Panax notoginseng during after-ripening process. Guihaia. 2016;36(12):1519–25.
Duan CL, Li ZT, Ding JL, et al. Physiologic characteristics of Panax notoginseng seeds during after-ripening process. China J Chin Materia Med. 2010;35(20):2652–6.
Li L, Sun XT, Zhang GH, et al. Effect of drying rates on the desiccation sensitivity and antioxidant enzyme activities of recalcitrant panax notoginseng seeds. Seed. 2014;33(12):1–5.
Yang K, Yang L, Fan W, et al. Illumina-based transcriptomic analysis on recalcitrant seeds of Panax notoginseng for the dormancy release during the after-ripening process. Physiol Plant. 2019;167(4):597–612.
Ge N, Yang K, Yang L, et al. iTRAQ and RNA-seq analyses provide an insight into mechanisms of recalcitrance in a medicinal plant Panax notoginseng seeds during the after-ripening process. Funct Plant Biol. 2021;49(1):68–88.
Wang W, Vinocur B, Altman A. Plant responses to drought, salinity and extreme temperatures: towards genetic engineering for stress tolerance. Planta. 2003;218(1):1–14.
Cao J, Li X. Identification and phylogenetic analysis of late embryogenesis abundant proteins family in tomato (Solanum lycopersicum). Planta. 2015;241(3):757–72.
Nagaraju M, Kumar SA, Reddy PS, et al. Genome-scale identification, classification, and tissue specific expression analysis of late embryogenesis abundant (LEA) genes under abiotic stress conditions in Sorghum bicolor L. PLoS ONE. 2019;14(1):e0209980.
Zan T, Li LQ, Li JT, et al. Genome-wide identification and characterization of late embryogenesis abundant protein-encoding gene family in wheat: Evolution and expression profiles during development and stress. Gene. 2020;736:144422.
Cheng Z, Zhang X, Yao W, et al. Genome-wide search and structural and functional analyses for late embryogenesis abundant (LEA) gene family in poplar. BMC Plant Biol. 2021;21(1):110.
Jeffares DC, Penkett CJ, Bahler J. Rapidly regulated genes are intron poor. Trends Genet. 2008;24(8):375–8.
Yao F, Song C, Wang H, et al. Genome-wide characterization of the HSP20 gene family identifies potential members involved in temperature stress response in Apple. Frontiers in Genetics. 2020;11:609184.
Shih MD, Hoekstra FA, Hsing YIC. Late embryogenesis abundant proteins. Adv Bot Res. 2008;2008(48):211–55.
Li Z, Chi H, Liu CY, et al. Genome-wide identification and functional characterization of LEA genes during seed development process in linseed flax (Linum usitatissimum L.). BMC Plant Biol. 2021;21(1):193.
Liu H, Xing MY, Yang WB, et al. Genome-wide identification of and functional insights into the late embryogenesis abundant (LEA) gene family in bread wheat (Triticum aestivum). Sci Rep. 2019;9(1):13375.
Jin XF, Cao D, Wang ZJ, et al. Genome-wide identification and expression analyses of the LEA protein gene family in tea plant reveal their involvement in seed development and abiotic stress responses. Sci Rep. 2019;9(1):14123.
Wang XS, Zhu HB, Jin GL, et al. Genome-scale identification and analysis of LEA genes in rice (Oryza sativa L.). Plant Sci. 2007;172(2):414–20.
Artur MAS, Zhao T, Ligterink W, et al. Dissecting the genomic diversification of late embryogenesis abundant (LEA) protein gene families in plants. Genome Biol Evol. 2019;11(2):459–71.
Freeling M. Bias in plant gene content following different sorts of duplication: tandem, whole-genome, segmental, or by transposition. Annu Rev Plant Biol. 2009;60:433–53.
Chen Y, Li C, Zhang B, et al. The role of the late embryogenesis-abundant (LEA) protein family in development and the abiotic stress response: a comprehensive expression analysis of potato (Solanum tuberosum). Genes. 2019;10(2):148.
Liu D, Sun J, Zhu D, et al. Genome-wide identification and expression profiles of late embryogenesis-abundant (LEA) genes during grain maturation in wheat (Triticum aestivum L.). Genes. 2019;10(9):696.
Hundertmark M, Buitink J, Leprince O, et al. The reduction of seed-specific dehydrins reduces seed longevity in Arabidopsis thaliana. Seed Sci Res. 2011;21(3):165–73.
Danyluk J, Perron A, Houde M, et al. Accumulation of an acidic dehydrin in the vicinity of the plasma membrane during cold acclimation of wheat. Plant Cell. 1998;10(4):623–38.
Konstantinidou E, Takos I, Merou T. Desiccation and storage behavior of bay laurel (Laurus nobilis L.) seeds. European J Forest Res. 2008;127(2):125–31.
Cheng JM, Yan XF. Recalcitrance of Quercus wutaishanica seeds—Sensitivity to desiccation and low temperature. Guihaia. 2019;39(12):1691–701.
Chen W, Kui L, Zhang GH, et al. Whole-genome sequencing and analysis of the chinese herbal plant Panax notoginseng. Mol Plant. 2017;10(6):899–902.
Farrar M. Striped Smith-Waterman speeds database searches six times over other SIMD implementations. Bioinformatics (Oxford, England). 2007;23(2):156–61.
Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–80.
Tamura K, Stecher G, Kumar S. MEGA11: molecular evolutionary genetics analysis version 11. Mol Biol Evol. 2021;38(7):3022–7.
Bailey TL, Johnson J, Grant CE, et al. The MEME Suite. Nucleic Acids Res. 2015;43(W1):W39-49.
Chen C, Chen H, Zhang Y, et al. TBtools: an integrative toolkit developed for interactive analyses of big biological data. Mol Plant. 2020;13(8):1194–202.
Wang Y, Tang H, DeBarry JD, et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 2012;40(7):e49–e49.
Zhang Z. KaKs_calculator 3.0: Calculating selective pressure on coding and non-coding sequences. Genomics Proteomics Bioinformatics. 2022;20(3):536–40.
Wei G, Wei F, Yuan C, et al. Integrated chemical and transcriptomic analysis reveals the distribution of protopanaxadiol- and protopanaxatriol-type saponins in Panax notoginseng. Molecules. 2018;23(7):1773.
Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20.
Patro R, Duggal G, Love MI, et al. Salmon provides fast and bias-aware quantification of transcript expression. Nat Methods. 2017;14(4):417–9.
Untergasser A, Cutcutache I, Koressaar T, et al. Primer3–new capabilities and interfaces. Nucleic Acids Res. 2012;40(15):e115-e115.
Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative pcr and the 2−ΔΔct method. Methods. 2001;25(4):402–8.
We are grateful to GSX and ZC for their assistance with data analysis.
This work is funded by the National Natural Science Foundation of China (32160248 and 81860676), the Major Special Science and Technology Project of Yunnan Province (202102AA310048), the National Key Research and Development Plan of China (2021YFD1601003), and Innovative Research Team of Science and Technology in Yunnan Province (202105AE160016).
Ethics approval and consent to participate
Not applicable. The authors declared that a permission to collect Panax notoginseng material has been obtained, and experimental research works on the plants described in this paper comply with institutional, national and international guidelines.
Consent for publication
The authors report no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1:
Figure S1. The neighbor-joining (NJ) phylogenetic tree of PnoLEA proteins. PnoLEA genes families are grouped by different colors. The tree was constructed with amino acid sequences of identified PnoLEA genes and bootstrap value of 1000 replicates.
Additional file 2:
Table S1. The list of 19 pairs repetitive events in LEA genes of P. notoginseng and its Ka/Ks ratio.
Additional file 3:
Figure S2. Collinearity map of the PnoLEA genes in P. notoginseng to other four species. The blue lines denote collinearity between the PnoLEA genes and other species, while the gray lines represent collinearity between the P. notoginseng genome and other species.
Additional file 4:
Table S2. Primers designed for Quantitative Real-time PCR (qRT-PCR) in P. notoginseng.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Jia, JS., Ge, N., Wang, QY. et al. Genome-wide identification and characterization of members of the LEA gene family in Panax notoginseng and their transcriptional responses to dehydration of recalcitrant seeds. BMC Genomics 24, 126 (2023). https://doi.org/10.1186/s12864-023-09229-0
- Expression patterns
- Dehydration stress
- Recalcitrant seeds
- Panax notoginseng