Genome-wide analysis of the P450 gene family in tea plant (Camellia sinensis) reveals functional diversity in abiotic stress
BMC Genomics volume 24, Article number: 535 (2023)
Cytochrome P450 (Cytochrome P450s) genes are involved in the catalysis of various reactions, including growth, development, and secondary metabolite biosynthetic pathways. However, little is known about the characteristics and functions of the P450 gene family in Camellia sinensis (C. sinensis).
To reveal the mechanisms of tea plant P450s coping with abiotic stresses, analyses of the tea plant P450 gene family were conducted using bioinformatics-based methods. In total, 273 putative P450 genes were identified from the genome database of C. sinensis. The results showed that P450s were well-balanced across the chromosomes I to XV of entire genome, with amino acid lengths of 268–612 aa, molecular weights of 30.95–68.5 kDa, and isoelectric points of 4.93–10.17. Phylogenetic analysis divided CsP450s into 34 subfamilies, of which CYP71 was the most abundant. The predicted subcellular localization results showed that P450 was distributed in a variety of organelles, with chloroplasts, plasma membrane,,and cytoplasm localized more frequently. The promoter region of CsP450s contained various cis-acting elements related to phytohormones and stress responses. In addition, ten conserved motifs (Motif1-Motif10) were identified in the CsP450 family proteins, with 27 genes lacking introns and only one exon. The results of genome large segment duplication showed that there were 37 pairs of genes with tandem duplication. Interaction network analysis showed that CsP450 could interact with multiple types of target genes, and there are protein interactions within the family. Tissue expression analysis showed that P450 was highly expressed in roots and stems. Moreover, qPCR analysis of the relative expression level of the gene under drought and cold stress correlated with the sequencing results.
This study lays the foundation for resolving the classification and functional study of P450 family genes and provides a reference for the molecular breeding of C. sinensis.
Cytochrome P450s (CYPs) are the largest enzyme family involved in NADPH- and/or O2-dependent hydroxylation reactions, which are ubiquitous across all domains of life . P450 enzymes are present in all plant species, and play important roles in plant growth, development, and adaptation to the environment . Under terrestrial environments, the preserved P450 families support chemical defence mechanisms, and a number of them participate in the manufacture and catabolism of hormones . Furthermore, through boosting the action of substances (such as flavonoids) with a higher antioxidant activity, CYPs are also implicated in safeguarding plants from harsh environmental circumstances [4, 5]. For the biosynthesis pathways of species-specific metabolites, species-specific P450 families are necessary . All cytochrome enzymes will have the code "CYP" followed by the family number, then an alphabet that designates the subfamily of the enzyme . Their amino acid sequences are extremely diverse, with similarities as low as 16% in some cases, but their structural folding has remained conserved throughout evolution .
With the development of next-generation sequencing technology (NGS), a large number of plant genomes have been published, which has also facilitated the identification of gene families . As one of the largest gene superfamily in plant genomes, P450s are represented by more than 300,000 gene sequences that have so far been preserved in databases, which include more than 16,000 plant P450s . Nonetheless, the identification of P450 gene family members presents a significant challenge due to their vast quantity, comprising no less than 1% of the total annotated genes in plant genomes. Consequently, this results in a comparatively lower number of identified P450 gene families. Research has shown that Arabidopsis thaliana (A. thaliana) has 246 P450 genes, making it the third-largest gene family in A. thaliana . The number of P450 genes in other plants is also relatively high, such as 457 in grape (Vitis vinifera), 332 in soybean (Glycine max), 312 in poplar (Populus trichocarpa), 356 in rice (Oryza sativa), 372 in sorghum (Sorghum bicolor) , 233 in tomato (Solanum lycopersicum) , 174 in mulberry (Morus notabilis) , 334 in flax (Linum usitatissimum L.) , 263 in tobacco (Nicotiana tabacum) , and 258 in Chinese cabbage (Brassica rapa L.) . Therefore, whole-genome analysis and co-expression networks of P450 gene families can help to determine the functions of P450s and understand the evolution of these multifunctional enzymes.
P450 enzymes are classified into different subfamilies based on their amino acid sequence and function. Plant P450s have been shown to participate in various biochemical pathways to produce primary and secondary metabolites, such as phenylpropanoids, alkaloids, terpenoids, lipids, cyanoglycosides, and polyols, as well as plant hormones . For example, gene families CYP90, CYP724, and CYP734 are involved in the biosynthesis of steroidal saponins and sugar alkaloids . P450 enzymes can also participate in the regulation of plant growth and development by synthesizing hormones , such as CYP735As involved in the biosynthesis of cytokinins , CYP707A involved in the catalytic synthesis of abscisic acid , CYP701A, CYP88AC, CYP714A1, CYP714D1, and CYP714A2 involved in the synthesis and inactivation of gibberellins [23, 24], CYP85A, CYP90A, CYP90B, CYP90C, CYP90D, CYP724B, and CYP734A involved in the biosynthesis of brassinosteroids [25,26,27], and CYP74A, CYP94B3, CYP94C1, CYP74A, and CYP74B involved in the synthesis of jasmonic acid [28,29,30].
P450 enzymes have also been shown to play a role in plant stress responses, including responses to abiotic stress (such as drought and extreme temperatures) and biotic stress (such as insect and pathogen attacks) [31, 32]. For instance, after Xanthomonas axonopodis infection, the CYP gene CaCYP1 from Capsicum annuum was discovered to be implicated in the (hypersensitivity response) . It was discovered that the Arabidopsis CYP gene, AtCYP76C2, is linked to hypersensitive fast cell death, a defensive mechanism against bacterial canker (Pseudomonas syringae) infection . Such CYP genes are excellent candidates to be exploited in agricultural species engineering to make them resistant to biotic and abiotic stress. Besides, P450 genes have been found to be involved in the metabolism of heavy metal stress . Overall, the P450 gene family plays a key role in the metabolism of various compounds in plants, and understanding the functions of these enzymes is important for studying plant biology and developing new plant-derived products.
Tea (Camellia sinensis) is one of the most important beverage crops in the world, with significant economic and health benefits. With the publication of the tea genome, over 80 tea gene families have been identified, such as HDAC , PMF , PLD , MAPK , as well as transcription factor families NAC, bZIP, TCP, and MYB [40,41,42,43]. However, few P450 genes from tea have been reported and functionally annotated. Moreover, to date, there have been no reports on the whole-genome study of these genes. Therefore, in this study, we identified the members of the P450 gene family in the whole genome of tea using bioinformatics methods, grouped P450 genes with important functions, and analyzed the physicochemical information, structural function, and expression patterns of all members to understand the molecular evolution of P450 genes and provide a reference for functional characterization of important candidate genes. Furthermore, this investigation holds significant implications for the genetic enhancement of tea growth, development, yield, and resistance to pests and diseases through the utilization of this gene family.
Materials and methods
Identification of P450 genes in tea plant genome
In this study, we aimed to identify and characterize P450 genes in the tea plant (C. sinensis) genome. To achieve this, we downloaded the HMM (Hidden Markov Model) file for the typical conserved domain of P450 genes (PF00067) from the Pfam 35.0 protein family database (http://pfam.xfam.org). We then used the HMMER3.0 software to perform a comparative search of all protein sequences in the tea plant genome database (http://tpia.teaplant.org).
To increase the accuracy of our search, we obtained 238 AtP450 protein sequences from the TAIR website (https://www.arabidopsis.org/) and used them as queries to perform a local BLAST search in the tea plant genome database (with an E-value cutoff of 10–3). We then filtered the candidate protein sequences with incomplete structures using the NCBI-CDD (http://www.Ncbi.Nlm.Nih.Gov/Structure/cdd/wrpsb.cgi) and SMART (http://smart.embl-heidelberg.de/) domain detection tools, resulting in the identification of CsP450 protein sequences.
To further characterize the identified CsP450 protein sequences, we submitted them to the ProtParam (http://web.expasy.org/protparam/) and predicted their molecular weight, isoelectric point, and amino acid composition . Finally, we used TBtools (https://github.com/CJChen/TBtools/releases) to locate the CsP450 genes on the tea plant chromosomes and named them according to their positions on the chromosomes .
Phylogenetic analysis of CsP450s
To identify the gene family members, protein sequences were extracted based on their IDs and aligned with 238 family genes from A. thaliana using Clustal W software with the default parameters . The resulting alignment was used to construct an unrooted evolutionary tree using the Neighbor-Joining method using MEGA 7 software (https://www.megasoftware.net/) . The Bootstrap parameter was set to 1000 to ensure the robustness of the tree. The resulting tree was further annotated using EvolView (https://www.evolgenius.info/evolview/#login) to enhance its readability and visual presentation.
Analysis of CsP450s gene structure and cis-acting elements
In this study, the CDS and genomic annotation information of the CsP450 gene family was obtained from the tea plant genome database. The Gene Structure Display Server (GSDS, http://gsds.cbi.pku.edu.cn/) was used to generate a schematic representation of the gene family's exon–intron structure . The MEME online software (https://meme-suite.org/meme/) was used to analyze the conserved motifs of the CsP450 proteins, with the following parameters: maximum of 10 misfits and an optimum motif width of 6—200 amino acid residues . The gene family's evolutionary tree, gene structure, and motif analysis were combined in a single figure using the TBtools software to demonstrate the gene structure and evolutionary relationships between family members.
To further explore the regulatory elements of the CsP450 gene family, the 2 kb upstream region of the ATG start codon of the CsP450 genes was downloaded from the tea plant database. The PlantCARE online tool (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) was used to predict cis-acting elements in the promoter sequences , and the results were visualized using TBtools.
Subcellular localization prediction of CsP450s gene
WOLF PSORT ProtParam tool (https://wolfpsort.hgc.jp/) were used to predict the subcellular localization of CsP450-encoded proteins. The algorithm of WOLF PSORT ProtParam tool compares the input sequence to the database of known subcellular localization signals and motifs, and then assigns a probability score to each potential subcellular localization site.
Chromosomal localization and genome collinearity analysis of CsP450s gene
To perform chromosome localization analysis of the gene family, we used the software MapChart (https://academic.oup.com/jhered/article/93/1/77/2187477). We conducted genome-wide collinearity analysis and gene duplication event analysis using the software McscanX with default parameters . KaKs Calculator 2.0 was used to estimate the non-synonymous substitution rate (Ka), synonymous substitution rate (Ks), and the ratio (= Ka/Ks) of paralog pairs for each pair of paralogs . In general, Ka/Ks = 1 reflects neutral selection (pseudogenes), Ka/Ks = < 1 shows purifying or negative selection, and Ka/Ks = > 1 shows positive selection.
Protein–protein interaction network analysis of CsP450s
The candidate P450 genes of tea plant were not found in the String database (https://string-db.org/). Therefore, we used OrthoVenn2 (https://orthovenn2.bioinfotoolkits.net/home) to search for homologous genes of tea plant P450 genes in Arabidopsis for further analysis. The protein–protein interaction network was visualized using Cytoscape (https://cytoscape.org/) network visualization software, where nodes represented proteins and edges represented interactions..
In-silico gene expression analysis of CsP450 genes
The Illumina RNA-sequencing (RNA-seq) data of tea plant were downloaded from the tea plant genome database (http://tpdb.shengxin.ren/) to examine the relative expression patterns of CsP450s under abiotic stress with various time points (0 h, 24 h, 48 h, and 72 h for PEG) and (0 h, 6 h, and 7 d for cold (4℃)) and different tissues including apical buds, flowers, fruits, young leaves, mature leaves, old leaves, roots, and stems. The clustering heatmap was drawn using the heatmap tool by Biotech Cloud Platform (https://cloud.oebiotech.cn/task/detail/heatmap/), with the parameter settings for clustering rows and selecting FPKM as the data preprocessing method.
Drought stress was induced in tea plants by treating them with 20% PEG6000 for 24 h, 48 h, and 72 h, while the control sample was collected at 0 h. To investigate the response of CsP450 genes to drought stress, ten CsP450 genes were selected and their expression levels were analyzed using qPCR. Total RNA was extracted from the tea plant samples using the RNAprep Pure Plant Kit (Tianjin, China), and cDNA was synthesized using the PrimeScript® RT reagent kit (Takara, China) according to the manufacturer's instructions. Gene-specific primers were designed using the NCBI database online toolkit (https://www.ncbi.nlm.nih.gov/tools/primer-blast/) and used to amplify the target fragments. The relative expression levels of the selected genes were calculated using the 2−ΔΔCt method . Additionally, cold stress was imposed on the tea plants by treating them at 4℃ for 6 h and 7 d, with samples collected at 0 h as the control. The expression analysis of CsP450 genes was performed with three biological replicates and three technical replicates for all samples.
The statistical analysis was performed using IBM SPSS Statistics 22 software (IBM, New York, USA) to compare the differences between treatments. All values presented in the figures are expressed as the mean ± standard deviation (SD) of biological triplicates, unless otherwise stated. Two-way analysis of variance (ANOVA) was conducted to determine the least significant difference (LSD) with a significance level of p < 0.05.
Identification and physicochemical analysis of CsP450 gene family
After screening the tea plant genome using NCBI-CDD and SMART, 273 candidate P450 genes were identified, and were subsequently designated as CsP4501 to CsP450273 according to their chromosome location, numbering and naming (Table 1). The chromosomal distribution of the P450 genes was found to well-balanced, with genes located on chromosomes 1 to 15. The P450 protein sequences varied greatly in length, ranging from 268 to 612 amino acids, with molecular weights ranging from 30.95 to 68.5 kDa, and isoelectric points ranging from 4.93 to 10.17. Subcellular localization analysis showed that these proteins were mainly localized to organelles such as chloroplasts, plasma membranes, cytoplasm, endoplasmic reticulum, mitochondria, nuclei, and vacuoles.
Phylogenetic analysis of CsP450 gene family
To gain a deeper understanding of the evolutionary relationships among members of the tea P450 gene family, we conducted multiple sequence alignment of the identified 273 tea P450 proteins with 238 AtP450 protein sequences, followed by cluster analysis to generate a phylogenetic tree (Fig. 1). The results of the phylogenetic tree analysis indicated that the tea P450 proteins belong to 34 subfamilies, including CYP71, CYP72, CYP73, CYP76, CYP77, CYP78, CYP81, CYP82, CYP84, CYP85, CYP86, CYP87, CYP89, CYP90, CYP93, CYP97, CYP98, CYP94, CYP701, CYP702, CYP703, CYP704, CYP705, CYP706, CYP707, CYP708, CYP709, CYP710, CYP711, CYP714, CYP716, CYP734, CYP749, and MAH. The CYP71 subfamily had the most members, containing 31 tea P450 proteins, while the CYP711 subfamily had the fewest members, each containing only one protein. The CYP702, CYP705, and CYP708 subfamilies had no tea P450 proteins, and there were no AtP450 proteins in the CYP749 subfamily. Notably, our evolutionary tree analysis revealed that all subfamilies included tea plant and A. thaliana P450 family genes, indicating that the tea plant P450 family shares a common ancestry with the A. thaliana P450 family. This analysis provides insights into the evolutionary relationships among tea P450 genes and lays the foundation for further investigations into the functional characteristics of this gene family.
Gene structure and conserved motif analysis of CsP450s in tea plant
The majority of plant genes are often interrupted by one or more introns or exons. These configurations may be used to investigate the evolutionary link between different members of the respective gene families. Many earlier investigations have observed a correlation between exon/intron distribution patterns and their pertinent biological activities . The evolutionary relationships and gene structures of the tea P450 family members were further investigated by integrating phylogenetic trees, gene structure diagrams, and motif analysis (Fig. 2A and 2B). By using the MEME website, 10 CsP450 proteins' conserved motifs were identified. The analysis revealed that the number of exons in tea P450 family genes ranged from 1 to 14, with 27 genes lacking introns and only one exon. In addition, ten conserved motifs (Motif1-Motif10) were identified in the CsP450 family proteins (Figure S1). The number of conserved motifs in tea P450 family genes varied from 1 to 10, with Motif5 to Motif8 being the most frequently occurring motifs in all genes. Furthermore, there were significant differences in the patterns of conserved motifs and gene structures between type A and non-type A P450s. For example, type A includes the CYP71 clan, which contains the following sub-families: CYP71, CYP78, CYP82, CYP89 and CYP736, while the non-type A clan included all CYPs other than CYP70 types. However, similar patterns were observed within the same subfamily, which enhanced the credibility of the phylogenetic relationships and population classifications.
Analysis of cis-acting elements of CsP450 gene family
Cis regulatory elements (CREs) are a family of non-coding DNA components that regulate gene expression at various developmental stages by influencing the transcription of nearby genes . To investigate the potential response of CsP450 family members to growth and development, stress and other environmental cues, the promoter regions were analyzed using PlantCARE. The results showed that the major cis-acting elements included abscisic acid responsive elements (ABRE), jasmonic acid response elements (CGTCA-motif), low temperature responsive element (LTR), MYB binding site involved in drought-inducibility (MBS), gibberellin-responsive regulatory element (TATC-box), salicylic acid responsive element (TCA-element) and auxin-responsive element (TGA-element) (Figure S2). The predicted results further suggest that the tea CsP450 family plays an important role in regulating growth and development processes, hormone signal transduction, and response to environmental stress.
Chromosomal distribution analysis of CsP450s in tea plant
Based on the genome annotation of the tea plant, we investigated the physical locations of CsP450s on tea plant chromosomes, and the results are presented in Fig. 3. The chromosome localization results of P450 genes in tea plants showed that all 15 chromosomes of tea plants contain P450 genes, indicating that the chromosome distribution of P450 genes in tea plants is biased. Among them, chromosomes 1, 2, 4, 7, and 12 have the most P450 genes, while chromosome 10 has the fewest. In addition, it was found that some CsP450 genes are closely linked, and 37 pairs of genes exhibit gene tandem duplication.
Gene duplication relationship and collinearity analysis of CsP450 genes
The investigation of gene duplication and amplification is crucial for exploring the evolution and expansion of the P450 gene families in tea plant. To investigate gene duplication events in the CsP450 gene family of tea plants, the MCScanX algorithm was used to analyze collinearity and gene duplication in the tea plant genome. Gene duplication and amplification between P450 genes provide important evidence for studying the evolution and expansion of gene families. Red lines linking two chromosomal parts represent syntenic regions. Analysis of large-scale gene duplication within the P450 gene family revealed that 37 pairs of genes participated in tandem duplication and 28 pairs of genes were collinear, providing the driving force for the evolution of tea plants (Fig. 4). In addition, duplication was most frequent in chromosomes 2 and 3, which is also the main reason for the higher number of CsP450 genes on these chromosomes. According to the aforementioned findings, tandem duplication and segmental duplication both contributed to the growth of the CsP450 family, although the former had a more significant impact.
Family members of a gene family often evolve from a single ancestral gene. Therefore, using collinearity analysis to study the relationship between P450 gene families in tea plants and A. thaliana genomes helps to understand the origin and evolutionary relationship of P450 genes (Fig. 5). The results showed that 41 homologous P450 genes were co-constructed in tea plants and A. thaliana, with more homologous P450 genes found in chromosomes 1 and 2 of tea plants, while no homologous P450 genes were found on chromosomes 5, 8, and 9. In addition, multiple tea plant P450 genes were identified as homologous to a single AtP450 gene, and multiple AtP450 genes were also homologous to a single tea plant P450 gene. This collinearity relationship suggests that the expansion of this gene family may have occurred before the divergence of tea plants and A. thaliana.
Selection pressure analysis of CsP450 genes
Throughout the course of evolution, gene duplication events often lead to the divergence of duplicated genes from their initial specialized functions. This divergence may manifest as non-functionalization, sub-functionalization, or neo-functionalization . We calculated Ka/Ks values from inter and intra genomic/subgenomic combinations of the tea plant in order to study the influence of Darwinian positive selection and the magnitude of selection pressure on divergence of P450 duplicated genes. As the majority of the Ka/Ks values were less than 1, it was assumed that after segmental and whole genome duplication, the CsP450 gene family had undergone strong purifying selection pressure with limited functional divergence (Table S1).
Protein–protein interaction network analysis of CsP450 genes
Analysis of protein–protein interactions (PPI) is a crucial way to understand protein function. Using the protein interaction network of Arabidopsis, we mapped and analyzed the protein interaction network of tea P450 proteins (Fig. 6). The results showed that 317 interactions were detected to be involved in the PPI network. The protein interaction map showed that multiple tea P450 genes have interacting target proteins, such as phenylalanine ammonia-lyase PAL1, flavonoid synthesis gene F3H, brassinosteroid synthesis pathway genes DWF5, DET2, STE1, and BR6OX1, among others. Additionally, there may also be protein–protein interaction relationships between tea P450 proteins, such as CsP450107, CsP450108, CsP450116, CsP450145, CsP450231, and CsP450266, among others. Therefore, the protein interaction network analysis further supports the hypothesis that tea P450 proteins may participate in multiple physiological pathways through protein interactions.
Tissue-specific expression of CsP450 genes
Understanding the tissue-specific expression patterns of genes is crucial for elucidating their roles in plant growth, development, and responses to environmental stresses . The expression patterns of genes in different tissues are closely related to their biological functions. In this study, we analyzed RNA-Seq data from eight different tissues of tea plants (apical buds, flowers, fruits, young leaves, mature leaves, old leaves, roots, and stems) to analyze the tissue-specific expression profiles of the P450 gene family. Normalized FPKM expression values were used to construct a digital expression profile heatmap. The CsP450s exhibited a diverse expression pattern. The results showed that the P450 gene family had high expression levels in the roots and stems, while their expression levels were low in mature and old leaves in tea plants (Fig. 7–1, -2). The clustering results indicated that P450 genes in the same subfamily exhibited similar expression patterns.
Expression analysis of CsP450s in response to drought and cold stress
To investigate the response of the P450 gene family to drought and cold stress in tea plants, transcriptome sequencing data from tea plants subjected to PEG treatment (24 h, 48 h, and 72 h) and cold stress (6 h and 7 d) were analyzed. The results indicated that the expression of CsP450 genes in response to drought stress followed one of three trends: initial upregulation followed by downregulation, sustained upregulation, or continuous downregulation (Fig. 8). Similar expression patterns were also observed under cold stress (Fig. 9). Furthermore, the clustering analysis of the CsP450 gene family revealed that genes from the same subfamily displayed similar expression patterns. These findings demonstrate that the expression of the CsP450 gene family is modulated in response to drought and cold stress in tea plants. These results may provide valuable insights into the molecular mechanisms underlying stress tolerance in tea plants and could facilitate the development of stress-resistant tea cultivars in the future.
Expression analysis of CsP450s in response to drought and cold stress
CsP450 genes are essential in tea plant response to environmental abiotic and biotic stresses. To further validate the expression patterns of the selected CsP450 genes under drought and cold stress, a quantitative real-time polymerase chain reaction (qPCR) was performed on 12 different CsP450 genes. The results indicated that the qPCR data were generally consistent with the transcriptomic data (Fig. 10). Specifically, under drought stress, CsP450139, CsP450197, and CsP450252 exhibited a continuous upregulation trend, with an approximately 8-, 5- and threefold increase, while CsP450219 showed a continuous downregulation trend compared with control. Besides, CsP45080, CsP450157 and CsP450181 showed an initial upregulation followed by a downregulation trend (Fig. 10A). However, the transcript level on each time points of CsP450240 showed no significant difference than control, with the maximum relative expression reaches 1.6 times at 48 h.
Under cold stress, CsP45080, CsP450157 and CsP450219 exhibited an increase followed by a decrease in expression levels, while CsP45022, CsP450197 and CsP450252 showed a continuous upregulation trend compared with control, with an approximately 2.5-, 2.4- and 2.3-fold increase. Conversely, CsP450171 and CsP450181 showed a continuous downregulation trend compared with control (Fig. 10B). Besides, the transcript level on each time points of CsP4507, CsP450139, and CsP450240 showed no significant difference than control. The findings from the qPCR analysis support the expression patterns observed in the transcriptomic data, thereby providing further evidence for the involvement of CsP450 genes in response to drought and cold stress in tea plants.
The cytochrome P450 genes catalyze various reactions, including growth, development, and biosynthesis of secondary metabolites . Gene identification and functional classification are essential for studying the function of gene families. As an important supergene family, cytochrome P450s have been identified at the genome level with the availabilities of the whole genome sequence in various plants. However, little is known about how these P450 genes respond to biotic and abiotic stresses and how they participate in the growth and development of tea plants. In this study, 273 non-redundant P450 genes were identified from the tea plant genome, and these genes are similar to those found in Arabidopsis. Then, a comprehensive study was conducted on the phylogenetic relationships, conserved motifs, gene structures, gene duplication events, cis-acting elements, and gene expression patterns in different tissues of tea plant members of this gene family. Besides, we analyzed the expression profile from RNA-Seq data related to drought and cold stress. The study contributes detailed knowledge on the CsP450 gene family and will help in comprehending the functional divergence of P450 genes in tea plants.
Recent genome sequencing revealed an approximate 3.0 Gb genome size for two representative elite tea plant cultivars . The phylogenetic tree topology of tea plant and Arabidopsis P450s showed similar clustering, indicating a certain degree of conservation of the P450 multi-gene family in plants. In the current phylogenetic classification of plant P450s, the plant P450 family is divided into nine different subfamilies, including CYP51, CYP71, CYP710, CYP711, CYP72, CYP74, CYP85, CYP86, and CYP97 subfamilies . Among the subfamilies present in the tea are CYP710, CYP711, CYP71, CYP72, CYP74, CYP85, CYP86, and CYP97. Many plant-specific enzymes encoded by P450 genes play a role in the metabolism of secondary products, belonging to the largest subfamily, CYP71, which has the most members in tea plants. The CYP71 subfamily is classified as type A P450s, and the remaining eight subfamilies are classified as non-A type . Most type A genes encode plant-specific enzymes that act on the metabolism of secondary products (such as phenylpropanoids and alkaloids), while non-A type genes are mainly involved in the synthesis of hormones and other compounds . These analyses provide critical information for studying the phylogeny of the cytochrome P450 gene family.
A recent study found that multiple cytochrome P450 (P450) genes induced by both biotic and abiotic stressors contain recognition sites for MYB and MYC transcription factors, ACGT core sequences, TGA-boxes, and W-boxes for WRKY transcription factors . These cis-acting elements are known to be involved in the regulation of plant defense, and the response of each P450 gene to various stressors is strictly controlled . In this study, numerous hormone-induced regulatory elements, such as TATC-box, TCA-element and TGA-element, and cis-acting elements involved in responses to abiotic stress, such as low temperature and drought, were identified in the promoter sequences of tea plant P450 genes.
Although the functions of multiple subfamilies of the P450 family have been extensively explored, the molecular basis for the transcriptional activation of many P450 genes by receptor-mediated signaling remains in its early stages . Furthermore, it should be noted that subcellular localization of some P450 enzymes, some of which may have more than one organelle localization, such as CsP45052 may function in the plasma membrane, mitochondrial membrane or endoplasmic reticulum. In particular, many P450-catalyzed reactions in plants may produce toxic compounds if released into the cytoplasm .
The evolution of organisms is mostly fueled by gene duplication. Tandem duplication (TD) and segmental or whole-genome duplications (S/WGD) are the two basic mechanisms by which gene duplication has taken place . In our study, segmental duplication of 28 P450 gene pairs was found in the tea plant. It was assumed that the ancient triplication WGD throughout evolution was responsible for these genes. Together with the segmental duplication events, 37 tandem duplication events were found, suggesting that tandem duplication played a major role in the proliferation of P450 genes in tea plants. These results were in line with the phenomenon observed in citrus and grapevine, where the majority of CYP genes were created through tandem duplication [64, 65].
Previous studies have revealed that plant P450 plays significant roles in different kinds of biochemical pathways and plays important roles in multiple biological processes, including development and stress response [66, 67]. The phenylpropanoid (PPP) pathway was discovered in the CsP450 PPI network, a crucial secondary metabolism pathway implicated in numerous biosyntheses, including the formation of lignin, radical scavenging, signalling molecules, and reproduction. In our study, the CsP450 genes' expression profiles were examined during various developmental stages as well as in response to drought and cold stresses. The findings suggested that the CsP450 genes could be grouped into various groups based on their expression patterns, and the genes within each cluster might be involved in a number of related functions. Furthermore, additional research is necessary to uncover the specific roles of individual CsP450 genes in the stress response and to assess their potential for the genetic improvement of tea plants.
In this study, we identified a total of 273 CsP450s family genes in the tea plant genome, which can be divided into A and non-A types, consisting of 34 subfamilies. We analyzed their structures and functions and found that subfamilies within the same type have similar exon–intron structures and motif compositions. In addition, we identified some cis-acting elements related to secondary metabolism and stress response. The results of collinearity and synteny suggested that the WGD/segmental duplications might mainly contribute to the expansion of the P450 gene family during evolution. Furthermore, our findings suggest that the CsP450 gene family is implicated in the response of tea plants to drought and cold stress. These results offer novel insights into the molecular mechanisms that underlie stress responses in tea plants and could have practical implications for breeding stress-tolerant tea cultivars.
Availability of data and materials
The datasets generated and/or analysed during the current study are available in the GenBank repository .
Mizutani M, Ohta D. Diversification of P450 Genes During Land Plant Evolution. Annu Rev Plant Biol. 2010;61:291–315.
Mizutani M. Impacts of Diversification of Cytochrome P450 on Plant Metabolism. Biol Pharm Bull. 2012;35:824–32.
Chapple C. Molecular-genetic analysis of plant cytochrome P450-Dependent monooxygenases. Annu Rev Plant Physiol Plant Mol Biol. 1998;49:311–43.
Yan Q, Cui X, Lin S, Gan S, Xing H, Dou D. GmCYP82A3, a Soybean Cytochrome P450 Family Gene Involved in the Jasmonic Acid and Ethylene Signaling Pathway, Enhances Plant Resistance to Biotic and Abiotic Stresses. PLoS ONE. 2016;11: e0162253.
Rao MJ, Xu Y, Tang X, Huang Y, Liu J, Deng X, et al. CsCYT75B1, a Citrus CYTOCHROME P450 Gene, Is Involved in Accumulation of Antioxidant Flavonoids and Induces Drought Tolerance in Transgenic Arabidopsis. Antioxidants. 2020;9:161.
Schuler MA, Werck-Reichhart D. Functional Genomics of P450S. Annu Rev Plant Biol. 2003;54:629–67.
Nelson DR. Cytochrome P450 and the Individuality of Species. Arch Biochem Biophys. 1999;369:1–10.
Werck-Reichhart D, Feyereisen R. Cytochromes P450: a success story. Genome Biol. 2000;1:reviews3003.1.
Nelson DR, Ming R, Alam M, Schuler MA. Comparison of Cytochrome P450 Genes from Six Plant Genomes. Tropical Plant Biology. 2008;1:216–35.
Nelson DR. Cytochrome P450 diversity in the tree of life. Biochim Biophys Acta Proteins Proteom. 2018;1866:141–54.
Nelson D, Werck-Reichhart D. A P450-centric view of plant evolution. Plant J. 2011;66:194–211.
Nelson DR, Schuler MA, Paquette SM, Werck-Reichhart D, Bak S. Comparative Genomics of Rice and Arabidopsis. Analysis of 727 Cytochrome P450 Genes and Pseudogenes from a Monocot and a Dicot. Plant Physiology. 2004;135:756–72.
Vasav AP, Barvkar VT. Phylogenomic analysis of cytochrome P450 multigene family and their differential expression analysis in Solanum lycopersicum L. suggested tissue specific promoters. BMC Genomics. 2019;20.
Ma B, Luo Y, Jia L, Qi X, Zeng Q, Xiang Z, et al. Genome-wide identification and expression analyses of cytochromeP450genes in mulberry (Morus notabilis). J Integr Plant Biol. 2014;56:887–901.
Babu PR, Rao KV, Reddy VD. Structural organization and classification of cytochrome P450 genes in flax (Linum usitatissimum L.). Gene. 2013;513:156–62.
Xie MM, Gong DP, Li FX, Liu GS, Sun YH. Genome-wide analysis of cytochrome P450 monooxygenase genes in the tobacco. Hereditas (Beijing). 2013;35:379–87.
Zhang S, Wu QR, Zhang HM, Pei ZM, Gao JW. Genome-wide identification and transcriptomic data exploring of the cytochrome P450 family in Chinese cabbage (Brassica rapa L. ssp. pekinensis). J Plant Interact. 2021;16:136–55.
Werck-Reichhart D. Promiscuity, a Driver of Plant Cytochrome P450 Evolution? Biomolecules. 2023;13:394.
Ohnishi T, Yokota T, Mizutani M. Insights into the function and evolution of P450s in plant steroid metabolism. Phytochemistry. 2009;70:1918–29.
Schuler MA. Plant Cytochrome P450 Monooxygenases. Crit Rev Plant Sci. 1996;15:235–84.
Takei K, Yamaya T, Sakakibara H. Arabidopsis CYP735A1 and CYP735A2 Encode Cytokinin Hydroxylases That Catalyze the Biosynthesis of trans-Zeatin. J Biol Chem. 2004;279:41866–72.
Saito S, Hirai N, Matsumoto C, Ohigashi H, Ohta D, Sakata K, et al. Arabidopsis CYP707As Encode (+)-Abscisic Acid 8′-Hydroxylase, a Key Enzyme in the Oxidative Catabolism of Abscisic Acid. Plant Physiol. 2004;134:1439–49.
Helliwell CA, Chandler PM, Poole A, Dennis ES, Peacock WJ. The CYP88A cytochrome P450, ent-kaurenoic acid oxidase, catalyzes three steps of the gibberellin biosynthesis pathway. Proc Natl Acad Sci. 2001;98:2065–70.
Zhang Y, Zhang B, Yan D, Dong W, Yang W, Li Q, et al. Two Arabidopsis cytochrome P450 monooxygenases, CYP714A1 and CYP714A2, function redundantly in plant development through gibberellin deactivation. Plant J. 2011;67:342–53.
Nomura T, Kushiro T, Yokota T, Kamiya Y, Bishop GJ, Yamaguchi S. The Last Reaction Producing Brassinolide Is Catalyzed by Cytochrome P-450s, CYP85A3 in Tomato and CYP85A2 in Arabidopsis. J Biol Chem. 2005;280:17873–9.
Ohnishi T, Szatmari A-M, Watanabe B, Fujita S, Bancos S, Koncz C, et al. C-23 Hydroxylation byArabidopsisCYP90C1 and CYP90D1 Reveals a Novel Shortcut in Brassinosteroid Biosynthesis. Plant Cell. 2006;18:3275–88.
Ohnishi T, Watanabe B, Sakata K, Mizutani M. CYP724B2 and CYP90B3 Function in the Early C-22 Hydroxylation Steps of Brassinosteroid Biosynthetic Pathway in Tomato. Biosci Biotechnol Biochem. 2006;70:2071–80.
Koo AJK, Cooke TF, Howe GA. Cytochrome P450 CYP94B3 mediates catabolism and inactivation of the plant hormone jasmonoyl-L-isoleucine. Proc Natl Acad Sci. 2011;108:9298–303.
Heitz T, Widemann E, Lugan R, Miesch L, Ullmann P, Désaubry L, et al. Cytochromes P450 CYP94C1 and CYP94B3 Catalyze Two Successive Oxidation Steps of Plant Hormone Jasmonoyl-isoleucine for Catabolic Turnover. J Biol Chem. 2012;287:6296–306.
Li L, Chang Z, Pan Z, Fu Z-Q, Wang X. Modes of heme binding and substrate access for cytochrome P450 CYP74A revealed by crystal structures of allene oxide synthase. Proc Natl Acad Sci. 2008;105:13883–8.
Yazaki K. Secondary metabolism in plant biotechnology. Plant Biotechnology. 2004;21:317–27.
Pandian BA, Sathishraj R, Djanaguiraman M, Prasad PVV, Jugulam M. Role of Cytochrome P450 Enzymes in Plant Stress Response. Antioxidants. 2020;9:454.
Kim Y-C, Kim S-Y, Paek K-H, Choi D, Park JM. Suppression of CaCYP1, a novel cytochrome P450 gene, compromises the basal pathogen defense response of pepper plants. Biochem Biophys Res Commun. 2006;345:638–45.
Godiard L, Sauviac L, Dalbin N, Liaubet L, Callard D, Czernic P, et al. CYP76C2, an Arabidopsis thaliana, cytochrome P450 gene expressed during hypersensitive and developmental cell death. FEBS Lett. 1998;438:245–9.
Goodwin SB, Sutter TR. Microarray analysis of Arabidopsis genome response to aluminum stress. Biol Plant. 2009;53:85–99.
Yuan L, Dai H, Zheng S, Huang R, Tong H. Genome-wide identification of the HDAC family proteins and functional characterization of CsHD2C, a HD2-type histone deacetylase gene in tea plant (Camellia sinensis L. O. Kuntze). Plant Physiol Biochem. 2020;155:898–913.
Huang D, Mao Y, Guo G, Ni D, Chen L. Genome-wide identification of PME gene family and expression of candidate genes associated with aluminum tolerance in tea plant (Camellia sinensis). BMC Plant Biology. 2022;22.
Roshan NM, Ashouri M, Sadeghi SM. Identification, evolution, expression analysis of phospholipase D (PLD) gene family in tea (Camellia sinensis). Physiol Mol Biol Plants. 2021;27:1219–32.
Chatterjee A, Paul A, Unnati GM, Rajput R, Biswas T, Kar T, et al. MAPK cascade gene family in Camellia sinensis: In-silico identification, expression profiles and regulatory network analysis. BMC Genomics. 2020;21.
Cao H, Wang L, Yue C, Hao X, Wang X, Yang Y. Isolation and expression analysis of 18 CsbZIP genes implicated in abiotic stress responses in the tea plant (Camellia sinensis). Plant Physiol Biochem. 2015;97:432–42.
Wang YX, Liu ZW, Wu ZJ, Li H, Zhuang J. Transcriptome-Wide Identification and Expression Analysis of the NAC Gene Family in Tea Plant [Camellia sinensis (L.) O. Kuntze]. PLOS ONE. 2016;11:e0166727.
Shang X, Han Z, Zhang D, Wang Y, Qin H, Zou Z, et al. Genome-Wide Analysis of the TCP Gene Family and Their Expression Pattern Analysis in Tea Plant (Camellia sinensis). Front Plant Sci. 2022;13: 840350.
Chen X, Wang P, Gu M, Lin X, Hou B, Zheng Y, et al. R2R3-MYB transcription factor family in tea plant (Camellia sinensis): Genome-wide characterization, phylogeny, chromosome location, structure and expression patterns. Genomics. 2021;113:1565–78.
de Castro E, Sigrist CJA, Gattiker A, Bulliard V, Langendijk-Genevaux PS, Gasteiger E, et al. ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins. Nucleic Acids Res. 2006;34(Web Server):W362-5.
Chen C, Chen H, Zhang Y, Thomas HR, Frank MH, He Y, et al. TBtools: An Integrative Toolkit Developed for Interactive Analyses of Big Biological Data. Mol Plant. 2020;13:1194–202.
Larkin MA, Blackshields G, Brown NP, Chenna R, McGettigan PA, McWilliam H, et al. Clustal W and Clustal X version 2.0. Bioinformatics. 2007;23:2947–8.
Kumar S, Stecher G, Tamura K, et al. Tamura K. MEGA7: Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets. Mol Biol Evol. 2016;33:1870–4.
Hu B, Jin J, Guo AY, Zhang H, Luo J, Gao G. GSDS 2.0: an upgraded gene feature visualization server. Bioinformatics. 2014;31:1296–7.
Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, et al. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 2009;37(Web Server):W202-8.
Higo K. PLACE: a database of plant cis-acting regulatory DNA elements. Nucleic Acids Res. 1998;26:358–9.
Wang Y, Tang H, DeBarry JD, Tan X, Li J, Wang X, et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 2012;40:e49–59.
Wang D, Zhang Y, Zhang Z, Zhu J, Yu J. KaKs_Calculator 2.0: A Toolkit Incorporating Gamma-Series Methods and Sliding Window Strategies. Genomics Proteomics Bioinformatics. 2010;8:77–80.
Livak KJ, Schmittgen TD. Analysis of Relative Gene Expression Data Using Real-Time Quantitative PCR and the 2−ΔΔCT Method. Methods. 2001;25:402–8.
Malik WA, Wang X, Wang X, Shu N, Cui R, Chen X, et al. Genome-wide expression analysis suggests glutaredoxin genes response to various stresses in cotton. Int J Biol Macromol. 2020;153:470–91.
Reilly SK, Gosai SJ, Gutierrez A, Mackay-Smith A, Ulirsch JC, Kanai M, et al. Direct characterization of cis-regulatory elements and functional dissection of complex genetic associations using HCR–FlowFISH. Nat Genet. 2021;53:1166–76.
He P, Zhang Y, Xiao G. Origin of a Subgenome and Genome Evolution of Allotetraploid Cotton Species. Mol Plant. 2020;13:1238–40.
Garg R, Jhanwar S, Tyagi AK, Jain M. Genome-Wide Survey and Expression Analysis Suggest Diverse Roles of Glutaredoxin Gene Family Members During Development and Response to Various Stimuli in Rice. DNA Res. 2010;17:353–67.
Wei C, Yang H, Wang S, Zhao J, Liu C, Gao L, et al. Draft genome sequence of Camellia sinensis var. sinensis provides insights into the evolution of the tea genome and tea quality. Proc Natl Acad Sci. 2018;115:E4151-8.
Yu J, Tehrim S, Wang L, Dossa K, Zhang X, Ke T, et al. Evolutionary history and functional divergence of the cytochrome P450 gene superfamily between Arabidopsis thaliana and Brassica species uncover effects of whole genome and tandem duplications. BMC Genomics. 2017;18.
Paquette SM, Bak S, Feyereisen R. Intron-Exon Organization and Phylogeny in a Large Superfamily, the Paralogous Cytochrome P450 Genes of Arabidopsis thaliana. DNA Cell Biol. 2000;19:307–17.
Rudolf JD, Chang C-Y, Ma M, Shen B. Cytochromes P450 for natural product biosynthesis in Streptomyces: sequence, structure, and function. Nat Prod Rep. 2017;34:1141–72.
Fang Y, Jiang J, Du Q, Luo L, Li X, Xie X. Cytochrome P450 Superfamily: Evolutionary and Functional Divergence in Sorghum (Sorghum bicolor) Stress Resistance. J Agric Food Chem. 2021;69:10952–61.
Zheng X, Li P, Lu X. Research advances in cytochrome P450-catalysed pharmaceutical terpenoid biosynthesis in plants. J Exp Bot. 2019;70:4619–30.
Jiu S, Xu Y, Wang J, Liu X, Sun W, et al. The Cytochrome P450 Monooxygenase Inventory of Grapevine (Vitis vinifera L.): Genome-Wide Identification, Evolutionary Characterization and Expression Analysis. Front Genet. 2020;11:44.
Liu X, Gong Q, Zhao C, Wang D, Ye X, Zheng G, et al. Genome-wide analysis of cytochrome P450 genes in Citrus clementina and characterization of a CYP gene encoding flavonoid 3’-hydroxylase. Horticulture Research. 2022;10.
Xu J, Wang X, Guo W. The cytochrome P450 superfamily: Key players in plant development and defense. J Integr Agric. 2015;14:1673–86.
Bathe U, Tissier A. Cytochrome P450 enzymes: A driving force of plant diterpene diversity. Phytochemistry. 2019;161:149–62.
We are grateful to the Shaanxi Provincial Department of Science and Technology and Shaanxi Provincial Department of Education for financial support.
This research was funded by the Shaanxi province natural science basic research program project (2023-JC-QN-0203) and the Shaanxi provincial department of education general project (22JK0240).
Shaanxi province natural science basic research program project,2023-JC-QN-0203
Ethics approval and consent to participate
Consent for publication
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Shen, C., Li, X. Genome-wide analysis of the P450 gene family in tea plant (Camellia sinensis) reveals functional diversity in abiotic stress. BMC Genomics 24, 535 (2023). https://doi.org/10.1186/s12864-023-09619-4
- Camellia sinensis
- P450 gene family
- Synteny analysis
- Gene expression