- Open Access
Genome-wide identification and expression analysis of the R2R3-MYB gene family in tobacco (Nicotiana tabacum L.)
BMC Genomics volume 23, Article number: 432 (2022)
The R2R3-MYB transcription factor is one of the largest gene families in plants and involved in the regulation of plant development, hormone signal transduction, biotic and abiotic stresses. Tobacco is one of the most important model plants. Therefore, it will be of great significance to investigate the R2R3-MYB gene family and their expression patterns under abiotic stress and senescence in tobacco.
A total of 174 R2R3-MYB genes were identified from tobacco (Nicotiana tabacum L.) genome and were divided into 24 subgroups based on phylogenetic analysis. Gene structure (exon/intron) and protein motifs were especially conserved among the NtR2R3-MYB genes, especially members within the same subgroup. The NtR2R3-MYB genes were distributed on 24 tobacco chromosomes. Analysis of gene duplication events obtained 3 pairs of tandem duplication genes and 62 pairs of segmental duplication genes, suggesting that segmental duplications is the major pattern for R2R3-MYB gene family expansion in tobacco. Cis-regulatory elements of the NtR2R3-MYB promoters were involved in cellular development, phytohormones, environmental stress and photoresponsive. Expression profile analysis showed that NtR2R3-MYB genes were widely expressed in different maturity tobacco leaves, and however, the expression patterns of different members appeared to be diverse. The qRT-PCR analysis of 15 NtR2R3-MYBs confirmed their differential expression under different abiotic stresses (cold, salt and drought), and notably, NtMYB46 was significantly up-regulated under three treatments.
In summary, a genome-wide identification, evolutionary and expression analysis of R2R3-MYB gene family in tobacco were conducted. Our results provided a solid foundation for further biological functional study of NtR2R3-MYB genes in tobacco.
MYB transcription factor is one of the largest members of the plant transcription factor family . MYB proteins share a highly conserved DNA-binding domain (MYB), mostly located at the N-terminus of the protein. The MYB domains are characterized by one to four incomplete tandem repeat (R) structures (termed R1, R2, R3, R4). Each repeat is composed of 50 ~ 53 conserved amino acid residues and forms three α-helices. The second and third helices can be folded to form a helix-turn-helix (HTH) structure, which is involved in DNA binding . Three regularly spaced tryptophan residues (or other hydrophobic residues) form a hydrophobic core in the three-dimensional structure center of HTH, which is important for maintaining the configuration of HTH . Sometimes, the tryptophan residues are replaced by other amino acids, such as aromatic amino acids and hydrophobic amino acids, especially in R3 domains. The C-terminal of MYB transcription factor usually contains a transcriptional activation region rich in acidic amino acids, which is responsible for multiple protein regulatory activities. Based on the number and types of MYB repeats, the MYB family is subdivided into four major groups, namely 4R-MYB, 3R-MYB/R1R2R3-MYB, 1R-MYB/MYB-related, 2R-MYB/R2R3-MYB .
R2R3-MYB transcription factor is predominantly present in plants, which contains R2 and R3 domains. It was reported that the R2R3-MYB genes probably evolved from the loss of the R1 repeat in the R1R2R3-MYB gene or from the duplication of the R1 repeat in 1R-MYB gene [5, 6]. R2R3-MYB gene family is widely involved in the regulation of various biological processes, including plant growth and development, hormone signaling, primary and secondary metabolism [7,8,9]. For example, AtMYB106, AtMYB16 and AtMYB17 are involved in the regulation of trichome branching, petal epidermal cell morphogenesis and early inflorescence development, respectively [10,11,12]. The biosynthesis of flavonoids is directly or indirectly controlled by three genes (P, C1, Pl) encoding R2R3-MYB domain in maize [1, 13, 14]. MYB7 gene in Actinidia deliciosa positively regulated the biosynthesis of carotenoids and chlorophyll . Knockdown the expression of MYB305 of the ornamental tobacco which contains a conserved R2R3-MYB DNA binding domain resulted in the decrease expression of related genes in nectarins and flavonoid biosynthetic . In addition, most R2R3-MYB genes have also been suggested to regulate plant responses to biotic and abiotic stress conditions. For instance, overexpression of AtMYB75 in Arabidopsis can increase secondary metabolites (anthocyanins and flavonols), which can protect against pests . OsMYB6 gene of rice as a stress-responsive factor which plays the role as a positive regulator in response to drought and salt stress resistance . Increasing the expression of NtMYB4a may promote anthocyanin accumulation, and thereby increase antioxidant capability and tolerance to low temperature of tobacco plant . Tobacco NtMYB12 overexpression adapts to low Pi stress environment via regulating the contents of flavonol and phosphorus .
Tobacco is one of the most important model plants . Like other plants, tobacco is often subjected to stress such as low temperature, salt, strong light, drought and other stresses, which leads to the reduction of production. Leaf senescence is an important production process with a positive and orderly process accompanied by the changes of leaf color, cell structure, biochemical metabolism and gene expression level, along with a series of degradations, which is related to secondary metabolites, such as flavonoids, carotenoids, chlorophyll, etc. [22, 23]. The regulatory role of R2R3-MYB genes in abiotic stress response, plant growth and development has been assessed in several studies [12, 18, 24]. The investigation of the R2R3-MYB gene family and their expression patterns under various stresses including abiotic stress and senescence in tobacco is of great significance for the study of plant physiology and development.
In this study, a comprehensive investigation of R2R3-MYB gene family, including gene structures, chromosomal localization, phylogenetic relationship, motif composition, duplication events and cis-element compositions was performed using the current tobacco genome sequence data. Moreover, the gene expression profile in different senescence stages of tobacco leaves and expression pattern under various abiotic stresses such as cold, salt, and drought treatments were analyzed. The objectives of this study were to systematically analyze the sequence structures of tobacco R2R3-MYB gene family and explore the evolutionary relationship of R2R3-MYB gene family in plant, and thereby, reveal the expression regulation of the R2R3-MYB gene family members under various stresses or adversity condition of senescence. The information derived from this study lay a foundation for further functional investigation on the R2R3-MYB gene family in tobacco.
Characterization and distribution of R2R3-MYB genes in tobacco genome
In this study, a total of 174 NtR2R3-MYB genes were identified in tobacco and were renamed from NtMYB1 to NtMYB174 (Table 1). The information of these NtR2R3-MYB genes and their corresponding proteins are showed in Table 1 and Additional file 1: Table S1, namely, gene ID, location, number of exons, protein length (aa), molecular weight (MW), theoretical isoelectric point (pI) and subcellular location. The protein lengths varied greatly from 192aa (NtMYB37, NtMYB39) to 505aa (NtMYB96). The molecular weights ranged from 22,003.55 Da (NtMYB37) to 55,710.67 Da (NtMYB96), and the theoretical isoelectric point (pI) ranged from 4.80 (NtMYB56) to 9.77 (NtMYB3). Subcellular localization prediction indicated that the majority NtR2R3-MYB members were located in the nucleus. In addition, the chromosome positions of some NtR2R3-MYB genes could not yet be defined due to the incomplete sequencing of tobacco genome.
A total of 107 NtR2R3-MYB genes were unevenly distributed on 24 chromosomes of tobacco, while 67 genes were mapped to unattributed scaffolds (Fig. 1). Chromosome 4 contained the biggest number of NtR2R3-MYBs (13 genes), while chromosome 17 contained 10 NtR2R3-MYBs, and chromosome 22 had 8 NtR2R3-MYBs. In contrast, chromosome 11 only contained 1 NtR2R3-MYB gene. In our study, 3 pairs of tandem duplication genes on chromosome 4 (NtMYB72/112, NtMYB72/138, NtMYB112/138) and 62 pairs of segmental duplication genes were identified in tobacco R2R3-MYB gene family (Fig. 2, Additional file 2: Table S2).
Phylogenetics and gene structure of the NtR2R3-MYBs
To investigate the evolutionary relationships among NtR2R3-MYB genes, a phylogenetic tree was constructed based on the amino acid sequences of 174 NtR2R3-MYB genes by using the MEGA-X software (Fig. 3A). The NtR2R3-MYBs were classified into 24 subgroups (A to X) with at least 50% bootstrap supported on phylogenetic trees (Fig. 3A). However, 5 NtR2R3-MYBs (NtMYB139, NtMYB9, NtMYB61, NtMYB103 and NtMYB27) could not be assigned to any of the 24 subgroups due to the low bootstrap values (< 50%). Among these subgroups, subgroup A (27 members) and X (21 members) were the two largest groups and these two subgroups represented more than 27% of the total NtR2R3-MYB members. In contrast, subfamilies B, C, F, J, K, N and T only contained two members.
Gene structure (Fig. 3B) analysis of NtR2R3-MYBs showed that the number of introns was varied from 0 to 11. The highest number (11) of introns was possessed by NtMYB173 and NtMYB174. Notably, a great number of R2R3-MYB genes (119 members, 68.39%) had a conserved gene structure with two introns and three exons, while 10 NtR2R3-MYBs completely lacked the introns. Similar exon-intron structural patterns were found among members within the same subgroup, especially the exon number and exon length were relatively conservative (Fig. 3).
Domain and motif analysis of the NtR2R3-MYBs
A total of 20 conserved motifs were predicted for 174 NtR2R3-MYB proteins using the online MEME program (Fig. 4). The lengths and conserved sequence of each motif is listed in Additional file 3: Table S3. The motif composition and distribution were found relatively conservative among members within the same subgroup (Fig. 4). The motif1, motif2 and motif3 located in the N-terminal of the majority NtR2R3-MYB protein sequences. There were conserved tryptophan residues in these three motifs, which were related to R2 and R3 domains. The type and number of motifs were similar in same subgroup, which suggested that motif pattern might be related to the function of MYB protein. Different subgroups usually possessed specific motifs, most of which located in the C-terminal. For example: motif7 and motif9 were specific to P; motif10 and motif13 only appear in A; motif8, motif14, and motif18 was presented in G, L, and Q alone, respectively.
To further explore the conservative domain of the NtR2R3-MYB proteins, the multiple alignment of the 174 NtR2R3-MYB protein sequences was performed based on DNAMAN software, and the R2 and R3 sequence logos of MYB were generated by WebLogo (Fig. 5). As a result, NtR2R3-MYB members possess the typical characteristics of MYB conserved domains, and the R2 and R3 repeat of NtR2R3-MYB contain about 52 amino acid residues. The R2 repeat contains three highly conserved tryptophan residues (W), forming a hydrophobic core zin HTH structure. The first tryptophan residues (W) of R3 repeat often was replaced by a phenylalanine (F), isoleucine (I) or leucine (L) residues, whereas the second and third tryptophan residues were highly conserved, and this result was consistent with A. thaliana . We also observed that some amino acid residues showed highly conservative, such as G-2, E-8, D-9, L-12, G-20, L-33, R-35, K-38, S-39, C-40, R-41, L-42, R-43, N-46, L-48 and P-50 in the R2 repeat and E-10, G-22, N-23, I-28, A-29, P-33, G-34, R-35, T-36, D-37, N-38, K-41 and N-42 in the R3 repeat. These highly conserved amino acid residues may be associated with conserved tryptophan residues to maintain the helix-turn-helix (HTH) structure of MYB transcription factor.
Promoter cis-elements analysis of NtR2R3-MYBs
Promoter cis-elements play critical roles in the initiation of gene expression. A total of 37 cis-regulatory were identified in the promoter region of NtR2R3-MYB genes, which could be classified into four categories elements, including cellular development, phytohormones, environmental stress and photoresponsive elements (Fig. 6, Additional file 4: Table S4). There were seven cis-acting elements related to cell development, such as CAT-box, MSA-like, GCN4_motif, CCAAT-box, MBSI, HD-Zip 1, and RY-element. Twelve phytohormone-responsive elements were identified, namely, CGTCA-motif, TGACG-motif, ABRE, P-box, TGA-element, TCA-element, AuxRR-core, TATC-box, GARE-motif, AuxRE, A-box and O2-site. These cis-elements are involved in JA/MeJA, abscisic acid, gibberellin, auxin, salicylic acid responsiveness and zein metabolism regulation. Meanwhile, the ABRE responsiveness elements were the most common in the NtR2R3-MYB gene promoters. In addition, ten light responsive elements were calculated, including GT1-motif, G-Box, Box 4, MRE, ATC-motif, Sp1, ATCT-motif, ACE, 3-AF1 binding site, and AAAC-motif. Almost all NtR2R3-MYB genes contained at least one phytohormone-responsive element in their promoter regions. There were 8 cis-regulatory elements that associated with response to external or environmental stresses were also present. This category includes low-temperature responsive element (LTR), anaerobic induction elements (ARE, GC-motif), drought-inducibility element (MBS), defense and stress responsive element (TC-rich repeats), circadian control element (circadian), wound-responsive element (WUN-motif), as well as AT-rich element. G-Box, Box 4, and ARE elements appear in most promoters of NtR2R3-MYB genes. The expression of these genes might be regulated by phytohormones, diverse light-responsiveness cis-elements, defense signaling transduction, and abiotic stresses during tobacco growth.
Phylogenetic analysis of the NtR2R3-MYB gene family
To investigate the phylogenetic relationships of the R2R3-MYB gene family, phylogenetic tree was generated based on the 174 tobacco R2R3-MYB protein sequences and 126 Arabidopsis R2R3-MYB protein sequences by using MEGA-X with maximum likelihood (ML) method (Fig. 7). According to the bootstrap values (> 50%) of the phylogenetic tree, the R2R3-MYB family were clustered into 38 subfamily. Among them, the R2R3-MYB members of tobacco were distributed in 34 subgroups (named N1-N34). There was no NtR2R3-MYB member distributed on the subfamily of S3, S6, S12 and S15, while three subfamilies (N10, N20, N21) only contain the R2R3-MYB members from tobacco. In addition, the NtR2R3-MYB members were mainly distributed in N9(9), N11(11), N22(13), N25(11), N27(9) and N34(10), and the number of R2R3-MYB members of tobacco were about twice to triple than that in Arabidopsis in these subfamilies. These results indicated that there were some common ancestors of R2R3-MYB genes between tobacco and Arabidopsis, and specific expansion and divergence also occurred after their separation during the evolution process.
Expression changes of NtR2R3-MYB genes in five senescence stages of tobacco leaves
To analyze the expression pattern of NtR2R3-MYBs in tobacco leaves, the FPKM values of NtR2R3-MYB genes at five senescence stages of tobacco leaves were obtained from the transcriptome data (Additional file 5: Table S5), and NtR2R3-MYB genes with no expression or low expression level (FPKM < 0.5) were excluded. Finally, the expression profiles of NtR2R3-MYB genes of 78 NtR2R3-MYB genes were generated (Fig. 8). The results showed that the members of NtR2R3-MYB genes exhibited differential expression in tobacco leaves at different senescence stages (Fig. 8) and these 78 NtR2R3-MYB genes were classified into three groups (Fig. 8 I to III). A total of 9 NtR2R3-MYB members were including in group I, and these genes had high expression level at the M5 stage. In contrast, these genes showed relative low expression level at other four senescence stages (M1-M4). In group II, most of NtR2R3-MYB genes showed high expression level at the M1 stage and decreased regularly with the increase of maturity. In terms of group III, the expression level of NtR2R3-MYB genes showed increase first and then decreased with the increasing of the senescence degrees. The results indicated that there may be functional diversity among NtR2R3-MYB members during tobacco growth and development.
Expression of NtR2R3-MYB genes in response to abiotic stress
To analyze the expression pattern of NtR2R3-MYBs in response to abiotic stress, gene expression was investigated by using the Genevestigator tools based on transcriptome data. NtR2R3-MYB genes with no expression or low expression level (FPKM < 0.5) were excluded (Additional file 6: Table S6). Finally, the expression profiles of 69 NtR2R3-MYB genes were generated. The result showed that many genes showed significant up-regulated or down-regulated compared with the control group under cold and salt stress conditions (Fig. 9), and these genes including NtMYB34, NtMYB38, NtMYB42, NtMYB44, NtMYB46, NtMYB63, NtMYB67, NtMYB73, NtMYB79, NtMYB82 and NtMYB104 were clustered together with S1 and S7 subfamilies of Arabidopsis. In addition, it has been reported that the members of Arabidopsis R2R3-MYB gene family in S1 and S7 subgroups were related to various stress responses . It is known that homologous genes in the same subgroup may have similar biological functions. To further explore the possible function of the tobacco R2R3-MYB genes, 15 tobacco R2R3-MYB genes that clustered with Arabidopsis R2R3-MYB gene members in S1 and S7 subgroups of the phylogenetic tree (Fig. 7) were selected for qRT-PCR analysis under cold, drought and salt stresses (Fig. 10). Compared with the control, the expression of four NtR2R3-MYB genes (NtMYB36, NtMYB45, NtMYB46 and NtMYB110) showed significantly up-regulated and seven NtR2R3-MYB genes (NtMYB38/41/42/63/67/79/82) showed significantly down-regulated under cold stress. As to the salt stress, the expression levels of 13 NtR2R3-MYB genes showed down-regulated, except MYB38 and MYB46 genes. In terms of drought stress, eight NtR2R3-MYB genes (NtMYB34/36/38/42/46/73/79/82) showed significantly up-regulated. Interestingly, the expression of NtMYB46 showed significantly up-regulated in response to all the stresses. In addition, the expression patterns of 15 genes in Fig. 9 and Fig. 10 were not completely consistent, and this phenomenon may be due to trial or sampling differences. The result implied the functional dissimilation among the tobacco R2R3-MYB genes.
R2R3-MYB gene family members are widely distributed in eukaryotes . With the development of genome sequencing, the whole-genome analysis of R2R3-MYB gene family has been identified in numerous species, including 126 in Arabidopsis thaliana, 89 in watermelon, 110 in rice, 157 in maize, 244 in soybean, and so on [7, 14, 27,28,29]. In this study, 174 R2R3-MYB genes of tobacco were identified, the number was greater than those identified in Arabidopsis, maize, rice, watermelon, but less than that in soybean. As an allotetraploid, the genome size of Nicotiana tabacum is 4.5Gb, while that of Arabidopsis, rice, maize, and soybean is 125 Mb, 430 Mb, 2300 Mb, and 1.025Gb, respectively [30,31,32,33,34]. In this case, it seems that there is no direct correlation between the number of R2R3-MYB genes and genome size in these plants. It was reported that polyploidization and gene region-specific duplication (tandem duplication and segmental duplications) were important mechanisms for the generation and expansion of gene families in plant . In our study, phylogenetic analysis found that the majority subfamilies contained the R2R3-MYB members both from tobacco and Arabidopsis, and however, a few subfamilies possessed only the R2R3-MYB members either from tobacco or Arabidopsis, suggesting that they might be derived from a common ancestor, and moreover, the R2R3-MYB gene family could also be undergone species specific differentiation after their separation. A total of 3 pairs of tandem duplication genes and 62 pairs of segmental duplication genes were identified in tobacco R2R3-MYB gene family, implying that the segmental duplication events were the main source for the expansion of R2R3-MYB gene family in tobacco, and this result was possible due to the allotetraploid of tobacco.
The evolution of gene family largely depends on the organization of gene structure. The varied length of nucleotide sequence among 174 NtR2R3-MYBs indicated the complexity in the Nicotiana tabacum L. genome. The molecular weight and isoelectric point values of NtR2R3-MYB proteins were also different among family members, suggesting their functional divergence. In addition, NtR2R3-MYB proteins contained 20 conserved motifs with different compositions, and their members were clustered in the same subfamily containing similar type and number of motifs, demonstrating the conservation and diversification of R2R3-MYB gene family of tobacco. It has been reported that the ancestor MYB gene has no intron, but the intron insertion event occurred in the MYB domain region under very occasional condition, and this intron pattern was kept conserved during the long evolution process [5, 36]. In our study, the majority NtR2R3-MYB genes possess typically splicing pattern of three exons and two introns, which exists in the conserved R2 and R3 repeats, and this result was in consistent with the previous reports in other plants [14, 37]. In addition, the NtR2R3-MYB proteins are comprised of the highly conserved MYB domain (R2 repeat and R3 repeat), R2 repeat contained a conserved LRPD motif at the C-terminal and three highly conserved tryptophan residues (W), whereas R3 repeat exist diversity at the first tryptophan residues (W), which could be substituted by other residues such as phenylalanine (F), isoleucine (I) and leucine (L). Meanwhile, substitution of amino acid occurred frequently in some sites of the MYB domain, and these regions may play important roles for the evolution and functional differentiation of tobacco R2R3-MYB protein. Similar results have been reported in other species, including Arabidopsis, Zea mays (Fig. 5) [14, 25].
The cis-elements of promoter play a key role in initiating gene expression. Genes with different cis-regulatory elements in the promoter sequences of genes may result in different expression patterns in pepper . In our study, a total of 37 cis-elements related sequences were identified in the promoter region of NtR2R3-MYB genes, and among them, 7 were related to cell development, 12 engaged in phytohormone-responsive, 10 for light responsive elements, and 8 involved in stresses cis-regulatory elements, suggesting the different function of regulatory elements of NtR2R3-MYB genes. These highly diverse cis-regulatory elements in the promoter region of NtR2R3-MYB genes may also reflect the functional divergence at the transcriptional level.
R2R3-MYB genes have been proved to be related to biotic and abiotic stress . For example, TaMYB344-overexpressing of wheat enhanced the tolerance of transgenic tobacco to drought, heat and high salt stress . OsMYB2 was induced by cold, salt, and dehydration stress in rice . In present study, the profiling data of gene expression was used to dissect the functional roles of NtR2R3-MYB genes, and different expression patterns were identified among NtR2R3-MYB genes in response to various abiotic stresses (Fig. 9). This result indicated the functional differentiation of NtR2R3-MYB genes. In addition, the expression of 15 NtR2R3-MYB genes were analyzed by qRT-PCR in response to three abiotic stress conditions, including cold, salinity, and drought. A total of 13 NtR2R3-MYB genes could be significantly regulated by at least two treatments except NtMYB44 and NtMYB104, implied that NtR2R3-MYBs may be involved in the cross-talk of different signaling pathways under stress. Generally, genes with similar structure will be clustered in the same subfamily and these genes may have similar biological functions. It was reported that two R2R3-MYB genes of Arabidopsis (AtMYB60 and AtMYB94) clustered in subfamily S1 (Fig. 7) were involved in the physiological regulation under salt and drought treatments [41, 42], and therefore, the orthologous clustered in S1 may have similar function. In this study, two NtR2R3-MYB genes (NtMYB38 and NtMYB46) which clustered with these two R2R3-MYB genes of Arabidopsis (AtMYB60 and AtMYB94) in S1 subfamily showed similar response patterns under salt and drought stresses, suggesting that NtMYB38 and NtMYB46 may be involved in the response under salt and drought stresses, and further experiments showed be conducted to validation these functions.
Tobacco leaf senescence is often regarded as a kind of adversity, and R2R3-MYB genes were proved to be involved in the senescence accompany with secondary metabolites, including flavonoids, carotenoids, chlorophyll, and so on [23, 43]. In this study, diverse expression patterns were found among the NtR2R3-MYB genes inferring that the functional differentiation of family gene members should also be coexisting (Fig. 8). For example, NtMYB3/4 and NtMYB3/6 appeared to belong to segmental duplication genes pairs (Fig. 2, Additional file 2: Table S2), and our result showed that NtMYB4 and NtMYB6 expressed in all the five senescence stages, with especial high expression level in M5 stage, whereas NtMYB3 had no expression in all investigated senescence stages (M1-M5) (Fig. 8). It has been reported that there are three outcomes in the evolution of duplicate genes theoretically, including nonfunctionalization, neofunctionalization and subfunctionalization . Therefore, it was inferred that NtMYB3 degenerated and lost its original function during the long evolution process. Notably, the expression level of majority members in group II including such as NtMYB146, NtMYB108, NtMYB90, NtMYB159, NtMYB120, NtMYB17 (Fig. 8) were decreased precisely corresponding consistent to the increasing of the senescence degrees, and these gene may closely relate to leaf senescence and can be further developed as the measure of leaf senescence or maturity. Whether or not the differential expression of NtR2R3-MYB genes leads to the changes of secondary metabolites such as flavonoids, carotenoids or chlorophyll still needs to be investigated. In short, our results provide using information for their further functional exploration.
In this study, a total of 174 R2R3-MYB genes were identified in tobacco (Nicotiana tabacum L.) genome and these genes were divided into 24 subfamilies. The NtR2R3-MYB genes were distributed randomly on 24 tobacco chromosomes. A total of 3 pairs of NtR2R3-MYB genes were founded to be originated from tandem duplication and 62 pairs NtR2R3-MYB genes were originated from segmental duplication. Cis-regulatory elements of the NtR2R3-MYB promoters were involved in cellular development, phytohormones, environmental stress and photoresponsive. The members of NtR2R3-MYB genes showed differential expression pattern in different maturity tobacco leaves, and differential response were also found under different abiotic stresses (cold, salt and drought) for 15 NtR2R3-MYBs. Our results provided valuable information for further functional study of NtR2R3-MYB genes in tobacco.
Identification of R2R3-MYB genes in tobacco
A total of 126 known Arabidopsis R2R3-MYB protein sequences were downloaded from the Arabidopsis Information Resource (TAIR, http://www.arabidopsis.org/) database . These sequences were used as queries using the online tool of BLASTP (E ≤ 1e− 5) for the identification of R2R3-MYB family members in the tobacco genome sequences of Sol Genomics Network database (https://solgenomics.net/organism/Nicotiana_attenuata/genome) [45, 46]. The redundant protein sequences of tobacco were removed manually, and then the candidate protein sequence which contained complete MYB domains (PF000249) were confirmed as the final R2R3-MYB protein sequence based on the Conserved Domain Database (CDD) of NCBI (https://www.ncbi.nlm.nih.gov/cdd/) . These final tobacco R2R3-MYB genes were renamed (NtMYBs). The features and the subcellular localization information of the NtR2R3-MYB protein sequences were analyzed by the online ExPASY tool (http://Web.ExPASY.Org/protparam/)  and the Softberry service platform-ProtComp 9.0 (http://linux1.softberry.com/berry.phtml), respectively.
Gene structure and conserved motif analysis
The GFF format file of tobacco gene structure was downloaded from Solanaceae genome database (https://sol-genomics.net/) , and the NtR2R3-MYB gene structure (exon-intron) was defined using the online software Gene Structure Display Server (GSDS) (http://gsds.cbi.pku.edu.cn/) . The motifs of NtR2R3-MYB protein were obtained from the online MEME program (http://meme-suite.org/, v5.1.1)  and following parameters were used: the minimum width, maximum width and maximum number of motifs were set to 6 bp, 100 bp and 20, respectively. The conserved NtR2R3-MYB domain was visualized using the WebLogo platform (http://weblogo.berkeley.edu/) . The cis-regulatory elements in the promoter region (2000 bp upstream of the starting codon) of the NtR2R3-MYBs were searched by the online program of PlantCARE (http://bioinformatics.psb.ugent.be/webtools /plantcare/html/) .
Chromosome localization and gene duplication
The physical position and chromosomal distribution information of NtR2R3-MYB genes were obtained by using the MapInspect software (http://mapinspect.software.informer.com/) . The possible segmental duplication and tandem duplication events were defined based on the method reported by Wang et al. (2010) . Both chromosomal localization and duplication events of the NtR2R3-MYB genes were graphic displayed using the TBtools software .
Multi-sequence alignment and phylogenetic classification
To explore the evolutionary relationship of R2R3-MYB gene family, the full protein sequences of R2R3-MYB from Arabidopsis and tobacco were used for the phylogenetic tree construction. Multiple sequence alignment was performed using ClustalW program in MEGA-X software , and the phylogenetic tree was constructed using the maximum likelihood (ML) with 1000 bootstraps.
NtR2R3-MYB genes expression analysis
The tobacco variety of Cuibi 1 (CB-1) was used in this study. To analyze the NtR2R3-MYB genes expression profile at different senescence stages, five senescence stages (M1, M2, M3, M4 and M5) of middle leaves (8th to 10th) judged by the appearance characteristics were collected for tests . The FPKM (Fragments Per Kilobase of transcript sequence per Millions base pairs sequenced) value of the NtR2R3-MYB genes at five senescence stages of tobacco leaves were extracted from our recent RNA-Seq data . The expression profile of NtR2R3-MYB genes at different senescence stages were measured by their FPKM value, and the heat map was generated using the Heatmap function of R gplots package . Additionally, expression analysis of NtR2R3-MYBs under salt (SRP193166)  and cold stress (SRP097876)  of tobacco were performed using Genevestigator software. Genes with low expression level (FPKM < 0.5) were filtered.
Plant treatments and quantitative real-time PCR analysis
To further decipher the expression pattern of NtR2R3-MYB genes in response to various abiotic, tobacco seeds were sown in sterilized mixed soil (vermiculite: humus = 1:1) under the condition of 22 °C and 16 h light/8 h dark photoperiod for 60 days . The plantlets of 60 days were transplanted into a tray with a nutrient solution for 3 days in growth chamber, and then were exposed to the abiotic treatments, including cold (4 °C), drought (10% polyethylene glycol) and salt (200 mM NaCl), respectively. Untreated plantlets were used as control (CK). The samples for gene expression analysis were collected 6 h after treatment, and three biological replicates per treatment and 3 leaves for each sample from different plantlet were gathered and these samples were immediately stored at − 80 °C prior to RNA extraction.
Total RNA was extracted using the Hipure Plant RNA Mini Kit (Magen Biotech, Shanghai, China) and the cDNA was synthesized using the SMART kit (Takara) according to the manufacturer’s protocol. The qRT-PCR primers of NtR2R3-MYB genes were designed by online software primer3 (https://bioinfo.ut.ee/primer3-0.4.0/)  and were shown in Additional file 7: Table S7. Real-time quantitative RT-PCR (qRT-PCR) was performed with SYBR Green qPCR Premix (Low ROX). A total of 20 μl volume of reaction mixture for each PCR run was prepared, containing 1.5 μl cDNA, 1 × Taq SYBR Green qPCR Premix (Monad, China) and a primer pair with a concentration of 0.2 μM. The two-step thermal cycling profile used was 95 °C for 5 min, 40 cycles at 95 °C for 30 s, followed by 60 °C for 60 s. Three technical replicates were performed for each sample. The relative expression level was calculated by the 2-ΔΔCt method .
Availability of data and materials
The datasets generated and/or analysed during the current study are available in the the NCBI Sequence Read Archive repository, https://www.ncbi.nlm.nih.gov/sra/PRJNA772550.
- NtR2R3-MYB :
R2R3-MYB gene of Nicotiana tabacum
Fragments Per Kilobase of transcript sequence per Millions base pairs sequenced
Gene Structure Display Server
Quantitative real-time PCR
Paz-Ares J, Ghosal D, Wienand U, Peterson PA, Saedler H. The regulatory c1 locus of Zea mays encodes a protein with homology to myb proto-oncogene products and with structural similarities to transcriptional activators. EMBO J. 1987;6(12):3553–8.
Ogata K, Kanei-Ishii C, Sasaki M, Hatanaka H, Nagadoi A, Enari M, et al. The cavity in the hydrophobic core of Myb DNA-binding domain is reserved for DNA recognition and trans-activation. Nat Struct Biol. 1996;3(2):178–87.
Wilkins O, Nahal H, Foong J, Provart NJ, Campbell MM. Expansion and diversification of the Populus R2R3-MYB family of transcription factors. Plant Physiol. 2009;149(2):981–93.
Dubos C, Stracke R, Grotewold E, Weisshaar B, Martin C, Lepiniec L. MYB transcription factors in Arabidopsis. Trends Plant Sci. 2010;15(10):573–81.
Jiang C, Gu J, Chopra S, Gu X, Peterson T. Ordered origin of the typical two- and three-repeat Myb genes. Gene. 2004;326:13–22.
Rosinski JA, Atchley WR. Molecular evolution of the Myb family of transcription factors: evidence for polyphyletic origin. J Mol Evol. 1998;46(1):74–83.
Yanhui C, Xiaoyuan Y, Kun H, Meihua L, Jigang L, Zhaofeng G, et al. The MYB transcription factor superfamily of Arabidopsis: expression analysis and phylogenetic comparison with the rice MYB family. Plant Mol Biol. 2006;60(1):107–24.
Martin C, Paz-Ares J. MYB transcription factors in plants. Trends Genet. 1997;13(2):67–73.
Jin H, Martin C. Multifunctionality and diversity within the plant MYB-gene family. Plant Mol Biol. 1999;41(5):577–85.
Baumann K, Perez-Rodriguez M, Bradley D, Venail J, Bailey P, Jin H, et al. Control of cell and petal morphogenesis by R2R3-MYB transcription factors. Development. 2007;134(9):1691–701.
Jakoby MJ, Falkenhan D, Mader MT, Brininstool G, Wischnitzki E, Platz N, et al. Transcriptional profiling of mature Arabidopsis trichomes reveals that NOECK encodes the MIXTA-like transcriptional regulator MYB106. Plant Physiol. 2008;148(3):1583–602.
Zhang Y, Cao G, Qu LJ, Gu H. Characterization of Arabidopsis MYB transcription factor gene AtMYB17 and its possible regulation by LEAFY and AGL15. J Genet Genomics. 2009;36(2):99–107.
Cone KC, Cocciolone SM, Burr FA, Burr B. Maize anthocyanin regulatory gene pl is a duplicate of c1 that functions in the plant. Plant Cell. 1993;5(12):1795–805.
Du H, Feng BR, Yang SS, Huang YB, Tang YX. The R2R3-MYB transcription factor gene family in maize. PLoS One. 2012;7(6):e37463.
Ampomah-Dwamena C, Thrimawithana AH, Dejnoprat S, Lewis D, Espley RV, Allan AC. A kiwifruit (Actinidia deliciosa) R2R3-MYB transcription factor modulates chlorophyll and carotenoid accumulation. New Phytol. 2019;221(1):309–25.
Liu G, Ren G, Guirgis A, Thornburg RW. The MYB305 transcription factor regulates expression of nectarin genes in the ornamental tobacco floral nectary. Plant Cell. 2009;21(9):2672–87.
Onkokesung N, Reichelt M, van Doorn A, Schuurink RC, van Loon JJ, Dicke M. Modulation of flavonoid metabolites in Arabidopsis thaliana through overexpression of the MYB75 transcription factor: role of kaempferol-3,7-dirhamnoside in resistance to the specialist insect herbivore Pieris brassicae. J Exp Bot. 2014;65(8):2203–17.
Tang Y, Bao X, Zhi Y, Wu Q, Guo Y, Yin X, et al. Overexpression of a MYB family gene, OsMYB6, increases drought and salinity stress tolerance in transgenic Rice. Front Plant Sci. 2019;10:168.
Luo Q, Liu R, Zeng L, Wu Y, Jiang Y, Yang Q, et al. Isolation and molecular characterization of NtMYB4a, a putative transcription activation factor involved in anthocyanin synthesis in tobacco. Gene. 2020;760:144990.
Song Z, Luo Y, Wang W, Fan N, Wang D, Yang C, et al. NtMYB12 positively regulates Flavonol biosynthesis and enhances tolerance to low pi stress in Nicotiana tabacum. Front Plant Sci. 2020;10:1683.
Chen WK, Yu KJ, Liu B, Lan YB, Sun RZ, Li Q, et al. Comparison of transcriptional expression patterns of carotenoid metabolism in 'Cabernet Sauvignon' grapes from two regions with distinct climate. J Plant Physiol. 2017;213:75–86.
Hörtensteiner S. Chlorophyll degradation during senescence. Annu Rev Plant Biol. 2006;57:55–77.
Liu Y, Wang L, Liu H, Zhao R, Liu B, Fu Q, et al. The antioxidative defense system is involved in the premature senescence in transgenic tobacco (Nicotiana tabacum NC89). Biol Res. 2016;49(1):30.
Zhang Q, Zhai J, Shao L, Lin W, Peng C. Accumulation of Anthocyanins: an adaptation strategy of Mikania micrantha to low temperature in winter. Front Plant Sci. 2019;10:1049.
Stracke R, Werber M, Weisshaar B. The R2R3-MYB gene family in Arabidopsis thaliana. Curr Opin Plant Biol. 2001;4(5):447–56.
Riechmann JL, Heard J, Martin G, Reuber L, Jiang C, Keddie J, et al. Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes. Science. 2000;290(5499):2105–10.
Katiyar A, Smita S, Lenka SK, Rajwanshi R, Chinnusamy V, Bansal KC. Genome-wide classification and expression analysis of MYB transcription factor families in rice and Arabidopsis. BMC Genomics. 2012;13:544.
Du H, Yang SS, Liang Z, Feng BR, Liu L, Huang YB, et al. Genome-wide analysis of the MYB transcription factor superfamily in soybean. BMC Plant Biol. 2012;12:106.
Wang J, Liu Y, Chen XL, Kong QS. Characterization and divergence analysis of duplicated R2R3-MYB genes in watermelon. J Am Soc Hortic Sci. 2020;145(5):281.
Sierro N, Battey JN, Ouadi S, Bakaher N, Bovet L, Willig A, et al. The tobacco genome sequence and its comparison with those of tomato and potato. Nat Commun. 2014;5:3833.
Kaul S, Koo HL, Jenkins J, et al. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000;408(6814):796–815.
Burr B. Mapping and sequencing the rice genome. Plant Cell. 2002;14(3):521–3.
Schnable PS, Ware D, Fulton RS, et al. The B73 maize genome: complexity, diversity, and dynamics. Science. 2009;326(5956):1112–5.
Schmutz J, Cannon SB, Schlueter J, et al. Genome sequence of the palaeopolyploid soybean. Nature. 2010;463(7278):178–83.
Cannon SB, Mitra A, Baumgarten A, Young ND, May G. The roles of segmental and tandem gene duplication in the evolution of large gene families in Arabidopsis thaliana. BMC Plant Biol. 2004;4:10.
Fedorova L, Fedorov A. Introns in gene evolution. Genetica. 2003;118(2–3):123–31.
Jiang C, Gu X, Peterson T. Identification of conserved gene structures and carboxy-terminal motifs in the Myb gene family of Arabidopsis and Oryza sativa L. ssp. indica. Genome Biol. 2004;5(7):R46.
Islam S, Sajib SD, Jui ZS, Arabia S, Ghosh A. Genome-wide identification of glutathione S-transferase gene family in pepper, its classification, and expression profiling under different anatomical and environmental conditions. Sci Rep. 2019;9:15.
Wei Q, Chen R, Wei X, Liu Y, Zhao S, Yin X, et al. Genome-wide identification of R2R3-MYB family in wheat and functional characteristics of the abiotic stress responsive gene TaMYB344. BMC Genomics. 2020;21(1):792.
Yang A, Dai X, Zhang WH. A R2R3-type MYB gene, OsMYB2, is involved in salt, cold, and dehydration tolerance in rice. J Exp Bot. 2012;63(7):2541–56.
Cominelli E, Galbiati M, Vavasseur A, Conti L, Sala T, Vuylsteke M, et al. A guard-cell-specific MYB transcription factor regulates stomatal movements and plant drought tolerance. Curr Biol. 2005;15(13):1196–200.
Lee SB, Suh MC. Cuticular wax biosynthesis is up-regulated by the MYB94 transcription factor in Arabidopsis. Plant Cell Physiol. 2015;56(1):48–60.
Zhu F, Luo T, Liu C, Wang Y, Yang H, Yang W, et al. An R2R3-MYB transcription factor represses the transformation of α- and β-branch carotenoids by negatively regulating expression of CrBCH2 and CrNCED5 in flavedo of Citrus reticulate. New Phytol. 2017;216(1):178–92.
Lynch M, Conery JS. The evolutionary fate and consequences of duplicate genes. Science. 2000;290(5494):1151–5.
Edwards KD, Fernandez-Pozo N, Drake-Stowe K, Humphry M, Evans AD, Bombarely A, et al. A reference genome for Nicotiana tabacum enables map-based cloning of homeologous loci implicated in nitrogen utilization efficiency. BMC Genomics. 2017;18(1):448.
Fernandez-Pozo N, Menda N, Edwards JD, Saha S, Tecle IY, Strickler SR, et al. The Sol Genomics Network (SGN)--from genotype to phenotype to breeding. Nucleic Acids Res. 2015;43:D1036–41.
Marchler-Bauer A, Derbyshire MK, Gonzales NR, Lu S, Chitsaz F, Geer LY, et al. CDD: NCBI's conserved domain database. Nucleic Acids Res. 2015;43:D222–6.
Wilkins MR, Gasteiger E, Bairoch A, Sanchez JC, Williams KL, Appel RD, et al. Protein identification and analysis tools in the ExPASy server. Methods Mol Biol. 1999;112:531–52.
Hu B, Jin J, Guo AY, Zhang H, Luo J, Gao G. GSDS 2.0: an upgraded gene feature visualization server. Bioinformatics. 2015;31(8):1296–7.
Bailey TL, Johnson J, Grant CE, Noble WS. The MEME suite. Nucleic Acids Res. 2015;43:W39–49.
Crooks GE, Hon G, Chandonia JM, Brenner SE. WebLogo: a sequence logo generator. Genome Res. 2004;14(6):1188–90.
Lescot M, Déhais P, Thijs G, Marchal K, Moreau Y, Van de Peer Y, et al. PlantCARE, a database of plant cis-acting regulatory elements and a portal to tools for in silico analysis of promoter sequences. Nucleic Acids Res. 2002;30(1):325–7.
Wu GQ, Li ZQ, Cao H, Wang JL. Genome-wide identification and expression analysis of the WRKY genes in sugar beet (Beta vulgaris L.) under alkaline stress. PeerJ. 2019;7:e7817.
Wang L, Guo K, Li Y, Tu Y, Hu H, Wang B, et al. Expression profiling and integrative analysis of the CESA/CSL superfamily in rice. BMC Plant Biol. 2010;10:282.
Chen C, Chen H, Zhang Y, Thomas HR, Frank MH, He Y, et al. TBtools: an integrative toolkit developed for interactive analyses of big biological data. Mol Plant. 2020;13(8):1194–202.
Kumar S, Stecher G, Li M, Knyaz C, Tamura K. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol Biol Evol. 2018;35(6):1547–9.
Qin M, Zhang B, Gu G, Yuan J, Yang X, Yang J, et al. Genome-wide analysis of the G2-like transcription factor genes and their expression in different senescence stages of tobacco (Nicotiana tabacum L.). Front Genet. 2021;12:626352.
Zhang B, Yang J, Gu G, Jin L, Chen C, Lin Z, et al. Xie X. integrative analyses of biochemical properties and transcriptome reveal the dynamic changes in leaf senescence of tobacco (Nicotiana tabacum L.). Front Genet. 2021;12:790167.
Walter W, Sánchez-Cabo F, Ricote M. GOplot: an R package for visually combining expression data with functional analysis. Bioinformatics. 2015;31(17):2912–4.
Xu J, Chen Q, Liu P, Jia W, Chen Z, Xu Z. Integration of mRNA and miRNA analysis reveals the molecular mechanism underlying salt and alkali stress tolerance in tobacco. Int J Mol Sci. 2019;20:2391.
Jin J, Zhang H, Zhang J, Liu P, Chen X, Li Z, et al. Integrated transcriptomics and metabolomics analysis to characterize cold stress responses in Nicotiana tabacum. BMC Genomics. 2017;18:496.
Song Z, Pan F, Yang C, Jia H, Jiang H, He F, et al. Genome-wide identification and expression analysis of HSP90 gene family in Nicotiana tabacum. BMC Genet. 2019;20:35.
Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, et al. Primer3--new capabilities and interfaces. Nucleic Acids Res. 2012;40(15):e115.
Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative PCR and the 2(−Delta Delta C(T)) method. Methods. 2001;25(4):402–8.
We appreciate the reviewers and editors for the patience to the work.
This research was financially supported by Fujian Tobacco Company (2019350000240137) and Nanping Tobacco Company (NYK2021-10-03). The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1.
Additional file 2.
Additional file 3.
Additional file 4.
Additional file 5.
Additional file 6.
Additional file 7.
About this article
Cite this article
Yang, J., Zhang, B., Gu, G. et al. Genome-wide identification and expression analysis of the R2R3-MYB gene family in tobacco (Nicotiana tabacum L.). BMC Genomics 23, 432 (2022). https://doi.org/10.1186/s12864-022-08658-7
- Nicotiana tobacum L.
- R2R3-MYB transcription factors
- Phylogenetic analysis
- Stress response
- Gene expression