Genome-wide analysis of bHLH transcription factor and involvement in the infection by yellow leaf curl virus in tomato (Solanum lycopersicum)
BMC Genomics volume 16, Article number: 39 (2015)
The basic helix-loop-helix (bHLH) proteins are a superfamily of transcription factors that can bind to specific DNA target sites. They have been well characterized in model plants such as Arabidopsis and rice and have been shown to be important regulatory components in many different biological processes. However, no systemic analysis of the bHLH transcription factor family has yet been reported in tomatoes. Tomato yellow leaf curl virus (TYLCV) threatens tomato production worldwide by causing leaf yellowing, leaf curling, plant stunting and flower abscission.
A total of 152 bHLH transcription factors were identified from the entire tomato genome. Phylogenetic analysis of bHLH domain sequences from Arabidopsis and tomato facilitated classification of these genes into 26 subfamilies. The evolutionary and possible functional relationships revealed during this analysis are supported by other criteria, including the chromosomal distribution of these genes, the conservation of motifs and exon/intron structural patterns, and the predicted DNA binding activities within subfamilies. Distribution mapping results showed bHLH genes were localized on the 12 tomato chromosomes. Among the 152 bHLH genes from the tomato genome, 96 bHLH genes were detected in the TYLCV-susceptible and resistant tomato breeding line before (0 dpi) and after TYLCV (357 dpi) infection. As anticipated, gene ontology (GO) analysis indicated that most bHLH genes are related to the regulation of macromolecule metabolic processes and gene expression. Only four bHLH genes were differentially expressed between 0 and 357 dpi. Virus-induced gene silencing (VIGS) of one bHLH genes SlybHLH131 in resistant lines can lead to the cell death.
In the present study, 152 bHLH transcription factor genes were identified. One of which bHLH genes, SlybHLH131, was found to be involved in the TYLCV infection through qRT-PCR expression analysis and VIGS validation. The isolation and identification of these bHLH transcription factors facilitated clarification of the molecular genetic basis for the genetic improvement of tomatoes and the development of functional gene resources for transgenic research. In addition, these findings may aid in uncovering an unexplored mechanism during the TYLCV infection in tomatoes.
The basic/helix-loop-helix (bHLH) proteins have DNA-binding and dimerization capabilities. They are a superfamily of transcription factor (TFs) that have been found to have many different functions in essential physiological and developmental process in animals and plants [1-3]. The bHLH domain contains approximately 60 amino acids with two functionally distinct regions, the basic region and the HLH regions . The basic region was 15 amino acids long and typically included six basic residues. It was located at the N terminus of the domain and functions as a DNA binding motif . The HLH region contains two amphipathic α helices separated by a loop region of variable length. The HLH region acts as a dimerization domain and allows the formation of homodimers or heterodimers [1,6]. Among all bHLH motifs, 19 amino acids have been found to be highly conserved in organisms ranging from yeast to mammals . Outside of these conserved bHLH domains, the proteins exhibited considerable sequence divergence. Some bHLH proteins have been shown to bind to the sequences containing the core element known as the E box (5′-CANNTG-3′), with the most common form of G-box (5′-CACGTG-3′). The nucleotides flanking the core element may also have a role in binding specificity [5,8].
Based on the phylogenetic relationships, DNA-binding motifs, and functional properties, the bHLH TFs family has been divided into six main groups in metazoans [2,9,10]. In brief, Group A bHLH proteins can bind to the CAGCTG core sequences of E-boxes. Group B includes a large number of functionally proteins (Max, Myc, MITF, and USF) and bind to the G-box sequence CACGTG [11,12]. Group C contains an additional protein-protein interaction region (the PAS domain) and binds to ACGTG or GCGTG sequences. Group D proteins have the HLH region but lack the basic DNA binding domain . Group E proteins have Pro or Gly residues within the basic region and can bind preferentially to a typical sequence, CACGNG . Group F consists of the COE domain; they have diverse sequences compared with other groups and another domain for dimerization and DNA binding [2,14,15].
Only a small number of plant bHLH proteins have been characterized functionally, far fewer than in animals. In Arabidopsis, 162 bHLH-encoding genes which were divided into 21 subfamilies according to their phylogenetic relationships have been identified from the analysis of genome sequences [3,16]. A total of 167 and 230 bHLH TFs have been identified in the rice (Oryza sativa) and Chinese cabbage (Brassica rapa) genomes, respectively [17,18]. These have been divided into 22 and 24 subfamilies, respectively. Phylogenetic analysis showed that the plant bHLH proteins comprised 26 subfamilies, 20 of which were present in the common ancestors of extant mosses and vascular plants . Most bHLH proteins identified so far have been functionally characterized in Arabidopsis, and their roles have been shown to include regulation of fruit dehiscence, anther and epidermal cell development, hormone signaling, and stress responses .
Tomato (Solanum lycopersicum) is an economically important vegetable worldwide. The annual global production of tomato in 2012 was more than 160 million tons including 50 million tons in China (http://faostat.fao.org/). The tomato genome has been sequenced and assembled by the International Tomato Genome Sequencing Project (http://solgenomics.net/organism/Solanum_lycopersicum/genome), because tomatoes are economically important and it is model species for the study of fruit ripening . A high-quality genome sequence for domesticated tomato and more than 30,000 proteins have been obtained. Tomato yellow leaf curl virus (TYLCV) is the most widespread and currently ranks 3rd among the most economically and scientifically most important plant viruses worldwide . The symptoms of TYLCV infection in young plants include stunted growth, upward curling of leaf margins, marked reduction in leaf size, mottling and yellowing of young leaves, and flower abscission, leading to severe yield loss . Currently five major loci resistant to TYLCV have been identified from different wild tomato relatives, Ty-1, Ty-3 and Ty-4 from S.chilense, Ty-2 from S.habrochaites, and Ty-5 from S.peruvianum [24-28]. Among them, Ty-1 and Ty-3 were found to be allelic and have been cloned. Ty-1 and Ty-3 were found to be allelic and have been cloned. They are RNA-dependent RNA polymerases (RDR) and may be involved in RNA silencing . In addition, Ty-2, Ty-4 and Ty-5 have been mapped to chromosomes 11, 3, and 4 respectively, using molecular markers [26,30-32]. cDNA library comparisons of susceptible and resistant tomato lines before and after TYLCV infection showed approximately 70 genes that are preferentially expressed in a tomato line with a resistance introgressed from S. habrochaites . Using whole transcriptome sequencing of the TYLCV-resistant tomato breeding line CLN2777A (R) and TYLCV-susceptible tomato breeding line TMXA48-4-0 (S), 209 and 809 genes were found to be differentially expressed in the R and S tomato lines, respectively .
In tomatoes, LeFER, a bHLH protein encoded by Solyc06g051550.2.1, SlybHLH083, was the first identified regulator of iron nutrition in plants. LeFER plays an important role in the Fe-deficiency response of tomatoes . Style2.1, encoded by Solyc02g084880.2.1, SlybHLH031, is the major quantitative trait locus responsible for style length; this important floral attribute has been shown to be associated with the evolution of self-pollination and was cloned in cultivated tomatoes . However, the tomato bHLH protein family has not been analyzed at a genome-wide level, and the phylogenetic relationship of this protein family remains poorly understood. In this study, a total of 152 SlybHLH genes were identified in the tomato genomic sequence and phylogenetic analyses were carried out to evaluate the relationships among these genes. Changes in global expression pattern of SlybHLH genes in R and S lines infected by TYLCV were analyzed to provide insight into the regulation of response to TYLCV. The expression of SlybHLH exhibited a variety of expression patterns, suggesting a novel layer of regulation for the response to TYLCV in tomato.
Database search for bHLH genes
The Pfam database (http://pfam.sanger.ac.uk/)  was used to screen the genome of tomato (S. lycopersicum; http://solgenomics.net/organism/Solanum_lycopersicum/genome) and potato (S. tuberosum; http://phytozome.jgi.doe.gov/). Proteins with helix-loop-helix DNA-binding domains (PF00010.21) were used to identify the putative bHLH proteins in tomato and potato using the hidden Markov model (HMM). The hmmsearch tool, with an expected value (e-value) cut-off of 1.0 was used to identify the proteins. These sequences were then verified using the SMART tool (http://smart.embl-heidelberg.de/) . The Arabidopsis thaliana bHLH proteins were retrieved from the TAIR database (http://www.arabidopsis.org/) using a previous report .
Phylogenetic analysis and identification of conserved motifs and gene structure
The complete amino acid sequences were screened against the Pfam database to identify the domains of bHLH transcription factors. MEGA6 software was used to construct neighbor-joining (NJ) distance trees using tomato bHLH protein domain sequences . The bootstrap was set as 1,000 replicates, which provided information regarding their statistical reliability. Meanwhile, the NJ method of the PHYLIP software (version 3.6; http://evolution.genetics.washington.edu/phylip.html; ) was also used with bootstrap of 1000 replicates to create another phylogenetic tree to validate the results from the NJ method by MEGA 6 software. A phylogenetic tree of all the identified bHLH protein domains was also constructed. The identified bHLH domains were aligned using a ClustalX 2.0 program with default settings .
To identify the conserved motifs in tomato bHLH proteins, the Multiple Expectation-maximization for Motif Elicitation (MEME) program version 4.9.0  was used with default parameters, except for the following parameters: (1) optimum motif width was set to ≥10 and ≤100; (2) the maximum number of motifs was set to identify ten motifs. MEME software (http://meme.sdsc.edu/meme/) was used to search for conserved motifs in the complete amino acid sequences of bHLH proteins.
Collinear correlations of bHLH genes in the tomato, potato, and Arabidopsis genomes
OrthoMCL program (http://www.orthomcl.org/cgi-bin/OrthoMclWeb.cgi)  was used to identify the orthologous and paralogous genes in tomatoes, potatoes and Arabidopsis. Briefly, the tools BLASTP, with an e-value ≤ 1e−10, and orthomclPairs were used to find orthologs, inparalogs and coorthologs in these three species. The Circos tool was used to link these genes to chromosomes . In addition, the relationships of orthologous and paralogous genes in these three species were also shown using the Circos tool . The bHLH genes in tomato were searched for duplication events (e value <1e−10, identity >90%).
Chromosome distribution and gene duplications
To determine the physical locations of bHLH genes, the starting and ending positions of all bHLH genes on each chromosome were obtained from the tomato database. The MapInspect software was used to draw the images of the locations of the tomato bHLH genes (http://mapinspect.software.informer.com/). We used the plant genome duplication database (PGDD, available at http://chibba.agtec.uga.edu/duplication/) to retrieve the duplicate chromosomal blocks and then identify the bHLH genes in the duplication block which allowed us to identify duplicate tomato bHLH genes . The PGDD is a public database used to identify and catalogue plant genes in terms of intra-genomic or cross-genomic syntenic relationships.
RNA data collection and data mining
Transcriptomic data of TYLCV-resistant breeding line, CLN2777A (R) and susceptible breeding line, TMXA48-4-0 (S) with uninfected (0 dpi) and mixed infection samples of 3, 5, and 7 days post infection (357 dpi) were downloaded from NCBI SRA database (SRA097118) and analyzed as described in a previous study . Enrichment of gene ontology (GO) categories was performed with an agriGO analysis toolkit (http://bioinfo.cau.edu.cn/agriGO/)  using the TopGO ‘elim’ algorithm  for the aspects ‘biological process’ and ‘subcellular localization’. The selected categories were sorted from the lowest to the highest P value (P < 0.01).
Validation of differentially expressed genes by quantitative RT-PCR
Four bHLH genes with differentially expressed in R or S lines were selected and subjected to quantitative RT-PCR validation (Additional file 1: Table S1). Primers for quantitative RT-PCR were designed using Primer5 software and primer specificity was evaluated by blasting primer sequences against the NCBI database. PCR amplifications were performed in a real-time thermal cycler qTOWER 2.0/2.2 (Analytik Jena, Germany) with 15 μl of final volumes containing 1.0 μl of cDNA, 0.5 μl each primer (10 μM), 6 μl of sterile water, and 7.5 μl (2×) SYBR Premix ExTaq™ II Kit (TaKaRa, Japan). The conditions for amplification were as follows: 5 min of denaturation at 95°C followed by 40 cycles of 95°C for 10 s, 60°C for 20 s, and 72°C for 10 s. The expression levels of selected genes were normalized to α-Tubulin (Solyc04g077020.2) expression . Relative gene expression was calculated using the 2-ΔΔCT method . Three biological replicates were performed for each of the selected genes.
Validation of candidate genes with virus-induced gene silencing (VIGS) and cell death analysis
The tobacco rattle virus (TRV) mediated VIGS system was used to silence a bHLH gene (Solyc10g008270.2) . Briefly, pTRV-containing Agrobacterium EHA105 was cultured in liquid LB medium and resuspended in infiltration medium at an O.D. value of 2.0 and cultured at room temperature for 4 h. Three week old seedlings were infiltrated by pressure inoculation in the leaves with a needleless syringe. For the VIGS experiments, agro infiltration was performed two weeks after TRV inoculation.
Death cells were identified by staining with lacto-phenol trypan blue and DAB as previous described [50,51]. To visualize cell death, stem were stained by boiling in lacto-phenol trypan blue (10 ml lactic acid, 10 ml glycerol, 10 g phenol, and 10 mg trypan blue, dissolved in 10 ml distilled water), followed by destaining with chloral hydrate (2.5 g ml−1). Then the death cell was examined with Ni-U microscope (Nikon, Japan).
Identification and classification of bHLH proteins in tomato
To identify the putative bHLH proteins in the tomato genome, a Hidden Markov Model search resulted in the identification of 152 bHLH proteins (Additional file 1: Table S2). To verify the reliability of our criteria, we performed simple modular architecture research tool (SMART) analysis of 152 putative SlybHLH protein sequences and found all of them had a typical bHLH domain. The number of bHLH TFs in tomatoes exceeded that of many metazoans and fungi, but was less than that found in some plants, such as rice (170), Chinese cabbage (230), potatoes (127), soybeans (289) and maize (289) . In addition, the density of bHLH proteins in the entire tomato (0.198) and potato (0.175) genomes was found to be less than that in most plant species, such as Arabidopsis (1.111) and rice (0.46) . Most Angiosperm plant lineages have experienced one of more rounds of ancient polyploidy . And the genomes of tomato and potato have undergone recent triplication events, whereas few individual tomato/potato genes remain triplicated . Therefore, this event might lead to relatively fewer bHLH genes in tomato compared to other plant species.
Multiple sequence alignments, predicted DNA-binding ability and conserved residues
To examine sequence features of these tomato bHLH domains, multiple sequence alignment of the 152 bHLH amino acid sequences were performed. There were four conserved regions in the bHLH domain sequences, including one basic region, two helix regions and one loop region (Figure 1A, Additional file 1: Table S3). The basic regions have five basic residues, but five of these proteins did not have the basic region (Additional file 2: Figure S1). The loop was found to be the most divergent region in terms of both length and amino acid composition. From the alignment, 19 residues were identified that were identical in at least 50% of the 152 tomato bHLH domains (Figure 1B). Among these 19 residues, nine residues were present in more than 75% sequences (Glu-9, Arg-10, Arg-12, Arg-13, Leu-23, Leu-26, Lys-38, Leu-53 and Leu-64 in this alignment).
Five residues (His-5, Glu-9, Arg-10, Arg-12, and Arg-13), five residues (Ile-16, Leu-23, Leu-26, Val-27, and Pro-29), two residues (Lys-38 and Asp-40) and seven residues (Ala-50, Lys-53, Glu-55, Ala-56, Ile-57, Tyr-59, and Lys-64) made up the basic region, the first helix region, the loop region and the second helix region, respectively. All of these conserved residues were consistent with previous studies [3,17,18]. The Leu-23 in the basic region was conserved in all 152 bHLH proteins, suggesting that this residue is extremely important for promoting the formation of dimerization among bHLH proteins .
The basic region of the bHLH domain can bind to DNA and is critical for function . Using the criteria described by Massari and Murre, the SlybHLH proteins were divided into several categories based on sequence information in the N-terminal region of the bHLH domains (Figure 1C, Additional file 1: Table S2) . As was done with Arabidopsis and Chinese cabbage, the SlybHLH proteins of tomato were also divided into two major groups according to 17 N-terminal amino acids within bHLH protein domain, including 119 DNA-binding and 33 non-DNA binding proteins. The DNA-binding bHLHs were further divided into two groups with different predicted target sequences depending on the presence or absence of residues Glu-9 and Arg-12 in the basic region. Group (1A) proteins had 92 putative E-box-binding proteins with conserved Glu-9/Arg-12 residues and Group (1B) proteins had 27 non-E-box-binding proteins lacking these residues (Figure 1C). The three residues in the basic region of the bHLH domain, His/Lys-5, Glu-9 and Arg-1, were found to constitute the classic G-box-binding region . Group (1A) can therefore be subdivided further into two subgroups: 1A1, whose 89 proteins are predicted to bind G-boxes, and 1A2, whose three members are predicted to bind other types of E boxes (non-G-box proteins).
Phylogenetic analysis of the bHLH transcription factor family
To assess the evolutionary relationships of the SlybHLH genes, an NJ phylogenetic tree was generated using the multiple sequences alignments of the conserved bHLH TF domains in tomato and Arabidopsis with a bootstrap value of 1,000. Twenty-six subfamilies were identified according to the clades support values, topology of the tree, and classification of the Arabidopsis [3,19]. No SlybHLH proteins in the XIII, II, subfamilies relative to those of Arabidopsis, therefore, tomato contained 24 bHLH subfamilies in our analysis (Figure 2). To further validate the reliability of the NJ tree with MEGA 6.0, NJ and maximum parsimony analysis was also used to generate phylogenetic trees using PHYLIP software (Additional file 3: Figure S2 and Additional file 4: Figure S3). 96.7% (147/152) of the SlybHLH proteins with NJ model using PHYLIP software were placed into the same subfamilies as those in the NJ tree with MEGA 6.0, indicating that both methods are in very good agreement.
In order to assess differences in protein structure, MEME was used to identify conserved motifs in the tomato bHLH proteins. Ten conserved motifs were identified and named motif 1 through motif 10. In general, the bHLH proteins were clustered in the same subfamilies and shared similar motif compositions, which indicated functional similarities among members of the same subfamilies . The tomato bHLH proteins were found to have a similar structure for every subfamily (Additional file 4: Figure S3). The pattern of intron position can also provide important evidence to support phylogenetic relationships in a gene family. Here, GSDS tools were used to show the gene structures for SlybHLH genes (Additional file 5: Figure S4). Among the 152 tomato bHLH genes, the number of introns ranged from 0 to 10, and most members of the same subfamilies had similar intron/exon structures. For example, the members of subfamilies III(d + e) and XV each have only one intron. These results demonstrated that proteins within the same subfamily share close evolutionary relationships.
Collinear correlations of bHLH genes in tomatoes, potatoes, and Arabidopsis
The Solanum lineage has experienced two consecutive genome triplications: one is ancient and shared with rosids, and the other more recent . In this study, the correlation between tomato, potato, and Arabidopsis bHLH genes was analyzed using the OrthoMCL program. Here, 167 gene pairs were found to be orthologous between tomatoes and potatoes, but only 61 orthologous pairs were found between tomato and Arabidopsis (Additional file 1: Table S4). These results were consistent with the close relationship of tomatoes and potatoes. Among the orthologous gene pairs shared by tomatoes and potatoes, each tomato bHLH gene had one to four potato bHLH genes. These results demonstrated that bHLH TF genes in potato were duplicated accompanied with evolution processes. In addition, paralogous bHLH gene pairs were also analyzed. A total of 61, 72, and 81 bHLH gene pairs were identified in Arabidopsis, tomatoes, and potatoes, respectively (Additional file 1: Table S5). Visualization of the relationships of paralogous and paralogous bHLH genes among these three species was performed using the Circos software (Figure 3).
Chromosome distribution and gene duplication of the bHLH TF family
The physical map positions of the bHLH genes on tomato chromosomes were identified (Figure 4). Among the 152 bHLH TFs, 151 were mapped onto the twelve tomato chromosomes except SlybHLH001. Most bHLH TFs were found on chromosome 01 (21, 13.8%) and 02 (18, 11.1%). In contrast, there are only 3 (2.0%) and 7 (4.6%) bHLH TF genes on chromosome 11 and 08, respectively. Furthermore, bHLH genes were found to be mapped on the chromosomes with an obviously uneven distribution, and some bHLH genes gathered on part of the chromosome. Relative high densities of bHLH genes were observed in some chromosomal regions, including the bottom of chromosomes 01, 02, 06, and 09. For example, 17 genes clustered in the end of chromosome 01 and 02 with density of 0.8 and 0.86 genes per Mb, respectively. In contrast, several large chromosomal regions lacked bHLH genes, such as the top half of chromosomes 02 and 08 and the central section of chromosomes 03, 04, 05, and 11 (Figure 4).
Previous reports have analyzed duplication events in rice and Chinese cabbage [17,18]. In the current analysis, we first retrieved the genome chromosome blocks in the tomato with PGDD database and identified the 382 duplicate blocks. A total of 59 duplicate bHLH genes pairs were located in these blocks (Figure 4). These duplication bHLH genes are derived from the same subfamily, indicating that members of some bHLH subfamilies originated from the duplication events.
Differential expression of SlybHLH genes in response to TYLCV infection
The RNA-seq technology has been shown to provide precise digital information related on gene expression and can discriminate genes of high sequence identity . Using this technology, global gene expression changes of the leaves of two tomato breeding lines, TYLCV-resistant CLN2777A and TYLCV-susceptible TMXA48-4-0, have been analyzed before (0 dpi) and after (357 dpi) TYLCV infection with viruliferous whiteflies . A total of 34,831 transcripts were detected from R and S lines by alignment to the tomato genome including the 1,386 novel transcripts predicted in tomatoes. The expression levels of mapped genes were normalized with a value of fragments per kilobase of exon per million fragments mapped (FPKM). If the FPKM value of gene was above zero, the gene was considered expressed. On the basis of this criterion, the expression level of 152 SlybHLH genes was confirmed in the S and R line with 0 and 357 dpi. A subset of 96 SlybHLH genes (63.2%) was expressed under both conditions (Figure 5A). GO enrichment analysis revealed that the products of most of SlybHLH genes were localized in the nucleus (Figure 5B). In addition, some biological processes such as ‘regulation of macromolecule metabolic process’ (GO: 0060255), ‘regulation of gene expression’ (GO: 0010468) and ‘regulation of metabolic process’ (GO: 0019222) were overrepresented among all SlybHLH genes, indicating that the bHLH genes were involved in transcription and metabolic regulation (Figure 5C).
Out of 152 SlybHLH genes in the current study, only four were differentially expressed before and after TYLCV infection (log2 fold change >1 and false discovery rate < 0.05) (Table 1). Among the four differentially expressed SlybHLH genes, SlybHLH131 (Solyc10g008270.2.1) was up-regulated in the R line and down-regulated in the S line after TYLCV infection. The expression level of four differentially expressed genes was determined using quantitative RT-PCR to validate the gene expression data from RNA-seq (Additional file 1: Table S1 lists the primers). The results demonstrated that all tested genes revealed a similar trend of transcript accumulation as in RNA-seq analysis (Figure 6).
VIGS validation of SlybHLH131 gene
To investigate the role of TYLCV as it related to resistant in tomato, the SlybHLH131 gene was further challenged with TYLCV after VIGS at the cotyledon stage. One month after agroinfiltration, the success of the TRV silencing system was confirmed by the appearance of cell death in the leaves of S plantlets treated with pTRV1 and pTRV2- SlybHLH131 (Figure 7A). We also observed the cell death development under Ni-U microscope and found that the VIGS lines with SlybHLH131 gene triggered a rapid cell death response in comparison with empty vector lines by trypan blue and DAB staining (Figure 7B and C). A quantitative RT-PCR analysis showed there were significantly fewer SlybHLH131 transcripts in SlybHLH131-silenced tomato leaves during TRV infection (Figure 7 D), indicating that SlybHLH131 was effectively silenced in tomatoes.
The Solanaceae is one of the largest angiosperm genera and includes many different vegetables consumed by humans. In recent years, many plants in this genera have been sequenced, including tomatoes , potatoes , peppers , and tobacco . With the rapid development in bioinformatic analyses, the information stored in various genomes may be explored to elucidate the mechanisms that regulate the development and response to biotic and abiotic stresses. Tomatoes are major crop plant and a model system for fruit development. As of 2011, tomato production had doubled from 24 million tons in 2001 . However, tomato yellow leaf curl virus disease causes huge losses in tomato production worldwide. This disease is caused by different related be gomovirus species. A previous study extended our basci understanding of the response of tomato to TYLCV infection by comparing whole transcriptome expression changes between a TYLCV-resistant line and a TYLCV-susceptible line. In present study, 152 bHLH transcription factor genes were identified in tomatoes and their differential expression was analyzed in R and S lines before and after TYLCV infection. Four differentially expressed genes were identified in R and S lines, in which SlybHLH077 and SlybHLH079 were derived from the chromosome 5 and 6, and SlybHLH131 and SlybHLH132 were mapped on the chromosome 10. These genes are not located in the region of five known loci (Ty-1 to Ty-5), so they are not the most important genes but might be involved in the regulation network of TYLCV resistance. Phylogenetic analysis of the bHLH domain allows division of the SlybHLH family into 23 subfamilies. The clustering of the members within these subfamilies was further supported by additional analysis with regard to other criteria, such as predicted DNA binding capacity and sequence specificity, exon/intron distribution pattern with the domain. These data support the general conclusion that members within subfamilies may have recent common evolutionary origins resulting from various genomic duplication events. They may have related molecular functions. The bHLH subfamily IIId transcription (bHLH3, bHLH13, bHLH14, and bHLH17) function function redundantly to negatively regulate jasmonate (JA) responses in Arabidopsis . However, the strong sequence diversity outside of the bHLH domain across the members of the SlybHLH family suggests that the expansion of this family in tomatoes involved extensive domain shuffling, as in other organisms. However, the non-bHLH amino acid motifs are conserved in each of the bHLH subfamilies (Additional file 5: Figure S4), suggesting that the conservation of these extra domains during plant evolution may have been essential to the function of the bHLH proteins in their respective subfamilies .
The core DNA binding domain of the bHLH proteins contained the basic region of the bHLH domain and these residues were found to recognize and bind to the core hexanucleotide . The amino acid sequence in this region provides the major subdivision of the bHLH family, dividing these proteins into those that are predicted to bind DNA and those that are not (Figure 3). Key residues in this region confer the capacity to discriminate the variants of the hexanucleotide motif with the canonical E-box (CANNTG) and non-E-box motifs. Additional residues within the basic region confer further DNA binding site sequence selectivity. These include G-box and non-G-box core motifs. The current analysis showed there are 92 E-box binding bHLH proteins in tomatoes, and this ration was lower than in Arabidopsis and Chinese cabbage, indicating that many different binding motifs exist in the tomato bHLH protein family.
Most bHLH proteins identified so far were functionally characterized in Arabidopsis, and their roles include plant development, fruit dehiscence, phytochrome signaling, hormone signaling and stress responses, such as cold, heat, abscisic acid, jasmonic acid, and the light signaling pathway . Only two bHLH genes have been functionally characterized in tomato. One is the FER gene (SlybHLH083), which is involved in the response to iron acquisition and supply . Another is Style2.1 (SlybHLH031), that is associated with the evolution of self-pollination; its natural genetic variation in the promoter is responsible for evolution from allogamy to autogamy in the cultivated tomatoes . In the current RNA-seq data set, the FER gene was not expressed in any of the four samples, and the Style2.1 gene was expressed but not differentially expressed in R or S lines after TYLCV infection.
Members of the same plant bHLH subfamily are frequently involved in the same biological processes. Usually the functions of these proteins overlap, causing them to be partially or totally redundant . The characterized function of bHLH in other species can help the user to predict the function of tomato bHLH in the same subfamily. Jasmonates (JA) are lipid-derived hormones that regulate plant responses to stresses such as wounding and pathogens invasion . JA negatively regulates plant growth and is considered to modulate the distribution of energy to defense responses . In Arabidopsis, the bHLH subfamily IIId members (bHLH3, bHLH13, bHLH14, and bHLH17) act as transcription repressors and function redundantly to negatively regulate JA response. In tomatoes, eight members of the III(d + e) subfamily have been identified, so these bHLH TFs might be involved in the JA response network and defense against TYLCV infection. SlybHLH131, a member of the Ib(2) subfamily was up-regulated in the R line, and down-regulated in the S line. In barley and rice, this subfamily is involved in Fe uptake and homeostasis [61,62]. This suggests that SlybHLH131 might perform some previously unknown function in TYLCV infection. Therefore, our results will pave the way for studies of new functions of bHLH genes in TYLCV infection and will further our understanding of this gene family under other biotic and abiotic stresses in tomato.
An extensive analysis of the tomato bHLH genes was performed, identifying 152 bHLH TFs in the entire tomato genome. These genes can be divided into 24 subfamilies using phylogeny, protein motifs, and gene structures. This phylogenetic analysis is in consistent with previous results. The members of subfamilies may share conserved functions not shared by other species. The pattern of expression of SlybHLH genes was observed in R and S lines infected with TYLCV. Results showed that SlybHLH131 might be involved in the TYLCV infection by VIGS. In summary, this is the first comprehensive and systemic analysis of bHLH transcription factors in tomatoes, and the results of this study revealed the importance of bHLH genes during TYLCV infection. They may also provide new opportunities for the investigation of previously unknown mechanisms by which tomatoes tolerate TYLCV infection. Furthermore, our results have established a solid foundation for future studies of other biotic and abiotic stresses using biochemical and physiological approaches that will probably reveal the functional significance of this family in tomato.
Murre C, McCaw PS, Baltimore D. A new DNA binding and dimerization motif in immunoglobulin enhancer binding, daughterless, MyoD, and myc proteins. Cell. 1989;56(5):777–83.
Ledent V, Vervoort M. The basic helix-loop-helix protein family: comparative genomics and phylogenetic analysis. Genome Res. 2001;11(5):754–70.
Toledo-Ortiz G, Huq E, Quail PH. The arabidopsis basic/helix-loop-helix transcription factor family. Plant Cell Online. 2003;15(8):1749–70.
Jones S. An overview of the basic helix-loop-helix proteins. Genome Biol. 2004;5(6):226.
Atchley WR, Terhalle W, Dress A. Positional dependence, cliques, and predictive motifs in the bHLH protein domain. J Mol Evol. 1999;48(5):501–16.
Nesi N, Debeaujon I, Jond C, Pelletier G, Caboche M, Lepiniec L. The TT8 gene encodes a basic helix-loop-helix domain protein required for expression of DFR and BAN genes in arabidopsis siliques. Plant Cell Online. 2000;12(10):1863–78.
Zheng X, Wang Y, Yao Q, Yang Z, Chen K. A genome-wide survey on basic helix-loop-helix transcription factors in rat and mouse. Mamm Genome. 2009;20(4):236–46.
Robinson KA, Koepke JI, Kharodawala M, Lopes JM. A network of yeast basic helix–loop–helix interactions. Nucleic Acids Res. 2000;28(22):4460–6.
Atchley WR, Fitch WM. A natural classification of the basic helix–loop–helix class of transcription factors. Proc Natl Acad Sci. 1997;94(10):5172–6.
Simionato E, Ledent V, Richards G, Thomas-Chollier M, Kerner P, Coornaert D, et al. Origin and diversification of the basic helix-loop-helix gene family in metazoans: insights from comparative genomics. BMC Evol Biol. 2007;7(1):33.
Henriksson M, Lüscher B. Proteins of the Myc network: essential regulators of cell growth and differentiation. Adv Cancer Res. 1996;68:109–82.
Goding CR. Mitf from neural crest to melanoma: signal transduction and transcription in the melanocyte lineage. Genes Dev. 2000;14(14):1712–28.
Sun XH, Copeland NG, Jenkins NA, Baltimore D. Id proteins Id1 and Id2 selectively inhibit DNA binding by one class of helix-loop-helix proteins. Mol Cell Biol. 1991;11(11):5603–11.
Fisher A, Caudy M. The function of hairy-related bHLH repressor proteins in cell fate decisions. Bioessays. 1998;20:298–306.
Crozatier M, Valle D, Dubois L, Ibnsouda S, Vincent A. Collier, a novel regulator of Drosophila head development, is expressed in a single mitotic domain. Curr Biol. 1996;6(6):707–18.
Bailey PC, Martin C, Toledo-Ortiz G, Quail PH, Huq E, Heim MA, et al. Update on the basic helix-loop-helix transcription factor gene family in Arabidopsis thaliana. Plant Cell Online. 2003;15(11):2497–502.
Li X, Duan X, Jiang H, Sun Y, Tang Y, Yuan Z, et al. Genome-wide analysis of basic/helix-loop-helix transcription factor family in rice and Arabidopsis. Plant Physiol. 2006;141(4):1167–84.
Song X-M, Huang Z-N, Duan W-K, Ren J, Liu T-K, Li Y, et al. Genome-wide analysis of the bHLH transcription factor family in Chinese cabbage (Brassica rapa ssp. pekinensis). Mol Genet Genomics. 2014;289(1):77–91.
Pires N, Dolan L. Origin and diversification of basic-helix-loop-helix proteins in plants. Mol Biol Evol. 2010;27(4):862–74.
Feller A, Machemer K, Braun EL, Grotewold E. Evolutionary and comparative analysis of MYB and bHLH plant transcription factors. Plant J. 2011;66(1):94–116.
Consortium TTG. The tomato genome sequence provides insights into fleshy fruit evolution. Nature. 2012;485(7400):635–41.
Scholthof K-BG, Adkins S, Czosnek H, Palukaitis P, Jacquot E, Hohn T, et al. Top 10 plant viruses in molecular plant pathology. Mol Plant Pathol. 2011;12(9):938–54.
Glick E, Levy Y, Gafni Y. The viral etiology of tomato yellow leaf curl disease-a review. Plant Protect Sci. 2009;45:81–97.
Hanssen IM, Lapidot M, Thomma BPHJ. Emerging viral diseases of tomato crops. Mol Plant Microbe Interact. 2010;23(5):539–48.
Ji Y, Schuster D, Scott J. Ty-3, a begomovirus resistance locus near the tomato yellow leaf curl virus resistance locus Ty-1 on chromosome 6 of tomato. Mol Breed. 2007;20(3):271–84.
Ji Y, Scott JW, Schuster DJ, Maxwell DP. Molecular mapping of Ty-4, a new tomato yellow leaf curl virus resistance locus on chromosome 3 of tomato. J Am Soc Hortic Sci. 2009;134(2):281–8.
Hanson P, Green SK, Kuo G. Ty-2, a gene on chromosome 11 conditioning geminivirus resistance in tomato. Tomato Genet Coop Rep. 2006;56:17–8.
Anbinder I, Reuveni M, Azari R, Paran I, Nahon S, Shlomo H, et al. Molecular dissection of tomato leaf curl virus resistance in tomato line TY172 derived from Solanum peruvianum. Theor Appl Genet. 2009;119(3):519–30.
Verlaan MG, Hutton SF, Ibrahem RM, Kormelink R, Visser RGF, Scott JW, et al. The tomato yellow leaf curl virus resistance genes < italic > Ty-1</italic > and < italic > Ty-3</italic > are allelic and code for DFDGD-class RNA–dependent RNA polymerases. PLoS Genet. 2013;9(3):e1003399.
Ji Y, Scott JW, Schuster DJ. Toward fine mapping of the tomato yellow leaf curl virus resistance gene Ty-2 on chromosome 11 of tomato. HortScience. 2009;44(3):614–8.
Hutton SF, Scott JW, Schuster DJ. Recessive resistance to tomato yellow leaf curl virus from the tomato cultivar tyking is located in the same region as Ty-5 on chromosome 4. HortScience. 2012;47(3):324–7.
Yang X, Caro M, Hutton S, Scott J, Guo Y, Wang X, et al. Fine mapping of the tomato yellow leaf curl virus resistance gene Ty-2 on chromosome 11 of tomato. Mol Breed. 2014;34(2):749–60.
Chen T, Lv Y, Zhao T, Li N, Yang Y, Yu W, et al. Comparative transcriptome profiling of a resistant vs. susceptible tomato (<italic > Solanum lycopersicum</italic>) cultivar in response to infection by tomato yellow leaf curl virus. PLoS One. 2013;8(11):e80816.
Ling H-Q, Bauer P, Bereczky Z, Keller B, Ganal M. The tomato fer gene encoding a bHLH protein controls iron-uptake responses in roots. Proc Natl Acad Sci. 2002;99(21):13938–43.
Chen K-Y, Cong B, Wing R, Vrebalov J, Tanksley SD. Changes in regulation of a transcription factor lead to autogamy in cultivated tomatoes. Science. 2007;318(5850):643–5.
Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C, et al. The Pfam protein families database. Nucleic Acids Res. 2012;40(D1):D290–301.
Letunic I, Doerks T, Bork P. SMART 7: recent updates to the protein domain annotation resource. Nucleic Acids Res. 2012;40(D1):D302–5.
Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol. 2013;30(12):2725–9.
Felsenstein J. PHYLIP - phylogeny inference package (Version 3.2). Cladistics. 1989;5:164–6.
Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 1997;25(24):4876–82.
Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, et al. MEME Suite: tools for motif discovery and searching. Nucleic Acids Res. 2009;37 suppl 2:W202–8.
Guo AY, Zhu QH, Chen X, Luo JC. GSDS: a gene structure display server. Yi Chuan. 2007;29(8):1023–6.
Li L, Stoeckert CJ, Roos DS. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 2003;13(9):2178–89.
Krzywinski M, Schein J, Birol İ, Connors J, Gascoyne R, Horsman D, et al. Circos: an information aesthetic for comparative genomics. Genome Res. 2009;19(9):1639–45.
Lee T-H, Tang H, Wang X, Paterson AH. PGDD: a database of gene and genome duplication in plants. Nucleic Acids Res. 2013;41(D1):D1152–8.
Du Z, Zhou X, Ling Y, Zhang Z, Su Z. agriGO: a GO analysis toolkit for the agricultural community. Nucleic Acids Res. 2010;38 suppl 2:W64–70.
Alexa A, Rahnenführer J, Lengauer T. Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics. 2006;22(13):1600–7.
Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative PCR and the 2 − ΔΔCT method. Methods. 2001;25(4):402–8.
Liu Y, Schiff M, Dinesh-Kumar SP. Virus-induced gene silencing in tomato. Plant J. 2002;31(6):777–86.
Choi DS, Hwang BK. Proteomics and functional analyses of pepper Abscisic acid–responsive 1 (ABR1), which is involved in cell death and defense signaling. Plant Cell Online. 2011;23(2):823–42.
Yang H, Yang S, Li Y, Hua J. The arabidopsis BAP1 and BAP2 genes are general inhibitors of programmed cell death. Plant Physiol. 2007;145(1):135–46.
Massari ME, Murre C. Helix-loop-helix proteins: regulators of transcription in eucaryotic organisms. Mol Cell Biol. 2000;20(2):429–40.
Ozsolak F, Milos PM. RNA sequencing: advances, challenges and opportunities. Nat Rev Genet. 2011;12(2):87–98.
Consortium TPGS. Genome sequence and analysis of the tuber crop potato. Nature. 2011;475(7355):189–95.
Qin C, Yu C, Shen Y, Fang X, Chen L, Min J, et al. Whole-genome sequencing of cultivated and wild peppers provides insights into Capsicum domestication and specialization. Proc Natl Acad Sci. 2014;111(4):5135–40.
Sierro N, Battey JND, Ouadi S, Bakaher N, Bovet L, Willig A, et al. The tobacco genome sequence and its comparison with those of tomato and potato. Nat Commun. 2014;5:3833.
Song S, Qi T, Fan M, Zhang X, Gao H, Huang H, et al. The bHLH subgroup IIId factors negatively regulate jasmonate-mediated plant defense and development. PLoS Genet. 2013;9(7):e1003653.
Morgenstern B, Atchley WR. Evolution of bHLH transcription factors: modular evolution by domain shuffling? Mol Biol Evol. 1999;16(12):1654–63.
Balbi V, Devoto A. Jasmonate signalling network in Arabidopsis thaliana: crucial regulatory nodes and new physiological scenarios. New Phytol. 2008;177(2):301–18.
Yang D-L, Yao J, Mei C-S, Tong X-H, Zeng L-J, Li Q, et al. Plant hormone jasmonate prioritizes defense over growth by interfering with gibberellin signaling cascade. Proc Natl Acad Sci. 2012;109(19):E1192–200.
Ogo Y, Itai RN, Nakanishi H, Inoue H, Kobayashi T, Suzuki M, et al. Isolation and characterization of IRO2, a novel iron-regulated bHLH transcription factor in graminaceous plants. J Exp Bot. 2006;57(11):2867–78.
Ogo Y, Nakanishi Itai R, Nakanishi H, Kobayashi T, Takahashi M, Mori S, et al. The rice bHLH protein OsIRO2 is an essential regulator of the genes involved in Fe uptake under Fe-deficient conditions. Plant J. 2007;51(3):366–77.
This work was supported by Natural Science Foundation of Jiangsu Province (BK2012784) and National Natural Science Foundation of China (No. 31471873, No. 31301775).
The authors declare that they have no competing interests.
JYW, BLZ, WGY performed the data analysis and drafted the manuscript. TMZ, BLZ, YWY, and JYW participated in the analysis of the data. JYW, LXT, ZZH, MLY, and TZC performed the experiments. All authors approved the final version of the manuscript.
Jinyan Wang and Zhongze Hu contributed equally to this work.
The primer sequences used for quantitative real-time PCR amplification of actin and SlybHLH131. Table S2. Summary of bHLH genes in tomato genome. Table S3. The bHLH domain sequences of tomato. Table S4. The orthologous bHLH genes in tomato, potato and Arabidopsis genome. Table S5. The paralogous bHLH genes in tomato, potato and Arabidopsis genome. Table S6. The expression of 152 SlybHLH genes in TY-2 and 4840 after TYLCV infection.
Alignment of all the bHLH domain of tomato proteins. Shown at the top are the boundaries used in this study to distinguish the DNA-binding basic region, the two a-helices and the variable loop region.
The NJ phylogenetic tree of bHLH proteins motif using PHYLIP software.
The NJ phylogenetic tree and conserved motif compositions of tomato. The neighbor-joining tree of tomato bHLH genes and their motif locations.
The NJ phylogenetic tree and SlybHLH gene structure of tomato. The yellow and green blocks indicate introns and exons, respectively.
About this article
Cite this article
Wang, J., Hu, Z., Zhao, T. et al. Genome-wide analysis of bHLH transcription factor and involvement in the infection by yellow leaf curl virus in tomato (Solanum lycopersicum). BMC Genomics 16, 39 (2015). https://doi.org/10.1186/s12864-015-1249-2
- Genome-wide analysis
- bHLH transcription factor