Whole-genome characterization of PbrLBD genes in Chinese white pear
In the present study, two strategies were independently performed to identify the candidate LBD gene set in the pear genome. First, a Hidden Markov Model search (HMMsearch) was performed against whole-genome protein sequences of pear using the HMM profile (PF03195) of the LOB domain. Second, a local BLASTP analysis was performed to scan the whole-genome protein sequences of pear using the Arabidopsis thaliana LBD proteins as query sequences [15]. Initially, we retained the common protein sequences identified by both methods. To further verify the reliability of candidate LBD genes, the SMART Tool, CDD tool and InterProScan tool were used to detect the completeness of the LOB domain of candidate proteins. As a result, four candidate genes with incomplete LOB domain were removed. Based on the location of candidate LBD genes, we also removed five genes that were anchored on scaffolds. Consequently, a total of 60 non-redundant and complete LBD genes were identified in the pear genome for further analysis. According to the presence/absence of the LX6LX3LX6L leucine zipper-like domain of LBD proteins, 60 PbrLBD genes were divided into two groups: 53 members belonged to class I and 7 members belonged to class II (Additional file 1: Figure S1). To distinguish members of the PbrLBD gene family, we named each of PbrLBD gene based on their order on the chromosomes (Additional file 2: Table S1), namely, PbrLBD1-PbrLBD60. In addition, we further analyzed the physical and chemical properties of 60 LBD proteins. The ExPASy Proteomics Server, an online proteomics and sequence analysis tool, was used to predict the molecular weights and isoelectric points of PbrLBD protein sequences. The length of PbrLBD-encoded protein sequences ranged from 150 (PbrLBD52) to 475 (PbrLBD7) amino acids. Protein masses ranged from 168.23 kD (PbrLBD52) to 50.74 kD (PbrLBD7), and protein pIs ranged from 4.56 (PbrLBD4) to 9.22 (PbrLBD13) (Additional file 2: Table S1).
Chromosome location of PbrLBD genes in the pear genome
To further investigate the distribution pattern of PbrLBD genes on chromosomes, we mapped all the members of the PbrLBD gene family in the pear genome. Mapchart software was used to show the location distribution of PbrLBD genes across the pear genome (Additional file 3: Figure S2). As a result, we found that 60 PbrLBD genes were unevenly distributed in the pear genome, with the exception of chromosome 4, chromosome 13 and chromosome 16. Chromosome 5 contained the largest number of PbrLBD genes (8 members), followed by chromosome 1 and chromosome 15, which each contained 7 PbrLBD genes. However, chromosome 6, chromosome 8 and chromosome 12 only contained one PbrLBD gene each. Three or more PbrLBD genes were mapped on chromosomes 2, 3, 7, 9, 10, 11, 14 and 17. Interestingly, we found that most PbrLBD genes existed in the form of gene clusters, suggesting that the PbrLBD family had undergone a WGD event/segmental duplication or tandem duplication.
Phylogenetic analysis of the PbrLBD genes
The phylogenetic tree is widely used to show the evolutionary relationship of gene families. A neighbour-joining (NJ) phylogenetic tree was constructed with the Mega-X program using the full-length LBD proteins of pear, apple and Arabidopsis (Fig. 1a). According to the phylogenetic tree results, a total of 168 LBD proteins from three species were phylogenetically categorized into seven subgroups, namely, class Ia, Ib, Ic, Id, Ie, and class IIa and IIb. In apple, 17, 6, 3, 8, 9, 2 and 12 MdLBD genes fell within the subgroups, respectively. In Arabidopsis, 7, 10, 3, 4, 9, 6 and 3 AtASL genes were clustered into the seven subgroups, respectively. In pear, class Ia and class Ib included the highest number of PbrLBD genes (14), followed by class Ie, which contained 11 PbrLBD genes (Fig. 1b). This result provides further evidence that class I contains a greater percentage of LBD genes than class II. It is important to note that most sister pairs of subclades consisted of PbrLBDs and MdLBDs (such as, PbrLBD35/MdLBD44, PbrLBD20/MdLBD26, PbrLBD14/MdLBD16, etc.), while most of the AtLBD genes were separate from PbrLBD genes and MdLBD genes (such as, ASL2, ASL25, ASL26 and ASL27). This result suggests that the PbrLBD genes and MdLBD genes had a closer evolutionary relationship, and consistent with the fact that apple and pear evolved from the same ancestor.
Conserved motif analysis and gene structural analysis of PbrLBD genes
To identify gene structures and evolutionary trajectories of LBD genes in pear, we investigated exon-intron compositions of the 60 PbrLBDs. The number of introns of the PbrLBD genes ranged from zero to four, suggesting that the introns of some PbrLBDs were lost during evolution. The PbrLBD7 gene contained five exons and four introns. In addition, the genes clustered into subclades had similar gene structures (Fig. 2a, c). Most members of class II had two exons and two UTR regions, whereas no UTR regions were detected for most members of class I.
The MEME (Multiple Em for Motif Elicitation) tool was used to predict the conserved motifs of PbrLBD proteins (Fig. 2b) [16]. A total of 20 conserved motifs were identified in our study, named motif 1–20. As a result, the numbers and types of conserved motifs in PbrLBD protein sequences were relatively conserved. Most members clustered into the same subclade had similar motif structures, suggesting that proteins from the same subclade may have similar functions. Motif 1 and motif 2 were basic regions of the LOB domain that were detected in all members of the PbrLBD gene family. Class I members had motif 1 (CX2CX6CX3C), motif 2 (GAS blocks) and motif 3 (LX6LX3LX6L), while Class II members lacked motif 3 (LX6LX3LX6L). This result provided further evidence to support division of the PbrLBD gene family into two clusters.
Cis-acting elements analysis in the putative promoter of PbrLBD genes
In general, the gene function was influenced by the distribution of cis-acting elements in the promoter region [17]. In this study, the region 2000 bp upstream (putative promoter sequences) of the transcription initiation site of PbrLBD genes was defined as the putative promoter region. The putative promoter sequences of PbrLBD genes were then submitted to the PlantCARE database to search for cis-acting elements (Fig. 3) [18]. As a result, a total of 266 cis-acting elements for plant growth and development were identified, and we selected 11 important cis-acting elements for further analysis (Fig. 3). In Fig. 3a, a diversity of distribution patterns of cis-acting elements was observed in the promoter region of PbrLBD genes, suggesting that the expression of PbrLBD genes may be regulated by various factors, such as, light, abscisic acid (ABA), methyl jasmonate (MeJA), gibberellin (GA) response element and myeloblastosis (MYB) binding site involved in drought-inducibility. For hormone-related elements, the ABA-responsive and MeJA-responsive elements were most frequently found in the pear LBD gene promotors. Previous studies had reported that four auxin-inducible LBD genes (LBD16/ASL18, LBD17/ASL15, LBD18/ASL20 and LBD29/ASL16) were regulated by AUXIN RESPONSE FACTOR7 (ARF7) and ARF19 to control the formation of lateral root [3, 7]. In our study, the IAA responsiveness element (IX) were predicated on the putative promotor of 32 PbrLBD genes, this result indicated that the functions of PbrLBD genes might be regulated by auxin. It is notable that the element related to MYB binding site involved in flavonoid biosynthetic gene regulation was predicted in five PbrLBD genes (PbrLBD5, PbrLBD17, PbrLBD49, PbrLBD51 and PbrLBD55), suggesting that they may play important functions in the process of flavonoid biosynthesis.
Synteny analysis reveal the expansion of the LBD gene family
Previous studies have reported that the expansion of gene families is driven by different gene duplication modes, including whole-genome duplication (WGD)/segmental duplication, dispersed duplications (DD), tandem duplication (TD), proximal duplication (PD), and transposed duplication (TRD). To explore which duplication types drove the expansion of the PbrLBD gene family, a synteny analysis was conducted using the MCScanX software package (Fig. 4). Then, the DupGen_finder tool was used to dissect gene duplication types with a priority of duplicate genes as follows: WGD > TD > PD > TRD > DD. As a result, 58 PbrLBD genes were assigned to one of five different duplication types. Consequently, 76.67% (44) genes in the PbrLBD gene family were duplicated and retained from WGD/segmental duplication types, followed by DD (6, 10%), TD (6, 10%), PD (1,1.67%) and TRD (1, 1.67%) (Additional file 4: Figure S3). This result shows that lineage-specific WGD/segmental duplication events were the major force in the expansion of the PbrLBD gene family.
To further investigate the evolutionary mechanisms of the PbrLBD gene family, we performed a method described on the Plant Genome Duplication Database (PGDD) to identify the synteny blocks in the pear genome. A total of 24 synteny blocks containing PbrLBD genes were identified. The 22 duplicated gene pairs from the PbrLBD gene family were mapped in the 19 synteny blocks, named synteny block 1–19 (Additional file 5: Table S2 and Fig. 4a). Among these synteny blocks, we found that several were strongly conserved. For example, synteny block 4 contained over 140 syntenic gene pairs. In order to further verify the reliability of synteny analysis, we investigated the distribution of the genes and the duplicated gene pairs in the 100-kb upstream and downstream regions of PbrLBD genes (Additional file 6: Table S3 and Fig. 4b). We found that most PbrLBD genes were located in 200-kb conserved synteny blocks. These results provided further evidence that expansion of the PbrLBD gene family was mainly driven by a WGD/segmental duplication event.
Estimating dates and driving forces for evolution of the PbrLBD gene family
The number of synonymous substitutions per site (Ks) is extensively used to calculate the approximate occurrence dates of WGD/segmental duplication events. Our previous studies indicated that two WGD events had occurred during pear genome evolution, including an ancient WGD event (Ks ~ 1.5–1.8, approximately 140 MYA) derived from a paleohexaploidization (γ) event and a relatively recent WGD event (Ks ~ 0.15–0.3, approximately 30–45 MYA) [19, 20]. To further investigate the evolution dates PbrLBD gene duplication, mean Ks values of PbrLBD duplicated gene pairs were estimated to trace the date of WGD/segmental duplication events. Table S3 shows the mean Ks values for 22 PbrLBD paralogs in the syntenic region. The values ranged from 0.01 to 3.99. The gene sequences of several paralogs were completely consistent, which indicated that no mutations occurred between these gene pairs (PbrBD27/PbrBD28, PbrBD11/PbrBD15). Most of the mean Ks values of the 22 paralogs fell in the range from 0.15 to 0.30 (Fig. 4), suggesting that the duplication of these gene pairs may have taken place during the relatively recent WGD event, approximately 30–45 MYA. However, the mean Ks value of PbrLBD paralogs were not mapped to the region with values from 1.5 to 1.8, suggesting that the PbrLBD gene family had not undergone the ancient WGD event (~ 140 MYA). Surprisingly, three paralogs (PbrBD18/PbrBD40, PbrBD33/PbrBD12 andPbrBD34/PbrBD24) exhibited relatively higher mean Ks values (3.61–3.98), suggesting that they duplicated during a more ancient duplication event.
Recently, Wu et al. reported that the pear genome underwent two significant groups of synteny blocks: a recent whole-genome duplication event (4DTv ([4-fold synonymous (degenerative) third-codon transversion]) ~ 0.08) and an ancient WGD event (4DTv ~ 0.5) [20]. To test these suppositions, we calculated the 4DTv values of 22 paralogs using our in-house scripts. The distribution of 22 paralogs ranged from 0 to 0.86 (Fig. 4c), and most members exhibited a strong peak with a 4DTv-value at approximately 0.08. This result further supported the conclusions of our mean Ks value analysis, which indicated the PbrLBD gene family had undergone only one WGD event occurring 30–45 MYA.
Three DNA mutation types have been identified in genome evolution, including beneficial, neutral and harmful mutations [21]. In nature, most neutral and harmful mutations are eliminated through purifying selection, while beneficial mutations are retained during the evolution process. The accumulation of beneficial mutations is important for plants and animals to respond to a complex and changing environmental conditions [22]. To identify which selection pressures were driven by the evolution of PbrLBD genes, the ka/ks ratios of 22 PbrLBD paralogs were calculated using the Kaks_calculator tool. A ka/ks ratio equal to one indicates neutral selection, less than one indicates negative selection (purifying selection) and higher than one indicates positive selection (Darwinian selection). Results showed that all ka/ks ratios of PbrLBD paralogs were less than 1, suggesting that negative selection (purifying selection) was the primary driving force for evolution of the PbrLBD gene family.
Expression analysis of PbrLBD genes by EST database
Expression sequence tags (EST) is the part of cDNA sequence from different tissues, the length ranged from 300 to 1000 bp. One EST represents an expression gene at one time in one cell or tissues. Expression sequence tags (EST) can provide further evidence to support the accuracy of the members of gene family from transcription level [23, 24]. To verify the accuracy of PbrLBD gene identification, a BLASTN search was performed to scan the ‘Dangshansuli’ pear expressed sequence tag (EST) database using all PbrLBD gene sequences as the query. A total of 54 EST hits were found for the PbrLBD gene family, with hits occurring for 33 of the genes (E-value<10e-10, identity> 95%, length > 200 bp). Three genes (PbrLBD21, PbrLBD35 and PbrLBD24) had the greatest numbers of EST hits, with four EST hits each. This result provided further evidence that the LBD gene set is reliable. However, there were 27 PbrLBD genes that did not hit any tags from the Chinese White pear EST database. This may be due to there has been little research on LBD genes in pear.
Expression patterns of PbrLBDs in different tissues of pear
Transcriptome sequencing analysis was conducted using transcriptome data from six different tissues of ‘Dangshansuli’ cultivar including stem, ovary, petal, sepal, fruit and leaf [25]. The RPKM (reads per kilobase per million) values were measured to represent the expression level of PbrLBD genes, and a value of RPKM equaling 0 represented that the gene had no expression in libraries. The expression patterns of 60 PbrLBD genes were shown in Fig. 5 using heatmap.2, a function of the gplots package in R. Among 60 PbrLBD genes, 11 PbrLBD genes were expressed in all six different tissues, indicating that they have various roles in the development of different tissues. 51.67, 60.00, 48.33, 73.33, 40.00 and 35.00% of all PbrLBDs were expressed in six different tissues corresponding to stem, ovary, petal, sepal, fruit and leaf, respectively. Thirty-nine genes were expressed in at least one of six different tissues, although the transcript abundance of several genes was relatively lower for certain tissues. However, 10 PbrLBD genes were not expressed in any of the six different tissues, suggesting that a portion of PbrLBDs were functional redundancies. As shown in Fig. 1b, the genes in class Ib (46.67%) and Ic (50%) have substantial expansion in the pear genome. According to transcriptome data, we found that seven members in class Ib were not expressed in anyone tissues, suggesting that these PbrLBDs in class Ib were functional redundancies. Among class Ic, three LBD transcription factors PbrLBD25, PbrLBD13 and PbrLBD5 exhibited relatively high expression level in stem, suggesting that these three genes might occur neofunctionalisation. In summary, the result of transcriptome sequencing analysis supported that PbrLBD genes played important roles in multiple biological processes.
To verify that the functional cluster of transcriptome sequences analysis was reliable, 11 PbrLBD genes were selected to conduct a qRT-PCR experiment to investigate the expression levels in six different tissues of the ‘Dangshansuli’ pear, including stem, leaf, petal, fruit, sepal and ovary (Fig. 6). Results showed that expression of all 11 PbrLBD genes was detected in the six tissue types. Moreover, the 11 PbrLBD genes exhibited a diversity of expression patterns in the six different tissues, suggesting that PbrLBD genes may function in different tissues and participate in diverse metabolic processes. Previous studies had reported that LBD genes strongly expressed in the micropylar side of the mature ovary [26]. In our study, seven genes (PbrLBD54, PbrLBD57, PbrLBD33, PbrLBD30, PbrLBD53, PbrLBD43 and PbrLBD50) exhibited a similar expression pattern with a high expression level in ovary tissues, suggesting that PbrLBD genes play critical functions during ovary development. The PbrLBD20 and PbrLBD25 genes were highly expressed in stem tissues, and PbrLBD10 genes were highly expressed in fruit. Overall, these PbrLBD genes may play critical roles in pear growth and development processes.
Potential functions of PbrLBD genes in anthocyanin biosynthetic pathway
Previous studies have revealed that some LBD genes can act as important transcription factors in regulating anthocyanin accumulation [2, 27]. To validate the potential functions of PbrLBDs in the anthocyanin biosynthetic pathway, we further analyzed our previous pear transcriptome data [28]. Compared to bagged fruits, the fruits removed from bags had significant higher anthocyanin concentrations in three comparisons (Fig. 7a). Differentially expressed genes (DEGs) analysis results indicated that four genes (PbrLBD20, PbrLBD33, PbrLBD35 and PbrLBD53) were differentially expressed in two or more comparisons of bagged fruit and fruit removed from bags (Fig. 7b). As shown in Fig. 7b, PbrLBD33 genes were found to be significantly up-regulated in pears with high anthocyanin concentrations in all three comparisons. Furthermore, MYB binding sites involved in flavonoid biosynthetic gene regulation were found in the promotors of the PbrLBD33 gene. These results suggested that PbrLBD33 may participate positively in anthocyanin biosynthesis. However, expression of the other four DEGs was found to be down-regulated in pears with high anthocyanin concentrations, suggesting that these genes may repress anthocyanin accumulation. These results provided strong evidence that these LBD genes act as critical regulators in the anthocyanin biosynthetic pathway.
Subcellular localization of PbrLBD20 protein
In our study, the LBD gene was a type of plant-specific transcription factor. PbrLBD20 was selected from four candidate anthocyanin-related genes for further subcellular localization, and its protein was predicted to be located in the nucleus. To observe the subcellular localization of PbrLBD20, a plasmid with the GFP gene fused to PbrLBD20 CDS and driven by the 35S promoter was translocated into tobacco leaves. A confocal microscope was used to determine the localization of recombinant PbrLBD20. In Fig. 8, we observe that CK (empty vector) is located at the membrane and the nucleus, and PbrLBD20-GFP shows strong signals in the nucleus (Fig. 8). This result validates that PbrLBD20 is a transcription factor.