Genome-wide identification of lysin motif containing protein family genes in eight rosaceae species, and expression analysis in response to pathogenic fungus Botryosphaeria dothidea in Chinese white pear

Background Lysin motif-containing proteins (LYP), which act as pattern-recognition receptors, play central roles in growth, node formation, and responses to biotic stresses. The sequence of Chinese white pear genome (cv. ‘Dangshansuli’) along with the seven other species of Rosaceae has already been reported. Although, in these fruit crops, there is still a lack of clarity regarding the LYP family genes and their evolutionary history. Results In the existing study, eight Rosaceae species i.e., Pyrus communis, Prunus persica, Fragaria vesca, Pyrus bretschneideri, Prunus avium, Prunus mume, Rubus occidentalis, and Malus × domestica were evaluated. Here, we determined a total of 124 LYP genes from the underlined Rosaceae species. While eighteen of the genes were from Chinese white pear, named as PbrLYPs. According to the LYPs structural characteristics and their phylogenetic analysis, those genes were classified into eight groups (group LYK1, LYK2, LYK3, LYK4/5, LYM1/3, LYM2, NFP, and WAKL). Dispersed duplication and whole-genome duplication (WGD) were found to be the most contributing factors of LYP family expansion in the Rosaceae species. More than half of the duplicated PbrLYP gene pairs were dated back to the ancient WGD (~ 140 million years ago (MYA)), and PbrLYP genes have experienced long-term purifying selection. The transcriptomic results indicated that the PbrLYP genes expression was tissue-specific. Most PbrLYP genes showed differential expression in leaves under fungal pathogen infection with two of them located in the plasmalemma. Conclusion A comprehensive analysis identified 124 LYP genes in eight Rosaceae species. Our findings have provided insights into the functions and characteristics of the Rosaceae LYP genes and a guide for the identification of other candidate LYPs for further genetic improvements for pathogen-resistance in higher plants.


Background
In contrast to mammals, plants have no sophisticated mobile defender cells or a somatic adaptive immune system. Plants have been developed their survival strategies, depending on the innate immunity along with signals arising from the site of infection via pathogen coevolution [1,2]. Similar to other organisms, plants can recognize PAMPs via recruiting plasmalemma localized pattern-recognition receptors (PRRs) to initiate immune reactions, such as PAMP-triggered immunity (PTI) responses [3]. Upon the perception between ectodomain and corresponding ligand, the cytoplasmic kinase domain (KD) of PRRs could transmit the signal to downstream and activate defense responses, such as reactive oxygen species production (ROS), phytoalexins, accumulation of callose, as well the stimulation of MAPK (Mitogen-activated protein kinase) pathways and the pathogenesis-related (PR) proteins expression.
Plant PRRs could be divided in two clusters: Receptorlike kinases (RLKs), which contain an extracellular sensor domain, a transmembrane domain and an intracellular domain with homology to protein kinases, involved in signal transduction; Receptor-like proteins (RLPs), which are similar to RLKs but lack intracellular region [4]. RLK/Ps are firstly reported in animals, but the gene number is particularly expanded in plants [5]. As a plant specific PRR family, the functions of lysin motif (LysM) containing proteins (LYPs) in fungal and bacterial microbe perceptions have been well studied in rice and Arabidopsis. LYPs are common in land plants and may have evolved before land colonization and symbiosis with mycorrhiza as a signaling module [6,7], and most of LYPs that have been characterized were related to the perception of N-acetyl glucosamine containing molecules and/or to be involved in plant-microbe interaction pathway including activating of defense responses and establishment of root endosymbioses. For example, the Arabidopsis genome encodes five LysM-RLKs, and three of them participate in chitin signaling with chitin affinity: AtCERK1 or LYK1, LYK4 and LYK5 [7][8][9][10][11][12]. AtCERK1 is essential for chitin signaling pathway in Arabidopsis by forming hetero-oligomeric complexes with LYK5 to initiate downstream PTI, and LYK4 is also involved in that pathway having functions partly redundant with LYK5. While OsCEBiP, the main chitin binding protein in rice, recruits OsCERK1 to activate the chitin-triggered immune responses [13][14][15]. In addition to activating innate immunity, LYPs in legumes are essential receptors for the perception of nodulation factors (NFs) released by rhizobia and the establishment of nitrogen fixing symbiosis [16][17][18][19][20][21].
Few LYPs have been previously reported in fruit trees including apple (MdCERK1 and MdCERK1-2). MdCERK1, the ortholog of AtCERK1, has been shown to directly bind chitin and to be involved in transcriptional responses to pathogen infection of a soilborne pathogen Rhizoctonia solani [22]. MdCERK1-2 is also involved in the anti-fungal defense responses as a PRR and significantly upregulated after Botryosphaeria dothidea infection [23]. However, for other therophyte and perennial species in the Rosaceae, members of the LYP gene family involved in fungal pathogen perception and their evolutionary history are poorly defined.
In this study, we identified the Rosaceae LYP genes at the genome-wide scale by employing bioinformatics and publicly available data, and analyzed part of their functions in pear. We annotated full-length LYP genes in pear and other Rosaceae species, investigated their subcellular localization, and analyzed their expression patterns in different pear tissue types. We investigated the expression of PbrLYPs in response to the infection by a fungal pathogen Botryosphaeria dothidea, and provided a relatively complete profile of the LYP gene family in the Rosaceae. The genetic structure, evolutionary analysis, and experimental data of LYPs provide potential candidate LYPs for the future genetic modifications of pathogen-resistance in Rosaceae fruit crops and other higher plants.

Identification and classification of LYP genes in the Rosaceae
To identify the members of LYP gene family in the genus Rosaceae, HMM search was performed using both the HMM profile (PF01476) and a self-built HMM model against the whole-genome protein sequences of each species. A total of 141 LYP genes were identified from eight investigated Rosaceae species. After removing redundant and incomplete gene sequences, the longest transcript of the same gene was retained. Subsequently, the NCBI Batch CD-Search was used to further confirm the presence of a LysM domain (Table 1 and Fig. S3a). Finally, we identified 124 LYP genes in eight Rosaceae species, including 18 genes in Chinese white pear, 14 in European pear, 21 in apple, 15 in peach, 13 in strawberry, 16 in Mei (Japanese apricot), 14 in sweet cherry and 13 in black raspberry. The PbrLYP genes showed a random distribution on eight of the 17 chromosomes and three unanchored scaffolds (scaffolds681.0, scaf-folds831.0, and scaffolds897.0) in pear ( Supplementary  Fig. S1).
Phylogenetic analyses of the LYP protein sequences were performed in order to classify the LYP genes and investigate their evolutionary relationships. The phylogenetic tree showed that the LYP genes are separated into eight well-supported clades. According to the name of the best hit gene in Arabidopsis, these subfamilies were named LYK1-3, LYK4/5, LYM1/3, LYM2, NFP (Nod factor perception protein), and WAKL (Wall associated kinase-like) (Fig. 1). The subfamily classification and corresponding names of LYPs are shown in Table 1. Although, the best local BLASTP hit gene of most LYPs in NFP clade were AtLYK4 or AtLYK5, their best NCBI BLASTP results were categorized as NFP (data not shown). Notably, the numbers of genes in the LYK1, LYK3 and LYK4/ 5 clades were more than that of others, suggesting these three subgroups may have undergone the subfamily specific expansion.
To explore the structural diversity of Rosaceae LYP genes, an exon distribution analysis was performed (Fig. 2). The results showed that these Rosaceae LYP subgroups displayed different exon abundance and the numbers of exons in each gene in the same subgroup were similar, supporting the phylogenetic classification of the LYP genes (Fig. 2a, c). However, among the Rosaceae, the number of exons in subgroups LYK1 and LYK3 was much higher compared to others, about 11 on average. These results were consistent with Arabidopsis. Exon number were relatively conserved in subgroups LYM1/3 and LYM2, at about 4 to 5 ( Table 2 and Supplementary Table S1). These results were well consistent with previous reports about the conserved exon number of different LYP types. Type I LYP genes contained was up to 10 exons (group LYK1 and LYK3), type II LYP contained approximately 5 (group LYM1/3 and LYM2), and of type III contained approximately 2 (group LYK2, LYK4/5, NFP and WAKL) [10,[24][25][26]. The conserved and specific exon numbers of certain LYP type may have been due to similar replication events, implying that the different LYP type genes originated through the evolutionary path separate from genes in other types.

Features of the LYPs in the Rosaceae
The characteristics of the LYPs and their coding genes are shown in Table 2. The lengths of the LYPs protein sequences ranged from 225 to 1255 amino acids and the molecular weights were 25.03 to 139.28 kD. Protein isoelectric points (PI) ranged from 4.43 to 8.75, with the majority lower than 7 ( Table 2). The highest number of exons in pear LYPs was found in the LYK1 and LYK3 subgroups. A similar trend was also observed in the other five Rosaceae species (European pear, apple, peach, sweet cherry and Mei) and Arabidopsis (Table 2), confirming that the genes in LYK1 and LYK3 groups have undergone specific evolutionary events as type I LYPs. However, the highest exon numbers in strawberry and black raspberry were detected in the LYM2 subgroup, suggesting that these species may have experienced an unknown evolutionary process or some specific selection forces. The grand average of hydropathy (GRAVY) for most LYK proteins in pear was positive, while that of LYMs was negative. The GRAVY of NFP and WAKL subgroups was random. A similar trend for all subgroups was also observed in the other Rosaceae species. These results indicated that similar to Arabidopsis, most of the LYK proteins are hydrophobic and all LYM proteins are hydrophilic in the LYP gene family (Table 2).

Synteny analysis of LYPs
The gene duplication events, such as tandem duplication, the whole genome duplication (WGD)/segmental duplication, and transposition events are the contributing factors in gene family development that impact the proteincoding gene family's evolution [27]. By MCScanX package, we detected the events duplication related to the LYP gene family, and assigned each of LYP genes to one of the five various types of duplication: WGD/segmental, singleton, proximal, dispersed, or tandem. In Arabidopsis, only two LYP genes duplicated during the WGD/segmental event, while the others originated from a dispersed duplication. Unlike in Arabidopsis, the five duplication types were all Structural and motif analysis of LYP genes. a Subgroup classification. Neighbor-joining phylogenetic tree was generated among 132 LYP genes with MEGA7. The subgroup names were labeled accordingly. b Motif analysis. Fifteen distinct motifs were determined with MEME suite and the representation of each motif was carried out with a different color. c Gene structural analysis. The exon sizes are comparable to their sequence length     (Table 3 and Supplementary Table S2). WGD occurred in all the Rosaceae species studied, with 38.9% of LYP genes in Chinese white pear and 57.2% in apple retained and duplicated from WGD/segmental events. However, the percentage of genes retained following dispersed duplication in peach (53.3%), strawberry (46.2%), Mei (56.3%), sweet cherry (57.1%), and black raspberry (69.2%) was higher than that in apple (19%). Peach, strawberry, Mei, sweet cherry, and black raspberry did experience a WGD from the time of their divergence from pear and apple. Hence, these species may have experienced more genome rearrangements and gene losses during the long-term evolution in the absence of WGD, resulting in the larger ratios of dispersed genes. Although pear and apple have undergone the same recent WGD event, Chinese white pear and European pear showed a higher percentage of dispersed LYP genes (38.9 and 50%, respectively) compared to apple. This may be due to the differences in the ratio of self-incompatibility and the domestication process between pear and apple. However, proximal duplication events of LYP genes were only detected in apple (14.3%), strawberry (23.2%), peach (6.7%), Mei (6.2%), and sweet cherry (14.3%) as depicted in Table  3. The obtained data suggested that WGD and dispersed gene duplication have an effective contribution to the development of LYP gene family, belong to Rosaceae. To reveal the LYP genes (belong to Rosaceae) evolutionary routes that made them the most diverse, here, we evaluated both intra-and intergenomic synteny analyses to identify conservation chromosome blocks within Chinese white pear and among eight Rosaceae species and Arabidopsis. The landscape of inter-species orthologous LYP gene pairs among Rosaceae species and Arabidopsis presented in Fig. 3 and their chromosomal distribution was random. In the Chinese white pear genome, 7 conserved syntenic blocks containing PbrLYPs were detected, including most of WGD/segmental type LYP gene pairs (PbrLYK1a-PbrLYK1b2, PbrLYM2-1-PbrLYM2-2, and PbrLYK3a1-PbrLYK3a2) (Fig. 4). The timing of the WGD/segmental duplication events could be estimated by the Ks value (synonymous substitutions per site) [28]. Based on previous reports, the Ks values, show that the genome of apple and pear have undergone two genomewide duplication events: the ancient WGD from γ triplication (Ks~1.6) and a recent WGD (Ks~0.2) [29] in the apple genome, as well the ancient WGD (Ks~1.5-1.8) that took place~140 MYA [30] and the recent WGD (Ks 0.15-0.3) occurred at 30-45 MYA [31] in pear. Hence, Ks values were used to estimate the time for the gene duplication events among the PbLYP gene family members.
The Ks values suggest that most PbrLYP genes were duplicated from around the time of the ancient WGD event, while some originated from the recent WGD (Table 4). The Ka/Ks ratio represents the selection intensity and direction. The Ka/Ks value of one showed neutral evolution, positive selection when the Ka/Ks value is greater than one, and purifying selection when the Ka/Ks value is lower than one [32]. Our results showed all Ka/Ks ratios of the PbrLYP gene pairs were lower than one, demonstrating, PbrLYPs primarily evolved under purifying selection (Table 4).

Conserved motif analysis of the LYP gene family in Rosaceae species
The types and composition of inner motifs primarily determine the protein function. To further identify motif construction of the LYP gene family in the Rosaceae, the online MEME program was used in this study to detect motif patterns of LYPs. Fifteen conserved motifs with low E values were recognized (Fig. 2b). The number of motifs in LYPs were varied and there were distinctive  12,9,6] for LYM2s. Each subfamily had its own relatively certain motif composition with significant differences between LYM and other types of LYPs ( Supplementary Fig. S3b), indicating that LYPs are relatively conserved in their evolutionary history and the division among groups may have occurred at an early period. Previous reports have shown that AtLYM2 and AtLYK1/4/5 were all involved in the chitin signal pathway and played a role as core participators or co-receptors to mediate the signaling for their chitin binding ability of the second LysM on the ectodomain [7,8,[10][11][12][13]. In this work, 3 of the 7 conserved residues for chitin binding on that LysM domain were detected on motif #12 (Supplementary Figs. S2 and S3b). This conserved motif was only detected in the extracellular domains of LYK1, LYK4/5, NFP, WAKL, and LYM type groups, which is indicative of inner links between the chitin affinity and the presence of motif #12 and that the evolution between each subgroup may not be completely independent. Interestingly, as the unique motif in LYP family, motif #15 was only detected in LYK3a subfamily (Supplementary Fig. S3b). This indicates that the conserved motif #15 may be related to the opposite function of AtLYK3, as a negative regulator in chitin-induced immunoreactions. Our results suggest that the occurrence of motif #12 and #15 in the ectodomain of Rosaceae LYPs may be related to the chitin affinity and the negative regulation of defensive responses to fungal pathogens, respectively.

Expression levels of the PbrLYPs
Previous transcriptome analysis of Chinese white pear revealed tissue-specific expression patterns in petal, sepal, ovary, stem, bud, leaves and fruit [33,34]. The results indicated that the background expression of most PbrLYP genes was rarely detected, however other genes were primarily expressed in fruit and leaves (Fig. 5a). For example, PbrLYK3a1 and PbrLYK3a2 were mainly expressed in fruit, petal, sepal and ovary, while PbrLYK1b3, PbrLYK1b1, PbrLYK1b2, PbrLYK4/5a1 and PbrNFP1 showed preferential expression levels in leaves. However, PbrLYM1/3-1 showed highest expression in fruit, stem, and bud, but relatively low expression in leaves.
To verify whether PbrLYPs participate in the defense response against Botryosphaeria dothidea (B. dothidea) pathogen infection, a fungal pathogen that can cause the ring rot disease in apple and pear, we performed an infection treatment experiment with 6-weeks-old pear seedlings. The qRT-PCR (quantitative real-time PCR) results indicated that most of PbrLYPs were up-regulated by the infection of B. dothidea, with the peak expression occurring at 4 or 6 dpi (Fig. 5b). For example, at 4 dpi, the relative expression of PbrLYK1b2, LYM2-1 and LYM2-2 was significantly higher than controls at the highest expression. At 6 dpi, the expression levels of PbrLYK1b2, PbrLYK3b1, PbrLYK4/5b and PbrWAKL1 were still relatively higher than control, as well the peak levels of PbrNFP2, PbrLYK2, PbrNFP1 and PbrLYM1/3-2. The results indicated that these differentially expressed genes may participate in the defense reactions. However, the expression of PbrLYK3a1, PbrLYK3a2, PbrLYK1a, PbrLYK1b1, PbrLYK4/5a-1 and PbrLYM1/3-3 showed no significant change following infection in Chinese white pear. Furthermore, the expression of PbrNFP1, PbrLYK1b2, PbrWAKL1 and PbrLYM2-2 was also significantly up-regulated in Qiuzi pear induced by the pathogen infection. On the contrary, the expression levels of PbrLYM2-1, PbrLYK3a2 and PbrLYM1/3-1 were downregulated after the infection (Supplementary Fig. S4).

Subcellular localization of the PbrLYPs
PRRs are primarily located in the plasma membrane and are in direct contact with the ligand. To verify whether LYP proteins were also present on the plasma membrane in the Rosaceae and had the potential to act as PRRs, we first performed structural analyses of PbrLYP proteins using the TMHMM online software. The sequence analysis showed that, except for LYM type LYPs, all PbrLYPs contained a transmembrane (TM) region (Supplementary Additional file 3), demonstrating that they can also be localized in the membrane. Considering the effect of signal peptides (SP) on subcellular localization, we selected PbrLYK1b2 and PbrLYK4/5a1 to verify the localization of PbrLYPs. The open reading frame of each gene was cloned from pear branches and PbrLYP-35S-GFP fusion proteins or control (35S-GFP alone) were transformed separately into Nb leaves. Based on fluorescence microscopy, using the control plasmid, the green fluorescence was found to be scattered in the overall cell. However, PbrLYK1b2-GFP and PbrLYK4/5a1-GFP containing vectors showed the green fluorescence only in the cell membrane, as depicted in Fig. 6. Therefore, all PbrLYPs with SP and TM seems to have the potential to act as PRRs.

Discussion
The LysM-containing proteins have been primarily implicated in the PTI immune processes including the In other words, the expansion of LYP family genes in Rosaceae species appeared like a common event. The larger gene numbers in certain LYK1, LYK3 and LYK4/5 groups suggested that these groups may play diverse roles in the adaptive evolution of Rosaceae species to environmental stresses. The gene duplication analysis showed that the expansion of LYP genes in Chinese white pear and apple was primarily due to WGD/segmental events, along with dispersed duplication as the major expansion driving force for LYPs in the other six Rosaceae species. According to the widely-spanning Ks values, many large-scale duplication events were detected at the ancient stage (Ks values of 12 of 20 duplicated gene pairs were around 1.290~2.786) in Chinese white pear ( Table 4). The results suggested that the selection of the function of perception and defense response to chitin was beginning at a very early stage and continuing up to now. These functions are fundamental and vital for plant survival. The LYPs in LYK1/3/4/5 and LYM2 groups were reported to be closely related to chitin signaling [7,8,[10][11][12][13]. In this study, six out of seven WGD/segmental-type PbrLYPs were detected in LYK1, LYK3 and LYM2 groups, suggesting that the evolution of chitin response was mainly derived from the WGD/segmental events and remained in Chinese white pear. The Ka/Ks ratios of all duplicated PbrLYP pairs were less than one, which implied that the PbrLYPs are undergoing purifying selection and they seem to be necessary for adaptation to the current environment in their evolutionary history.
Phylogenetic analysis classified the Rosaceae LYPs into eight subgroups, which suggested that the evolution of different subfamilies was relatively independent. Analysis of the gene structure and protein motif showed the high similarity of the motif composition and exon-intron architecture within each subgroup also confirming independent evolution (Fig. 2b, c). The above results suggested that the genes in the same clade may have similar evolutionary histories and may perform a similar function. As shown in the gene structure analyses, subfamilies LYK1 and LYK3 contained the highest number of exons in LYKs, indicating that intronization in the exons of the genes (in these groups) might have happened. The number of exons also has a key contribution to their divergent functionality in various tissues, organs, or growth periods. A similar case was also found in the LYM2 group in the LYMs in European pear, strawberry, and black raspberry.
According to the previous works about the evolution of the plant LYP gene family, the LYPs have evolved through local and segmental duplications and can be grouped into three main types: LYP-I (about 10 exons per gene and containing conserved KD), LYP-II (one to five exons per gene, lacking the KD), and LYP-III (one or two exons per gene, with a KD unlike that of LYP-I), likely arising from the fusion of other type LYP genes [16,[35][36][37]. The LYP-I type gene products are the main PRRs in each signaling pathway. LYP-II types are likely to not function as core receptor kinases, but form complexes with other LYPs, such as that AtLYK1 that could interact with AtLYM1/3 and AtLYK4/5 to mediate bacterial and fungal pathogen perception in Arabidopsis, respectively [7,11,12,38]. The Rosaceae LYPs were well-matched to the characteristics in protein and gene structure of AtLYPs, and therefore may potentially have similar roles in signaling. With the higher number of genes and exons, the genes in the LYM2 group of sweet cherry and black raspberry and genes in LYK1 and LYK3 groups of other species seemed to have undergone stronger evolutionary selection and may be more diverse in function. In addition, we also investigated the conserved motifs of LYPs and determined the putative protein localization as well as their collinearity relationships. In total, 15 distinct conserved motifs among various LYP proteins were predicted by the MEME analysis. As shown above in Fig. 2b and Fig. S3b, motif patterns [#6, 14,12,9,6] and [#8,10,7,5,1,3,2,13,4] might represent the functional motifs of ectodomain and intracellular kinase domain of LYPs, respectively. Meanwhile, a LYK3aunique motif #15 was detected in the region of the juxtamembrane domain in the LYK3a protein group, where it could be regulated by phosphorylation to affect the activity of the kinase domain [39]. The AtLYK3 was also placed into LYK3a group with the motif #15. Therefore, it is reasonable to consider that the motif #15 was related to the negative regulatory functions of the genes in LYK3a. However, this question requires further research. The conservative residues for chitin binding were detected in the motif #12, and that motif was only found in LYK1/4/5, NFP, WAKL and LYM type groups (Fig.  S3b), suggesting that the genes in those groups likely shared a common ancestor and had the similar functions at the ancient period in response to chitin. After a long period of evolution and selection, the duplicated Rosaceae LYPs remained in relatively large numbers, suggesting that the LYP genes were important for Rosaceae species in adaptation to the complex and changing environments.
The LYP gene family plays various important roles in growth and response to biotic stresses. For example, AtLYK1 encodes a plasma membrane-localized receptor kinase protein. AtCERK1 works as a receptor homodimer or the core element of the hetero-tetramer with AtLYK4/ 5 or AtLYM1/3. These complexes are involved in initiating PTI responses against the fungal or bacterial pathogen infection in Arabidopsis [7,8,[10][11][12]38]. Transcriptome data showed that in the common target tissues for pathogen infection, such as leaves and fruit, some PbrLYPs had relatively higher expression than in other tissues for host protection. Based on the expression patterns, these PbrLYPs may be regarded as putative defense-related genes at the background level. In China, fruit ring rot and stem wart diseases caused by pathogen B. dothidea occur in almost all pear-growing areas, and the target organs including pear fruit, stem, shoots and leaf [40]. Although the pathogenesis of B. dothidea was poorly understood, the previous work in apple had reported that a LysMcontaining protein gene, MdCERK1-2, was involved in the anti-fungal defense responses as a PRR and significantly upregulated after B. dothidea infection [23]. In other word, the chitin signaling pathway was likely recruited during the infection of B. dothidea. To verify whether PbrLYPs were involved in the defense reaction, we performed an infection treatment and qRT-PCR analysis. Our qRT-PCR results indicated that some of PbrLYPs participated in the immune response to B. dothidea infection (Fig. 5 and Fig. S4). In addition, after the infection with B. dothidea, significantly increased relative expression of several putative defense-related genes was detected by qRT-PCR, including PbrLYK1b2, PbrNFP1, PbrWAKL1 and PbrLYM2-2, which is consistent with the case of MdCERK1-2 in apple [23]. It is interesting to note that, although the expression level of PbrLYK4/5b in Chinese white pear was strongly up-regulated by the fungal pathogen infection, it could not be detected in Qiuzi pear before or after infection. This may have been due to the relatively high expression levels of PbrLYK1b1, PbrLYK1b2 and PbrWAKL1 in Qiuzi pear compared to Chinese white pear. Therefore, if some of the LYPs were the core PRR of chitin perception complex and able to form a homodimer, like in Arabidopsis, then these extensively expressed proteins may perform full functions independently activate the chitin signaling pathway. Hence, there may be no need to recruit co-receptors like PbrLYK4/5b to form a recognition complex, possibly accounting for the high pathogen resistance of Qiuzi pear. This question requires further investigation to reveal the infection strategy of B. dothidea and the resistant mechanism of host pear plant. Furthermore, the subcellular localization analysis demonstrated that pathogeninducible genes (PbrLYK1b2 and PbrLYK4/5b) were also located at the plasmalemma, suggesting a potential capacity for PbrLYKs to act as PRRs at the subcellular level. In addition to the expression analysis, these results were consistent with previous studies that have implicated LYP genes in biotic stress tolerance via chitin-binding chitin and activation of the downstream immune response as plasmalemma-located PRRs [7,11,12,38].
However, further investigation will be required to determine whether the expansion of LYPs could provide more advanced pathogen detection model to increase the chances of surviving under the complex environmental changes and the receptor complex in Chinese white pear or other Rosaceae species similar to that in Arabidopsis or rice. The characterization of key elements and the composite pattern of these complexes was also crucial to the understanding of the functional mechanisms of LYPs in the Rosaceae.

Conclusions
One hundred twenty-four full-length LYP genes were determined in the eight genomes of Rosaceae, along with the 18 LYP genes of the Chinese white pear genome. Based on the protein sequences and CDS structural characteristics, comparison with Arabidopsis homologs, and phylogenetic analysis, the LYP genes were classified into eight groups i.e., LYK1, LYK2, LYK3, LYK4/5, LYM1/3, LYM2, NFP, and WAKL, with groups LYK1 and LYK3 possibly having higher functional diversity. According to the analysis of collinearity, the ancient and recent WGD and dispersed duplication might have a role in the evolution of the LYP gene family, associated with apple and Chinese white pear. The LYP family genes were found to be greatly influenced via evolutionary negative selection. qRT-PCR revealed that LYP genes might have a vital role against the fungal pathogenesis. The underlined collected data establish a foundation for advanced studies to evaluate the complexity of LYP gene family in the Rosaceae.

Determination of LYP genes in Chinese white pear and other species of Rosaceae
For the determination of the LYP genes in pear and other species of Rosaceae, several databases were employed. To acquire LYP family genes, we used the following strategy: The genome sequences of eight species belong to Rosaceae were downloaded from each genome project (Supplementary Table S3). Subsequently, we built a Hidden Markov Model (HMM) with the extracellular domain sequences of 12 well-studied LYP proteins (AtLYK1-5, AtLYM1-3, OsCERK1, OsCEBiP, OsLYP4 and OsLYP6, the accession numbers and extracellular domain sequences were shown in Table S5) [41], using the HMME R3 software package [42,43], and downloaded the seed file of Lysin Motif domain (PF01476) from the Pfam database (http://pfam.xfam.org/). The sequences of eight Arabidopsis proteins and four rice proteins were acquired from TAIR (https://www.arabidopsis.org/) and NCBI (https://www.ncbi.nlm.nih.gov/), respectively. Then HMM searches with PF01476 and self-build model were independently conducted for the local protein databases of eight species of Rosaceae via HMMER3 with E-values < 1e − 10 . Furthermore, two resulting gene lists were intersected and the protein sequences were detected by the NCBI Batch CD-Search tools (Batch CD-Search: https:// www.ncbi.nlm.nih.gov/Structure/bwrpsb/bwrpsb.cgi) based on CDD v3.18 and SMART v6.0 databases for the validation of the existence of the LysM domain. The sequences of proteins with E-values greater than 1e − 6 or without a LysM domain were deleted. The relevant accession numbers of LYP genes were shown in Table 1.

Structure and conserved motif analysis of the LYP genes
The Gene Structure Display Server (GSDS 2.0) (http:// gsds.cbi.pku.edu.cn/) was used to analyze the structures of the LYP genes by aligning the cDNA sequences with their corresponding genomic DNA sequences. Conserved motif analysis of LYP proteins was performed by online Multiple Expectation Maximization for Motif Elicitation (MEME) (http://meme.nbcr.net/meme/cgibin/meme.cgi) [44]. Maximum number parameter of motifs was seted as 15.

Phylogenetic analysis
The construction of phylogenetic trees was carried out with Neighbor-Joining (NJ) and a bootstrap of 1000 in MEGA7.0 (http:// www.megasoftware.net/) [45]. The pdistance was used and the optional parameters for pairwise deletion were considered.

Chromosomal localization and synteny analysis
Genome annotation files of Arabidopsis and eight Rosaceae species were downloaded from TAIR and each genome project (Supplementary Table S3). The same procedure used in the PGDD (http://chibba.agtec.uga. edu/duplication/) [46] was performed to analyze the synteny among the LYPs. Primarily, for the investigation of considerable pairs of the homologous gene, the local all-vs-all BLASTP searches among Arabidopsis and eight species belong to Rosaceae genomes were conducted (E < 1e − 10 ). Afterward, MCScanX was employed for the determination of syntenic gene pairs with the BLASTP result and gene location information used as input files [47]. The downstream analysis tool (duplicate_gene_classifier) in the MCScanX package was employed for the identification of tandem, proximal dispersed, and segmental/whole-genome duplications (WGD) of LYP family genes. The results were visualized using circos-0.69 software [48]. The Ka and Ks values were analyzed via KaKs_Calculator 2.0 [49]. For the estimation of the date of segmental duplication events, the succeeding pairs of homologous genes within 100 Kb on all sides of the LYP genes, considered for the mean Ks calculation.

Subcellular localization of the PbrLYPs
The amplification of total-length CDS of the PbrLYK1b2 and PbrLYK4/5a1 was carried out via PCR, respectively. Purified products were subcloned directionally into the modified pCAMBIA1300-GFP vector (Clontech, Beijing, China), and resulted in PbrLYK1b2-GFP and PbrLYK4/ 5a1-GFP. Primers assisting gene cloning and vectors construction, depicted in added Table S4. The agrobacterium carrying above products were transformed into 4-week-old Nb leaves, respectively, as the method reported previously with slight modification [50]. Images were obtained via the Zeiss LSM Image Browser (Zeiss LSM 780, Germany). The independent assays were conducted at a minimum of thrice for each gene. The empty vector pCAMBIA1300-GFP was used as control.

Infection treatment and quantitative real-time PCR
Chinese white pear (Dangshansuli, Pyrus bretschneideri Rehd.) and "Qiuzi" pear (Pyrus ussuriensis Maxim) seeds were obtained from the pear germplasm orchard of the Center of Pear Engineering Technology Research situated at Hushu in Nanjing and were allowed to grow in soil pots in a maintained environment (2:1 light/ dark period, 25°C) in the phytotron. Sixty leaves of each kind of pears were harvested from 15 six-week-old seedlings and placed on the sterile water wetted filter paper in a petri dish overnight. Then the 5-day-old fresh B. dothidea mycelia, which grown in the PDA plat, were stuck to the paraxial surface of leaves to perform infection. The infected leaves were cryopreserved with liquid nitrogen at 0 dpi (day post infection), 2dpi, 4dpi, and 6dpi. Total RNA extraction and the synthesis of cDNA were according to the instructions of RNA kit (Tiangen, Beijing, China) and PrimeScript RT reagent Kit (Trans Gen). Specialized primers of the constitutive TUB and PbrLYP genes were designed via NCBI online tool Primer-BLAST (https://www.ncbi.nlm.nih.gov/tools/primer-blast/index.cgi? LINK_LOC=BlastHome) with the Specificity Parameters Organism option set as Pyrus bretschneideri (taxid:225117) (Supplementary Table S4). The specificity of each primer pairs was verified by the online program Primer search-Paired against the pear genome. The qRT-PCR assays were conducted with three technical copies. QRT-PCR reactions (20 μl per hole) were performed as previously reported [51]. The expression was evaluated for each sample via the 2 −ΔΔCt method, and Duncan's multiple range test was conducted. A P-value of less than 0.05 was the considerable variation and indicated with asterisks. The reported RNA-seq data was processed for the evaluation of the expression patterns of PbrLYPs (obtained from the NCBI bioproject PRJNA563942 and PRJNA498777) [33,34], the differentially expressed genes were identified with |log2FC| > 1. The heatmaps were drawn in TBtools v0.666 [52].
Additional file 1: Table S1. Exon number statistical analysis of LYP genes. Table S2. Duplication type of LYP genes in Arabidopsis and eight Rosaceae species. Table S3. Genome information of eight Rosaceae species. Table S4. Primers of PbrLYPs for qRT-PCR and vector construction. Table S5. Extracellular domain sequences of 12 LYP proteins from Arabidopsis and rice.
Additional file 2: Fig. S1. Gene position of PbrLYPs. The characters in blue indicate the chromosome or scaffold number. The height of columns in red indicate the length of chromosome. The number on the right side of the chromosome indicate the start position of each gene. Fig. S2 15 MEME motifs of LYPs. Over-represented motifs in Arabidopsis and the eight Rosaceae species were identified using the MEME tool. The stack's height indicates the level of sequence conservation. The heights of the residues within the stack indicate the relative frequencies of each residue at that position. The star symbols under Motif 12 indicate the conservative positions for chitin binding. Fig. S3. Schematic diagram and distribution of conserved motifs among the LYP proteins of Arabidopsis and eight Rosaceae species. a: The schematic diagram of conserved motifs among the LYP proteins detected by NCBI Batch CD-Search. 4EBZ: the ectodomain of AtCERK1 in PDB databases (https://www.rcsb.org/). b: The distribution of conserved motifs among the LYP proteins detected by MEME. The red and black squares represent non-and conservative MEME domains in the subfamily. The blue square represents the unique motif in LYK3a subgroup. Fig. S4 qRT-PCR analyses of the PbrLYP genes in Qiuzi pear leaves after B. dothidea infection. The pear actin was used as internal reference for the normalization. Asterisks indicated significant difference in statistics compared with 0dpi at the indicated time points (* P < 0.05).