Genome-wide identification and classification of protein kinases in woodland strawberry
Using an HMM approach, a total of 954 putative woodland strawberry PK genes were identified (Additional file 1: Table S1 and Additional file 2: Figure S1), all of which fell into one of nine groups, AGC, CAMK, CK1, CMGC, Plant-specific, RLK, STE, TKL, and “Others”. Out of all the groups, the RLK group had the most members, which accounted for 67.0% of the total PK genes. All PK members were further classified into 124 families (Additional file 4: Table S3), out of which, 39 families contained only one member. The RLK-Pelle_DLSV family was the largest, with 128 members.
The properties of woodland strawberry kinome
To characterize the 954 strawberry PKs, the gene structure, kinase domain and predicted subcellular localizations of their putative protein translations were determined (Additional file 5: Table S4). Strikingly, 920 strawberry PK genes (96.4%) had two or more kinase domains. Whereas, the remainder PK genes only had one kinase domain, and these genes were distributed in 18 different families (Additional file 6: Table S5).
In the analysis of PKs gene structure, it was found that the number of introns per gene varied widely from 0 to 47, with an average intron number of six. mrna23790 (RLK-Pelle_DLSV) was the PK with the most introns. Out of the 954 PK genes, 144 (15.1%) lacked introns. In others, 197 (20.6%) of the PKs contained more than ten introns, while 34 (3.6%) others contained more than 20 introns. At kinase family level, members in CMGC_SRPK, RLK-Pelle_LRR-VII-1, RLK-Pelle_LRR-VII-2, RLK-Pelle_LRR-VII-3, and RLK-Pelle_RLCK-X families had the same number of introns. However, the exon/intron boundary in some PK genes in some families was highly variable. Among 34 members in the STE_STE11 family, 11 were intronless, whereas each of the remaining 23 family members contained four to 30 introns. Based on the phylogenetic relationships of these genes in the STE_STE11 family, all of the members could be clearly divided into two clusters based on the number of introns-clusters without introns and clusters that are intron-rich (> 3 introns per gene; Additional file 2: Figure S1). These data suggest that the kinase families had their own evolutionary expansions subsequent to divergence from one another.
To gain further insights into the potential functions of the woodland strawberry PK proteins, the subcellular localization of each amino acid translation was predicted using Plant-mPLoc. The result indicated that 58.4% of the PKs were predicted to localize to the nucleus and 24.6% were predicted to localize to the cell membrane (Fig. 1). The remaining kinase genes were predicted to localize to the chloroplast, cytoplasm, mitochondrion, peroxisome, and extracell, respectively (Additional file 3: Table S2). The PKs in different kinase groups were predicted to localize to different cellular compartments. About 100% (59/59) CAMK and 97.0% (64/66) CMGC members were predicted to localize to the nucleus, whereas 45.4% (290/639) RLK members were predicted to localize to the cell membrane. Among all the kinase families, 23 kinase families were predicted to have the same subcellular locations for all members.
Different duplication types among woodland strawberry PKs
Gene duplication plays a crucial role in the evolution of plant genomes and diversification of protein function [20], and can occur via whole-genome duplication (WGD) and single-gene duplication events [21]. Single-gene duplication can be further divided into tandem duplication (TD), proximal duplication (PD), transposed duplication (TRD), and dispersed duplication (DSD) [20]. The woodland strawberry kinome had 78 WGD events with 145 PK genes, that involved 90 RLK kinase genes (Additional file 7: Table S6), and 141 strawberry PK genes underwent 80 TD events, among which, 72 events occurred in the RLK group. We identified 58 PD events with 105 PK genes, a total of 193 TRD events with 318 PK genes from 71 gene families, and 839 DSD genes with 918 PK genes from 119 gene families. Additional file 7: Table S6 shows different duplication patterns drove the expansion of woodland strawberry PK genes.
In order to estimate the time of different duplication types in the PK genes, synonymous substitution (Ks) rates of the duplicated gene pairs were determined. The Ks frequency of WGD kinase genes peaked at 1.4 to 1.5, much greater than the peak range of 0.2 to 0.3 in TD genes (Fig. 2). Among the TRD events, the Ks frequency peaked at 1.8–1.9, which was the greatest peak value in all the duplication types. The TRD of PK genes occurred before the WGD-resulted kinase genes. However, the tandem duplication PK genes appeared relatively later than the other types of kinase duplications.
To estimate selective pressure on strawberry PKs between different duplication types, Ka/Ks values were calculated for each gene pair. A Ka/Ks ratio less than 1 indicates purifying selection, a Ka/Ks ratio equal to 1 implies neutral selection, while Ka/Ks value greater than 1 indicates positive selection [22]. Almost all gene pairs, including all the types of duplicates, had a Ka/Ks value of less than 1 (Fig. 3 and Additional file 8: Table S7). The WGD genes had significant lower Ka/Ks values in median, average, and quartile than TD and TRD genes (t-test, P < 0.01). These results suggest that WGD-derived gene pairs have narrower distribution of Ka/Ks values, WGD genes evolve slower and are under weaker selection pressure than the gene pairs derived from other duplication types.
Chromosomal distribution of woodland strawberry PKs
To determine the chromosomal distribution of woodland strawberry PKs, a total of 907 genes were mapped, and it was found that they are unevenly distributed across the seven chromosomes. Chromosome 6 and 3, which is the longest, harbored the two largest numbers of kinase genes, 197 and 191 genes, respectively. Chromosome 1 contained the fewest with 81 PK genes (Fig. 4). The strawberry PK members in the same group were generally clustered together on different chromosomes. For example, the largest numbers of CAMK and STE members were distributed on chromosome 6, whereas the greatest number of RLK members was located on chromosome 3 (Additional file 3: Table S2). Although the gene number of strawberry PKs was partly related to chromosome length, the uneven distribution of PKs in different groups was also found between different chromosomes.
Functional prediction of woodland strawberry PK genes
To determine the putative functions of woodland strawberry PKs, the GO annotations for all the genes were examined and were assigned and classified into three main GO categories: biological process, molecular function, and cellular component (Fig. 5). Functional GO terms for the PK genes were also analyzed. The tops three GO terms in molecular function were assessed as “protein kinase activity”, “ATP binding”, and “protein binding”. The woodland strawberry PKs were enriched in GO terms of epigenetic processes, such as “protein phosphorylation, in GO terms of development, “recognition of pollen”, and in GO terms of signaling cascades, “signal transduction”. All the PKs were enriched in cellular component of membrane. Furthermore, the strawberry PKs in each kinase group enriched in biological process and molecular function was found similar (Fig. 6). However, the PKs in the RLK kinase group were enriched in terms of “response to stress”.
Expression patterns of woodland strawberry PKs in different tissues during development
In order to explore the expression patterns of strawberry PK genes in different tissues, an in silico analysis of the transcriptomic data from carpel, anther, cortex, embryo, ghost, leaf, ovule, pith, pollen, seedling, style, wall, microspores, flowers, perianth, and receptacle was conducted [23]. Based on the heatmap cluster analysis of PK expression, the 952 woodland strawberry PK genes were classified into eight clusters (Fig. 7 and Additional file 9, 10, 11, 12, 13, 14, 15 and 16: Figure S2-9). Cluster 1 contained 204 PKs, with numerous genes exhibiting high expression in microspores, flower, perianth, and receptacle, and low expression in pollen (Additional file 9: Figure S2). In cluster 2, most PK genes also had high levels of expression in microspores, flower, perianth, receptacle, but with low levels of expression in embryo and pollen (Additional file 10: Figure S3). The PK genes in cluster 3, 4, and 5 showed significant down-regulation in pollen (Additional file 11, 12 and 13: Figure S4-S6). However, in cluster 6, most genes had high levels of expression in pollen (Additional file 14: Figure S7). The GO analysis of the PKs in each cluster supported the results. The woodland strawberry PKs in cluster 1–6 were all enriched in GO terms of “recognition of pollen” (Additional file 17: Figure S10). Interestingly, the PK genes that had high expression levels in microspores, flower, perianth, and receptacle had low expression levels in pollen. To further explore the relationship between woodland strawberry PK gene families and expression patterns in pollen, a heatmap was constructed (Fig. 8). Where most PK families had low expression in pollen, RLK − Pelle_RLCK−VIIa− 1, RLK − Pelle_RLCK−VIIa− 2, and RLK − Pelle_PERK− 1 kinase families were significantly up-regulated in pollen. Taken together, these results suggest that PK families have distinct expression patterns with regards to tissue type.
RNA-seq analyses of woodland strawberry PK genes in response to gray mold infection
Botrytis cinerea is the causal agent of gray mold disease, which causes serious economic loss in fresh strawberry. In order to investigate whether the strawberry PK genes are associated with the defense of mature strawberry fruits against this pathogen, we mined the transcriptome data of mature fruits infected with B. cinerea. There were 109 kinase genes (in cluster 1 and 2) that exhibited differential expression patterns. These genes showed significant up- or down-regulation in response to B. cinerea attack (Fig. 9). Interestingly, among the 46 down-regulated genes (cluster 1), 38 (82.6%) were from the RLK kinase group (Additional file 18: Figure S11). Moreover, there were 50 RLK genes (79.4%) among the 63 up-regulated kinase genes (cluster 2) (Additional file 19: Figure S12). However, most woodland strawberry PK genes in cluster 3 showed little changes and variations comparing with the control upon B. cinerea infection (Additional file 20: Figure S13). The heatmap indicated that the 109 strawberry kinase genes in cluster 1 and 2 played important roles in response to B. cinerea. In addition, the genes in the RLK kinase group associated with strawberry gray mold disease responses.