The Arabidopsis Kinome: phylogeny and evolutionary insights into functional diversification
- Monika Zulawski†1,
- Gunnar Schulze†1,
- Rostyslav Braginets1,
- Stefanie Hartmann2 and
- Waltraud X Schulze3Email author
© Zulawski et al.; licensee BioMed Central Ltd. 2014
Received: 23 March 2014
Accepted: 25 June 2014
Published: 1 July 2014
Protein kinases constitute a particularly large protein family in Arabidopsis with important functions in cellular signal transduction networks. At the same time Arabidopsis is a model plant with high frequencies of gene duplications. Here, we have conducted a systematic analysis of the Arabidopsis kinase complement, the kinome, with particular focus on gene duplication events. We matched Arabidopsis proteins to a Hidden-Markov Model of eukaryotic kinases and computed a phylogeny of 942 Arabidopsis protein kinase domains and mapped their origin by gene duplication.
The phylogeny showed two major clades of receptor kinases and soluble kinases, each of which was divided into functional subclades. Based on this phylogeny, association of yet uncharacterized kinases to families was possible which extended functional annotation of unknowns. Classification of gene duplications within these protein kinases revealed that representatives of cytosolic subfamilies showed a tendency to maintain segmentally duplicated genes, while some subfamilies of the receptor kinases were enriched for tandem duplicates. Although functional diversification is observed throughout most subfamilies, some instances of functional conservation among genes transposed from the same ancestor were observed. In general, a significant enrichment of essential genes was found among genes encoding for protein kinases.
The inferred phylogeny allowed classification and annotation of yet uncharacterized kinases. The prediction and analysis of syntenic blocks and duplication events within gene families of interest can be used to link functional biology to insights from an evolutionary viewpoint. The approach undertaken here can be applied to any gene family in any organism with an annotated genome.
Protein kinases constitute a protein family with functions in cellular signal transduction pathways. In the model plant Arabidopsis thaliana, about 4% of the genes encode protein kinases, which can be referred to collectively as the kinome . These different protein kinases can be subdivided into several families according to their function, structure, and phylogenetic relationships. Roughly 60% of all protein kinases belong to the large superfamily of receptor kinases (RLK), including the large family of transmembrane leucine-rich-repeat (LRR) receptor kinases. Also the so-called receptor-like cytoplasmic kinases (RLCK), which lack extracellular and trans-membrane domains are part of the receptor kinase clade. The clade of soluble kinases consists of the most prominent families, namely the cyclin-dependent kinases (CDK) involved in cell-cyle regulation, the mitogen-activated protein kinases (MAPK, MAPKK, MAPKKK), which constitute transmission cascades for responses to extracellular stimuli, the AGC kinases, and the kinases decoding calcium signals (CDPK-SnRK superfamily). Although mechanisms leading to expansion of the receptor-like kinases in Arabidopsis were suggested , a comprehensive analysis of the entire Arabidopsis kinome with respect to gene duplication patterns has not been carried out so far.
Gene duplication events present an important mechanism for the generation of evolutionary novelties , and at least six different types of duplications including tandem duplications, whole-genome and segmental duplications, as well as transpositions can be distinguished . For example, duplication events may affect local genes by single-gene duplications, blocks of genes on chromosomes by segmental duplications, or entire genomes through whole-genome duplications. Each mechanism of duplication results in potentially changed expression context of a gene or leads to genetic and functional redundancies. However, due to the reduction of selective pressure on redundant gene copies in duplicated regions, duplicates may be lost or pseudogenized. The term ’syntenic region’ is widely used in the context of gene duplication analysis and evolutionary history of genes and genomes. In classical genetics, synteny refers to the colocalization of genes or genomic regions on the same chromosome . The concept of collinearity on the other hand refers to a conserved gene order between the ancestral and the potentially duplicated genomic region and may thus be used to infer synteny in the context of gene duplications [5–7].
Several models for the retention and loss of duplicated genes have been proposed and it is believed that at least in plants, genes are retained or lost in a biased manner with respect to their mode of duplication and functional context [8–10]. Plants, especially angiosperms, are known for their high frequency of chromosomal and whole-genome duplications. Arabidopsis thaliana alone has experienced at least two recent whole genome duplication events in the period between its divergence from Carica papaya (∼72 million years ago) and Arabidopsis lyrata (∼10 million years ago) commonly referred to as α and β duplication events, respectively. In addition, there was an ancient paleohexaploidy event shared between all rosids . As a consequence, the expansion and functional diversification of gene families was largely shaped by gene duplication events and a number of studies have reported their impact on the evolution of resistance genes  and various other large gene families [6, 9].
In this study, the freely available MCScanX toolkit  was used to detect collinear regions in Arabidopsis thaliana and classify duplicated kinase genes according to their most likely mode of generation. Classifications were further refined by the MCScanX-transposed extension using Arabidopsis lyrata and Populus trichocarpa as outgroups. To gain insight into patterns of retention and loss of duplications within protein kinase families, inferred syntenic regions were mapped onto a phylogeny of 940 kinases and then linked to gene expression data, family gene annotations and loss-of-function phenotypes. Besides assembly and phylogenetic evaluation of the Arabidopsis kinome, our study provides insights into the functional diversification among the protein kinases in the context of gene duplications.
Phylogeny of Arabidopsis kinases
An alignment of 491 eukaryotic protein kinases was downloaded on Feb 9, 2012 from http://kinase.com/human/kinome/phylogeny.html, and this alignment was used to compute a profile Hidden Markov Model (HMM) using the software HMMer . All representative gene models from Arabidopsis thaliana (TAIR10_pep) were searched against the profile HMM using HMMer. In total, 1,045 sequences generated hits with an E-value lower than 0.01. These sequences were then aligned to the profile HMM. Two sequences (AT1G11300.1 and AT2G32800.1) each had two distinct kinase domains and both domains per gene were therefore included as separate sequences in the alignment, resulting in an alignment of 1,047 distinct Arabidopsis kinase domains (Additional file 1). All alignment positions not part of the profile HMM were removed from the alignment. In addition, all sequences covering less than 70% of the profile HMM were removed using the software REAP . The cutoff value of 70% corresponded to a threshold value in sequence coverage distribution with sharp decline of sequence coverage for 111 kinase domains below coverage of 70% (Additional file 2). The final alignment then consisted of 317 columns from 942 sequences (kinase domains) and was used to compute a maximum likelihood phylogeny and 100 bootstrap replicates using the PROTCATWAG model of the RAxML program .
The mapping of genes to kinase families was based on an extensive literature search . In the case of unkown/unreported family annotation, the gene phylogeny as well as domain structural information was used to infer the most likely annotation for genes according to their clade membership. This approach resulted in the (re)assignment of 115 previously lacking or ambiguous annotations. The original three file of the phylogeny has been deposited at Dryad under the reference number pq7d7 (doi:10.5061/dryad.pq7d7).
Detection of syntenic blocks and classification of duplication types
To predict segmentally duplicated blocks in the Arabidopsis thaliana genome and further classify and count other types of gene duplications, a local installation of the MCScanX toolkit was obtained from the MCScan webpage (http://chibba.pgml.uga.edu/mcscan2/). To prepare sequences for analysis, a local installation of the BLAST + suite (version 2.2.27) was obtained from NCBI. Protein sequences of representative gene models and associated annotation files were downloaded from TAIR v10 (ftp://ftp.arabidopsis.org/home/tair/Genes/TAIR10_genome_release/) for Arabidopsis thaliana and from phytozome.net (http://www.phytozome.net/) for Arabidopsis lyrata and Populus trichocarpa. Arabidopsis thaliana whole genome protein sequences were queried against databases of Arabidopsis thaliana, Arabidopsis lyrata and Populus trichocarpa, using blastp with an E-value cutoff of 10-5 and restricting the output to a maximum of five hits per gene to serve as input for the MCScanX toolkit, which was used to detect and classify syntenic regions. Detection of collinear blocks and duplication classification were performed by the MCScanX algorithm and associated downstream tools using default parameters. To further enhance the duplication classification and allow for the detection of transposed genes, the MCScanX-transposed extension was employed using Arabidopsis lyrata and Populustrichocarpa as outgroups.
Enrichment analysis and calculation of expected counts
For each kinase family the ratio of expected to observed counts per duplication event was calculated. This ratio for tandem duplications was plotted against the ratio for segmental duplications in a bidirectional boxplot . The expected duplication frequency of segmental and tandem duplication events in each family was calculated as follows: For tandem counts, a simulation was carried out, placing N genes of size 1 kb (where N is the size of the corresponding gene family) in a genome of approximately 100000 kb and counting how many pairs of genes were within a 50 kb window. The gene family size (N) was varied between 10 and 300 in steps of 10. Each simulation was repeated 1000 times, and the results were averaged to yield the expected tandem counts for each size class. Relying on previous reports on the frequency of segmental duplications in Arabidopsis thaliana, the expected proportion of the genome present in at least one segmentally duplicated block was approximately 75%. Thus, assuming no bias, the average count of segmentally duplicated genes in each gene family can be estimated by the relation segexp = N * 0.75, where N is the number of genes in the respective gene family.
To evaluate significant differences in duplication types between families, an enrichment analysis was carried out by employing Fisher’s exact test under the null hypothesis of no association between a particular subfamily and frequency of a particular duplication mechanism. Each combination of subfamily and duplication mechanism was tested separately, and the obtained p-values were corrected for multiple hypotheses testing by Benjamini-Hochberg correction . Additionally, Pearson residuals from Chi-squared tests were used to assess the direction (enrichment/depletion) and strength of deviation from associations between sub-family and duplication mechanism expected under the null hypothesis.
Data analysis and visualization
Visualization of phylogenetic trees, simulations and statistical analyses were conducted in R (http://www.r-project.org/) using packages ape  and phangorn . Customized Perl scripts were used to parse input and output files to and from the MCScanX-utility. Results were stored and queried using the R-package RSQLite (http://cran.r-project.org/web/packages/RSQLite/index.html) in combination with a SQLite3 database which is available on request. Interaction networks were visualized using Cytoscape version 3.0.2. . Phylogenetic trees were computed with Raxml, and visualized with the program FigTree (Version 4.1, A. Rambaut; http://tree.bio.ed.ac.uk/software/figtree/). Gene expression samples from various developmental stages and tissues specific to the set of investigated kinases were downloaded from Genevestigator (https://www.genevestigator.com/gv/).
Public data sets
Subcellular locations were used based on the consensus location in SUBA3 . Phenotypes of loss-of-function mutants were obtained from . Protein-protein interaction data were obtained from AI1 . Information on myristoylation , phosphorylation  and functional annotation  was taken from supplementary materials of mentioned publications and/or from TAIR .
Phylogeny of the Arabidopsis kinome
The phylogeny of Arabidopsis kinases showed a clear division into two major clades of 561 membrane-located receptor kinases and 381 soluble kinases (Figure 1A). The clade of soluble kinases consisted of 21 distinct published kinase families and most functionally characterized subfamilies form separate clades. These included four kinase families of the mitogen-activated kinase cascades (MAPK, MAP2K, raf-like and ste-like MAP3K), which are known for transmission of various responses to changes in gene expression . Kinases annotated as MAP3 kinases were split into two separate clades as already noted earlier : the 37 members of the ste-like MAP3 kinases grouped together with MAP2 kinases, and the 48 members of raf-like MAP3 kinases formed a separate clade. To date, few phosphorylation targets for these different MAP3 kinases are known . For the ste-like MAP3 kinases several MAP2 kinases were found among the target proteins, confirming the classic cascade model in which ste-like MAP3 kinases phosphorylate and activate MAP2 kinases as well as transcription factors [36, 37]. In contrast, raf-like MAP3 kinases contain well known kinases like VIK1 (At1g14000) , STY-kinases (At2g17770, At4g35780, At4g38470) and CTR1 (At5g03730). VIK1 was found to be involved in regulation of tonoplast transporters , and CTR1 interacts with the Ethylene receptor ETR1 (At1g66340) and phosphorylates the transcription factor EIN3 (At3g20770) . Since so far no MAP2 kinases were found to be phosphorylated by raf-like MAP3 kinases, it is most likely that they are a mis-annotated kinase family and do not actually function in MAPK-signaling.
Kinases decoding calcium signals were grouped into the families of calcium-dependent kinases (CDPK) and CBL-interacting kinases (CIPK/SnRK3). The latter group was located on the same clade as two other groups of Snf-related kinase (SnRK) . The AGC kinases act as effectors of second messengers, are involved in many different processes from blue light perception to auxin signalling and, as expected, form a distinct family . Another soluble kinase family involved in hormone signalling and represented as a separate clade in the phylogeny is the family of Shaggy-like kinases (SLK/GSK3) . These kinases act as signal transducers from plasma-membrane located processes to transcription factors or other kinases, and they are best characterized in the context of brassinosteroid signalling . Kinases involved in the regulation of cell organisation and cell division group into the families of cyclin dependent kinases (CDK) , the casein and casein-related kinases (CKII and CKL) [45, 46], the never-in-mitosis kinases (NIMA/NEK)  as well as the three AURORA kinases . With-no-lysine kinases (WNK) contribute to the regulation of circadian rhythm .
In addition, 48 soluble kinases without known family annotation were found to form separated clades in the phylogenetic tree, but for most of these kinases no functional information is available yet. Based on their placement on the phylogeny, we were able to annotate 20 of these soluble kinases: one was defined as AGC kinase (PDK1;3, AT2G20050), one as CDK (CDKC1;2, AT3G01085), three as members of the Raf-like MAP3 kinases family and 15 as ste-like MAP3 kinases. All newly identified kinases were marked with one asterisk in the proposed annotation in Additional file 1.
The annotation of receptor-like kinases on the phylogeny confirmed earlier efforts of receptor kinase classification based on whole sequences and extracelluar domain structures . The division of this large clade into receptor kinases (RLK) and receptor-like cytoplasmatic kinases (RLCK) corresponds well with the information of subcellular location obtained from SUBA3 . Sequences without annotation were assigned based on their sister-group relationship to known kinases (Additional file 3). These newly annotated receptor kinases were assigned with one asterisk in Additional file 1. In those cases where existing protein annotations were in disagreement with the annotation of the majority of kinases in the same clade, these kinases were renamed and marked with two asterisks in Additional file 1 (see also Additional file 3). Branches in the phylogeny with not-annotated or mis-annotated kinases did not differ in their bootstrap values from the already annotated kinases. Thus, the newly annotated kinases had the same degree of support for family membership from the bootstrap values as already annotated family members.
In contrast to the family annotation for soluble kinases, the functional context of only few RLKs is known. Therefore, the functional annotation of RLKs is mainly derived from domain structure and homologies to kinases in yeast or animals instead of activating substrate or acting pathway. Further investigation of biological processes and targets for most of the RLKs is needed to provide a similar quality of functional annotation for the RLK families as is already available for the soluble kinase families.
Subcellular localizations for the soluble kinases according to SUBA3  ranged from nucleus to plasma membrane (Figure 1B) and is in good agreement with the division of the phylogeny into the cytoplasmic kinases, receptor kinases and membrane-located cytoplasmic kinases. Within the soluble kinase clade, membrane location was often achieved by posttranslational modifications. Myristoylation is a posttranslational modification of proteins allowing a reversible protein association with plasma membrane . Currently, 437 proteins are known to be myristoylated (Additional file 1), among them 83 kinases from our analysis . The subfamilies RLCK_2 and RLCK_7 contained membrane associated kinases with 10 and 12 members known to be myristoylated. Also most calcium-dependent kinases, all CDPK-related kinases, and all CDK-like kinases are soluble kinases with known reversible membrane interactions.
The origin of the Arabidopsis kinome: insights from gene duplication analysis
Summary of duplications identified by MCScanX ( Arabidopsis thaliana against itself)
All genes (%)
Distribution of duplication patterns to kinase subfamilies
Pie charts in Figure 3B indicate the proportion of genes that are assigned to particular subcellular compartments. Some kinase families, such as RLCK_9 contained genes with very different consensus localizations . Treating the majority of subcellular localization within each subfamily as representative, we observed a tendency of receptor kinase families with predominant plasma membrane localization to preserve genes arising from tandem duplications rather than those originating from segmental duplications. Conversely, most cytosolic kinase subfamilies exhibit moderate to low ratios of observed to expected counts for tandem duplications and relatively large ratios for segmental duplications. With the exception of CKII, soluble kinase subfamilies show increased ratios in either tandem or segmental duplications but not in both. Tandem duplicates were already proposed as one of the major mechanisms of expansion of the large receptor kinase family in Arabidopsis. Although tandem duplication events are not predicted to be the major mode of gene copy generation for the entire set of kinase families in this study (Table 1), a certain tendency to retain tandem duplicates within the receptor kinases could be confirmed.
An enrichment analysis of particular duplication events associated with specific kinase subfamilies was performed by Fisher’s Exact test (Figure 3B) and obtained p-values were corrected for multiple testing . The LRR clade 1 (containing the subfamilies LRR_6A, LRR_6B, LRR_1), and the mixed clade 1 (containing the subfamilies CRR, RLCK_1, LRR_8A, RLCK_3, RLCK_4, WAK), were found significantly depleted (p < 0.05) for WGD/segmental duplications, while the LRR clade 2 (containing the subfamilies LRR_9A, LRR_9) was significantly enriched (p < 0.05) for duplicates originating from segmental duplications. LRR clade 3 (containing the subfamilies LRR_4, LRR_8B, LRR_8C) was found to be depleted for dispersed duplicates (p < 0.001) and enriched for tandem duplicates (p < 0.001) and proximal gene copies (p < 0.05). LRR clade 1 and LRR clade 3 were also significantly enriched proximal duplication (p < 0.001 and p < 0.05, respectively). All subfamilies, except the non-characterized soluble kinase subfamily, were found to be weakly depleted for singleton genes.
Functional divergence of duplications
To assess the significance of protein kinases functional diversification in context of whole plant functions, we studied the distribution of phenotypes from loss-of-function kinase mutants . Currently, about 2,400 genes with detectable loss-of-function phenotypes in Arabidopsis were identified, among them 76 kinases represented in our phylogeny. Thus, kinases constitute 3.25% of all genes with detectable loss-of-function phenotype. Overall, 45% of these kinases show a conditional phenotype, 29% a morphological 18% a lethal, and 8% a biochemical phenotype (Additional file 6). Most of these phenotypes were described for soluble kinases (soluble kinases: 11.5%; receptor kinases: 5.7%), particularly calcium-dependent kinases, all three SnRK-families, AGC kinases as well as six of 48 kinases without family annotation. The loss-of-function mutants in kinases were generally significantly enriched for essential phenotypes (p = 0.020, Fisher’s Exact Test), cellular and biochemical phenotypes (p < 0.001, Fisher’s Exact Test) as well as for conditional phenotypes (p = 0.002, Fisher’s Exact Test) compared to loss-of-function mutants in non-kinase genes. There was no enrichment for morphological phenotypes among the kinase mutants. Kinases with essential phenotypes were observed to contain a high proportion of dispersed duplications (p = 0.031), while kinases resulting in morphological loss-of-function phenotypes were particularly enriched for segmental duplications (p = 6.48E-95), supporting the tendency for functional diversification for dispersed duplications and functional conservation in segmental duplications.
Kinases within the cellular interaction network
This study aimed at investigating the role of gene duplication events in the serine-threonine-tyrosine kinase complement of Arabidopsis thaliana. We constructed a phylogeny of eukaryotic kinase families and undertook efforts to link gene duplication events to functional diversification or conservation based on gene expression data.
The phylogeny of Arabidopsis kinase subfamilies has been intensely studied for individual kinase families, especially for CDPKs , MAP-Kinases , or lectin receptor kinases , and members of the large family of receptor kinases . Recent genome wide and species wide approaches in classification and annotation of eukaryotic protein kinases were based on Hidden Markov Model profiles [59, 60]. For plants, the most comprehensive classification can be found in the Eukaryotic Kinase and Phosphatase Database (EKPD) . Overall, in our study a total of 111 Arabidopsis kinases listed in EKPD were not part of our phylogeny (Additional file 7). In contrast to EKPD, here kinase domains we also filtered for a minimum of 70% sequence coverage of the HMM representing the kinase domain, assuming that kinases with less than 70% coverage of the whole HMM kinase domain would not function as kinases. 94 proteins with an annotated kinase domain were excluded on this basis. Their kinase domains showed large gaps in the sequence covering conserved regions of the model kinase domain, with occasionally even half of the domain missing. Thus, especially the 50 excluded members of the receptor kinase group may have functionally degenerated kinase domains. In particular, 26 members of the atypical kinases were excluded, which show high similarities to prokaryotic kinases and are particularly abundant in plastid and mitochondrial location . Another 17 kinases listed in EKPD did not match the HMM profile we used as template. Two plastidial yet uncharacterized kinases matching our HMM profile of an eukaryotic kinase were not classified as kinase in EKPD. Therefore, family-specific HMM models, as already used in EKPD, will be valuable in annotation of all kinases in Arabidopsis and eukaryotes in general. Histidine-receptor kinases  which originate from bacterial two-component signalling were neither considered in EKPD nor here. In total we included 940 kinases with eukaryotic kinase domain in the analysis, out of which only 553 kinases have yet been functionally characterized, and for 298 out of these kinases we have some regulatory information . Based on the phylogeny, we were able to newly annotate 77 soluble kinases and 108 receptor like kinases (Additional file 1) and assign them to an existing subfamily. Thus, the phylogeny in itself provided an important contribution in definition and classification of protein kinases with unknown function.
Determination of syntenic regions and family based enrichment
Due to the complex history of duplication events especially in plants, the identification of syntenic regions within and between genomes is a nontrivial task, and conclusions drawn from publicly available datasets may underlie controversial assumptions related to the particular organism under study. To aid such analyses, several recent efforts were made to automate and generalize the process of detecting and evaluating syntenic genome regions. In addition, integrative web-based resources, for example the comparative genomic system platform (CoGe) , provide interactive frameworks for query and visualization of syntenic regions within and between genomes. Thus, we focused on the recently published MCScanX utility  and based our analysis on the comparison of Arabidopsis thaliana against itself, as well as comparisons of close (Arabidopsis lyrata) and distant (poplar) relatives.
To determine enrichments for specific types of duplication events in different kinase families, a ratio-based approach (observed/expected) and an enrichment analysis was used. The ratio-based approach showed that tandem duplication ratios tended to be considerably higher than segmental duplication ratios. However, since the detection of subfamily characteristics depends on relative differences between subfamilies, this only marginally affected the conclusions drawn from the analysis. It is important to note that the ratio-based analysis is in general not redundant to the enrichment analysis, since enrichment analysis incorporates the family size parameter. For example, the LRR_clade_3 shows only minor deviations from the median ratios in the boxplot but is found significantly enriched for tandem duplicates (p < 0.001) by Fisher’s exact test and Chi-square residuals.
An interesting observation was that the kinases analyzed here constituted 3.4% of the total genome, but made up 4.5% of all duplicated genes suggesting higher frequencies of duplicated genes in that gene family. This is in line with the observation that kinases are significantly under-represented among single-copy genes . Thus, duplications among kinases showed a tendency to be retained and gave rise to functional diversification.
Functional diversification and conservation
A tendency for frequent dispersed duplications and associated diversification in gene expression was observed particularly for some of the receptor kinase families. This large kinase subclade was also found to be especially affected by strong diversification through single nucleotide polymorphisms [61, 62]. Since receptor kinases have roles in pathogen defense, self-incompatibility and various developmental processes , functional diversification of this gene family allows rapid adaptations to specific environmental conditions. In contrast, some of the soluble kinase families, such as RLCK, MAP-Kinase and SnRK families showed a tendency to be duplicated as a result of segmental duplications and associated functional conservation based on gene coexpression. This is in line with findings that cytoplasmic proteins and proteins involved in cellular metabolism are also less frequently affected by phospho-specific nucleotide polymorphisms .
The prediction and analysis of syntenic blocks and duplication events within gene families of interest can be used to link knowledge from functional biology and proteomics to insights from an evolutionary viewpoint. In our study, the kinome of Arabidopsis thaliana was analysed with respect to characteristic patterns of various types of gene duplication modes in combination with subcellular localization, gene expression, and phenotypic data. Summarizing the findings, cytosolic protein kinases and receptor-like protein kinases exhibit different frequencies in the retention of genes duplicated through segmental and tandem duplications, respectively and resulted in different degrees of functional diversification. The phylogeny allowed classification and annotation of yet uncharacterized kinases. The approach undertaken here can be applied to any gene family in any organism with an annotated genome.
Availability of supporting data
Supplementary material is available as additional files through BioMed Central. The original tree file of the phylogeny has been submitted to Dryad (http://datadryad.org) and is available under the reference number doi: 10.5061/dryad.pq7d7.
MZ, GS, RB and WXS were funded by the Max-Planck Society and SH was funded by the University of Potsdam.
- Champion A, Kreis M, Mockaitis K, Picaud A, Henry Y: Arabidopsis kinome: after the casting. Funct Integr Genomics. 2004, 4: 163-187.PubMedView ArticleGoogle Scholar
- Shiu S-H, Bleecker AB: Expansion of the receptor-like kinase/Pelle gene family and receptor-like proteins in Arabidopsis. Plant Physiol. 2003, 132: 530-543. 10.1104/pp.103.021964.PubMedView ArticleGoogle Scholar
- Gossmann TI, Schmid KJ: Selection-driven divergence after gene duplication in Arabidopsis thaliana. J Mol Evol. 2011, 73: 153-165. 10.1007/s00239-011-9463-2.PubMedView ArticleGoogle Scholar
- Wang Y, Wang X, Paterson AH: Genome and gene duplications and gene expression divergence: a view from plants. Ann N Y Acad Sci. 2012, 1256 (1–14):Google Scholar
- Lyons E, Pedersen B, Kane P, Alam M, Ming R, Tang H, Wang X, Bowers J, Paterson A, Lisch D, Freeling M: Finding and comparing syntenic regions among Arabidopsis and the outgroups papaya, poplar, and grape: CoGe with Rosids. Plant Physiol. 2008, 148: 1772-1781. 10.1104/pp.108.124867.PubMed CentralPubMedView ArticleGoogle Scholar
- Woodhouse MR, Tang H, Freeling M: Different gene families in Arabidopsis thaliana transposed in different epochs and at different frequencies throughout the Rosids. Plant Cell. 2011, 23: 4241-4253. 10.1105/tpc.111.093567.PubMed CentralPubMedView ArticleGoogle Scholar
- Wang Y, Tang H, DeBarry JD, Tan X, Li J, Wang X, Lee T, Jin H, Marler B, Guo H, Kissinger JC, Paterson A: MCScanX: a toolkit for detection and evolutionary analysis of gene syteny and collinearity. Nucleic Acids Res. 2012, 40 (7): e49-10.1093/nar/gkr1293.PubMed CentralPubMedView ArticleGoogle Scholar
- Freeling M: Bias in plant gene content following different sorts of duplication: Tandem, whole-genome, segmental or by transposition. Annu Rev Plant Biol. 2009, 60: 433-453. 10.1146/annurev.arplant.043008.092122.PubMedView ArticleGoogle Scholar
- Cannon SB, Mitra A, Baumgarten A, Young ND, May G: The roles of segmental and tandem gene duplication in the evolution of large gene families in Arabidopsis thaliana. BMC Plant Biol. 2004, 4: 10-10.1186/1471-2229-4-10.PubMed CentralPubMedView ArticleGoogle Scholar
- Van de Peer Y, Fawcett JA, Proost S, Sterck L, Vandepoele K: The flowering world: a tale of duplications. Trends Plant Sci. 2009, 14 (12): 680-688. 10.1016/j.tplants.2009.09.001.PubMedView ArticleGoogle Scholar
- Baumgarten A, Cannon S, Spangler R, May G: Genome-level evolution of resistance genes in Arabidopsis thaliana. Genetics. 2003, 165: 309-319.PubMed CentralPubMedGoogle Scholar
- Eddy SR: Accelerated Profile HMM Searches. PLoS Comput Biol. 2011, 7 (10): e1002195-10.1371/journal.pcbi.1002195.PubMed CentralPubMedView ArticleGoogle Scholar
- Hartmann S, Vision TJ: Using ESTs for phylogenomics: can one accurately infer a phylogenetic tree from a gappy alignment?. BMC Evol Biol. 2008, 26 (8): 95-View ArticleGoogle Scholar
- Stamatakis A, Ludwig T, Meier H: RAxML-III: a fast program for maximum likelihood-based inference of large phylogenetic trees. Bioinformatics. 2005, 21 (4): 456-463. 10.1093/bioinformatics/bti191.PubMedView ArticleGoogle Scholar
- Zulawski M, Braginets R, Schulze WX: PhosPhAt goes kinases – Searchable protein kinase target information in the plant phosphorylation site database PhosPhAt. Nucleic Acids Res. 2013, 41 (D1): D1176-D1184. 10.1093/nar/gks1081.PubMed CentralPubMedView ArticleGoogle Scholar
- Hu TT, Pattyn P, Bakker EG, Cao J, Cheng JF, Clark RM, Fahlgren N, Fawcett JA, Grimwood J, Gundlach H, Haberer G, Hollister JD, Ossowski S, Ottilar RP, Salamov AA, Schneeberger K, Spannagel M, Wang X, Yang L, Nasrallah ME, Bergelson J, Carrington JC, Gaut BS, Schmutz J, Mayer KF, van de Peer Y, Grigoriev IV, Nordborg M, Weigel D, Guo YL: The Arabidopsis lyrata genome sequence and the basis of rapid genome size change. Nat Genet. 2011, 43 (5): 476-481. 10.1038/ng.807.PubMed CentralPubMedView ArticleGoogle Scholar
- Tuskan GA, Difazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A, Schein J, Sterck L, Aerts A, Bhalerao RR, Bhalerao RP, Blaudez D, Boerjan W, Brun A, Brunner A, Buscov V, Campbell M, Carson J, Chalot M, Chapman J, Chen GL, Cooper D, Couthino PM, Couturier J, Covert S, Cronk Q, et al: The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science. 2006, 313 (5793): 1596-1604. 10.1126/science.1128691.PubMedView ArticleGoogle Scholar
- Cannon SB, Kozik A, Chan B, Michelmore R, Young ND: DiagHunter and GenoPix2D: programs for genomic comparisons, large-scale homology discovery and visualization. Genome Biol. 2003, 4 (10): R68-10.1186/gb-2003-4-10-r68.PubMed CentralPubMedView ArticleGoogle Scholar
- Benjamini Y, Hochberg Y: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Royal Stat Soc. 1995, 57: 289-300.Google Scholar
- Paradis E, Claude J, Strimmer K: APE: Analyses of Phylogenetics and Evolution in R language. Bioinformatics. 2004, 20 (2): 289-290. 10.1093/bioinformatics/btg412.PubMedView ArticleGoogle Scholar
- Schliep KP: phangorn: phylogenetic analysis in R. Bioinformatics. 2011, 27 (4): 592-593. 10.1093/bioinformatics/btq706.PubMed CentralPubMedView ArticleGoogle Scholar
- Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T: Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13 (11): 2498-2504. 10.1101/gr.1239303.PubMed CentralPubMedView ArticleGoogle Scholar
- Tanz SK, Castleden I, Hooper CM, Vacher M, Small I, Millar HA: SUBA3: a database for integrating experimentation and prediction to define the SUBcellular location of proteins in Arabidopsis. Nucleic Acids Res. 2013, 41: D1185-D1191. 10.1093/nar/gks1151.PubMed CentralPubMedView ArticleGoogle Scholar
- Lloyd J, Meinke D: A comprehensive dataset of genes with a loss-of-function mutant phenotype in Arabidopsis. Plant Physiol. 2012, 158 (3): 1115-1129. 10.1104/pp.111.192393.PubMed CentralPubMedView ArticleGoogle Scholar
- Arabidopsis Interactome Mapping Consortium T: Evidence for network evolution in an Arabidopsis interactome map. Science. 2011, 333 (6042): 601-607.View ArticleGoogle Scholar
- Podell S, Gribskow M: Predicting N-terminal myristoylation sites in plant proteins. BMC Genomics. 2004, 5 (1): 37-10.1186/1471-2164-5-37.PubMed CentralPubMedView ArticleGoogle Scholar
- Durek P, Schmidt R, Heazlewood JL, Jones A, MacLean D, Nagel A, Kersten B, Schulze WX: PhosPhAt: the Arabidopsis thaliana phosphorylation site database. An update. Nucleic Acids Res. 2010, 38: D828-D834. 10.1093/nar/gkp810.PubMed CentralPubMedView ArticleGoogle Scholar
- Thimm O, Bläsing O, Gibon Y, Nagel A, Meyer S, Kruger P, Selbig J, Muller LA, Rhee SY, Stitt M: MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. Plant J. 2004, 37 (6): 914-939. 10.1111/j.1365-313X.2004.02016.x.PubMedView ArticleGoogle Scholar
- Swarbreck D, Wilks C, Lamesch P, Berardini TZ, Garcia-Hernandez M, Foerster H, Li D, Meyer T, Muller R, Ploetz L, Radenbaugh A, Singh S, Swing V, Tissier C, Zhang P, Huala E: The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res. 2008, 36 (D1009-1014):Google Scholar
- Lundquist PK, Davis JI, Van Wijk KJ: ABC1K atypical kinases in plants: filling the organellar kinase void. Trends Plant Sci. 2012, 17 (9): 546-555. 10.1016/j.tplants.2012.05.010.PubMed CentralPubMedView ArticleGoogle Scholar
- Huang Y, Houston NL, Tovar-Mendez A, Stevenson SE, Miernyk JA, Randall DD, Thelen JJ: A quantitative mass spectrometry-based approach for identifying protein kinase-clients and quantifying kinase activity. Anal Biochem. 2010, 402 (1): 69-76. 10.1016/j.ab.2010.03.028.PubMedView ArticleGoogle Scholar
- Grefen C, Harter K: Plant two-component systems: principles, functions, complexity and cross talk. Planta. 2004, 219 (5): 733-742.PubMedView ArticleGoogle Scholar
- Arabidopsis Genome Initiative T: Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000, 408: 796-816. 10.1038/35048692.View ArticleGoogle Scholar
- Colcombet J, Hirt H: Arabidopsis MAPKs: a complex signalling network involved in multiple biological processes. Biochem J. 2008, 413: 217-226. 10.1042/BJ20080625.PubMedView ArticleGoogle Scholar
- Champion A, Picaud A, Henry Y: Reassessing the MAP3K and MAP4K relationships. Trends Plant Sci. 2004, 7: 301-308.Google Scholar
- Hadiarto T, Nanmori T, Matsuoka D, Iwasaki T, Sato K, Fukami Y, Azuma T, Yasuda T: Activation of Arabidopsis MAPK kinase kinase (AtMEKK1) and induction of AtMEKK1-AtMEK1 pathway by wounding. Planta. 2006, 223 (4): 708-713. 10.1007/s00425-005-0126-7.PubMedView ArticleGoogle Scholar
- Miao Y, Laun TM, Smykowski A, Zentgraf U: Arabidopsis MEKK1 can take a short cut: it can directly interact with senescence-related WRKY53 transcription factor on the protein level and can bind to its promoter. Plant Mol Biol. 2007, 65 (1–2): 63-76.PubMedView ArticleGoogle Scholar
- Wingenter K, Trentmann O, Winschuh I, Hörmiller II, Heyer AG, Reinders J, Schulz A, Geiger D, Hedrich R, Neuhaus HE: A member of the mitogen-activated protein 3-kinase family is involved in the regulation of plant vacuolar glucose uptake. Plant J. 2011, 68 (5): 890-900. 10.1111/j.1365-313X.2011.04739.x. in pressPubMedView ArticleGoogle Scholar
- Huang Y, Li H, Hutchison CE, Laskey J, Kieber JJ: Biochemical and functional analysis of CTR1, a protein kinase that negatively regulates ethylene signaling in Arabidopsis. Plant J. 2003, 33 (2): 221-233. 10.1046/j.1365-313X.2003.01620.x.PubMedView ArticleGoogle Scholar
- Hrabak EM, Chan CW, Gribskov M, Harper JF, Choi JH, Halford N, Kudla J, Luan S, Nimmo HG, Sussman MR, Thomas M, Walker-Simmons K, Zhu JK, Harmon AC: The Arabidopsis CDPK-SnRK superfamily of protein kinases. Plant Physiol. 2003, 132 (2): 666-680. 10.1104/pp.102.011999.PubMed CentralPubMedView ArticleGoogle Scholar
- Bögre L, Ökresz L, Henriques R, Anthony RG: Growth signalling pathways in Arabidopsis and the AGC protein kinases. Trends Plant Sci. 2003, 8 (9): 424-431. 10.1016/S1360-1385(03)00188-2.PubMedView ArticleGoogle Scholar
- Jonak C, Hirt H: Glycogen synthase kinase 3/SHAGGY-like kinases in plants: an emerging family with novel functions. Trends Plant Sci. 2002, 7: 457-461. 10.1016/S1360-1385(02)02331-2.PubMedView ArticleGoogle Scholar
- Yan Z, Zhao J, Peng P, Chiara RK, Li J: BIN2 functions redundantly with other Arabidopsis GSK3-like kinases to regulate brassinosteroid signaling. Plant Physiol. 2009, 150 (2): 710-721. 10.1104/pp.109.138099.PubMed CentralPubMedView ArticleGoogle Scholar
- Vandepoele K, Raes J, De Veylder L, Rouzé P, Rombauts S, Inzé D: Genome-wide analysis of core cell cycle genes in Arabidopsis. Plant Cell. 2002, 14 (4): 903-916. 10.1105/tpc.010445.PubMed CentralPubMedView ArticleGoogle Scholar
- Pepperkok R, Lorenz P, Ansorge W, Pyerin W: Casein kinase II is required for transition of G0/G1, early G1, and G1/S phases of the cell cycle. J Biol Chem. 1994, 269 (9): 6986-6991.PubMedGoogle Scholar
- Ben-Nissan G, Cui W, Kim DJ, Yan Y, Yoo BC, Lee JY: Arabidopsis casein kinase 1-like 6 contains a microtubule-binding domain and affects the organization of cortical microtubules. Plant Physiol. 2008, 148 (4): 1897-1907. 10.1104/pp.108.129346.PubMed CentralPubMedView ArticleGoogle Scholar
- Motose H, Takatani S, Ikeda T, Takahashi T: NIMA-related kinases regulate directional cell growth and organ development through microtubule function in Arabidopsis thaliana. Plant Signal Behav. 2012, 7 (12): 1552-1555. 10.4161/psb.22412.PubMed CentralPubMedView ArticleGoogle Scholar
- Hofmann NR: A role for plant AURORA kinases in formative cell division. Plant Cell. 2011, 23 (11): 3867-10.1105/tpc.111.231110.PubMed CentralPubMedView ArticleGoogle Scholar
- Wang Y, Liu K, Liao H, Zhuang C, Ma H, Yan X: The plant WNK gene family and regulation of flowering time in Arabidopsis. Plant Biol (Stuttg). 2008, 10 (5): 548-562. 10.1111/j.1438-8677.2008.00072.x.View ArticleGoogle Scholar
- Shiu S-H, Bleecker AB: Receptor-like kinases form Arabidopsis form a monophyletic gene family related to animal receptor kinases. Proc Natl Acad Sci U S A. 2001, 98 (19): 10763-10768. 10.1073/pnas.181141598.PubMed CentralPubMedView ArticleGoogle Scholar
- Boisson B, Giglione C, Meinnel T: Unexpected protein families including cell defense components feature in the N-myristoylome of a higher eukaryote. J Biol Chem. 2003, 278 (44): 43418-43429. 10.1074/jbc.M307321200.PubMedView ArticleGoogle Scholar
- Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden T: BLAST+: architecture and applications. BMC Bioinformatics. 2009, 10: 421-10.1186/1471-2105-10-421.PubMed CentralPubMedView ArticleGoogle Scholar
- Jeong H, Mason SP, Barabási AL, Oltvai ZN: Letzality and centrality in protein networks. Nature. 2001, 411: 41-42. 10.1038/35075138.PubMedView ArticleGoogle Scholar
- Yu H, Braun P, Yildirim MA, Lemmens I, Venkatesan K, Sahalie J, Hirozane-Kishikawa T, Gebreab F, Li N, Simonis N, Hao T, Rual JF, Dricot A, Vaszquez A, Murray RR, Simon C, Tardivo L, Tam S, Svrzikapa N, Fan C, de Smet AS, Motyl A, Hudson ME, Park J, Xin X, Cusick ME, Moore T, Boone C, Snyder M, Roth FP, et al: High-quality binary protein interaction map of the yeast interactome network. Science. 2008, 322 (5898): 104-110. 10.1126/science.1158684.PubMed CentralPubMedView ArticleGoogle Scholar
- De Smet R, Adams KL, Vandepoele K, Van Montagu MC, Maere S, Van de Peer Y: Convergent gene loss following gene and genome duplications creates single-copy families in flowering plants. Proc Natl Acad Sci U S A. 2013, 110 (8): 2898-2903. 10.1073/pnas.1300127110.PubMed CentralPubMedView ArticleGoogle Scholar
- Chen L, Hu W, Tan S, Wang M, Ma Z, Zhou S, Deng X, Zhang Y, Huang C, Yang G, He G: Genome-wide identification and analysis of MAPK and MAPKK gene families in Brachypodium distachyon. PLoS One. 2012, 7 (10): e46744-10.1371/journal.pone.0046744.PubMed CentralPubMedView ArticleGoogle Scholar
- Vaid N, Pandey PK, Tuteja N: Genome-wide analysis of lectin receptor-like kinase family from Arabidopsis and rice. Plant Mol Biol. 2012, 80 (4-5): 365-388. 10.1007/s11103-012-9952-8.PubMedView ArticleGoogle Scholar
- Shiu S-H, Bleecker AB: Plant receptor-like kinase gene family: diversity, function, and signaling. Sci STKE. 2001, 113 (re22): 1-13.Google Scholar
- Wang Y, Liu Z, Chen H, Gao T, Pan Z, Yang Q, Guo A, Xue Y: EKPD: a hierarchical database of eukaryotic protein kinases and protien phosphatases. Nucleic Acids Res. 2014, 42 (1): D496-D502.PubMed CentralPubMedView ArticleGoogle Scholar
- Goldberg JM, Griggs AD, Smith JL, Haas BJ, Wortman JR, Zeng Q: Kinannote, a computer program to identify and classify members of the eukaryotic protein kinase superfamily. Bioinformatics. 2013, 29 (19): 2387-2394. 10.1093/bioinformatics/btt419.PubMed CentralPubMedView ArticleGoogle Scholar
- Borevitz JO, Hazen SP, Michael TP, Morris GP, Baxter IR, Hu TT, Chen H, Werner JD, Nordborg M, Salt DE, Kay SA, Chory J, Weigel D, Jones JD, Ecker JR: Genome-wide patterns of single-feature polymorphism in Arabidopsis thaliana. Proc Natl Acad Sci U S A. 2007, 104 (29): 12057-12062. 10.1073/pnas.0705323104.PubMed CentralPubMedView ArticleGoogle Scholar
- Riano-Pachon DM, Kleessen S, Neigenfind J, Durek P, Weber E, Engelsberger WR, Walther D, Selbig J, Schulze WX, Kersten B: Proteome-wide survey of phosphorylation patterns affected by nuclear DNA polymorphisms in Arabidopsis thaliana. BMC Genomics. 2010, 11 (1): 411-10.1186/1471-2164-11-411.PubMed CentralPubMedView ArticleGoogle Scholar
- Osakabe Y, Yamaguchi-Shinozaki K, Shinozaki K, Tran LS: Sensing the environment: key roles of membrane-localized kinases in plant perception and response to abiotic stress. J Exp Bot. 2013, 64 (2): 445-458. 10.1093/jxb/ers354.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.