- Research article
- Open Access
Plant protein peptidase inhibitors: an evolutionary overview based on comparative genomics
© Santamaría et al.; licensee BioMed Central Ltd. 2014
- Received: 17 September 2014
- Accepted: 18 September 2014
- Published: 25 September 2014
Peptidases are key proteins involved in essential plant physiological processes. Although protein peptidase inhibitors are essential molecules that modulate peptidase activity, their global presence in different plant species remains still unknown. Comparative genomic analyses are powerful tools to get advanced knowledge into the presence and evolution of both, peptidases and their inhibitors across the Viridiplantae kingdom.
A genomic comparative analysis of peptidase inhibitors and several groups of peptidases in representative species of different plant taxonomic groups has been performed. The results point out: i) clade-specific presence is common to many families of peptidase inhibitors, being some families present in most land plants; ii) variability is a widespread feature for peptidase inhibitory families, with abundant species-specific (or clade-specific) gene family proliferations; iii) peptidases are more conserved in different plant clades, being C1A papain and S8 subtilisin families present in all species analyzed; and iv) a moderate correlation among peptidases and their inhibitors suggests that inhibitors proliferated to control both endogenous and exogenous peptidases.
Comparative genomics has provided valuable insights on plant peptidase inhibitor families and could explain the evolutionary reasons that lead to the current variable repertoire of peptidase inhibitors in specific plant clades.
- Comparative genomics
- Peptidase inhibitors
- Plant evolution
- Protein families
Proteolysis is a ubiquitous mechanism required to maintain the life cycle in all known organisms. Degrading and recycling of proteins are crucial events to control protein functionality and to achieve that proteins act in correct spatial and temporal locations. In plants, peptidases are key players in numerous physiological processes [1, 2]. During plant development they are involved in the regulation of protein functionality and the breakdown of storage compounds in the seed and other plant tissues [3–5]. In relation with biotic and abiotic stresses, they are taking part in the regulation of both endogenous and exogenous proteins to fight against these natural plant stresses [6, 7]. As proteolysis is an irreversible mechanism, peptidases must be precisely controlled. Peptidase activity may be regulated at the transcriptional and translational levels, but the most important control is achieved at the protein level. Peptidase inhibitors are proteinaceous molecules that exert their action by regulating peptidase activity. In plant development, peptidase inhibitors are involved in the same physiological processes than the peptidases they control [8–10]. As defence proteins, they are inhibiting peptidases from the pests and pathogens that attack the plant [11, 12].
The MEROPS database is dedicated to the analyses of peptidase and peptidase inhibitors . In this database both peptidases and their inhibitors are classified into clans based on structural similarity or sequence features. All members of a clan share a similar protein fold. Clans are divided in families based on common ancestry. All the members of a family are homologous proteins. At present, 75 peptidase inhibitor families are compiled in the database.
The unique study focused on the different peptidase inhibitor families that exist in a specific life clade has been performed on the prokaryotes kingdom . In plants, several peptidase inhibitor families such as I4 Serpins (MEROPS family identifier and common name), I13 Potato type I (Pin-I), I25 cystatins or I20 Potato type II (Pin-II) have been already reviewed [4, 9, 15, 16]. However, an evolutionary and global analysis of the different inhibitor families in different plant species has still not been performed. The field of genomics has been conveniently developed in last years and numerous tools have arisen to deal with the enormous number of sequences deposited in the databases. Nowadays, a great number of plant genomes have been sequenced and annotated, including species from basal taxonomic groups . These genomic sequences have been included in several comparative genomic programs, such as Phytozome, PLAZA or GreenPhylDB [18–20], simplifying the process to extract and compare information on the family members coming from different plant species [17, 21]. Using these strong last generation tools, the evolutionary features regarding the distribution of protein peptidase inhibitors in the plant kingdom have been analyzed in this work.
Protein peptidase inhibitors families in plants
To get the complete number of protein peptidase inhibitors in plants, several species were selected. The genomes of these species have been completely sequenced and annotated, and drafts of these sequences are available on the web. These species were: fifteen eudicots (Ricinus communis , Populus trichocarpa , Medicago truncatula , Glycine max , Cucumis sativus , Prunus persica , Fragaria vesca , Arabidopsis thaliana [29, 30], Carica papaya , Theobroma cacao , Vitis vinifera , Mimulus guttatus ), four monocots (Sorghum bicolor , Zea mays , Oryza sativa [37, 38], Brachypodium distachyon ), one pseudofern (Selaginella moellendorffii ), one moss (Physcomitrella patens [41, 42]), and five algae (Chlamydomonas reinhardtii , Volvox carteri , Coccomyxa subellipsoidea , Micromonas pusilla , Ostreococcus lucimarinus ). All the genomes of these plant species are accessible at Phytozome comparative genomics database, and most of them also at GreenPhylDB comparative genomics database. Gene prediction quality varies among the annotation stage of the different genomes and the gene family distribution and size could slightly be modified when new annotation versions will be released.
Distribution of protein peptidase inhibitor families in the Viridiplantae
Chlorophyta, Bryophyta, Lycopodiophyta, Monocots, Eudicots
Chlorophyta (Chlorophyceae, Trebouxiophyceae)
Chlorophyta, Bryophyta, Lycopodiophyta, Monocots, Eudicots
Chlorophyta, Bryophyta, Lycopodiophyta, Monocots, Eudicots
I7 Squash serine
I9 Subtilisin propeptide
Chlorophyta, Bryophyta, Lycopodiophyta, Monocots, Eudicots
Chlorophyta, Bryophyta, Lycopodiophyta, Monocots, Eudicots
Lycopodiophyta, Monocots, Eudicots
Chlorophyta, Bryophyta, Lycopodiophyta, Monocots, Eudicots
I29 Papain propeptide
Chlorophyta, Bryophyta, Lycopodiophyta, Monocots, Eudicots
I39 Alpha-2 macroglobulin
I51 SCPY Inhibitor
Chlorophyta, Bryophyta, Lycopodiophyta, Monocots, Eudicots
I83 ANT Inhibitor
Distribution of the restricted protein peptidase inhibitor families
The following families have a restricted distribution in plants, being specific of clades ranging from algae to land plants:
Family I2: named Kunitz-A, includes mainly animal serine peptidase inhibitors. BLAST searches have not identified these inhibitors in land plants, only in the algae C. reinhardtii. The MEROPS database shows that they are also present in other algae species.
Family I7: named squash serine peptidase inhibitors, are specific for plants and they have only been described in Cucurbitales. BLAST searches indicate the existence of two members in C. sativus.
Family I18: mustard family of serine peptidase inhibitors specific for plants and only described in Brassicales. BLAST searches identified six different members in A. thaliana.
Family I37: potato carboxypeptidase inhibitor family, inhibitors of metallopeptidases of the M14 family. Exclusively described in Solanales and not found in BLAST searches on the selected genomes.
Family I39: named alpha-2 macroglobulins, are proteins that interact with peptidases regardless of catalytic type. Abundant in bacteria and animals, according to MEROPS database they are also present in M. pusilla and P. trichocarpa. BLAST searches confirm their existence in the algae M. pusilla, but not in P. trichocarpa.
Family I55: named squash aspartic peptidase inhibitors, are specific for plants and they have mainly been described in Cucurbitales. BLAST searches reveal their specificity for Cucurbitales, where three members were identified in C. sativus.
Family I67: named bromeins, are inhibitors of the cysteine peptidase bromelain. Only described in the monocot Ananas comosus and not found by BLAST searches on the selected genomes.
Family I73: Veronica trypsin inhibitor family merely described in the eudicot Veronica hederifolia and not found by BLAST searches on the selected genomes.
Family I83: inhibitors of serine endopeptidases present in insect species and also in the Conifer Picea sitchensis. Not found by BLAST searches on the selected genomes.
Family I90: trypsin inhibitors only described in eudicot plants from the order Caryophyllales, and not found by BLAST searches on the selected genomes.
Evolution of the main protein peptidase inhibitor families
Gene content evolution of I1 Kazal in plants
To understand how the I1 Kazal lineage has evolved in the different plant clades, the individual Kazal domains from single domain proteins were aligned (see Additional file 1A). Extensive amino acid differences avoid the construction of a robust phylogenetic tree using all the Kazal sequences. Thus, sequences contributing to extensive gaps in the conserved regions of the alignment were discarded and a phylogenetic tree was constructed (see Additional file 2A). The corresponding schematic cladogram is shown in Figure 2B. As highlighted, two main clades were found, one from algae sequences and the other one from land plants. The evolutionary groups in the land plant sequences could not be clearly established in the tree. Eudicot sequences were mixed in different groups, with no evidences of species-specific proliferations. Monocot and moss sequences were grouped in separated clades supported by approximate likelihood-ratio test values (aLRT) higher than 65% but in a monophyletic clade common to eudicot sequences. This cladogram suggests that the Kazal family in plants has evolved differently between algae and land plants and that extensive sequence variations have took place in angiosperm species.
Gene content evolution of I3 Kunitz-P in plants
To avoid the difficulties to create and explain a phylogenetic tree using the 174 sequences, several of them were selected. The sequences from the eudicot species A. thaliana, M. truncatula and F. vesca and all the monocot species were chosen. The individual Kunitz-P domains were aligned (see Additional file 1B). Sequences contributing to extensive gaps in the conserved regions of the alignment were discarded and a phylogenetic tree was constructed (see Additional file 2B). The corresponding schematic cladogram is shown in Figure 3B. As highlighted, monocot and eudicot clades are separated. In the eudicot clade, several species-specific proliferations are detected, with sequences ranging from 3 to 11, which are supported by aLRT values higher than 80%. These expansions suggest that the evolution of the Kunitz-P family in eudicots is the result of extensive duplications in specific species.
Gene content evolution of I4 Serpin in plants
Similar to that performed for the I3 family, several sequences were chosen to create the phylogenetic tree. The algae, pseudofern and moss sequences, as well as the sequences from the eudicot species A. thaliana, M. truncatula and F. vesca and the monocot species S. bicolor and O. sativa were selected. Proteins were aligned (see Additional file 1C), sequences contributing to extensive gaps in the conserved regions of the alignment were discarded and a phylogenetic tree was constructed (see Additional file 2C). The corresponding schematic cladogram is shown in Figure 4B. Two main clades were found, one from algae sequences and the other from land plants including the moss and pseudofern sequences. As highlighted, different clado-specific proliferations were detected, supported by aLRT values higher than 80%. Three different lineages from monocot sequences were found including sequences from both, S. bicolor and O. sativa. From eudicots, most of the sequences from A. thaliana, M. truncatula and F. vesca were found in separated groups, suggesting species-specific (or clade-specific) proliferations.
Gene content evolution of I6 cereal in plants
To understand how the I6 Cereal lineage has evolved the Cereal proteins were aligned (see Additional file 1D). Two sequences from rice with extensive gaps that disturbed the alignment were discarded and a phylogenetic tree was constructed (see Additional file 2D). The corresponding schematic cladogram is shown in Figure 5B. Two different lineages, one for monocots and other for eudicot species were found, supported by aLRT values higher than 70%.
Gene content evolution of I12 Bowman-Birk in plants
The Bowman-Birk proteins were aligned (Additional file 1E) and a phylogenetic tree was constructed (Additional file 2E). As for I6 Cereal family, the corresponding schematic cladogram shows two different lineages, one for monocots and another one for eudicot species, supported by aLRT values higher than 95% (Figure 6B).
Gene content evolution of I13 Pin-I in plants
As for some other protein families, to avoid the difficulties to create and understand a phylogenetic tree using the 242 sequences, several of them were selected. The algae, pseudofern and moss sequences, as well as the sequences from the eudicot species A. thaliana, M. truncatula, F. vesca and V. vinifera, and the monocot species S. bicolor and O. sativa were chosen. Proteins were aligned (see Additional file 1F) and a phylogenetic tree was constructed (see Additional file 2F). The corresponding schematic cladogram is shown in Figure 7B. As highlighted, two clado-specific proliferations were detected, supported by aLRT values higher than 85%. The lineage from monocot sequences included 36 sequences, and the lineage for eudicots included 20 sequences. The most divergent monocot and eudicot sequences were not included in these clades.
Gene content evolution of I20 Pin-II in plants
To understand how the I20 Pin-II lineage has evolved in the different plant clades, all the individual Pin-II domains were aligned (see Additional file 1G) and a phylogenetic tree was constructed (see Additional file 2G). The corresponding schematic cladogram is shown in Figure 8B. Two different branches have been found, one of them comprised by the pseudofern sequences and the other one by the Angiosperm sequences, supported by aLRT values higher than 90%. The monocot and eudicot sequences were not separated in the clade suggesting a common evolution and, probably, a loss of this type of inhibitors in several species during evolution. The phylogram and the extensive variations in sequence also suggest a different origin of pseudofern and angiosperm sequences.
Gene content evolution of I25 cystatin in plants
As for some other families, the algae, pseudofern and moss sequences, as well as the sequences from the eudicot species A. thaliana, M. truncatula, F. vesca and V. vinifera, and the monocot species S. bicolor and O. sativa were selected. After discarding sequences contributing to extensive gaps in the conserved regions of the alignment, a phylogenetic tree was constructed (see Additional files 1H and 2H). The corresponding schematic cladogram is shown in Figure 9B. As highlighted, two different clado-specific proliferations are detected, supported by aLRT values higher than 80%. One is composed by sequences from all land plant species and the second is composed only by angiosperm sequences. This cladogram suggests that evolution of the cystatin family in plants is the result of extensive duplications from ancestral genes, and the divergence of these sequences in single clades.
Gene content evolution of I51 Serine Carboxypeptidase Y Inhibitors in plants
The high number of sequences in this family prompted us to select the sequences from the same plant species that were used for the cystatin family. Proteins were aligned (see Additional file 1I), and a phylogenetic tree was constructed (see Additional file 2I). The corresponding schematic cladogram is shown in Figure 10B. As highlighted, different clades were detected, which were supported by aLRT values higher than 60%. Basal clades were composed by algae, moss and pseudofern sequences, and include a proliferation of S. moellendorffii sequences in a specific pseudofern clade. Angiosperm sequences were found in three different lineages. Two of them were only formed by monocot and eudicot sequences, and the third lineage was also formed by moss and pseudofern sequences.
Coevolution of peptidases and their inhibitors in plants
To analyze the evolution of the different peptidase families, targets of the peptidase inhibitor families, is a key point to understand the meaning of the actual gene content of these peptidase inhibitory families. From the wide number of peptidase families C1A Papain and S8 Subtilisin families have been selected in function of their physiological importance in the plant, and the capacity of most inhibitor families to inhibit them. C1A Papain members are inhibited by I25 Cystatin inhibitors and by some I4 Serpin inhibitors. S8 Subtilisin members are inhibited by I1 Kazal, I3 Kunitz-P, I4 Serpin, I6 Cereal, I12 Bowman-Birk, I13 Pin-I and I20 Pin-II inhibitors.
Identification of inhibitory peptidase families in plants provides a working definition of a basal core shared by most plant clades and a starting point to figure out the evolutionary cues regarding the expansion of peptidase inhibitory networks. Variability is the key word that defines this kind of proteins. None of the peptidase inhibitory families is ubiquitously present in the genomes of all species analyzed, mainly due to their lack in some algae genomes. In this way, the genome of M. pusilla RCC299 does not have any member of the nine most conserved peptidase inhibitory families, the most represented algae species have only members of three of these families, and four of these families are not present in any algae genome. In addition, conservation of peptidase inhibitory families is also partial in land plants. The basal moss and pseudofern species lack members of the I3, I6 and I12 families, and several eudicot species lack members of some of the most conserved peptidase inhibitor families. The number of members of each family is another feature that confirms this global variability. Although angiosperms own in general higher number of members than basal plants, the highest number of I1 Kazal domains is in the Chlorophyceae algae C. reinhardtii and V. carteri, and of the I20 Pin-II proteins is in the pseudofern S. moellendorffii. Among angiosperms, the number of members of different families presents a strong variation. In some families such as I3 Kunitz-P, I4 Serpin or I13 Pin-I there are eudicot species with more than 20 members and others with only 1 member of the same family. This strong variability has also been found in prokaryotes, where mostly the occurrence of individual types of inhibitors is limited to few bacterial species scattered among phylogenetically distinct orders or even phyla of microbiota . Thus, variability has been confirmed as the main feature of peptidase inhibitory families.
At this point, it is desirable to know the evolutionary reasons that force this variability. Peptidase inhibitors may have two different functions. They are inhibitors of the endogenous peptidases, regulating the activity of the own plant peptidases to avoid an indiscriminate degradative action when it is not convenient [8, 9]. Furthermore, they could be also regulating the activity of exogenous peptidases, such as the peptidases that several pests and pathogens use to feed and to survive in the plant species they attack [11, 12, 48]. To understand which are the mechanisms related to this evolutionary variability, correlations between the number of peptidases and their inhibitors add some valuable information. The number of endogenous C1A or S8 inhibitors is positively correlated with the number of the peptidases they inhibit. Plant species with a high number of peptidases also contains a high number of inhibitors. This result is congruent with an evolutionary scenario in which endogenous peptidase proliferations are followed by peptidase inhibitor gene expansions. But this correlation is not perfect and some species have more or less inhibitors than those expected by their peptidase repertoire. Two possible reasons may explain these discrepancies: i) Several peptidases are not functional and, therefore, they have not force the increasing of the inhibitor members to regulate them; ii) Several inhibitors are not regulating endogenous peptidases and have proliferated to actually inhibit the peptidases used by the pests and pathogens to attack the plant. This second possibility has been previously appointed . A diversity of mechanisms, such as the recruitment of additional protein-folding families as inhibitors, the combination of different inhibitor domains into a single molecule, the high rate of retention of gene duplication events and the hypervariation of contact residues have been postulated . In the case of the plants, a mixed combination of evolutionary forces, the increase of endogenous peptidases and the fight against exogenous peptidases, will explain the actual repertoire of peptidase inhibitor present in land plants.
Another feature that supports the strong variability in the peptidase inhibitor repertoires and the possibility of a quick evolution mediated by pests and pathogens is the existence of small peptidase inhibitory families that are restricted to single species/clades. Ten of the 21 peptidase inhibitor families identified in plants are restricted to a clade: I2 and I39 to some algae lineages, I7, I18, I37, I55 and I67 to a eudicot or monocot order, and I73, I83 and I90 to a single angiosperm species. New gene families typically originate either from duplicate copies of a gene that become sufficiently divergent and are no longer recognized as members of the same family, from genes horizontally transferred, or from genes originated de novo from previously non-coding sequences . The small peptidase inhibitor families of plants are most probably derived from duplications followed by strong sequence divergence. For example, the I55 SQAPI family, only present in Cucurbitales presents a three-dimensional structure similar to that of the members of the I25 phytocystatin family, suggesting a common ancestor gene for both families [51, 52]. Likewise, the three-dimensional structure of the I18 MTI-2 family resembles the structure of the I13 Pin-I family [53, 54]. In this way, the selective losses of cysteine residues, and the conformational changes derived from it, have been postulated as a manner to get variability to be more effective against pathogen/pest attack . In contrast to the birth of new gene peptidase inhibitor families, the death of peptidase inhibitor families is a process that should be further investigated in plants. The loss of members from a family in some clades/species can be due to the loss of the physiological constraints that previously impose as deleterious the absence of this family . In the case of the plants, endogenous physiological activity of peptidases should be carefully regulated. The existence of a statistically significant correlation between peptidases and inhibitors in the plant kingdom supports that the loss of a specific physiological mechanism controlled by a peptidase could be correlated to the loss of some specific inhibitors of this peptidase. However, strong variations in the number of inhibitors in a specific peptidase inhibitor family pointed to a more active evolutionary mechanism based in the interaction with biotic stresses. Thus, the loss of peptidase inhibitor members should be most probably related with the absence of the driving force, for example, with the loss of the deleterious effects induced by a specific pathogen/pest species.
In conclusion, comparative genomics has allowed us to obtain further insights on the present repertoire of peptidase inhibitors in plants, and on the evolution of these peptidase inhibitor families. Variability in response to the endogenous and exogenous peptidases that have to be regulated by the inhibitors is the main feature of this kind of proteins. While new families commonly restricted to a specific species/clade will be probably found in next year’s, the evolutionary mechanisms that allow this strong diversity should be in deep investigated.
MEROPS v9.10 database  of peptidases and their inhibitors was used to establish the protein peptidase inhibitor families present in plants by looking for the distribution of each family in the different kingdoms. Then, Blast searches for peptidases and peptidase inhibitors were performed in publicly available genome databases. Sequences were identified by searching the current genome releases at the Phytozome v9.1 comparative genomic database . Blast searches were made in a recurrent way. First, a complete amino acid plant sequence from data banks corresponding to a protein of the family was used. Then, the protein sequences of each plant species were employed to search in the same species. Finally, after an alignment of the proteins found in plants, the conserved region surrounding the catalytic sites from the species most related was used to a final search in each plant species. To test the accuracy of the results, retrieved sequences were compared, when possible, with the identified sequences in each plant species of the same family in the GreenPhylDB v3.0 comparative genomics database .
Domain architecture prediction
Amino acid sequences for plant proteins putatively including at least one peptidase or protein peptidase inhibitory domain were subjected to a sequence search in the Pfam database v27.0  to know the combination of domains within each protein.
Protein alignments and phylogenetic trees
Alignments of the amino acid sequences were performed using the default parameters of MUSCLE v3.8 . Sequences with extensive gaps were manually excluded from phylogenetic studies. Phylogenetic and molecular evolutionary analyses were conducted using the programs PhyML v3.0 and MEGA v5.2 [58, 59]. The displayed protein peptidase inhibitor trees were constructed by means of a maximum likelihood PhyML method at Phylogeny.fr home using a BIONJ starting tree . The approximate likelihood-ratio test (aLRT) based on a Shimodaira-Hasegawa-like procedure was applied as statistical test for non-parametric branch support . All families were also analysed with the Maximum parsimony and the Neighbour-Joining algorithms, and with different gap penalties. No significant differences in the tree topologies were detected. Information about gene models for all proteins used to construct the phylogenetic trees is compiled in Additional file 3.
A linear trend line has been drawn through the number of peptidases and their inhibitors in different plant species. The R2 value indicates how well data fits the line. To test the statistical significance of the correlation results between the number of peptidases and their inhibitors in different plant species, a Pearson Product Moment Correlation test was performed using SigmaStat v3.5 software. A correlation coefficient (ρ) positive and a p value lower than 0.05 means that the two variables tends to increase in a concerted manner.
This work was supported by Ministerio de Educación y Ciencia (AGL2011-23650), Ministerio de Economía y Competitividad (Subprograma Juan de la Cierva 2012 to M.E.S.), and European Commission FP7 (Marie Curie action Co-Fund programme 2012 to M.D-M).
- van der Hoorn RA: Plant proteases: from phenotypes to molecular mechanisms. Annu Rev Plant Biol. 2008, 59: 191-223. 10.1146/annurev.arplant.59.032607.092835.PubMedView ArticleGoogle Scholar
- Pesquet E: Plant proteases - from detection to function. Physiol Plant. 2012, 145 (1): 1-4. 10.1111/j.1399-3054.2012.01614.x.PubMedView ArticleGoogle Scholar
- Schaller A: A cut above the rest: the regulatory function of plant proteases. Planta. 2004, 220 (2): 183-197. 10.1007/s00425-004-1407-2.PubMedView ArticleGoogle Scholar
- Roberts IN, Caputo C, Criado MV, Funk C: Senescence-associated proteases in plants. Physiol Plant. 2012, 145 (1): 130-139. 10.1111/j.1399-3054.2012.01574.x.PubMedView ArticleGoogle Scholar
- Tan-Wilson AL, Wilson KA: Mobilization of seed protein reserves. Physiol Plant. 2012, 145 (1): 140-153. 10.1111/j.1399-3054.2011.01535.x.PubMedView ArticleGoogle Scholar
- Kohli A, Narciso JO, Miro B, Raorane M: Root proteases: reinforced links between nitrogen uptake and mobilization and drought tolerance. Physiol Plant. 2012, 145 (1): 165-179. 10.1111/j.1399-3054.2012.01573.x.PubMedView ArticleGoogle Scholar
- van der Hoorn RA, Jones JD: The plant proteolytic machinery and its role in defence. Curr Opin Plant Biol. 2004, 7 (4): 400-407. 10.1016/j.pbi.2004.04.003.PubMedView ArticleGoogle Scholar
- Martinez M, Cambra I, Gonzalez-Melendi P, Santamaria ME, Diaz I: C1A cysteine-proteases and their inhibitors in plants. Physiol Plant. 2012, 145 (1): 85-94. 10.1111/j.1399-3054.2012.01569.x.PubMedView ArticleGoogle Scholar
- Volpicella M, Leoni C, Costanza A, De Leo F, Gallerani R, Ceci LR: Cystatins, serpins and other families of protease inhibitors in plants. Curr Protein Pept Sci. 2012, 12 (5): 386-398.View ArticleGoogle Scholar
- Martinez M, Cambra I, Carrillo L, Diaz-Mendoza M, Diaz I: Characterization of the entire cystatin gene family in barley and their target cathepsin L-like cysteine-proteases, partners in the hordein mobilization during seed germination. Plant Physiol. 2009, 151 (3): 1531-1545. 10.1104/pp.109.146019.PubMed CentralPubMedView ArticleGoogle Scholar
- Horger AC, van der Hoorn RA: The structural basis of specific protease-inhibitor interactions at the plant-pathogen interface. Curr Opin Struct Biol. 2013, 23 (6): 842-850. 10.1016/j.sbi.2013.07.013.PubMedView ArticleGoogle Scholar
- Haq SK, Atif SM, Khan RH: Protein proteinase inhibitor genes in combat against insects, pests, and pathogens: natural and engineered phytoprotection. Arch Biochem Biophys. 2004, 431 (1): 145-159. 10.1016/j.abb.2004.07.022.PubMedView ArticleGoogle Scholar
- Rawlings ND, Waller M, Barrett AJ, Bateman A: MEROPS: the database of proteolytic enzymes, their substrates and inhibitors. Nucleic Acids Res. 2014, 42 ((Database issue)): D503-509.PubMed CentralPubMedView ArticleGoogle Scholar
- Kantyka T, Rawlings ND, Potempa J: Prokaryote-derived protein inhibitors of peptidases: A sketchy occurrence and mostly unknown function. Biochimie. 2010, 92 (11): 1644-1656. 10.1016/j.biochi.2010.06.004.PubMed CentralPubMedView ArticleGoogle Scholar
- Benchabane M, Schluter U, Vorster J, Goulet MC, Michaud D: Plant cystatins. Biochimie. 2010, 92 (11): 1657-1666. 10.1016/j.biochi.2010.06.006.PubMedView ArticleGoogle Scholar
- Turra D, Lorito M: Potato type I and II proteinase inhibitors: modulating plant physiology and host resistance. Curr Protein Pept Sci. 2011, 12 (5): 374-385. 10.2174/138920311796391151.PubMedView ArticleGoogle Scholar
- Martinez M: From plant genomes to protein families: computational tools. Comput Struct Biotechnol J. 2013, 8: e201307001-PubMed CentralPubMedView ArticleGoogle Scholar
- Goodstein DM, Shu S, Howson R, Neupane R, Hayes RD, Fazo J, Mitros T, Dirks W, Hellsten U, Putnam N, Rokhsar D: Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res. 2012, 40 (Database issue): D1178-1186.PubMed CentralPubMedView ArticleGoogle Scholar
- Rouard M, Guignon V, Aluome C, Laporte MA, Droc G, Walde C, Zmasek CM, Perin C, Conte MG: GreenPhylDB v2.0: comparative and functional genomics in plants. Nucleic Acids Res. 2011, 39 (Database issue): D1095-1102.PubMed CentralPubMedView ArticleGoogle Scholar
- Van Bel M, Proost S, Wischnitzki E, Movahedi S, Scheerlinck C, Van de Peer Y, Vandepoele K: Dissecting plant genomes with the PLAZA comparative genomics platform. Plant Physiol. 2012, 158 (2): 590-600. 10.1104/pp.111.189514.PubMed CentralPubMedView ArticleGoogle Scholar
- Martinez M: Plant protein-coding gene families: emerging bioinformatics approaches. Trends Plant Sci. 2011, 16 (10): 558-567. 10.1016/j.tplants.2011.06.003.PubMedView ArticleGoogle Scholar
- Chan AP, Crabtree J, Zhao Q, Lorenzi H, Orvis J, Puiu D, Melake-Berhan A, Jones KM, Redman J, Chen G, Cahoon EB, Gedil M, Stanke M, Haas BJ, Wortman JR, Fraser-Liggett CM, Ravel J, Rabinowicz PD: Draft genome sequence of the oilseed species Ricinus communis. Nat Biotechnol. 2010, 28 (9): 951-956. 10.1038/nbt.1674.PubMed CentralPubMedView ArticleGoogle Scholar
- Tuskan GA, Difazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, Putnam N, Ralph S, Rombauts S, Salamov A, Schein J, Sterck L, Aerts A, Bhalerao RR, Bhalerao RP, Blaudez D, Boerjan W, Brun A, Brunner A, Busov V, Campbell M, Carlson J, Chalot M, Chapman J, Chen GL, Cooper D, Coutinho PM, Couturier J, Covert S, Cronk Q: The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science. 2006, 313 (5793): 1596-1604. 10.1126/science.1128691.PubMedView ArticleGoogle Scholar
- Young ND, Debelle F, Oldroyd GE, Geurts R, Cannon SB, Udvardi MK, Benedito VA, Mayer KF, Gouzy J, Schoof H, Van de Peer Y, Proost S, Cook DR, Meyers BC, Spannagl M, Cheung F, De Mita S, Krishnakumar V, Gundlach H, Zhou S, Mudge J, Bharti AK, Murray JD, Naoumkina MA, Rosen B, Silverstein KA, Tang H, Rombauts S, Zhao PX, Zhou P, et al: The Medicago genome provides insight into the evolution of rhizobial symbioses. Nature. 2011, 480 (7378): 520-524.PubMed CentralPubMedGoogle Scholar
- Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, et al: Genome sequence of the palaeopolyploid soybean. Nature. 2010, 463 (7278): 178-183. 10.1038/nature08670.PubMedView ArticleGoogle Scholar
- The Cucumber Genome Project. [http://www.phytozome.net/cucumber.php],
- Verde I, Abbott AG, Scalabrin S, Jung S, Shu S, Marroni F, Zhebentyayeva T, Dettori MT, Grimwood J, Cattonaro F, Zuccolo A, Rossini L, Jenkins J, Vendramin E, Meisel LA, Decroocq V, Sosinski B, Prochnik S, Mitros T, Policriti A, Cipriani G, Dondini L, Ficklin S, Goodstein DM, Xuan P, Del Fabbro C, Aramini V, Copetti D, Gonzalez S, Horner DS, et al: The high-quality draft genome of peach (Prunus persica) identifies unique patterns of genetic diversity, domestication and genome evolution. Nat Genet. 2013, 45 (5): 487-494. 10.1038/ng.2586.PubMedView ArticleGoogle Scholar
- Shulaev V, Sargent DJ, Crowhurst RN, Mockler TC, Folkerts O, Delcher AL, Jaiswal P, Mockaitis K, Liston A, Mane SP, Burns P, Davis TM, Slovin JP, Bassil N, Hellens RP, Evans C, Harkins T, Kodira C, Desany B, Crasta OR, Jensen RV, Allan AC, Michael TP, Setubal JC, Celton JM, Rees DJ, Williams KP, Holt SH, Ruiz Rojas JJ, Chatterjee M, et al: The genome of woodland strawberry (Fragaria vesca). Nat Genet. 2011, 43 (2): 109-116. 10.1038/ng.740.PubMed CentralPubMedView ArticleGoogle Scholar
- Arabidopsis_Genome_Initiative: Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000, 408 (6814): 815-Google Scholar
- Lamesch P, Berardini TZ, Li D, Swarbreck D, Wilks C, Sasidharan R, Muller R, Dreher K, Alexander DL, Garcia-Hernandez M, Karthikeyan AS, Lee CH, Nelson WD, Ploetz L, Singh S, Wensel A, Huala E: The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic Acids Res. 2012, 40 (Database issue): D1202-1210.PubMed CentralPubMedView ArticleGoogle Scholar
- Ming R, Hou S, Feng Y, Yu Q, Dionne-Laporte A, Saw JH, Senin P, Wang W, Ly BV, Lewis KL, Salzberg SL, Feng L, Jones MR, Skelton RL, Murray JE, Chen C, Qian W, Shen J, Du P, Eustice M, Tong E, Tang H, Lyons E, Paull RE, Michael TP, Wall K, Rice DW, Albert H, Wang ML, Zhu YJ: The draft genome of the transgenic tropical fruit tree papaya (Carica papaya Linnaeus). Nature. 2008, 452 (7190): 991-996. 10.1038/nature06856.PubMed CentralPubMedView ArticleGoogle Scholar
- Motamayor JC, Mockaitis K, Schmutz J, Haiminen N, Iii DL, Cornejo O, Findley SD, Zheng P, Utro F, Royaert S, Saski C, Jenkins J, Podicheti R, Zhao M, Scheffler BE, Stack JC, Feltus FA, Mustiga GM, Amores F, Phillips W, Marelli JP, May GD, Shapiro H, Ma J, Bustamante CD, Schnell RJ, Main D, Gilbert D, Parida L, Kuhn DN: The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color. Genome Biol. 2013, 14 (6): r53-10.1186/gb-2013-14-6-r53.PubMed CentralPubMedView ArticleGoogle Scholar
- Jaillon O, Aury JM, Noel B, Policriti A, Clepet C, Casagrande A, Choisne N, Aubourg S, Vitulo N, Jubin C, Vezzi A, Legeai F, Hugueney P, Dasilva C, Horner D, Mica E, Jublot D, Poulain J, Bruyère C, Billault A, Segurens B, Gouyvenoux M, Ugarte E, Cattonaro F, Anthouard V, Vico V, Del Fabbro C, Alaux M, Di Gaspero G, Dumas V: The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature. 2007, 449 (7161): 463-467. 10.1038/nature06148.PubMedView ArticleGoogle Scholar
- Hellsten U, Wright KM, Jenkins J, Shu S, Yuan Y, Wessler SR, Schmutz J, Willis JH, Rokhsar DS: Fine-scale variation in meiotic recombination in Mimulus inferred from population shotgun sequencing. Proc Natl Acad Sci U S A. 2013, 110 (48): 19478-19482. 10.1073/pnas.1319032110.PubMed CentralPubMedView ArticleGoogle Scholar
- Paterson AH, Bowers JE, Bruggmann R, Dubchak I, Grimwood J, Gundlach H, Haberer G, Hellsten U, Mitros T, Poliakov A, Schmutz J, Spannagl M, Tang H, Wang X, Wicker T, Bharti AK, Chapman J, Feltus FA, Gowik U, Grigoriev IV, Lyons E, Maher CA, Martis M, Narechania A, Otillar RP, Penning BW, Salamov AA, Wang Y, Zhang L, Carpita NC: The Sorghum bicolor genome and the diversification of grasses. Nature. 2009, 457 (7229): 551-556. 10.1038/nature07723.PubMedView ArticleGoogle Scholar
- Schnable PS, Ware D, Fulton RS, Stein JC, Wei F, Pasternak S, Liang C, Zhang J, Fulton L, Graves TA, Minx P, Reily AD, Courtney L, Kruchowski SS, Tomlinson C, Strong C, Delehaunty K, Fronick C, Courtney B, Rock SM, Belter E, Du F, Kim K, Abbott RM, Cotton M, Levy A, Marchetto P, Ochoa K, Jackson SM, Gillam B: The B73 maize genome: complexity, diversity, and dynamics. Science. 2009, 326 (5956): 1112-1115. 10.1126/science.1178534.PubMedView ArticleGoogle Scholar
- Goff SA, Ricke D, Lan TH, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H, Hadley D, Hutchison D, Martin C, Katagiri F, Lange BM, Moughamer T, Xia Y, Budworth P, Zhong J, Miguel T, Paszkowski U, Zhang S, Colbert M, Sun WL, Chen L, Cooper B, Park S, Wood TC, Mao L, Quail P: A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science. 2002, 296 (5565): 92-100. 10.1126/science.1068275.PubMedView ArticleGoogle Scholar
- Ouyang S, Zhu W, Hamilton J, Lin H, Campbell M, Childs K, Thibaud-Nissen F, Malek RL, Lee Y, Zheng L, Orvis J, Haas B, Wortman J, Buell CR: The TIGR Rice Genome Annotation Resource: improvements and new features. Nucleic Acids Res. 2007, 35 (Database issue): D883-887.PubMed CentralPubMedView ArticleGoogle Scholar
- The_International_Brachypodium_Initiative: Genome sequencing and analysis of the model grass Brachypodium distachyon. Nature. 2010, 463 (7282): 763-768. 10.1038/nature08747.View ArticleGoogle Scholar
- Banks JA, Nishiyama T, Hasebe M, Bowman JL, Gribskov M, de Pamphilis C, Albert VA, Aono N, Aoyama T, Ambrose BA, Ashton NW, Axtell MJ, Barker E, Barker MS, Bennetzen JL, Bonawitz ND, Chapple C, Cheng C, Correa LG, Dacre M, DeBarry J, Dreyer I, Elias M, Engstrom EM, Estelle M, Feng L, Finet C, Floyd SK, Frommer WB, Fujita T, Gramzow L: The Selaginella genome identifies genetic changes associated with the evolution of vascular plants. Science. 2011, 332 (6032): 960-963. 10.1126/science.1203810.PubMed CentralPubMedView ArticleGoogle Scholar
- Rensing SA, Lang D, Zimmer AD, Terry A, Salamov A, Shapiro H, Nishiyama T, Perroud PF, Lindquist EA, Kamisugi Y, Tanahashi T, Sakakibara K, Fujita T, Oishi K, Shin-I T, Kuroki Y, Toyoda A, Suzuki Y, Hashimoto S, Yamaguchi K, Sugano S, Kohara Y, Fujiyama A, Anterola A, Aoki S, Ashton N, Barbazuk WB, Barker E, Bennetzen JL, Blankenship R: The Physcomitrella genome reveals evolutionary insights into the conquest of land by plants. Science. 2008, 319 (5859): 64-69. 10.1126/science.1150646.PubMedView ArticleGoogle Scholar
- Zimmer AD, Lang D, Buchta K, Rombauts S, Nishiyama T, Hasebe M, Van de Peer Y, Rensing SA, Reski R: Reannotation and extended community resources for the genome of the non-seed plant Physcomitrella patens provide insights into the evolution of plant gene structures and functions. BMC Genomics. 2013, 14: 498-10.1186/1471-2164-14-498.PubMed CentralPubMedView ArticleGoogle Scholar
- Merchant SS, Prochnik SE, Vallon O, Harris EH, Karpowicz SJ, Witman GB, Terry A, Salamov A, Fritz-Laylin LK, Marechal-Drouard L, Marshall WF, Qu LH, Nelson DR, Sanderfoot AA, Spalding MH, Kapitonov VV, Ren Q, Ferris P, Lindquist E, Shapiro H, Lucas SM, Grimwood J, Schmutz J, Cardol P, Cerutti H, Chanfreau G, Chen CL, Cognat V, Croft MT, Dent R: The Chlamydomonas genome reveals the evolution of key animal and plant functions. Science. 2007, 318 (5848): 245-250. 10.1126/science.1143609.PubMed CentralPubMedView ArticleGoogle Scholar
- Prochnik SE, Umen J, Nedelcu AM, Hallmann A, Miller SM, Nishii I, Ferris P, Kuo A, Mitros T, Fritz-Laylin LK, Hellsten U, Chapman J, Simakov O, Rensing SA, Terry A, Pangilinan J, Kapitonov V, Jurka J, Salamov A, Shapiro H, Schmutz J, Grimwood J, Lindquist E, Lucas S, Grigoriev IV, Schmitt R, Kirk D, Rokhsar DS: Genomic analysis of organismal complexity in the multicellular green alga Volvox carteri. Science. 2010, 329 (5988): 223-226. 10.1126/science.1188800.PubMed CentralPubMedView ArticleGoogle Scholar
- Blanc G, Agarkova I, Grimwood J, Kuo A, Brueggeman A, Dunigan DD, Gurnon J, Ladunga I, Lindquist E, Lucas S, Pangilinan J, Pröschold T, Salamov A, Schmutz J, Weeks D, Yamada T, Lomsadze A, Borodovsky M, Claverie JM, Grigoriev IV, Van Etten JL: The genome of the polar eukaryotic microalga Coccomyxa subellipsoidea reveals traits of cold adaptation. Genome Biol. 2012, 13 (5): R39-10.1186/gb-2012-13-5-r39.PubMed CentralPubMedView ArticleGoogle Scholar
- Worden AZ, Lee JH, Mock T, Rouze P, Simmons MP, Aerts AL, Allen AE, Cuvelier ML, Derelle E, Everett MV, Foulon E, Grimwood J, Gundlach H, Henrissat B, Napoli C, McDonald SM, Parker MS, Rombauts S, Salamov A, Von Dassow P, Badger JH, Coutinho PM, Demir E, Dubchak I, Gentemann C, Eikrem W, Gready JE, John U, Lanier W, Lindquist EA: Green evolution and dynamic adaptations revealed by genomes of the marine picoeukaryotes Micromonas. Science. 2009, 324 (5924): 268-272. 10.1126/science.1167222.PubMedView ArticleGoogle Scholar
- Palenik B, Grimwood J, Aerts A, Rouze P, Salamov A, Putnam N, Dupont C, Jorgensen R, Derelle E, Rombauts S, Zhou K, Otillar R, Merchant SS, Podell S, Gaasterland T, Napoli C, Gendler K, Manuell A, Tai V, Vallon O, Piganeau G, Jancek S, Heijde M, Jabbari K, Bowler C, Lohr M, Robbens S, Werner G, Dubchak I, Pazour GJ: The tiny eukaryote Ostreococcus provides genomic insights into the paradox of plankton speciation. Proc Natl Acad Sci U S A. 2007, 104 (18): 7705-7710. 10.1073/pnas.0611046104.PubMed CentralPubMedView ArticleGoogle Scholar
- Santamaria ME, Hernandez-Crespo P, Ortego F, Grbic V, Grbic M, Diaz I, Martinez M: Cysteine peptidases and their inhibitors in Tetranychus urticae: a comparative genomic approach. BMC Genomics. 2012, 13: 307-10.1186/1471-2164-13-307.PubMed CentralPubMedView ArticleGoogle Scholar
- Christeller JT: Evolutionary mechanisms acting on proteinase inhibitor variability. FEBS J. 2005, 272 (22): 5710-5722. 10.1111/j.1742-4658.2005.04975.x.PubMedView ArticleGoogle Scholar
- Demuth JP, Hahn MW: The life and death of gene families. Bioessays. 2009, 31 (1): 29-39. 10.1002/bies.080085.PubMedView ArticleGoogle Scholar
- Headey SJ, Macaskill UK, Wright MA, Claridge JK, Edwards PJ, Farley PC, Christeller JT, Laing WA, Pascal SM: Solution structure of the squash aspartic acid proteinase inhibitor (SQAPI) and mutational analysis of pepsin inhibition. J Biol Chem. 2010, 285 (35): 27019-27025. 10.1074/jbc.M110.137018.PubMed CentralPubMedView ArticleGoogle Scholar
- Nagata K, Kudo N, Abe K, Arai S, Tanokura M: Three-dimensional solution structure of oryzacystatin-I, a cysteine proteinase inhibitor of the rice, Oryza sativa L. japonica. Biochemistry. 2000, 39 (48): 14753-14760. 10.1021/bi0006971.PubMedView ArticleGoogle Scholar
- Zhao Q, Chae YK, Markley JL: NMR solution structure of ATTp, an Arabidopsis thaliana trypsin inhibitor. Biochemistry. 2002, 41 (41): 12284-12296. 10.1021/bi025702a.PubMedView ArticleGoogle Scholar
- McPhalen CA, James MN: Crystal and molecular structure of the serine proteinase inhibitor CI-2 from barley seeds. Biochemistry. 1987, 26 (1): 261-269. 10.1021/bi00375a036.PubMedView ArticleGoogle Scholar
- Joshi RS, Mishra M, Suresh CG, Gupta VS, Giri AP: Complementation of intramolecular interactions for structural-functional stability of plant serine proteinase inhibitors. Biochim Biophys Acta. 2013, 1830 (11): 5087-5094. 10.1016/j.bbagen.2013.07.019.PubMedView ArticleGoogle Scholar
- Finn RD, Bateman A, Clements J, Coggill P, Eberhardt RY, Eddy SR, Heger A, Hetherington K, Holm L, Mistry J, Sonnhammer EL, Tate J, Punta M: Pfam: the protein families database. Nucleic Acids Res. 2014, 42 ((Database issue)): D222-230.PubMed CentralPubMedView ArticleGoogle Scholar
- Edgar RC: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004, 32 (5): 1792-1797. 10.1093/nar/gkh340.PubMed CentralPubMedView ArticleGoogle Scholar
- Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O: New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 2010, 59 (3): 307-321. 10.1093/sysbio/syq010.PubMedView ArticleGoogle Scholar
- Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S: MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol. 2011, 28 (10): 2731-2739. 10.1093/molbev/msr121.PubMed CentralPubMedView ArticleGoogle Scholar
- Dereeper A, Guignon V, Blanc G, Audic S, Buffet S, Chevenet F, Dufayard JF, Guindon S, Lefort V, Lescot M, Claverie JM, Gascuel O: Phylogeny.fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res. 2008, 36 (Web Server issue): W465-469.PubMed CentralPubMedView ArticleGoogle Scholar
- Anisimova M, Gascuel O: Approximate likelihood-ratio test for branches: a fast, accurate, and powerful alternative. Syst Biol. 2006, 55 (4): 539-552. 10.1080/10635150600755453.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.