Protease-associated cellular networks in malaria parasite Plasmodium falciparum
© Lilburn et al. licensee BioMed Central Ltd 2011
Published: 23 December 2011
Skip to main content
© Lilburn et al. licensee BioMed Central Ltd 2011
Published: 23 December 2011
Malaria continues to be one of the most severe global infectious diseases, responsible for 1-2 million deaths yearly. The rapid evolution and spread of drug resistance in parasites has led to an urgent need for the development of novel antimalarial targets. Proteases are a group of enzymes that play essential roles in parasite growth and invasion. The possibility of designing specific inhibitors for proteases makes them promising drug targets. Previously, combining a comparative genomics approach and a machine learning approach, we identified the complement of proteases (degradome) in the malaria parasite Plasmodium falciparum and its sibling species [1–3], providing a catalog of targets for functional characterization and rational inhibitor design. Network analysis represents another route to revealing the role of proteins in the biology of parasites and we use this approach here to expand our understanding of the systems involving the proteases of P. falciparum.
We investigated the roles of proteases in the parasite life cycle by constructing a network using protein-protein association data from the STRING database , and analyzing these data, in conjunction with the data from protein-protein interaction assays using the yeast 2-hybrid (Y2H) system , blood stage microarray experiments [6–8], proteomics [9–12], literature text mining, and sequence homology analysis. Seventy-seven (77) out of 124 predicted proteases were associated with at least one other protein, constituting 2,431 protein-protein interactions (PPIs). These proteases appear to play diverse roles in metabolism, cell cycle regulation, invasion and infection. Their degrees of connectivity (i.e., connections to other proteins), range from one to 143. The largest protease-associated sub-network is the ubiquitin-proteasome system which is crucial for protein recycling and stress response. Proteases are also implicated in heat shock response, signal peptide processing, cell cycle progression, transcriptional regulation, and signal transduction networks.
Our network analysis of proteases from P. falciparum uses a so-called guilt-by-association approach to extract sets of proteins from the proteome that are candidates for further study. Novel protease targets and previously unrecognized members of the protease-associated sub-systems provide new insights into the mechanisms underlying parasitism, pathogenesis and virulence.
Malaria remains a major threat to health and economic development in endemic countries, infecting 300-500 million people yearly and claiming 1-2 million deaths, primarily of young children. Symptoms of malaria include high fever, shaking chills, headache, vomiting, and anemia. If left untreated, malaria can quickly become life threatening by disrupting the blood supply to vital organs. Malaria is caused by a group of parasites from the genus Plasmodium. Five species, P. falciparum, P. vivax, P. malariae, P. ovale, and P. knowlesi, are known to cause the disease in humans. P. falciparum is the most devastating and widespread species.
No effective anti-malaria vaccines are available for use in humans . For decades, the management of malaria has relied heavily on chemotherapy, which uses a limited number of drugs. However, the rapid evolution and spread of drug resistance in parasites has led to an increase in morbidity and mortality rates in malaria endemic regions. The development of new drug/vaccine targets is urgently needed.
Thanks to the completion of the genome sequencing projects for P. falciprum and its sibling species [14–19], a novel array of proteins have been proposed as potential drug targets, including (1) proteins like 1-deoxy-D-xylulose 5-phosphate (DOXP) reductoisomerase [20, 21], and apicoplast gyrase  that are located in the apicoplast, an organelle with its origin close to the chloroplast; (2) kinases such as cyclin-dependent protein kinases (Pfmrk)  and the plant-like calcium-dependent protein kinase (PfCDPK5) ; (3) transporters involved in drug resistance and nutrient acquisition from the host [25–30], and (4) proteases.
Proteases are a group of enzymes that degrade proteins by breaking peptide bonds. They are attractive antimalarial targets due to their indispensible roles in parasite development and invasion [31, 32]. Previously we predicted the protease complement (degradome) in the malaria parasite P. falciparum and its four sibling species using a comparative genomics approach and a support vector machine (SVM)-based, supervised machine learning approach [1–3]. This catalog revealed a new line of novel proteases for functional characterization. Studies on malarial proteases have been focused on biochemical and molecular characterization [33–46], structural modeling and analysis [47, 48], and inhibitor design and screening [49–59]. Although significant progress has been made, much remains to be learned about the roles played by these proteins, including how they interact with other proteins in space and time to coordinate important aspects of growth, transmission, invasion, response to drug treatment and pathogenesis of this devastating pathogen.
One approach to gaining wider views on the roles of proteins in biological systems relies on network biology. Known and inferred protein associations are used to build a network of proteins, thus establishing a map of all the associations in the organism and allowing deductions to be made as to the role of proteins that are poorly understood and poorly annotated. Clearly, both proposed and demonstrated protein-protein associations could aid us in understanding the role of a protease in the parasite. Therefore, we constructed a network of P. falciparum proteins using the protein-protein association data from STRING database , and analyzed these data, in conjunction with the data from protein-protein interaction assays using the yeast 2-hybrid (Y2H) system , blood stage microarray experiments [6–8], proteomics [9–12], literature text mining, and sequence homology analysis. The topology of the protein-protein association network was analyzed and the results examined for information as to how the proteases may function within the parasite. Sets of proteins associated with specific proteases or protease families were extracted from the whole-cell network to create protease-associated subnetworks and five of these subnetworks were examined in detail. Novel protease targets and previously unrecognized members of some sub-systems could be postulated; these insights help us to better understand the mechanisms underlying parasite metabolism, cell cycle regulation, invasion and infection.
We downloaded and mined the protein-protein association data from the STRING database  involving proteins from P. falciparum. Seventy-seven (77) out of 124 predicted proteases were found in this set and were associated with at least one other protein, constituting 2,431 associations (Additional Files 1 and 2). Each association between a pair of proteins has a confidence score (S) ranging from 0.15 to 0.999 that was inferred from the evidence used to establish the association: 221 associations (9.1%) have high confidence scores (S>0.7), 432 associations (17.8%) have medium confidence scores (0.4≤S≤0.7), and strikingly, 1,778 associations (73.1%) have relative low confidence scores (0.15≤S<0.4). The large proportion of low-scored associations arises from the paucity of annotation data. Before the genome of P. falciparum was sequenced, only about 20 proteins had been characterized; after genome sequencing this number increased by two orders of magnitude, but over 60% of the predicted gene products in the genome still had no functional assignment  and ten years of subsequent effort have reduced this number to roughly 45% . Consequently, information such as KEGG pathway assignments, PDB protein structures and reactome data, which tend to improve association scores, is scarce for P. falciparum. Therefore, our subsequent analysis will not exclude the associations with low confidence scores as they may well represent associations that have not been previously recognized.
The largest protease-associated network in P. falciparum is the ubiquitin-proteasome protein degradation system (UPS). The UPS is responsible for degrading unwanted or misfolded proteins and is believed to execute important roles in protein turnover and cell cycle regulation in a wide variety of organisms . We previously identified a group of threonine proteases that form α- and β- subunits of the proteasome complex and two families of ubiquitin-specific hydrolases (C12 and C19) [1, 63] (Additional File 1). The UPS pathway in P. falciparum has been deduced by Dr. Hagai Ginsburg (http://sites.huji.ac.il/malaria/maps/proteaUbiqpath.html), and involves two consecutive steps: (1) tagging the ubiquitin molecules to target proteins and (2) degradation of the tagged protein by the proteasome complex with release and recycling of ubiquitin. The major components of the UPS in P. falciparum are conserved with other eukaryotes. However, a growing body of evidence suggests that the UPS plays a critical role in the parasite-specific life style and it is therefore intriguing to unveil the proteins and pathways that are associated with or regulated by the UPS [64, 65], as they may carry out functions specific to pathogenesis or virulence. We identified 1,148 associations in P. falciparum that involved 11 threonine proteases in the T1 family, two proteases in the C12 ubiquitin C-terminal hydrolase family, and six proteases in the C19 ubiquitin-specific protease family. One hundred and twenty-four (124) associations are protease-protease associations, and the remaining 1,024 associations involve non-protease partners. One hundred and sixty-four (164) of these associations have high confidence scores (S>0.7), the majority of which involve the association between catalytic components and regulatory components in the proteasome complex (Additional File 4).
Representative P. falciparum proteins that are associated with PF10_0111, a putative 20S proteasome beta subunit with the highest connectivity. Protein-protein interactions revealed by yeast 2-hybrid assays are italicized.
Protein accession number
tat-binding protein homolog
Cell cycle regulation
putative Skp1 family protein
putative multiprotein bridging factor type 1
CCAAT-box DNA binding protein subunit B
putative 60S ribosomal protein L40/UBI
putative 60S ribosomal protein L10
putative translation initiation factor eIF-1A
putative elongation factor 1 (EF-1)
60S ribosomal protein L4
putative apicoplast ribosomal protein L15 precursor
putative translation elongation factor EF-1, subunit alpha
rab specific GDP dissociation inhibitor
putative ubiquitin transferase
merozoite surface protein 3
merozoite surface protein 3
duffy binding-like merozoite surface protein
parasite-infected erythrocyte surface protein
Plasmodium exported protein, unknown function
Moreover, the yeast 2-hybrid assay using PF10_0111 as a bait revealed 15 PPI preys (Table 1), confirming that it is associated with (1) transcriptional regulation involving a CCAAT-box DNA binding protein subunit B (PF11_0477) containing a histone-like transcription factor domain, and (2) translation involving a putative translation elongation factor EF-1 subunit alpha (PF11_0245), a putative 60S ribosomal protein L4 (PFE0350c), and a putative ribosomal protein L15 precursor predicted to localize to the apicoplast (PF14_0270), a specific organelle of prokaryotic origin found in Apicomplexa parasites. PF10_0111 may also be associated with protein modifications involving a putative ubiquitin transferase (MAL7P1.19)  and chromatin fluidity involving a putative nucleosome assembly protein (PFI0930c).
Interestingly, PF10_0111 is shown to have PPI with three predicted surface antigens: (1) merozoite surface protein 3 (PF10_0345), which was shown by global RNA decay and nuclear run-on assays to serve a role in transcriptional regulation and RNA stabilization [70, 71]; (2) a merozoite surface protein (PF10_0348). Domain analysis revealed a N-terminus Duffy binding domain that is present in the Duffy receptors expressing blood group surface determinants and a C-terminus SPAM (secreted polymorphic antigen associated with merozoites) domain, both of which have been implicated in parasite immune evasion, cytoadherence and pathogenesis [72, 73]; (3) a parasite-infected erythrocyte surface protein (PFE0060w). The microarray and proteomics assays show that these three surface proteins are expressed at the invasive merozoite stage [6, 8, 10, 11].
These results reflect much that is known about the UPS, but also suggest that it may also be associated with a variety of processes ranging from transcriptional regulation, translation, cell cycle progression, invasion, protein trafficking, and immune evasion. Not surprisingly, the UPS has become a promising antimalarial target. Various independent studies have shown that inhibition of proteasome activity can arrest parasite growth, and yet show limited toxicity to human cell lines [64, 74, 75].
The common belief that proteases cleave peptide bonds in a water environment was challenged by the discovery of a set of proteases that conduct hydrolysis in the hydrophobic environment of cellular membranes . During RIP, intramembrane proteases cleave transmembrane-spanning helical (TMH) segments of the substrates and release soluble effectors, many of which are signaling molecules, thereby triggering cascades of signal transduction pathways [85, 86]. RIP is now believed to be a ubiquitous signaling mechanism in a wide variety of organisms from bacteria to humans . The roles of RIP in the parasite life cycle have begun to be unraveled. Three families of membrane-tethered proteases involved in RIP have been identified in P. falciparum, including an aspartic signal peptide peptidase (PfAPP, PF14_0543) in the A22 presenilin family, eight rhomboid serine proteases (PfROMs) in the S54 family, and two putative Site-2 metallo proteases (S2Ps, PF13_0028 and PF10_0317) in the M50 family[1, 88–93].
Representative P. falciparum proteins that are associated with the regulated intramembrane proteolysis (RIP) network.
Accession number of protease
Associated Protein accession number
A22 (presenilin family)
putative retrieval receptor for endoplasmic reticulum membrane proteins
ER lumen protein retaining receptor
Sec61 alpha subunit, PfSec61
secretory complex protein 61 gamma subunit
putative translation initiation factor eIF-1A
putative eukaryotic translation initiation factor 3 subunit 7
putative eukaryotic translation initiation factor 3 subunit 8
putative eukaryotic translation initiation factor 5
putative eukaryotic translation initiation factor 3, subunit 6
putative elongation factor 1 (EF-1)
putative splicing factor
putative peptide chain release factor subunit 1
cloroquine resistance associated protein Cg3 protein
S54 (Rhomboid family)
apical membrane antigen 1
merozoite adhesive erythrocytic binding protein
erythrocyte binding antigen-181
erythrocyte binding antigen-140
erythrocyte binding antigen-175
reticulocyte binding protein 2 homolog b
M50 (S2P protease family)
putative proteasome 26S regulatory subunit
putative cell division cycle protein 48 homolog
The second family, PfROM, includes a group of serine proteins with demonstrated roles in parasite invasion [90, 91, 95, 96]. Only one out of the ten rhomboid protease homologs in P. falciparum, PfRom1 (PF11_0150), was predicted to have protein-protein associations. Most interestingly, all the six proteins associated with it are antigens that have been considered as vaccine candidates; they belong to three families of adhesins that are essential for parasite invasion (Figure 4 andTable 2): (1) the apical membrane antigen 1 (AMA1, PF11_0344) is an adhesin required for merozoite invasion and it plays an indispensible role in the proliferation and survival of the malaria parasite . PfRom1 was shown to be able to cleave AMA1 ; (2) the erythrocyte binding-like (EBL) family is involved in binding to a host chemokine receptor, the Duffy antigen . Among the four EBAs with predicted association with PfRom1, EBA-175(MAL7P1.176) is proven a natural substrate for PfRom1 , but it remains unclear whether PfRom1 can cleave EBA-140 (MAL13P1.60), EBA-181 (MAL1P1.16), and a putative merozoite adhesive erythrocytic binding protein (PF11_0486); (3) a reticulocyte binding protein 2 homolog b protein (MAL13P1.176) in the reticulocyte binding-like (RBL) family. PfRom1 is able to cleave the RBL proteins . Apparently, PfRom1 plays a central role in the RIP network that is tightly linked to the invasion process  and as such merits further investigation as a drug target.
S2Ps in the third family, PF10_0317 and PF13_0028, have two and one associations, respectively (Figure 4 andTable 2). PF10_0317 is associated with a proteasome 26S regulatory subunit and a cell division cycle (CDC) protein 48 homolog, which is implicated by GO analysis in ER localization and cell cycle regulation. Our previous domain analysis showed that PF10_0317 contains a Der-1 like domain, which was implicated in proteolysis associated with the ER [99–102]. PF13_0028 is associated with an adenylosuccinate synthetase AdsS (PF13_0287), which is important for the de novo biosynthesis of purine nucleotides. This association was predicted based on the genome synteny analysis, which revealed that the homologs of S2P and AdsS are located in the same chromosomal neighborhood in a variety of Actinobacteria. The functions of these S2Ps in malaria parasites are yet to be defined.
Proteases in P. falciparum may play other roles important for parasite biology. We previously identified a single copy of calpain PfCalp (MAL13P1.310) in P. falciparum genome [3, 106, 107]. Calpain is crucial for signal transduction, cell cycle regulation, differentiation, development, and cell-cell communication from bacteria to humans. Very little is known about its role in P. falciparum. Only four proteins seemed to be associated with calpain: including a putative protein with a C3HC4 type zinc finger, the motif commonly present in transcriptional regulators, a ribosomal protein, and two proteins with unknown function. However, partial knockdown assays recently suggested that PfCalp is essential for the parasite's optimal growth and cell cycle progression . Phylogenetic analysis revealed that PfCalp is a unique type of calpain confined to alveolates (a group of protists) with distant relatedness to human calpains [63, 108], adding it to a new line of promising drug target. Another class of proteases that mediate cell cycle regulation and programmed cell death is comprised of the three metacaspases from the C14 protease family [63, 109]. Only one association partner was identified for PF13_0289 and PF14_0363, (polyubiquitin and a hypothetical protein with unknown function respectively), and no associations were found for PF14_0160, reflecting our limited knowledge about their functions in malaria parasite.
Our network analysis of proteases from P. falciparum uses a so-called guilt-by-association approach to extract sets of proteins from the proteome that are candidates for further study. The network biology approach is readily adapted to any system for which a genome sequence exists and for which some type of protein-protein association is available, although there are limitations. Some of these stem from missing data, and/or noisy data, which lead to underestimation of the S value for a pair of associated proteins, but this problem becomes less significant with each release of data. A second problem is the lack of any dynamic element in evaluating the associations. A more formal integration of expression data could help to ameliorate this situation, especially expression data sets gathered under different conditions. Despite these limitations, our results produced known associations, which serve as positive controls such as the ubiquitin-proteasome system (UPS). It also indicated that proteases are playing previously unrecognized role in the biology of the parasite, such as the proteases that mediate the stress responses. Our results also imply that certain of these proteases, such as the proteases that mediate regulated intramembrane proteolysis, parasite egress, and signal peptide processing and protein secretion, may be good candidates for antimalarial targeting, as they are highly connected in the network. Furthermore, some of these candidates are known to have no or only distantly related homologs in humans, which reduces the probability of adverse effects resulting from their inactivation. Finally, our analysis has identified new components of previously recognized systems in the parasite, such as the protein(s) involved in transcriptional regulation, cell cycle progression, invasion, protein trafficking, and immune evasion in the UPS, or the antioxidant defense proteins associated with the heat shock response systems.
The proteases in P. falciparum were predicted using a comparative genomics approach and a support vector machine (SVM)-based, supervised machine learning approach [1–3]. The classification and annotation were according to the MEROPS protease nomenclature, which is based on intrinsic evolutionary and structural relationships .
The complete set of protein-protein associations for P. falciparum was extracted from the downloaded STRING database ; each association between a pair of proteins has a confidence score (S) ranging from 0.15 to 0.999 that was inferred from the evidence used to establish the association, such as homology transfer, KEGG pathway assignments, conserved chromosome synteny, phylogenetic co-occurence, and literature co-occurence . This set of associations was visualized in Cytoscape  and converted to an undirected weighted graph, where there is a single edge between any pair of proteins and the S value is used as the weight. The network was characterized using NetworkAnalyzer  and significant modules were detected using MINE  and MCODE . The default values were used for all three plugins. The set of proteins directly associated with the 77 proteases in the association set were screened using BiNGO  to determine if any categories of proteins, as identified by their Gene Ontology terms, were over-represented. The hypergeometric test was used with the Benjamini and Hochberg false discovery date correction. A significance level of 0.05 was selected.
We downloaded the P. falciparum genomic sequence and annotation data , transcriptomic microarray data [6–8], mass-spectrometry proteomic data [9–12], and protein-protein interactome  data for network associated proteins from PlasmoDB, the Plasmodium Genome resource center (http://www.plasmodb.org) . Conserved domains/motifs in P. falciparum sequences were identified by searching InterPro . Multiple alignments were obtained using the ClustalX program  and T-coffee , followed by manual inspection and editing. Phylogenetic trees were inferred by the neighbor-joining, maximum-parsimony and maximum-likelihood methods, using MEGA5 .
apical membrane antigen
calcium-dependent protein kinase
heat shock protein
multiprotein bridging factor
open reading frame
parasitophorous vacuole membrane
regulated intramembrane proteolysis
Serine Repeat Antigen
S-phase kinase-associated protein
secreted polymorphic antigen associated with merozoites
signal peptidase complex
support vector machine
TATA box-binding protein
We thank PlasmoDB for providing an all-in-one portal for malaria omic data. This work is supported by NIH grants GM081068 and AI080579 to Y. Wang. ZZ is supported by the government scholarship from China Scholarship Council. YW is also supported by NIH grant RR013646. We thank the Computational Biology Initiative at UTSA for providing computational support. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of General Medical Sciences, National Institute of Allergy and Infectious Diseases, National Center for Research Resources, or the National Institutes of Health.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.