Sub-grouping and sub-functionalization of the RIFIN multi-copy protein family
© Joannin et al; licensee BioMed Central Ltd. 2008
Received: 16 July 2007
Accepted: 15 January 2008
Published: 15 January 2008
Parasitic protozoans possess many multicopy gene families which have central roles in parasite survival and virulence. The number and variability of members of these gene families often make it difficult to predict possible functions of the encoded proteins. The families of extra-cellular proteins that are exposed to a host immune response have been driven via immune selection to become antigenically variant, and thereby avoid immune recognition while maintaining protein function to establish a chronic infection.
We have combined phylogenetic and function shift analyses to study the evolution of the RIFIN proteins, which are antigenically variant and are encoded by the largest multicopy gene family in Plasmodium falciparum. We show that this family can be subdivided into two major groups that we named A- and B-RIFIN proteins. This suggested sub-grouping is supported by a recently published study that showed that, despite the presence of the Plasmodium export (PEXEL) motif in all RIFIN variants, proteins from each group have different cellular localizations during the intraerythrocytic life cycle of the parasite. In the present study we show that function shift analysis, a novel technique to predict functional divergence between sub-groups of a protein family, indicates that RIFINs have undergone neo- or sub-functionalization.
These results question the general trend of clustering large antigenically variant protein groups into homogenous families. Assigning functions to protein families requires their subdivision into meaningful groups such as we have shown for the RIFIN protein family. Using phylogenetic and function shift analysis methods, we identify new directions for the investigation of this broad and complex group of proteins.
Antigenic variants are proteins expressed by pathogenic organisms, which are usually exposed to immune pressure from a vertebrate host. The genes that encode these proteins can be single copy within the genome as is the case for viruses and the variability therefore exists between gene copies of individuals. This implies that the proteins they encode retain the same function. However, other organisms maintain several to many copies within the genomes of each individual [1, 2]. Conversely to viral genes, these multicopy genes are not only under immune pressure but can also follow distinct evolutionary paths to differentiate into novel functional units.
The genomes of Plasmodium species contain numerous large multigene families that have been amplified via functional or immune pressures [2–6]. One important feature of these organisms is that they do not express the whole protein repertoire simultaneously [7–10]. These polymorphic families are predominantly situated in the sub-telomeric ends of chromosomes [2–6], where gene rearrangements are frequent [11, 12]. They encode for proteins that presumably fulfill several functions and immune pressure has driven them to antigenically vary at the surface of the infected erythrocyte . Empirical studies have shown that the Plasmodium falciparum Erythrocyte Membrane protein 1 (PfEMP1) can mediate cytoadhesion by interacting with various host receptors, resulting for example in sequestration of the infected erythrocytes in the host tissue or rosette formation with uninfected red blood cells . The repertoire of PfEMP1 proteins is therefore shaped both by functional pressures for binding and by diversifying pressures to evade immunity . Yet, such an accumulation of experimental data is missing for protein families in most parasite species.
We have studied the RIFIN protein family, a group suggested to be under immune diversifying selection. Their genes, repetitive interspersed family (rif), are the largest family in P. falciparum with 150 to 200 copies per haploid genome. They are small two-exon genes (≈1000 base pairs), with a conserved domain architecture [15, 16]. Characteristically, RIFIN proteins are described as small polypeptides beginning with a putative signal sequence followed by a conserved domain, a variable region and a conserved C-terminal domain. Two transmembrane regions have been predicted on both sides of the variable region; with this stretch predicted to be exposed to immune pressure [9, 15]. The proteins most closely related to RIFINs are of the Sub-Telomeric Variable Open Reading Frame (STEVOR) family , numbering 28 copies in the reference strain genome . Although primary sequence similarity is limited , this relationship is emphasized by the existence of a RIFIN_STEVOR family (PF02009) in the PFAM database .
RIFIN proteins have been detected throughout the intra-human life cycle of the parasite [8, 18–21]. Furthermore, RIFIN proteins are associated with a stable immune response over time and with rapid clearance of parasites from the circulation [22, 23]. However, as for most protein families, little more is known and their function(s) remain(s) to be discovered. In this study, we propose a novel approach to understand complex protein families for which little data is available. We demonstrate the division of the RIFIN family into two groups, which we associate with published differential cellular localization. Finally, we correlate these differences with the prediction of a function shift between these sub-groups.
Phylogenetic classification of the RIFIN family
In addition to the analysis of the 3D7 strain, we have aligned the 3D7 sequences with 59 of the DD2 and 65 of the HB3 strain sequences (selection criteria detailed in Methods). The tree resulting from the protein alignment confirmed the results obtained with the reference genome analyses. The sequences sorted into the same two major clades with no strain specific grouping (see Additional file 1). The B-RIFIN clade is split into three groups; however the B1 and B2 clades contain few sequences from the DD2 and HB3 genomes.
It is noteworthy that the two B-RIFIN sequences, which cluster with A-RIFINs (PFD0045c and PFI0050w), have homologous sequences in both DD2 and HB3 genomes (see Additional file 1, stars).
Based on the knowledge that non-coding regions may contain motifs of significance in gene regulation and expression, we also analyzed 500 base pairs of non-coding upstream and downstream untranslated regions (UTRs) from the 3D7 rif genes. The phylogenetic analyses of these regions segregated the sequences into the same major A- and B- groups as the coding regions, which we have termed A-rif and B-rif UTRs (see Additional file 2). For both 5' and 3' UTR analyses, B-rif UTRs could be further divided into two groups, one of which included B1 and B2 variant UTRs, the other mostly B3 variant UTRs. As in the above analysis, some sequences did not segregate into their expected sub-group, for example a few B3 sequences were found in the B1/B2 subdivision and vice versa. Additionally, some A-rif UTRs clustered with B-rif UTRs and in this case, mostly with the B3 sub-group. In contrast to the coding sequences, the A-rif UTRs appear to cluster into sub-groups. Despite overall similarities in observations between both 5' and 3' UTR analyses, there was only partial congruence between these UTR clusters, in particular as far as A-rif UTRs are concerned.
A previous study has identified two transcriptional repression sites (TATGCAATGATT and CGCACAACAC)  upstream of 8 rif genes in a head to head orientation with UpsA var genes. An exhaustive search on all 14 chromosomes of the 3D7 strain shows that these two motifs are found in 20 and 19 copies, respectively. However, only 15 and 11 copies are upstream (either independently or in combination) of a total of 16 rif genes (see Additional file 2, indicated by #); the other copies are found up- or downstream, or sometimes in the coding region of other genes. Concordantly to this analysis, 13 of the 5' UTRs of these genes cluster together in our phylogenetic tree.
An analysis of chromosomal location reveals that only 6 of the 134 sequences (4.5%) used in this study are centrally located genes (data not shown). The other similarly positioned rif genes are annotated as pseudogenes or are truncated and none of these are grouped according to protein or UTR sequences (data not shown). The transcription of ≈70% of A-rif and all B-rif genes is telomere oriented. The A-rif genes with a centromeric transcription orientation (≈30%) do not cluster on the protein tree (data not shown), however they are mostly distributed within three sub-clades of the A-rif 5' UTR tree (see Additional file 2, crosses).
Function shift analysis of A- and B-RIFIN proteins
We sought for indications of functional differences between A- and B-RIFIN sub-groups by analyzing them for function shifts according to previously described methods . Function shift analysis calculates the number of rate and conservation shifting sites (RSS and CSS, respectively) that exist between two given protein groups. RSS is measured by U-values, which indicate the likelihood that the mutation rate changes for each alignment position between the subfamilies under consideration. A site is considered rate-shifting (at 5% significance level) if its U-value is above a cut-off value of 4.0 . CSS is measured by the Z-score, a normalized method to examine the similarity between two distributions of amino acids. Smaller Z-score values are associated with similar amino acid distributions in both subfamilies, while larger Z-score values are associated with very different distributions. The total numbers of positions are counted for both RSS and CSS calculations.
The results are compared to enzymatic protein families that have undergone a change in function, which belong to several functional categories including immunity related functions. The function shift model was benchmarked using organisms from all three kingdoms of life, namely Archea, Bacteria and Eukaryotes. This results in the estimation of the likelihood of sub-functionalization between the two groups. The function shift analysis of sub-group A against sub-group B (using standard cut-offs of 4 for RSS and 0.5 for CSS) resulted in the prediction of 81 rate shifting sites (RSS) (22% of all positions) and 60 conservation shifting sites (CSS) (17%) between them (see Additional file 3, rifins.html, for the full alignment). We computed the probability of the prediction as 83% based on RSS alone and 52% based on CSS alone. Considering comparable knowledge empirically gathered on the classification of shifts in function of known protein families, which combine the two measures , A- and B- sub-groups are predicted to have functionally diverged from each other.
Most significant Rate Shifting Sites
Position in the Alignment
Residues in A-RIFINs
Residues in B-RIFINs
Residue Conserved in family
Most significant Conservation Shifting Sites
Position in the Alignment
Conserved Residue in A-RIFINs
Conserved residue in B-RIFINs
Limitations of function shift analyses lie in regions for which one group has amino-acid stretches that the other group lacks. In this case, RSS and CSS calculations give a null value; however this does not equate to an absence of impact on functional divergence of the two groups. One particular way of viewing such a site is to acknowledge it as a shifted site from a conserved motif to an absence of residues. The 25 AA stretch present in A-RIFIN sequences and absent from B-RIFINs can be viewed in this way, specifically due to the conservation of many of its residues as seen in Fig. 1B. Additionally, most of this motif is predicted to be a loop region, which could be involved in a functional site.
Protein families with known functions have successfully been sorted into functionally different sub-groups using phylogenetic techniques [28, 29]. However, which approach should be used with proteins of unknown function? We have combined phylogenetic and function shift analyses to study the Plasmodium falciparum RIFIN protein family. Our results demonstrated that these proteins could be subdivided into two major groups that we named A- and B-RIFIN proteins. We correlate these groups with different localization studies [19, 21, 30] based on proteins from each of these groups. Moreover, our function shift analysis points to the probability that these two groups of proteins have undergone neo- or sub-functionalization.
The 3D7 rif cDNA tree we constructed by the Neighbor Joining method distinguished A- and B-type RIFIN variants, the latter being subdivided into three groups (B1, B2 and B3). The additional analysis of combined rif sequences from three different strains (3D7, DD2 and HB3) confirms this grouping (see Additional file 1). However, most DD2 and HB3 sequences clustered in the A and B3 groups, with only four sequences in the B1/B2 group. Our strict inclusion criteria have resulted in the removal of over 45% of the DD2 and HB3 RIFINs, mainly truncated sequences. We do not know whether these are simply pseudogenes within these genomes or if they appear as truncated due to the difficulties in sequencing and assembling subtelomeric regions of P. falciparum parasites. Considering this latter case, we prefer not to draw genome wide conclusions from possibly incomplete genomes.
Upon further investigation of the 3D7 RIFINs, B3-sequences showed to be hybrid variants that have B1/B2 features in their C1 domains but A-type features in their V2 domains. Vice versa, two A-variant hybrids carrying A-specific C1 domains and B1/B2-specific V2 domains were also found (Fig. 3). Recombination events and gene conversion are likely to serve as explanations for the formation of such hybrid sequences. The former are essential for the generation of antigenic diversity  and previously proposed to be responsible for the diversity of the var gene family . These authors argue for recombination events restricted between genes grouped according to their chromosomal location and transcription orientation. In contrast to the var genes, there is no evidence for such specific recombination within the A- and B-rif gene groups: ≈70% of the A-rif and all B-rif genes have the same telomere-directed transcription orientation; the remaining ≈30% of A-rif genes do not cluster in our gene tree. Also, over 95% of all rif genes analyzed here are subtelomeric. Theoretically, recombination can thus occur between A- and B- types of the same orientation. DePristo et al. showed that low-complexity regions are preferred sites for recombination events to occur in var genes . Since low-complexity regions are commonly found within RIFIN sequences at the boundaries of the variable region, it is tempting to suggest these sites to have a role in the generation of such hybrid sequences. Gene conversion has been observed in P. falciparum [11, 33, 34] and is the other possible explanation for these sequences. However, gene conversion has a homogenizing effect that is not detected between B3-rif V2 regions and the sequences showing highest identity to them (66,6% average sequence identity). This might be an indication in favor of recombination events or, simply, that gene conversion is not as frequent as suggested for falcipain genes .
Whichever mechanism, both recombination and gene conversion events are known to interfere with phylogenetic reconstruction . Another factor that influences the resolution of phylogenetic analysis is long branch attraction [36, 37]. We have seen that A- and B3-RIFIN sequences have long branches (Fig. 2), which could also interfere in our phylogeny. To further confirm our proposed sub-grouping, we constructed phylogenetic trees of the UTRs of rif genes. Previous analysis of gene families has shown that long-term survival of paralogous genes allows for changes in the regulatory regions of those genes . Our analysis of rif gene UTRs demonstrated a significant segregation of these non-coding regions into similar A- and B-rif UTR groups (see Additional file 2). Taking all these facts into consideration, we conclude that despite a seemingly low bootstrap value of 61%, RIFIN proteins can be divided into A- and B-RIFIN proteins.
One question arises at this point: could there be an alternative grouping of rif/RIFIN sequences? var, the other major family in P. falciparum has been classified according to 5' UTR and genomic position [2, 39, 40]. Their classification into 3 major sub-groups (A ≈17%, B ≈42% and C ≈40%) mainly relies on the following features: (i) 5' UTR grouping (UPSA, B and C); (ii) gene position (A and B telomeric, C central); and (iii) transciption orientation (A and C towards the telomere, B towards the centromere) . However, PfEMP1 proteins are more complicated than RIFINs by the fact they are modular. Recognizable signatures allow for the identification of each module but intra-module similarity is limited . The overall function of these proteins is accepted as adhesion to host receptors and is highly module dependent (reviewed in ).
A parallel analysis of rif genes shows that, on one hand, very few are not sub-telomeric and no obvious pattern regroups these sequences. In the absence of more conclusive evidence, we do not think this is a good criterion for sub-grouping rif genes. On the other hand, rif UTR sequences can be grouped into sub-clusters. Also, the 5' UTRs of A-rif genes transcribed towards the centromere are non-randomly distributed (see Additional file 2, crosses). These observations confirm previous reports of differential regulation of A-rif expression within the same parasite strain . However the clustering of these A-rif UTR sequences is not congruent with the clustering of the protein-derived cDNA sequences. A recent study of yir genes, the largest P. yoelii yoelii multigene family, shows that some yir genes undergo alternative splicing events , which implies regulatory signals in addition to those controlling gene activation and silencing. Therefore, although it is tempting to further the sub-grouping of A-rif genes, we believe additional experimental evidence of differential transcription is required to ascertain these sub-divisions.
A recent study has shown that the intracellular distribution of RIFIN molecules in the infected erythrocyte is more diverse than previously envisaged . In order to address the issue of cross reactivity of the antisera used in this study, Petter et al.  tested recognition of the anti-RIF29 and anti-PFI0050c antisera against other recombinant proteins of each group. Also, their western blot analyses show that neither A-RIFIN antisera are cross-reactive. A-type RIFINs, detected by an antiserum directed against PFB1035w  as well as an antiserum directed against RIF29  (both A-type RIFINs), are transported to Mauer's clefts and towards the surface of the infected cell [19, 21], while B-type RIFINs, detected by an antiserum directed against PFB1040w  and an antiserum directed against PFI0050c  (both B-type RIFINs), are expressed inside the parasite , which is consistent with this group's previous report . Additionally, both A- and B-RIFIN proteins were detected in merozoites, here again with different sub-cellular distributions . The localization of B-RIFINs is concordant with the lower variability they exhibit in their V2 region, at least for the B1- and B2- RIFIN proteins (shorter branch lengths in Fig. 2). This would be expected of sequences not exposed to the immune system for long periods of time, as they would be at the infected erythrocyte surface.
Although all RIFIN variants bear a motif for directing proteins onto the secretory route, out of the parasite and into the cytoplasm of the host cell, referred to as the Plasmodium Export Element (PEXEL) or Vacuolar Transport Signal [24, 42], additional factors not yet characterized might enhance or interfere with protein export. Bioinformatics analyses of biochemical properties of the PEXEL motif and surrounding amino acids suggest possible modulations of the role of this motif (J. Hiss, J. Przyborski, F. Schwarte, K. Lingelbach and G. Schneider, personal communication). Alternatively, presence or absence of conserved motifs distributed elsewhere in the protein, such as the 25 AA stretch present in A-RIFINs, and/or different native 3D conformations of A- and B-RIFIN variants due to the highly conserved subtype specific cysteine residues (possibly involved in disulfide bonding), could impose restrictions on the export signal carried by the PEXEL motif. A previous study of synthetic constructs of the gene PFI0050c (a B-RIFIN) fused to a green fluorescent protein shows that this protein is retained in the parasite when its full length is expressed . However truncated versions, notably when lacking the C-terminal conserved region, are exported to the Maurer's Clefts. It is not clear whether this difference of localization is due to missing motifs in the C-terminus or to changes in 3D conformation due to the truncation of the C-terminus, including a transmembrane domain, of the protein. Whichever their respective transport mechanism, A- and B-RIFIN proteins have a distinct pattern of distribution during the intraerythrocytic life cycle of the parasite, which in correlation with the divergence of their regulatory regions  is suggestive of functional differences.
To test this hypothesis, we carried out a function shift analysis  of our sub-groups. The evolution of protein families and the consequential evolution of their function are accompanied by the accumulation of mutations at individual sites throughout the protein sequence . These sites may incur different types of selective pressures. A specific site may become important for the maintenance of the function, and therefore a specific amino acid is fixed in that position. In contrast, a fixed site may lose its importance, and become prone to mutation (typical RSS sites). Alternatively, a switch of functional specificity of a site may result in the switch from one amino acid to another accompanied by strict conservation (no further mutations allowed) in both sub-groups (typical CSS site). Finally, the remaining mutations are thought to be randomly accumulated at selectively neutral sites. However, recent studies have shown that mutations in non-essential residues can greatly influence protein stability and aggregation . These types of mutations may build up a compensation mechanism for mutations in key functional sites. Our function shift analysis shows, between A- and B-RIFIN proteins, which sites are under strict or varying selective pressure (see Additional file 3, rifins.html). Although the function shift analysis does not take into consideration sites for which one of the groups has a full gap (as the 25 AA insertion/deletion in the C1 domain), the accumulation of these shifted sites throughout the RIFIN sequences resulted in the prediction of a function shift between A- and B-type RIFIN proteins. A more stringent analysis of these shifted sites (see Additional file 4, rigins_high.html) identified specific residues about 15 AA ahead of and within the PEXEL motif with significant physical and chemical property changes. This analysis confirms the observations made by Hiss et al. (J. Hiss, J. Przyborski, F. Schwarte, K. Lingelbach and G. Schneider, personal communication). Also, the changes in cysteine conservation between the two groups are potentially involved in the variation of their three dimensional structures. These changes are likely to modulate the trafficking properties of RIFIN proteins. These predicted RSS and CSS sites can be tested, in future studies, by experimental techniques like site directed mutagenesis for their ability to bring about function changes.
Although rif genes have been initially discovered and subsequently studied in the blood stage of the parasite's life cycle [8, 9, 19, 21, 30, 45], recent large scale transcriptional and proteomic analyses show that rif gene transcripts and RIFIN proteins are most abundant in sporozoites (25 and 20 respectively) as well as being present in gametocytes and merozoites [18, 21, 46–49]. Recent work in other Plasmodia species has also put forward modulations of expression and function of multi-copy protein families such as VIR of P. vivax and both YIR and PY235 of P. yoelii yoelii [41, 50, 51]. In particular, the expression of these proteins in different stages of the parasite life cycle advocates for a greater subdivision of these families and their specific functions.
So far, the RIFIN protein family has been considered to be one large family with an unknown function but our results argue for a cautious approach when studying such variable protein families. The RIFIN proteins have been long neglected, possibly in part because of the complexity involved in studying such a large group of proteins. Antigenic variation is mostly a secondary function, as seen with the PfEMP1 proteins, which main function is in cytoadhesion. While physiological functions of RIFIN proteins remain obscure, it is expected that future focus on RIFIN sub-families, the 25 AA insertion/deletion and the predicted conservation-shifted sites between these sub-groups will help to simplify the quest for understanding their biological roles in the parasite. Finally, the lower variability of B-RIFIN molecules and their expression throughout the cycle of the parasite (multi-stage) suggest these proteins as candidate vaccine targets. Further analysis of this family in wild isolates may confirm this hypothesis.
Phylogenetic analysis and sequence representation
3D7 RIFIN sequences were retrieved from PlasmoDB v4.4 ; DD2 and HB3 sequence and annotation information was downloaded from the Broad Institute of Harvard and MIT . Protein multiple sequence alignments were generated using the Kalign software  and manual refinement was carried out with the help of the BioEdit software . We chose as inclusion criterion for RIFIN sequences that they correspond to the described rif and RIFIN structures: two exon gene and protein composed of a signal peptide followed by a conserved domain, a variable region and ending with a typical positively charged C-terminus. Out of the 159 RIFIN sequences from the 3D7 reference strain, 25 were either truncated sequences or lacked obvious similarity with the majority of RIFIN sequences and were thus eliminated from our analysis. Similarly, only 59 (of the 156 with a RIFIN_STEVOR PFAM annotation, 25 of which are STEVORs) and 65 (of the 131, 26 of which are STEVORs) sequences of DD2 and HB3, respectively, were retained for analysis.
Independent alignments and phylogenetic analyses were carried out, on one hand, for the 3D7 strain (134 sequences) and, on the other hand, for the combined 3D7, DD2 and HB3 strains (258 sequences).
Five hundred base pairs of upstream and downstream untranslated regions (UTR) as well as the cDNA sequences of the 3D7 rif genes were retrieved from GeneDB . The UTRs were aligned in the same manner as the protein sequences.
Protein sequences are easier to accurately align than cDNA, however the degeneracy of the genetic code makes cDNA more informative than the corresponding protein translation. We used cDNA alignments derived from our protein multiple sequence alignments in order to increase the precision of the phylogenetic analysis. The cDNA alignments were constructed by replacing the amino acids in the protein alignments with the corresponding P. falciparum gene specific codons using the PAL2NAL software . All the alignments are available upon request to the authors.
The C1 domain starts at the PEXEL motif and ends 30 AA after the insertion/deletion. The V2 domain starts 31 AA after the insertion deletion and ends 57 AA before the N-terminus of the protein alignment.
The alignments were used to construct distance trees using the Neighbor Joining method with the MEGA3.1 software . We used a p-distance model with gaps/missing data treated as pairwise deletion for the proteins and UTRs and complete deletion for cDNA alignments. No trees were cut down throughout the experiments. In order to estimate robustness, bootstrap proportions were computed after 1000 replications.
Secondary structure predictions were computed using PSIPRED [61, 62]. The predicted secondary structures were aligned according to the protein alignment and a consensus prediction was generated using the Jalview software .
Function shift analysis
The function shift analysis was carried out on each subfamily pair, of the 3D7 genome sequences (after exclusion of two A-RIFIN and four B-RIFIN sequences which are hybrid A/B sequences; see Discussion for further details), using a previously described method . In this method, two types of sites, namely rate shifting sites  and conservation shifting sites  are detected and a combined measure is calculated to assess the level of function shift between the sub-groups under consideration. In order for the algorithms to calculate shifting sites, the sequences need to segregate into their predicted groups. Six sequences (two A-RIFIN and four B-RIFN proteins) clustered in the opposite sub-group creating systematic errors in the algorithm. These sequences are all hybrids and were excluded from the function shift analysis.
Conservation Shifting Site
Plasmodium EXport ELement
Plasmodium falciparum Erythrocyte Membrane Protein 1
Rate Shifting Site
repetitive interspersed family
We thank Jane Thompson, Craig Wheelock and Ulf Ribacke for critical reading of this manuscript. We also thank the reviewers for helping us improve the quality and accuracy of our findings. This work was supported by European Community's Sixth Framework Program (MEST-CT-2004-8475), the BioMalPar consortium (LSHP-CT-2004-503578) and the Swedish Research Council.
- Stringer JR, Keely SP: Genetics of surface antigen expression in Pneumocystis carinii. Infect Immun. 2001, 69 (2): 627-639. 10.1128/IAI.69.2.627-639.2001.PubMedPubMed CentralView ArticleGoogle Scholar
- Gardner MJ, Hall N, Fung E, White O, Berriman M, Hyman RW, Carlton JM, Pain A, Nelson KE, Bowman S, Paulsen IT, James K, Eisen JA, Rutherford K, Salzberg SL, Craig A, Kyes S, Chan MS, Nene V, Shallom SJ, Suh B, Peterson J, Angiuoli S, Pertea M, Allen J, Selengut J, Haft D, Mather MW, Vaidya AB, Martin DM, Fairlamb AH, Fraunholz MJ, Roos DS, Ralph SA, McFadden GI, Cummings LM, Subramanian GM, Mungall C, Venter JC, Carucci DJ, Hoffman SL, Newbold C, Davis RW, Fraser CM, Barrell B: Genome sequence of the human malaria parasite Plasmodium falciparum. Nature. 2002, 419 (6906): 498-511. 10.1038/nature01097.PubMedView ArticleGoogle Scholar
- del Portillo HA, Fernandez-Becerra C, Bowman S, Oliver K, Preuss M, Sanchez CP, Schneider NK, Villalobos JM, Rajandream MA, Harris D, Pereira da Silva LH, Barrell B, Lanzer M: A superfamily of variant genes encoded in the subtelomeric region of Plasmodium vivax. Nature. 2001, 410 (6830): 839-842. 10.1038/35071118.PubMedView ArticleGoogle Scholar
- Fischer K, Chavchich M, Huestis R, Wilson DW, Kemp DJ, Saul A: Ten families of variant genes encoded in subtelomeric regions of multiple chromosomes of Plasmodium chabaudi, a malaria species that undergoes antigenic variation in the laboratory mouse. Mol Microbiol. 2003, 48 (5): 1209-1223. 10.1046/j.1365-2958.2003.03491.x.PubMedView ArticleGoogle Scholar
- Janssen CS, Phillips RS, Turner CM, Barrett MP: Plasmodium interspersed repeats: the major multigene superfamily of malaria parasites. Nucleic Acids Res. 2004, 32 (19): 5712-5720. 10.1093/nar/gkh907.PubMedPubMed CentralView ArticleGoogle Scholar
- Sam-Yellowe TY, Florens L, Johnson JR, Wang T, Drazba JA, Le Roch KG, Zhou Y, Batalov S, Carucci DJ, Winzeler EA, Yates JR: A Plasmodium gene family encoding Maurer's cleft membrane proteins: structural properties and expression profiling. Genome Res. 2004, 14 (6): 1052-1059. 10.1101/gr.2126104.PubMedPubMed CentralView ArticleGoogle Scholar
- Chen Q, Fernandez V, Sundstrom A, Schlichtherle M, Datta S, Hagblom P, Wahlgren M: Developmental selection of var gene expression in Plasmodium falciparum. Nature. 1998, 394 (6691): 392-395. 10.1038/28660.PubMedView ArticleGoogle Scholar
- Fernandez V, Hommel M, Chen Q, Hagblom P, Wahlgren M: Small, clonally variant antigens expressed on the surface of the Plasmodium falciparum-infected erythrocyte are encoded by the rif gene family and are the target of human immune responses. J Exp Med. 1999, 190 (10): 1393-1404. 10.1084/jem.190.10.1393.PubMedPubMed CentralView ArticleGoogle Scholar
- Kyes SA, Rowe JA, Kriek N, Newbold CI: Rifins: a second family of clonally variant proteins expressed on the surface of red cells infected with Plasmodium falciparum. Proc Natl Acad Sci U S A. 1999, 96 (16): 9333-9338. 10.1073/pnas.96.16.9333.PubMedPubMed CentralView ArticleGoogle Scholar
- Scherf A, Hernandez-Rivas R, Buffet P, Bottius E, Benatar C, Pouvelle B, Gysin J, Lanzer M: Antigenic variation in malaria: in situ switching, relaxed and mutually exclusive transcription of var genes during intra-erythrocytic development in Plasmodium falciparum. Embo J. 1998, 17 (18): 5418-5426. 10.1093/emboj/17.18.5418.PubMedPubMed CentralView ArticleGoogle Scholar
- Freitas-Junior LH, Bottius E, Pirrit LA, Deitsch KW, Scheidig C, Guinet F, Nehrbass U, Wellems TE, Scherf A: Frequent ectopic recombination of virulence factor genes in telomeric chromosome clusters of P. falciparum. Nature. 2000, 407 (6807): 1018-1022. 10.1038/35039531.PubMedView ArticleGoogle Scholar
- Hernandez-Rivas R, Hinterberg K, Scherf A: Compartmentalization of genes coding for immunodominant antigens to fragile chromosome ends leads to dispersed subtelomeric gene families and rapid gene evolution in Plasmodium falciparum. Mol Biochem Parasitol. 1996, 78 (1-2): 137-148. 10.1016/S0166-6851(96)02618-7.PubMedView ArticleGoogle Scholar
- Rasti N, Wahlgren M, Chen Q: Molecular aspects of malaria pathogenesis. FEMS Immunol Med Microbiol. 2004, 41 (1): 9-26. 10.1016/j.femsim.2004.01.010.PubMedView ArticleGoogle Scholar
- Robinson BA, Welch TL, Smith JD: Widespread functional specialization of Plasmodium falciparum erythrocyte membrane protein 1 family members to bind CD36 analysed across a parasite genome. Mol Microbiol. 2003, 47 (5): 1265-1278. 10.1046/j.1365-2958.2003.03378.x.PubMedView ArticleGoogle Scholar
- Cheng Q, Cloonan N, Fischer K, Thompson J, Waine G, Lanzer M, Saul A: stevor and rif are Plasmodium falciparum multicopy gene families which potentially encode variant antigens. Mol Biochem Parasitol. 1998, 97 (1-2): 161-176. 10.1016/S0166-6851(98)00144-3.PubMedView ArticleGoogle Scholar
- Gardner MJ, Tettelin H, Carucci DJ, Cummings LM, Aravind L, Koonin EV, Shallom S, Mason T, Yu K, Fujii C, Pederson J, Shen K, Jing J, Aston C, Lai Z, Schwartz DC, Pertea M, Salzberg S, Zhou L, Sutton GG, Clayton R, White O, Smith HO, Fraser CM, Adams MD, Venter JC, Hoffman SL: Chromosome 2 sequence of the human malaria parasite Plasmodium falciparum. Science. 1998, 282 (5391): 1126-1132. 10.1126/science.282.5391.1126.PubMedView ArticleGoogle Scholar
- Finn RD, Mistry J, Schuster-Bockler B, Griffiths-Jones S, Hollich V, Lassmann T, Moxon S, Marshall M, Khanna A, Durbin R, Eddy SR, Sonnhammer EL, Bateman A: Pfam: clans, web tools and services. Nucleic Acids Res. 2006, 34 (Database issue): D247-51. 10.1093/nar/gkj149.PubMedPubMed CentralView ArticleGoogle Scholar
- Florens L, Washburn MP, Raine JD, Anthony RM, Grainger M, Haynes JD, Moch JK, Muster N, Sacci JB, Tabb DL, Witney AA, Wolters D, Wu Y, Gardner MJ, Holder AA, Sinden RE, Yates JR, Carucci DJ: A proteomic view of the Plasmodium falciparum life cycle. Nature. 2002, 419 (6906): 520-526. 10.1038/nature01107.PubMedView ArticleGoogle Scholar
- Haeggstrom M, Kironde F, Berzins K, Chen Q, Wahlgren M, Fernandez V: Common trafficking pathway for variant antigens destined for the surface of the Plasmodium falciparum-infected erythrocyte. Mol Biochem Parasitol. 2004, 133 (1): 1-14. 10.1016/j.molbiopara.2003.07.006.PubMedView ArticleGoogle Scholar
- Helmby H, Cavelier L, Pettersson U, Wahlgren M: Rosetting Plasmodium falciparum-infected erythrocytes express unique strain-specific antigens on their surface. Infect Immun. 1993, 61 (1): 284-288.PubMedPubMed CentralGoogle Scholar
- Petter M, Haeggstrom M, Khattab A, Fernandez V, Klinkert MQ, Wahlgren M: Variant proteins of the Plasmodium falciparum RIFIN family show distinct subcellular localization and developmental expression patterns. Mol Biochem Parasitol. 2007, 156 (1): 51-61. 10.1016/j.molbiopara.2007.07.011.PubMedView ArticleGoogle Scholar
- Abdel-Latif MS, Dietz K, Issifou S, Kremsner PG, Klinkert MQ: Antibodies to Plasmodium falciparum rifin proteins are associated with rapid parasite clearance and asymptomatic infections. Infect Immun. 2003, 71 (11): 6229-6233. 10.1128/IAI.71.11.6229-6233.2003.PubMedPubMed CentralView ArticleGoogle Scholar
- Abdel-Latif MS, Khattab A, Lindenthal C, Kremsner PG, Klinkert MQ: Recognition of variant Rifin antigens by human antibodies induced during natural Plasmodium falciparum infections. Infect Immun. 2002, 70 (12): 7013-7021. 10.1128/IAI.70.12.7013-7021.2002.PubMedPubMed CentralView ArticleGoogle Scholar
- Marti M, Good RT, Rug M, Knuepfer E, Cowman AF: Targeting malaria virulence and remodeling proteins to the host erythrocyte. Science. 2004, 306 (5703): 1930-1933. 10.1126/science.1102452.PubMedView ArticleGoogle Scholar
- Tham WH, Payne PD, Brown GV, Rogerson SJ: Identification of basic transcriptional elements required for rif gene transcription. Int J Parasitol. 2007, 37 (6): 605-615. 10.1016/j.ijpara.2006.11.006.PubMedView ArticleGoogle Scholar
- Abhiman S, Sonnhammer EL: Large-scale prediction of function shift in protein families with a focus on enzymatic function. Proteins. 2005, 60 (4): 758-768. 10.1002/prot.20550.PubMedView ArticleGoogle Scholar
- Knudsen B, Miyamoto MM: A likelihood ratio test for evolutionary rate shifts and functional divergence among proteins. Proc Natl Acad Sci U S A. 2001, 98 (25): 14512-14517. 10.1073/pnas.251526398.PubMedPubMed CentralView ArticleGoogle Scholar
- Prim N, Bofill C, Pastor FI, Diaz P: Esterase EstA6 from Pseudomonas sp. CR-611 is a novel member in the utmost conserved cluster of family VI bacterial lipolytic enzymes. Biochimie. 2006, 88 (7): 859-867. 10.1016/j.biochi.2006.02.011.PubMedView ArticleGoogle Scholar
- Stam MR, Danchin EG, Rancurel C, Coutinho PM, Henrissat B: Dividing the large glycoside hydrolase family 13 into subfamilies: towards improved functional annotations of alpha-amylase-related proteins. Protein Eng Des Sel. 2006, 19 (12): 555-562. 10.1093/protein/gzl044.PubMedView ArticleGoogle Scholar
- Khattab A, Klinkert MQ: Maurer's clefts-restricted localization, orientation and export of a Plasmodium falciparum RIFIN. Traffic. 2006, 7 (12): 1654-1665. 10.1111/j.1600-0854.2006.00494.x.PubMedView ArticleGoogle Scholar
- Kraemer SM, Smith JD: Evidence for the importance of genetic structuring to the structural and functional specialization of the Plasmodium falciparum var gene family. Mol Microbiol. 2003, 50 (5): 1527-1538. 10.1046/j.1365-2958.2003.03814.x.PubMedView ArticleGoogle Scholar
- DePristo MA, Zilversmit MM, Hartl DL: On the abundance, amino acid composition, and evolutionary dynamics of low-complexity regions in proteins. Gene. 2006, 378: 19-30. 10.1016/j.gene.2006.03.023.PubMedView ArticleGoogle Scholar
- Enea V, Corredor V: The evolution of plasmodial stage-specific rRNA genes is dominated by gene conversion. J Mol Evol. 1991, 32 (2): 183-186. 10.1007/BF02515391.PubMedView ArticleGoogle Scholar
- Nielsen KM, Kasper J, Choi M, Bedford T, Kristiansen K, Wirth DF, Volkman SK, Lozovsky ER, Hartl DL: Gene conversion as a source of nucleotide diversity in Plasmodium falciparum. Mol Biol Evol. 2003, 20 (5): 726-734. 10.1093/molbev/msg076.PubMedView ArticleGoogle Scholar
- Posada D, Crandall KA: The effect of recombination on the accuracy of phylogeny estimation. J Mol Evol. 2002, 54 (3): 396-402.PubMedView ArticleGoogle Scholar
- Kennedy M, Holland BR, Gray RD, Spencer HG: Untangling long branches: identifying conflicting phylogenetic signals using spectral analysis, neighbor-net, and consensus networks. Syst Biol. 2005, 54 (4): 620-633. 10.1080/106351591007462.PubMedView ArticleGoogle Scholar
- Stiller JW, Hall BD: Long-branch attraction and the rDNA model of early eukaryotic evolution. Mol Biol Evol. 1999, 16 (9): 1270-1279.PubMedView ArticleGoogle Scholar
- Shakhnovich BE, Koonin EV: Origins and impact of constraints in evolution of gene families. Genome Res. 2006, 16 (12): 1529-1536. 10.1101/gr.5346206.PubMedPubMed CentralView ArticleGoogle Scholar
- Lavstsen T, Salanti A, Jensen AT, Arnot DE, Theander TG: Sub-grouping of Plasmodium falciparum 3D7 var genes based on sequence analysis of coding and non-coding regions. Malar J. 2003, 2: 27-10.1186/1475-2875-2-27.PubMedPubMed CentralView ArticleGoogle Scholar
- Voss TS, Thompson JK, Waterkeyn J, Felger I, Weiss N, Cowman AF, Beck HP: Genomic distribution and functional characterisation of two distinct and conserved Plasmodium falciparum var gene 5' flanking sequences. Mol Biochem Parasitol. 2000, 107 (1): 103-115. 10.1016/S0166-6851(00)00176-6.PubMedView ArticleGoogle Scholar
- Fonager J, Cunningham D, Jarra W, Koernig S, Henneman AA, Langhorne J, Preiser P: Transcription and alternative splicing in the yir multigene family of the malaria parasite Plasmodium y. yoelii: identification of motifs suggesting epigenetic and post-transcriptional control of RNA expression. Mol Biochem Parasitol. 2007, 156 (1): 1-11. 10.1016/j.molbiopara.2007.06.006.PubMedView ArticleGoogle Scholar
- Hiller NL, Bhattacharjee S, van Ooij C, Liolios K, Harrison T, Lopez-Estrano C, Haldar K: A host-targeting signal in virulence proteins reveals a secretome in malarial infection. Science. 2004, 306 (5703): 1934-1937. 10.1126/science.1102737.PubMedView ArticleGoogle Scholar
- Golding GB, Dean AM: The structural basis of molecular adaptation. Mol Biol Evol. 1998, 15 (4): 355-369.PubMedView ArticleGoogle Scholar
- DePristo MA, Weinreich DM, Hartl DL: Missense meanderings in sequence space: a biophysical view of protein evolution. Nat Rev Genet. 2005, 6 (9): 678-687. 10.1038/nrg1672.PubMedView ArticleGoogle Scholar
- Weber JL: Interspersed repetitive DNA from Plasmodium falciparum. Mol Biochem Parasitol. 1988, 29 (2-3): 117-124. 10.1016/0166-6851(88)90066-7.PubMedView ArticleGoogle Scholar
- Bozdech Z, Llinas M, Pulliam BL, Wong ED, Zhu J, DeRisi JL: The transcriptome of the intraerythrocytic developmental cycle of Plasmodium falciparum. PLoS Biol. 2003, 1 (1): E5-10.1371/journal.pbio.0000005.PubMedPubMed CentralView ArticleGoogle Scholar
- Daily JP, Le Roch KG, Sarr O, Ndiaye D, Lukens A, Zhou Y, Ndir O, Mboup S, Sultan A, Winzeler EA, Wirth DF: In vivo transcriptome of Plasmodium falciparum reveals overexpression of transcripts that encode surface proteins. J Infect Dis. 2005, 191 (7): 1196-1203. 10.1086/428289.PubMedPubMed CentralView ArticleGoogle Scholar
- Le Roch KG, Zhou Y, Blair PL, Grainger M, Moch JK, Haynes JD, De La Vega P, Holder AA, Batalov S, Carucci DJ, Winzeler EA: Discovery of gene function by expression profiling of the malaria parasite life cycle. Science. 2003, 301 (5639): 1503-1508. 10.1126/science.1087025.PubMedView ArticleGoogle Scholar
- Llinas M, Bozdech Z, Wong ED, Adai AT, DeRisi JL: Comparative whole genome transcriptome analysis of three Plasmodium falciparum strains. Nucleic Acids Res. 2006, 34 (4): 1166-1173. 10.1093/nar/gkj517.PubMedPubMed CentralView ArticleGoogle Scholar
- Fernandez-Becerra C, Pein O, de Oliveira TR, Yamamoto MM, Cassola AC, Rocha C, Soares IS, de Braganca Pereira CA, del Portillo HA: Variant proteins of Plasmodium vivax are not clonally expressed in natural infections. Mol Microbiol. 2005, 58 (3): 648-658. 10.1111/j.1365-2958.2005.04850.x.PubMedView ArticleGoogle Scholar
- Preiser PR, Khan S, Costa FT, Jarra W, Belnoue E, Ogun S, Holder AA, Voza T, Landau I, Snounou G, Renia L: Stage-specific transcription of distinct repertoires of a multigene family during Plasmodium life cycle. Science. 2002, 295 (5553): 342-345. 10.1126/science.1064938.PubMedView ArticleGoogle Scholar
- PlasmoDB v4.4. [http://v4-4.plasmodb.org/]
- Broad Institute of Harvard and M.I.T. [http://www.broad.mit.edu/]
- Lassmann T, Sonnhammer EL: Kalign--an accurate and fast multiple sequence alignment algorithm. BMC Bioinformatics. 2005, 6: 298-10.1186/1471-2105-6-298.PubMedPubMed CentralView ArticleGoogle Scholar
- Hall T: BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucl Acids Symp Ser. 1999, 41: 95-98.Google Scholar
- GeneDB. [http://www.genedb.org/]
- Suyama M, Torrents D, Bork P: PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 2006, 34 (Web Server issue): W609-12. 10.1093/nar/gkl315.PubMedPubMed CentralView ArticleGoogle Scholar
- Kumar S, Tamura K, Nei M: MEGA3: Integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment. Brief Bioinform. 2004, 5 (2): 150-163. 10.1093/bib/5.2.150.PubMedView ArticleGoogle Scholar
- Protein Sequence Logos and Relative Entropy. [http://www.cbs.dtu.dk/~gorodkin/appl/plogo.html]
- Schneider TD, Stephens RM: Sequence logos: a new way to display consensus sequences. Nucleic Acids Res. 1990, 18 (20): 6097-6100. 10.1093/nar/18.20.6097.PubMedPubMed CentralView ArticleGoogle Scholar
- Bryson K, McGuffin LJ, Marsden RL, Ward JJ, Sodhi JS, Jones DT: Protein structure prediction servers at University College London. Nucleic Acids Res. 2005, 33 (Web Server issue): W36-8. 10.1093/nar/gki410.PubMedPubMed CentralView ArticleGoogle Scholar
- Jones DT: Protein secondary structure prediction based on position-specific scoring matrices. Journal of molecular biology. 1999, 292 (2): 195-202. 10.1006/jmbi.1999.3091.PubMedView ArticleGoogle Scholar
- Clamp M, Cuff J, Searle SM, Barton GJ: The Jalview Java alignment editor. Bioinformatics (Oxford, England). 2004, 20 (3): 426-427. 10.1093/bioinformatics/btg430.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.