The Entamoeba histolytica genome: primary structure and expression of proteolytic enzymes

Background A number of studies have shown that peptidases and in particular cysteine peptidases constitute major pathogenicity factors in Entamoeba histolytica. Recent studies have suggested that a considerable number of genes coding for proteolytic enzymes are present within the E. histolytica genome and questions remain about the mode of expression of the various molecules. Results By homology search within the recently published amoeba genome, we identified a total of 86 E. histolytica genes coding for putative peptidases, including 46 recently described peptidase genes. In total these comprise (i) 50 cysteine peptidases of different families but most of which belong to the C1 papain superfamily, (ii) 22 different metallo peptidases from at least 11 different families, (iii) 10 serine peptidases belonging to 3 different families, and (iv) 4 aspartic peptidases of only one family. Using an oligonucleotide microarray, peptidase gene expression patterns of 7 different E. histolytica isolates as well as of heat stressed cells were analysed. A total of 21 out of 79 amoeba peptidase genes analysed were found to be significantly expressed under standard axenic culture conditions whereas the remaining are not expressed or at very low levels only. In heat-stressed cells the expression of 2 and 3 peptidase genes, respectively, were either decreased or increased. Only minor differences were observed between the various isolates investigated, despite the fact that these isolates were originated from asymptomatic individuals or from patients with various forms of amoebic diseases. Conclusion Entamoeba histolytica possesses a large number of genes coding for proteolytic enzymes. Under standard culture conditions or upon heat-stress only a relatively small number of these genes is significantly expressed and only very few variations become apparent between various clinical E. histolytica isolates, calling into question the importance of these enzymes in E. histolytica pathogenicity. Further studies are required to define the precise role of most of the proteolytic enzyme for amoeba cell biology but in particular for E. histolytica virulence.


Background
The faecal-oral spread protozoan parasite Entamoeba histolytica is an important human pathogen. Normally, this parasite resides and multiplies in the large bowel and can persist there for months and years causing only an asymptomatic luminal gut infection. However, occasionally E. histolytica penetrates the intestinal mucosa, which leads to ulcerative colitis or it disseminates to other organs, most commonly to the liver, where it induces abscess formation. Cysteine peptidases are considered to play a major role for the pathogenicity of E. histolytica as suggested by a large number of in vitro and in vivo studies [1][2][3][4][5][6][7][8][9]. Most convincing are results from infections of laboratory animals indicating that E. histolytica trophozoites that have reduced cysteine peptidase activity are greatly impaired in their ability to induce amoebic liver abscesses [8,9]. In addition, overexpression of cysteine peptidases led to an increase in cytopathic activity, measured by in vitro monolayer disruption, as well as to a significant increase in amoebic liver abscess formation in laboratory animals in comparison to matching controls [10]. Furthermore, the discovery that amoeba cysteine peptidases possess interleukin-1β converting enzyme activity suggests a novel mechanism of these enzymes in amoebic virulence [11].
Homology searches based on the conservation of active site regions revealed that the E. histolytica genome contains a multitude of at least 50 genes coding for cysteine peptidases (reviewed by Clark et al [12]). Of these, the majority is structurally related to the C1 papain superfamily, whereas a few others are more similar to family C2 (calpain-like cysteine proteinases), C19 (ubiquitinyl hydrolase), C48 (Ulp1 peptidase), C54 (autophagin), and C65 (otubain), respectively [12].
Phylogenetic analyses of the 37 C1-family members revealed that they represent 3 distinct clades (A, B, C), each consisting of 13, 11 and 13 members, respectively [12]. EhCP-A and EhCP-B family members are organised as classical pre-pro enzymes with an overall cathepsin Llike structure. They differ in length of the pro regions as well as of the catalytic domains and have specific sequence motifs within the N-terminal regions of the mature enzymes. In addition, most members of the EhCP-B contain hydrophobic stretches near or at the C-terminus [12,13]. The primary structure prediction of the 13 EhCP-C members indicated a hydrophobic region located 11 to 28 amino acid residues apart from the N-terminus, which is predicted to form a signal anchor. As there is no example of a structural related cysteine peptidase corresponding to the EhCP-C subfamily, any function of this group of molecules remains to be determined.
In addition, two genes encoding for putative cysteine peptidases of the family C2 (calpain-like proteases) were identified within the genome (EhCALP1 and EhCALP2). These molecules are involved in several cellular processes including signal transduction pathways, remodelling of the cytoskeleton and membranes and apoptosis [14].
Another 4 genes were identified coding for enzymes with homology to the peptidase family C54 also termed autophagins (EhAUTO1-4). The process of autophagy has initially been described in other eukaryotic cells as a rescue mechanism that is induced upon starvation or oxidative stress. It is a process by which cells digest parts of their own cytosolic material. This allows the recycling of molecules under conditions of nutritional limitation and remodelling of intracellular structure for cell differentiation [15][16][17].
Four other genes putatively encoding cysteine peptidases in E. histolytica show homology to members of the C19 and C65 families. These two groups of enzymes are known to be involved in ubiquitin degradation. In addition 3 genes with homology to Ulp1 peptidase (C48 family) were found. Ulp1 is a member of a family of peptidases that control the function of SUMO a small ubiquitin like modifier protein [18].
Only preliminary data are available for other peptidase family members in E. histolytica. So far, a collagenase [19], a high molecular weight proteinase [20], a serine-metallo proteinase [21], a tripeptidyl peptidase I [22] and a serine protease [23] have been reported.
In this study we have analysed the genome of E. histolytica for the presence of additional peptidases belonging to the aspartate, serine and metallo peptidase families. Furthermore, the expression profile of the amoeba genes for the various proteolytic enzymes was assessed in 7 different E. histolytica isolates as well as under heat stress conditions using an oligonucleotide-based microarray and quantitative real time PCR.

Peptidase genes in E. histolytica
Homology search within the E. histolytica genome revealed a total of 86 genes coding for putative peptidases. These comprise 50 cysteine peptidases of various families, all of which belonging to clan CA. In addition, 4 aspartic, 10 serine and 22 metallo peptidase genes were identified ( Figure 1, Table 1). Structural details of the various E. histolytica cysteine peptidases have been described recently [12].
Primary structure prediction of the other 3 groups of proteolytic enzymes are as follows:

Aspartic peptidases
The 4 aspartic peptidases (EhAsP22-1 to EhAsP22-4) share 30 to 40% sequence identity and are homologous to intramembrane-cleaving peptidases (clan AD, family A22). All of them have the specific active site residues TyrAsp and GlyLeuGlyAsp and contain 7 or 8 transmembrane domains but only EhAsP22-1 and EhAsP22-2 have recognizable signal peptides, whereas EhAsP22-3 contains a predicted signal anchor motif. EhAsp22-1, -2 and -3 have significant homology to signal peptide peptidases of various organisms including Trypanosoma cruzi and Arabidopsis thaliana and in addition, EhAsP22-1 and EhAsP22-3 contain the signal peptide peptidase-specific motif GlnProAlaLeuLeuTyr [24,25]. The primary structure of EhAsp22-4 revealed highest identity (35-40%) to putative presenilins of various organisms including Dictyostel-ium discoideum, Arabidopsis thaliana and Homo sapiens, but a signal peptide or signal anchor is absent.

Serine peptidases
Of the 10 E. histolytica genes coding for putative serine peptidases, 5 are predicted to belong to clan SC, family S9 ( Figure 1, Table 1), with the active site residues Ser, Asp, His. According to the amino acid residues adjacent to the active site Ser (GGSYGG), EhSP9-1, -2, and -3 can be grouped into subfamily C. The sequences of EhSP9-1 and EhSP9-2 are identical except for a 12 amino acid insertion present in EhSP9-2. In contrast, EhSP9-3, -4 or -5 share only 20% sequence identity with EhSP9-1 or EhSP9-2.
The active site residues of EhSP9-4 are not conserved and for EhSP9-5 only a partial sequence of 102 amino acid residues is available. Thus, a reliable assignment of these two Summary of the peptidase gene families identified within the Entamoeba histolytica genome Figure 1 Summary of the peptidase gene families identified within the Entamoeba histolytica genome. The identified enzymes were grouped into the corresponding peptidase clans and families according to the MEROPS nomenclature.
enzymes to a specific S9 subfamily is not possible. Signal peptides were identified only for EhSP9-1, EhSP9-3 and EhSP9-4, respectively.
Another 3 enzymes were classified into clan SC but represent most likely members of family S28 of serine peptidases (EhSP28-1, -2, -3), which are also known as lysosomal Pro-Xaa carboxypeptidases. All 3 molecules have a predicted signal peptide and are of similar size comprising betwen 457 and 480 amino acid residues. EhSP28-1 (EhSp1) and EhSP28-2 (EhSp2) have been previously characterized [23]. Both are highly similar as they share 89% sequence identity, but only 35% to EhSP28-3.
Two other serine peptidases (EhSP26-1 and EhSP26-2) have homology to members of the signal peptidase family S26B (clan SF) containing the catalytic dyad Ser and His.
EhSP26-1 has a calculated molecular mass of approximately 20 kDa and contains 2 hydrophobic regions located near the N-and C-terminus, respectively. Sequence similarity to other members of this family is approximately 45%. In contrast, EhSP26-2 shares only 20% sequence identity with members of the S26 family. Moreover, it does not contain predicted transmembrane regions and the active site is not conserved.

Metallo peptidases
A considerable number of 22 E. histolytica genes are predicted to encode putative metallo peptidases. These are relatively diverse and can be attributed to 7 different clans and 11 different families ( Figure 1, Table 1). Six of the enzymes group into clan MA, with the characteristic zinc binding-motif consisting of two histidine residues encompassing the sequence HEXXH. One member is assigned to family M1 (EhMP1-1) and two others to family M3 (EhMP3-1, EhMP3-2). The latter are known as Gluzincins with the third Zn-binding site being a glutamate residue.
Another two clan MA members (EhMP8-1, EhMP8-2) are homologous to metzincins, which are characterized by a C-terminal His residue being a third zinc-binding site. The two enzymes share 34% sequence identity and both contain a predicted C-terminal transmembrane region but only EhMP8-1 has a signal sequence.
A further clan MA member belongs to family M48 (EhMP48-1) and contains a predicted signal anchor and 6 additional transmembrane domains. The structure of EhMP48-1 is homologous to ste24, an endopeptidase from yeast. Like the yeast enzyme, the amoeba molecule contains the conserved HEXXH zinc-binding motif, located between the fourth and the fifth transmembrane domain.
Another putative metallo peptidase was assigned to clan ME, family M16C containing the characteristic zinc-binding motif HXXEH. Members of this family are falcilysin from Plasmodium falciparum, eupitrilysin from Homo sapiens and CYM1 peptidase from Saccharomyces cerevisiae.
A group of 6 enzymes (EhMP24-1 to EhMP24-6) was predicted to constitute metallo peptidases of clan MG, family 24, which usually represent cytosolic exopeptidases that require co-catalytic ions such as cobalt or manganese. Another 6 peptidases were identified, with homology to metallo peptidases of clan MH. Of these, 2 constitute most likely aspartyl aminopeptidases belonging to family M18 (EhMP18-1, EhMP18-2). They share 40% identity and approximately 35% with members of this family from other organisms. The other 4 amoeba enzymes of clan MH were attributed to family 20 (EhMP20-1 to EhMP20-4). In general, enzymes of this family hydrolyse the late products of protein degradation to complete the conversion of proteins into free amino acids.
The deduced amino acid sequence of a further amoeba gene revealed homology to clan MK, family 22 of metallo peptidases. The only enzyme belonging to this family known so far is the O-sialoglycoprotein endopeptidase from Pasteurella haemolytica. At present, the nature of the active site residues is unknown [26]. Like the amoeba homologue, the bacterial peptidase does not possess a signal peptide.
In addition, one E. histolytica enzyme was identified belonging to family M49, clan M. The mammalian homologues are cytosolic dipeptidyl peptidases, which sequentially release N-terminal dipeptides [27]. Moreover, an enzyme designated EhU48-1 were annotated, which is similar to EhMP48-1. Like EhMP48-1, it contains a signal anchor sequence and six transmembrane domains. Nevertheless, homology search grouped this peptidase into the U48 family. However, the specificities of the two families are overlapping but not identical [28,29].

Peptidase gene expression of various E. histolytica isolate under standard axenic culture conditions
To allow detailed expression analyses of the various E. histolytica peptidase genes, a small microarray was designed. This array contains 86 specific oligonucleotides representing 4 different E. histolytica houskeeping genes, 3 peptidase-inhibitor genes as well as 79 of the 86 identified peptidase genes. Genes coding for the serine peptidase EhSP9-4 or for the cysteine peptidases EhCP-A7, EhUBP, EhUCH, EhUlp-1, EhUlp-2 and EhUlp-3, respectively, were not included because it was either not possible to design a specific oligonucleotide or they were identified after the array was already spotted. In a first attempt, labelled cDNA from the widely used laboratory strain HM-1:IMSS was hybridized to the array (Figure 2). The results from multiple experiments using RNA preparations from cells grown under standard axenic culture conditions were highly reproducible and indicated that only 3 peptidase genes were expressed at high levels (mean spot intensity >8000), all of them encoding cysteine proteinases (EhCP-A1, EhCP-A2, EhCP-A5). A set of 17 peptidase genes revealed intermediate expression levels (mean spot intensity 800 to 3000). This group comprised the genes for the cysteine peptidases EhCP-A6, -A10, -A11, -B2, -C4 and EhCALP1, the aspartic peptidase EhAsP22-1, the serine peptidase EhSP9-2 and the metallo peptidases EhMP1-1, 16-1, 18-1, 20-3, 20-4, 24-1, 24-2, 24-6 and 48-1, respectively. All other peptidase genes were expressed at levels below the detection limit of Northern blots (mean spot intensity <700). The reliability of the results obtained by array hybridization was confirmed by qRT-PCR using a set of 22 pairs of primers amplyfing cDNAs of the 3 highly expressed genes as well as a representative number of the intermediate or low expressed peptidase genes ( Table 2).
In order to determine the extend of inter-strain variation in the expression of peptidase genes, HM-1:IMSS was compared with 6 different E. histolytica isolates all of them cultivated under axenic conditions. These isolates originated from different parts of the world and were obtained from patients with different forms of amoebic disease or in at least one case from an asymptomatic E. histolytica car- rier. Pairwise comparison of the various isolates with HM-1:IMSS revealed only minor differences in the expression of the various peptidase genes (Table 3). Three isolates including the one from an asymptomatic carrier showed no differences and two isolates differed only in the expression of one gene. In isolate HK-9, expression of the of the gene for cysteine peptidase EhCP-A5 was decreased by 2.3 fold and in isolate DRP expression of the the gene for the metallo peptidase EhM48-1 was increased by 5.2 fold. The only exception was isolate EGG, which revealed differences in expression for 4 peptidase genes. This isolate was obtained from a patient who simultaneously developed amoebic colitis and liver abscess. Compared to HM-1:IMSS isolate EEG showed decreased expression of the genes for the cysteine peptidase EhCP-A1 as well as for the metallo peptidase EhMP20-3 by about 2 fold, and an increase in the expression of the genes for serine peptidase EhSP9-2 and for the metallo peptidase EhMP20-1 by about 2.8 and 8.6 fold, respectively.

Peptidase gene expression in response to heat stress
Previous studies have suggested that the level of expression of a number of cysteine peptidase genes is sensitive to heat shock [30,31]. To further characterize the influence of heat stress on the expression pattern of the various E. histolytica peptidase genes, amoeba were cultured at 42°C for 4 hours and compared with amoebae cultivated under standard culture conditions at 36°C. The results indicated that only 5 of the 79 peptidase genes investigated were differentially expressed upon heat shock. The amount of RNA for the highly expressed genes ehcp-a1 and ehcp-a2 was found to be decreased by about 6 and 4 fold, respectively, whereas the expression of ehcp-a5, ehcp-a6 or ehmp8-2 was increased by approximately 2 fold (Table 4). Similar results were obtained by qRT-PCR. However, there were no significant differences in expression for the remaining 74 peptidase genes.

Discussion
In an attempt to annotate all E. histolytica peptidase genes, a total of 86 putative or known proteolytic enzymes were identified within the E. histolytica genome. Such a great number of peptidase genes is not unusual for protozoans. So far, 110 annotated peptidases were found for Plasmodium falciparum and 70 for Giardia lamblia. Entamoeba, Plasmodium and Giardia contain aspartic peptidases of the A22 family. In addition, P. falciparum contains genes belonging to the A1 family known as plasmepsins. Of the various cysteine peptidases, the autophagin-like as well as the OTU-like enzymes are only present in E. histolytica. On the other hand, several cysteine peptidase families found in P. falciparum, such as C2, C12, C13, C14, C44 and C56 Expression of E. histolytica peptidase genes of the E. histolytica isolate HM-1:IMSS as determined by microarrays Figure 2 Expression of E. histolytica peptidase genes of the E. histolytica isolate HM-1:IMSS as determined by microarrays. Error bars represent the standard error of the mean of nine hybridizations (biological replicates).
have no counterpart in Entamoeba. Regarding the serine and metallo peptidases no striking differences between the families of Entamoeba, Plasmodium and Giardia became obvious, except 10 additional peptidase families, that are peculiar for P. falciparum. The leishmanolysin-like peptidases of the M8 family are specific for E. histolytica and absent in P. falciparum and G. lamblia, respectively.
Since only a fraction of the 86 putative amoeba peptidases have been biochemically and functionally characterized so far, the function and localization of most of the molecules can only be predicted from the deduced primary structure.
All four aspartic peptidases identified within the Entamoeba histolytica genome may belong to intramembranecleaving proteases, which usually perform downstream functions such as cell signalling, regulation and intercellular communications [32]. So far, three families of peptidases are known to promote intramembrane cleavage. These are metallo peptidases represented by the human site-2 protease [33], serine peptidases represented by Drosophila melanogaster rhomboid-1 [34], and aspartic proteases including human presenilins [35], as well as signal peptide peptidases [24]. Within the Entamoeba genome, homologous to only the presenilins and signal peptide peptidases have been found.
One of the amoebic aspartic peptidases (EhAsp22-4) shows highest identity to presenilins. So far, the physiological function of presenilins is not fully understood. Presenilin is one of the subunits that form a multiprotein complex called gamma-secretase [36]. Homologues are found in various organisms of different origin such as Caenorhabditis elegans, Drosophila melanogaster, and even plants [37,38]. It has been shown that mutations within this protein are associated with Alzheimer's disease [36]. However, homologues to other subunits of this complex, such as nicastrin have not been identified within the Entamoeba genome. EhAsp22-4 contains seven putative transmembrane domains. Interestingly, the active site residues were found within one predicted outside loop of the protein. This is different to the other known presenilins [39].
The 11 identified serine peptidases can be grouped into 3 families. The two amoeba serine peptidases characterized so far, belong to family S28, previously designated EhSP1 and EhSP2 and now renamed EhSP28-1 and EhSP28-2 [23]. Biochemical analysis revealed that these peptidases prefer the substrate Suc-AAF-AMC. This enzymatic feature  is identical to that of the E. histolytica tripeptidyl serine peptidase purified by Flockenhaus and colleagues [22]. Unfortunately, no sequence of the purified tripeptidyl peptidase is available. In addition to these two described serine peptidases, one more serine peptidase gene belonging to family S28 has been identified. Microarray analysis indicated that EhSP28-1, EhSP28-2 and EhSP28-3 are not expressed or expressed at a very low level, which is in contrast to the results of Barrios-Ceballos and colleagues [23]. They postulated that the identified peptidase activity corresponds to EhSP28-2 and that this protein is associated with the trophozoite membrane. Using bioinformatic tools, no hydrophobic stretches or transmembrane domains could be deduced within EhSP28-2. Due to these controversial results, the amoeba serine peptidases require further investigation.

Gene accession Gene name Change of signal intensity HM-1:IMSS/strain of interest (pixel) Fold change p-value
The function of the two S26 family serine peptidases identified within E. histolytica is unknown. Peptidases of this family are usually membrane proteins and their function is the processing of newly synthesised secretory proteins. They remove the hydrophobic, N-terminal signal peptides as the proteins are translocated across membranes [40].
Four genes belong to the S9 family (homologous to dipeptidyl-peptidases). One peptidase (SP9-2) shows the highest expression of all serine peptidases analysed in this study. However, the function of these enzymes in E. histolytica remains to be determined.
It is likely that some of the E. histolytica serine peptidases might play a role during the encystation process is it was shown for E. invadens, the in vitro model organism for enand excystation [41]. Unfortunately, the serine peptidases involved in the E. invadens encystations processes have not been identified so far.
A total of 22 genes were identified encoding metallo peptidases, which are predicted to belong to 11 different families. The members of two of the identified metallo peptidase families contain transmembrane domains.
These are the leishmanolysin-like peptidases and the CAAX prenyl peptidases. EhMP8-1 and EhMP8-2 are homologous to leishmanolysin found in kinetoplastids.
Leishmanolysin occurs mainly as a heavily-glycosylated protein that is attached to the outer membrane of Leishmania promastigates by a glycosylphosphatidylinositol anchor. It has been demonstrated that leishmanolysin plays a role in resistance of promastigotes to complementmediated lysis and in receptor-mediated uptake of the parasite by phagocytic host cells [42]. There are other eukaryotes, including Caenorhabditis elegans, Drosophila melanogaster and Homo sapiens that have homologues of this protein. Nevertheless, highest degree of sequence similarity to the classical leishmanolysin is found for the enzymes of E. histolytica and Dictyostelium discoideum. However, the proteins of these two organisms have not been characterised so far.
Interestingly, under standard axenic culture conditions only a relatively small number of peptidase genes is significantly expressed. The results are in agreement with a recent study by Ehrenkaufer et al., in which the expression pattern of 38 of the 50 different cysteine peptidase genes were analysed in the standard laboratory E. histolytica isolate HM-1:IMSS [43]. However, in contrast to the results presented here, Ehrenkaufer et al., found differences in the expression of a considerable number of peptidase genes when recent clinical isolates were compared with strain HM-1:IMSS. The discrepancy between the two studies is most likely due to differences in the culture media used. In the study presented here, all E. histolytica isolates were grown under axenic conditions in a monophasic medium, whereas Ehrenkaufer et al. cultured their recent clinical isolates xenically using a diphasic medium and compared the results with HM-1:IMSS grown under axenic condition in a monophasic medium. As proteolytic enzymes are considered to be involved in nutrition uptake and digestion, differences in the composition of the culture medium and in particular the presence of microorganisms should considerably influence expression of petidase genes in Entamoeba. However, questions remain about possible functions of all the different peptidases present in E. histolytica. At least some of them may be involved in encystation-or exystation processes, as described for E. invadens or for a cathepsin C like peptidase of G. lamblia, which is involved in processing of cyst-wall specific proteins [44]. Aggressive and invading Entamoeba trophozoites should be endowed with adequate mechanisms that ensure their protection against host defence strategies. In this study, the trophozoites were exposed to a temporary heat stress, which partly mimics the situation during tissue invasion. Heat stressed amoebae revealed downregulation of the genes for EhCP-A1 and EhCP-A2 and elevated expression of the genes for EhCP-A5, EhCP-A6 and EhMP8-2, respectively, which is in accordance with a recent report by Weber and colleagues [31]. As EhCP-A6 and EhMP8-2 are expressed at very low levels during in vitro cultivation, these enzymes are obviously not essential for parasite growth at least at standard culture conditions. It has been postulated that the upregulation of the gene for EhCP-A6 during heat stress is due to its potential role in the degradation of damaged proteins [31]. Recently, in few other studies, regulation of peptidase expression in response to various conditions has been described. In HM-1:IMSS clone L6, which is deficient in virulence, phagocytosis as well as cysteine peptidase activity, expression of the genes for EhCP-A1, EhCP-A2 and EhCP-A5 was significantly decreased [45]. In contrast, during intestinal colonisation expression of the genes for EhCP-A1, EhCP-A4 and EhCP-A6 was found to be increased [46]. This further highlights the importance of peptidases for E. histolytica pathogenicity.

Conclusion
Under standard culture conditions only a relatively small number of at least 86 identified peptidase genes is expressed and only very few variations become apparent between various clinical E. histolytica isolates. Nevertheless, here and in few other studies, it was shown that the peptidase expression can be regulated in response to various conditions. Therefore, further studies are necessary to understand the role of all or at least most of the peptidases in the biochemistry and especially for the virulence of E. histolytica.

E. histolytica isolates and parasite culture
Seven E. histolytica isolates were used in this study. Strain HM-1:IMSS was isolated in 1967 from a patient with amoebic dysentery, strain NIH:200 was isolated in 1949 from a patient with colitis, strain HK-9 was isolated from a patient with amoebic dysentery (year unknown), strain DRP was isolated in 1985 from a patient with an amoeboma, strain EGG was isolated in 1988 from a patient with colitis and amoebic liver abscess, strain 452 was iso-lated in 1983 from an asymptomatic carrier. Origin of strain 32 is unknown. HM-1:IMSS, NIH:200 and HK-9 are standard laboratory strains obtained from the American Type Culture Collection. They were original isolated in Mexico, India and Korea respectively. All other strains were isolated in Brasil and kindly provided by Prof. E. F. Silva, University of Minas Gerais, Belo Horizonte, Brasil. Different genotypes of the E. histolytica strains were confirmed by PCR-based genotyping based on variation in the numbers of short tandem repeats that are linked to E. histolytica tRNAs [47]. Trophozoites of the various isolates were cultured axenically in TYI-S-33 medium supplemented with 10% adult bovine serum [48]. Cells were harvested by chilling on ice and subsequent centrifugation at 430 × g at 4°C for 5 min. The resulting pellet was washed twice with phosphate-buffered saline (6.7 mM NaHPO 4 , 3.3 mM NaH 2 PO 4 , 140 mM NaCl, pH7.2). For heat shock experiments incubation temperature of cultures was shifted from 36°C to 42°C for 4 hours.

Identification of peptidase homologous of E. histolytica
Conserved domains of cysteine-, serine-, metallo-, or aspartic-peptidases were used for homology search [49] against the E. histolytica genome as provided by The Sanger Centre and The Institute of Genomic Research [50][51][52].
With the help of MEROPS [53] the identified enzymes were grouped into the corresponding peptidase clans and families.

Microarray design
For microarray experiments a 60-base oligonucleotide array was designed containing probes for 79 of the 86 identified putative peptidase genes. The various oligonucleotides contain similar GC-contents of 35.5% and an average T m of 71.6°C, with a standard deviation of 1.17 (range 66-74°C). The oligonucleotides were designed and synthesized by Eurogentec. Each oligonucleotide was printed in quadruplicate on glass slides (Advalytix Epoxy AD100) in a concentration of 50 μM. The spotting procedure was done in cooperation with the University of Marburg (Genomic Solutions OmniGrid). The oligonucleotide sequences are listed [see Additional file 1].

RNA isolation, microarray hybridization, sample labelling, and visualization
Total amoeba RNA was isolated using TRIZOL reagent (InVitrogen) according to standard protocols. For microarray analysis 5 μg of total RNA was used. Two biological replicates including dye swap experiments were performed. The reverse transcription of RNA into cDNA was performed according to the Atlas Superscript Fluorescent Labeling Kit (TaKaRa) followed by indirect labelling.