Simultaneous gene expression profiling in human macrophages infected with Leishmania major parasites using SAGE

Background Leishmania (L) are intracellular protozoan parasites that are able to survive and replicate within the harsh and potentially hostile phagolysosomal environment of mammalian mononuclear phagocytes. A complex interplay then takes place between the macrophage (MΦ) striving to eliminate the pathogen and the parasite struggling for its own survival. To investigate this host-parasite conflict at the transcriptional level, in the context of monocyte-derived human MΦs (MDM) infection by L. major metacyclic promastigotes, the quantitative technique of serial analysis of gene expression (SAGE) was used. Results After extracting mRNA from resting human MΦs, Leishmania-infected human MΦs and L. major parasites, three SAGE libraries were constructed and sequenced generating up to 28,173; 57,514 and 33,906 tags respectively (corresponding to 12,946; 23,442 and 9,530 unique tags). Using computational data analysis and direct comparison to 357,888 publicly available experimental human tags, the parasite and the host cell transcriptomes were then simultaneously characterized from the mixed cellular extract, confidently discriminating host from parasite transcripts. This procedure led us to reliably assign 3,814 tags to MΦs' and 3,666 tags to L. major parasites transcripts. We focused on these, showing significant changes in their expression that are likely to be relevant to the pathogenesis of parasite infection: (i) human MΦs genes, belonging to key immune response proteins (e.g., IFNγ pathway, S100 and chemokine families) and (ii) a group of Leishmania genes showing a preferential expression at the parasite's intra-cellular developing stage. Conclusion Dual SAGE transcriptome analysis provided a useful, powerful and accurate approach to discriminating genes of human or parasitic origin in Leishmania-infected human MΦs. The findings presented in this work suggest that the Leishmania parasite modulates key transcripts in human MΦs that may be beneficial for its establishment and survival. Furthermore, these results provide an overview of gene expression at two developmental stages of the parasite, namely metacyclic promastigotes and intracellular amastigotes and indicate a broad difference between their transcriptomic profiles. Finally, our reported set of expressed genes will be useful in future rounds of data mining and gene annotation.


Conclusion:
Dual SAGE transcriptome analysis provided a useful, powerful and accurate approach to discriminating genes of human or parasitic origin in Leishmania-infected human MΦs. The findings presented in this work suggest that the Leishmania parasite modulates key transcripts in human MΦs that may be beneficial for its establishment and survival. Furthermore, these results provide an overview of gene expression at two developmental stages of the parasite, namely metacyclic promastigotes and intracellular amastigotes and indicate a broad difference between their transcriptomic profiles. Finally, our reported set of expressed genes will be useful in future rounds of data mining and gene annotation.

Background
Co-evolution of humans and pathogens has exerted a dual selective pressure on the immune system of the host that strives to control infection and on the pathogens, which have developed various strategies to circumvent the host's immune responses.
Leishmania (L) parasites are obligate intracellular pathogens that preferentially invade macrophages (MΦs) where they replicate, ultimately causing a heterogeneous group of diseases that affects millions of people mainly in subarid, tropical and subtropical areas [1]. In view of their wide distribution, leishmaniasis remain embedded in impoverished populations and represent a paradigm of neglected diseases [2].
To establish infection, the flagellated metacyclic promastigotes must enter MΦs and avoid triggering host responses. Since MΦs play a dual function in infection, acting as a safe shelter for parasites but also as their ultimate killer, these cells are the alpha and the omega for host resistance or susceptibility to Leishmania infection. Cellular events occurring early during MΦ-parasite interactions are likely to influence the fate of infection. MΦs are able to secrete a remarkably diverse set of regulators known to influence the physiological functions and differentiation of neighboring cells. Thus, activation of the innate immunity is required, by migrating parasitized dendritic cells to trigger an adaptive immune response of the Th1-type. The latter induces interferon (IFN) γ-activated MΦs to kill Leishmania parasites, promote disease healing and regulate resistance to re-infection as well as vaccine-induced immunity [3].
Leishmania have developed a range of sophisticated mechanisms to subvert the leishmanicidal activities of MΦs, by altering gene expression for cytokines, chemokines, transcription factors, membrane receptors and molecules involved in signal transduction in infected cells [4,5]. Although a wealth of crucial information has already been reported on the matter, it generated only a segmented view that hardly recognizes the full value of the biological consequences of this host-parasite conflict on a more global scale.
There is obviously a need for a high-throughput approach that generates a global view, in order to identify the salient modifications of the biological pathways triggered by intracellular parasitism. Applying transcriptomics to study host-pathogen interactions has already contributed important insights to the understanding of the mechanisms of pathogenesis, and it is expanding further with the accumulation of genomic sequences of host organisms (e.g., human) and their pathogens [6]. Indeed, several studies analyzing the human MΦ transcriptome upon viral [7], bacterial [8] or fungal [9] infections have been published. However, to our knowledge, only one study, using the microarray technique has described the effect of L. major infection on the transcriptome of human MΦs [10]. More recently, a paper has described at the global scale the abrogation in the human monocytic THP1 cell line of IFNγ gene expression by this parasite species [11].
Compared to other transcriptomic methods, Serial Analysis of Gene Expression (SAGE) technology has proved to be a powerful tool for the quantitative cataloguing and comparison of genes expressed in cells or tissues from various physiological and pathological conditions. Additionally, SAGE allows one to study the expression profiles of both known and unknown genes and as a result contributes to better genome annotation [21]. This technology was successfully applied to study the transcriptome of different parasites e.g., Plasmodium falciparum [22]. Schistosoma mansoni [23] and Trypanosoma congolense [24], among others [21].
As far as we know, this is the first study using the SAGE strategy that provides a high-throughput simultaneous analysis of gene expression in the context of the Leishmania-human MΦ encounter. Although the impact of the parasite on the human transcriptome appeared globally marginal, we identified several genes corresponding to diverse functional pathways that were differentially expressed upon infection, suggesting their likely involvement in the infectious process. Interestingly, we individualized genes involved in complement or IFNγ pathways, and others belonging to S100 proteins, MHC molecules, apoptosis, cytokines and chemokines families. Concurrently, our SAGE analysis unveiled a deep variation in parasite transcript abundance; such characterized transcripts could contribute to understanding the dynamics of gene expression in the intracellular parasite-stage.

Data analysis allows good discrimination between human and parasite tags generated in the same "MDM+Lm" mixed SAGE library
To identify the transcripts that were modulated upon infection, we compared the three libraries that were constructed. We found that the MDM and Lm libraries had 194 tags in common and 3,857 tags were shared by the "MDM+Lm" and Lm libraries. In addition, 2,535 tags were common between MDM and "MDM+Lm" libraries ( Figure 1A). Unexpectedly, this initial analysis showed that a large number of tags were specifically present in "MDM+Lm" library.
In order to more confidently assign these tags to a human origin, they were compared to an assembled composite matrix containing a total of 26,176 unique tags and built from (i) nine publicly available leukocyte SAGE libraries, that were generated from freshly isolated monocytes, M-CSF differentiated, GM-CSF differentiated and LPS activated cells, immature and mature monocyte-derived dendritic cells and un-fractionated populations of leukocytes, (ii) the non-infected MDM library (this paper) and (iii) a second in-house-generated MDM-M library (Ottones et al., unpublished data; see Methods section for details).
As shown on the Venn diagram in Figure 1B, only 191 tags were shared by all libraries (ABC subset) and 196 tags (AC subset) were shared by the Lm library (referred to as C) and the non-infected human leukocyte libraries (referred to as A). This low level (1.4%) of synonymy between human and parasite tags indicated that we could accurately discriminate human transcripts from parasite transcripts in the infected "MDM+Lm" sample (referred to as B). It is not excluded that some of the 3,814 human tags sorted in AB might correspond to parasite tags specifically expressed by the amastigote stage and absent in the promastigote-derived library. However, in this case we assumed that their number was likely to be in the same range as in AC and ABC, so that most AB tags could be reasonably considered as MΦ-specific.
It must be stressed here that the apparently large number of "MDM+Lm"-specific tags must be interpreted with caution because most of them were observed only once and may result from sequencing errors, the major source of noise inflating the number of unique tags. Indeed, when we reanalyzed the data excluding tags that appeared only once (unless present at least twice in another library), we ended up with 3,506 human tags sorted in AB, 2,960 tags common to the Lm and "MDM+Lm" libraries (BC subset, Figure 1D). Interestingly, when excluding unique tags, number of these present only in "MDM+Lm" library dropped from 15,771 to 1,678 tags. In fact, most of the tags occurring at high frequencies in the "MDM+Lm" library were sorted into the AB or BC subsets. Since these tags could be safely considered as identifying human (AB) or parasite (BC) transcripts, their respective frequencies could be taken as representative of the actual figure of the two species-specific transcripts in the initial mRNA inputs. Thus, we estimated that human and parasite mRNAs contributed to 51% and 49% (54% and 46% for tags > 1), respectively, of the "MDM+Lm" sample.

Impact of L. major infection on MΦ transcriptome
To investigate MΦ tags that were modulated by Leishmania infection, we compared total tags present in the MDM library to those present in the "MDM+Lm" library, after withdrawing tags of parasitic origin. Most tags were expressed at similar levels between resting and Leishmaniainfected MDM. A semi-logarithmic plot ( Figure 2) showed that both up-and down-modulated tags were distributed within a bell-shaped symmetric curve, though tailed for the tags up-regulated 12-16 times. This ratio profile Venn diagram comparing the parasite-infected MDM and L. major SAGE libraries with the MDM non-infected library or with other publicly available leukocyte libraries Starting with the matrix registering initial SAGE data, we recalculated tag frequencies in each of the 11 libraries, replaced by the nearest integer of tag frequencies for 10,000 counts. For every tag, the sum of normalized frequencies was calculated and tags were discarded for values less than 2. The resulting matrix (2,918 rows) was split into two parts: the first registering the 500 tags with the highest sum of frequencies (Top500) and the second registering tags with lower frequencies (2,418).
Using Principal Component Analysis, observable either on 2D (not illustrated) or 3D graphs (Figure 3), landscapes generated with the Top500 dataset showed that the closest relationship was between "MDM+Lm" and their Taking these data sets as a whole, MDM infected with L. major parasites showed a transcriptional profile closer to that of non-infected cells but clearly different from that of Comparison of gene expression modulation in the L. major-infected "MDM+Lm" library to MDM library Figure 2 Comparison of gene expression modulation in the L. major-infected "MDM+Lm" library to MDM library. A semi-logarithmic plot shows that both up-and down-modulated tags were decreased within a bell-shaped curve except for a tail corresponding to tags upregulated 12 to 16 times. The relative expression of each transcript was determined by dividing the number of tags observed in the MDM library by the number of the same tags observed in the "MDM+Lm" library. To avoid division by 0, we used a tag value of 1 for any tag that was not detectable. These ratios are plotted on the abscissa. The number of tag species comprising each ratio is plotted on the ordinate.
LPS-activated MΦs. In addition, the expression profiles of MDM, whether infected or not, were the closest to those of GM-CSF-and M-CSF-elicited cells.
These results globally indicate that the internalization of viable Leishmania parasites in macrophage and their intracellular multiplication appear to induce only minor changes in the basal transcriptome profile with no indication of an obvious inflammatory response. Nevertheless, a detailed comparison of MDM and "MDM+Lm" profiles revealed changes that might be biologically relevant to the infectious process.

Quantitative PCR experiments confirmed the changes in gene expression detected in human SAGE libraries
Human tags were assigned to their corresponding genes using Preditag ® software [26] and BLAST and then to their related biological processes using Gene Ontology [27]. Quantitative real-time PCR was then used to assess the accuracy of the generated data. Several candidate families of genes showing differential expression patterns in our human SAGE libraries were selected. To compare the Q-PCR and SAGE data, "MDM+Lm"/MDM, expression ratios were calculated (Table 2).
On the whole, data generated by SAGE or Q-PCR showed a good concordance between the trends (up-or down-regulation) of expression ratios for 83% of the genes tested, although the response measured by the two techniques might differ in magnitude. The best correlations between SAGE and Q-PCR data were observed for the genes that were abundantly expressed.

L. major infection induces a discreet but selective change in human MΦ transcripts
Following tag annotation, we used the STRIPE software [28] to screen for any spatial clustering across the human genome (Additional file 4: Spatial Clustering across the Human genome of tags extracted from MDM and "MDM+Lm" libraries). Statistical analysis of transcripts did not show any specific up-or down-modulated gene clustering across the human chromosomes. Further analysis showed that the response of MDM to Leishmania infection is characterized by the expression of genes encoding for proteins involved in several biological processes ( Figure 4 and Additional file 5: Extended names of abbreviated genes).

Complement activation
We first focused our analysis on genes involved in innate immunity such as complement components. In vivo opsonization of Leishmania promastigotes by C3b and C3bi permits the interaction with the MΦ complement receptors 1 (CR1) and 3 (CR3), respectively. In addition, it is known that C1qA and C1qB molecules are highly upregulated by activation. Our results showed a drastic inhibition of gene transcription of these latter two proteins upon MDM infection. Several other transcripts, such as C5R1, C2, C1qG or RGC32, were also down-regulated after L. major infection.

S100 proteins
Recently, a novel group of calcium-binding molecules, namely the phagocytic S100 proteins, was described as pro-inflammatory factors. These endogenous damageassociated molecular pattern (DAMP) molecules, also called alarmins, play an important role in innate immunity. Our results showed that S100A6, S100A8 and S100A9 transcripts were repressed upon L. major infection. The S100A8/A9 complex has been shown to play an important role in phagocyte NADPH oxidase activation, which contributes to intracellular parasite killing. Their inhibition could be way for L. major to avoid reactive oxygen intermediate (ROI) killing. Two other S100 family members were up-regulated (i.e., S100A10 and S100A11). These two proteins are described as interacting with the Ntermini of annexins A1 and A2, forming a sophisticated Ca2+ sensing system. The annexin A2, which acts as a receptor of plasmin, a potent pro-inflammatory activator of human monocytes, was down-regulated twice. The identification of several transcripts of this family modulated by Leishmania suggests a novel mechanism of inflammation and tissue damage in infected MΦs.

MHC class I and class II molecules
We then investigated major histocompatibility complex (MHC) genes after L. major infection. Class II antigenprocessing genes, including some cathepsins (i.e., cathepsin S or cathepsin C) and genes involved in class II presentation such as CD74, Human Leukocyte Antigen (HLA)-DP, HLA-DQ, HLA-DR and HLA-DM, were repressed in MDM cells relatively to samples from uninfected cells.
In contrast to the class II pathway, genes involved in antigen processing and presentation via the MHC class I pathway (i.e., β2-microglobulin, tapasin, HLA-C, HLA-F and HLA-G) were not altered by parasite infection at 24 h, except for HLA-B and calnexin genes, which were downregulated, and HLA-A, which was up-regulated.

Interferon (IFN)γ pathway
Inhibition of the IFNγ pathway appears to be a mechanism that is widely used by different pathogens to subvert the host responses [29]. IFNγ, a potent inducer of MHC class II expression in MΦs and hence of antigen presentation, once bound to its receptor, leads to STAT1 phosphorylation and translocation to the nucleus and to IFN regulatory factor (IRF) activation, which play a key role in the induction of a large set of MΦ effector molecules involved in host defense and inflammation. Our results showed a significant decrease in IFNGR2, STAT1 and IRF1 transcripts in parasite infected MDM.

Apoptosis
Programmed cell death plays a pivotal role in normal tissue development and in pathological conditions [30]. Interestingly, Leishmania inhibits host cell apoptosis pathways in order to favor its own multiplication [31]. We annotated several tags as apoptotic and anti-apoptotic family members. Transcripts of caspase 3 (CASP3), Acyl coenzyme A-binding protein (DBI), death inducer-obliterator-1 (DIDO1) and Bcl2-related protein A1 (BCL2A1), are proapoptotic proteins or induced by apoptosis, and were down-modulated upon infection. In addition, an antiapoptotic gene transcript called defender against cell death-1 (DAD1) was slightly induced.

Cytokines and chemokines
We finally focused on cytokine and chemokine transcripts. Several were up-regulated upon infection e.g., IL-8, CXCL2, CXCL3 or NFIL3. On the other hand, we noted that the mRNA expression levels of different chemokines and their ligands, i.e., CCR2, CCL5, CCL17, CXCL9, CXCL10, CCL4L2 or CKLFSF3, were drastically inhibited upon infection. As expected, the transcripts of the pivotal cytokine IL10 were strongly up-regulated (from 101 to 233 occurrences), even though the assigned tag did not correspond to the tag directly following the poly-A signal. However, we were not able to unambiguously assign tags corresponding to other cytokines, classically reported to be altered after Leishmania infection (i.e., IL12, IL18 or TNFα).

Transcriptome analysis of extra-and intracellular specific stages of L. major parasites
Our analysis also included the study of Leishmania transcriptome alterations, once parasites were exposed to the phagolysosomal intracellular environment and transform into amastigotes. We focused on (i) the most highly expressed transcripts at the metacyclic stage and (ii) differentially expressed tags between the intracellular and extracellular stages of L. major parasite.  Examples of gene transcripts categorized into functional classes involved in defense MΦ programs

Annotation of L. major tags from the metacyclic parasite SAGE library
A total of 33,906 tags corresponding to 9,530 unique tags were generated from the metacyclic stage of the L. major library ( Table 1). The 106 most abundant ones represented 1.1% of the total number of unique tags (106/ 9,530) but totaled up to 40% of the entire collection of parasite tags (13,636/33,906).
Tag-to-gene mapping was done for these most abundant ( Table 3) and total tags of the L. major GLC94 library transcripts using BLAST against the Friedlin L. major genome. This snapshot of the major parasitic transcripts showed that 35 out of 106 tags (33%) mapped unambiguously to their genes with 100% sequence identity (downstream stop codon). Twenty-seven tags mapped to a unique gene and 8 mapped to two or more genes belonging to the same family. Among these assigned tags, 32 tags were located in the 3' region, downstream the stop codon and three matched inside the CDS. Finally, from the tags present at least twice (3,163 tags), we were able to assign 1,068 tags to their genes (Additional file 6: "Tag to gene assignation" of all parasitic tags present at least twice and extracted from Lm library).
Through this annotation, several transcripts were found encoding for ribosomal proteins including 40S, 60S, L1a, L27, I3, S25 and S27a ribosomal proteins. This analysis also revealed the abundant expression of mRNA encoding for histones H1, H2A and H3, for ubiquitin related proteins, tubulin and microtubule associated protein among others (Table 3).

Preferential expression of transcripts in L. major parasites at the intracellular stage
When comparing libraries, we found 3,666 tags (2,960 with more than one copy) of them co-expressed in the "MDM+Lm" and Lm, but absent in other human libraries ( Figure 1B and 1D). Statistical analysis and fold increase levels revealed that 697 of these tags were differentially expressed between metacyclic promastigotes (Lm library) and intracellular parasites ("MDM+Lm" library), with 420 tags preferentially expressed by intramacrophagic para-sites (p < 0.05 for 193 and p ≥ 0.05 for 227 of them but with a fold increase greater then 3.5-fold).
Tag-to-gene mapping of these 420 tags showed that 113 (27%) of them were unambiguously mapped to their genes. Among them, only 105 tags mapped to a unique gene and eight mapped to genes within the same family.
This analysis also revealed differential expression of mRNA encoding for different proteins including an amas- tin-like protein, histones H1 and H2B, tubulin tyrosine ligase, reiske iron-sulfur protein precursor and several ribosomal proteins (Table 4). Interestingly, 71 of the assigned tags corresponded to hypothetical proteins with conserved domains and/or unknown functions.

Stage-specific preferential expression of parasite transcripts is confirmed by quantitative PCR experiments
We used quantitative real-time PCR to validate the accuracy of the SAGE data generated. Q-PCR was also performed on cDNA obtained from amastigote-like axenic parasites of L. major.
By comparing "MDM+Lm"/MDM tags ratios, Q-PCR showed the same trend towards the up-regulated expression of all selected transcripts, but one, in intracellular parasites compared to L. major promastigote metacyclic parasites (Table 4). Unexpectedly, 33% of the tested transcripts that were up-regulated in intracellular amastigotes, using SAGE and Q-PCR technologies, were down-regulated in the amastigote-like parasites obtained by culture in axenic conditions. This result suggests that the transcriptome profile of L. major amastigote-like axenic parasites may not reproduce the profile expressed by the naturally induced intracellular amastigote stage and that the biological results obtained with the former parasite should be cautiously extrapolated to the latter parasite form.

Discussion
Genome-wide expression profiling offers new perspectives for studying host-pathogen interactions to decipher, at the transcriptional level, how host cells react to infection and how pathogens adapt to their host's microenvironment. In the present study, we took advantage of SAGE to analyze the transcriptomes of both the infected MΦ and the intracellular parasite Leishmania using a one-step approach. Our working hypothesis was that, having extracted the bulk of mRNA molecules from a co-culture of parasites and infected MΦs, it would be possible to separate, in the resulting SAGE library, the respective contributions of each organism to the mixed collection of tags. The proportion of ambiguous gene signatures was found to be lower than 1.5%, confirming the validity of this approach. Such unambiguous tag species identification would be more difficult to reach using alternative high-throughput transcriptomic methods, such as microarrays, due to the difficulties in assessing the extent of cross-hybridization between the human and the parasite transcripts.
Separating the contribution of both organisms in an infected MΦsAGE library raised no technical problems and could be performed on a desktop computer using the functions of a commercial database management system (MS-Access). To distinguish tags according to their origin, we considered that merging all publicly available leukocyte libraries would generate a set of tags that are repre-sentative of human transcripts. Despite this extended coverage, it is clear that the deconvolution of both transcriptomes could not be complete, since unmatched tags that could not be ascribed to either of the two species (H. sapiens or L. major) may either correspond to very specific human transcripts expressed only in Leishmania-infected MΦs and never generated elsewhere or may reveal stagespecific parasite transcripts strictly specific of the intracellular stage. This problem was pointed out in a recent study [32], suggesting that the human genome might actually contain twice as many transcribed regions as currently annotated. Moreover, the ENCODE project consortium highlighted the number and complexity of the RNA transcripts generated comparatively to the small number of protein-coding genes (≈ 21,000) currently annotated on the human genome [33,34].
SAGE was used as a quantitative approach, to evaluate the expression levels of mRNAs and to calculate the respective amount of material from human or parasite origin. With the reasonable assumption that mRNAs originated only from living cells, our data demonstrated the importance (49%) of the parasitic load in infected cells.
In spite of this heavy parasitic burden, a salient feature emerged from multivariate statistics: that parasite infection has, at the global level, an apparent marginal impact (only 2.4% of the transcripts were found modulated) on the expression profile of infected MΦs. Thus, the mRNA profile of infected MΦs contrasted with that of monocytes exposed to LPS because it revealed many fewer alterations in gene expression.
However, although Leishmania parasites do not seem to induce dramatic changes in the transcriptional remodeling program of MΦs, a closer analysis detected physiologically significant alterations in gene transcription. Despite their discreetness, these alterations could harmfully weaken macrophages' microbicidal defense task and homing properties. Indeed, our analysis showed that several MΦ antiparasitic pathways were altered at the level of mRNA expression upon infection by L. major parasites. In particular, we were able to show that several members of the S100A family, among others, are up-or down-regulated by infection. This is in contrast to a previous study using microarray technology that reported almost stable signals between non-infected MΦs as compared to L. major-infected MΦs for this gene family [10]. Other differences in the expression levels of several chemokine family members were observed between the two studies, except for CXCL3 and IL8 transcripts, which were strongly upregulated.
Whether the discrepancies between the two approaches reflect differences in the experimental protocols used by the two studies (e.g., cell-parasite incubation time, para-site strains or human genetic variability) or are attributable to differences in the sensitivity of the two techniques to accurately quantitate the mRNA of expressed genes is unclear. It is noteworthy that our results concerning the IFNγ pathway, are in agreement with those obtained recently by Dogra et al. in THP1-infected cells [11]. Indeed, transcripts of STAT-1, a key actor of this pathway, were drastically down-regulated at 24 h after infection, though there is no external activation by IFNγ. In addition, we found that several IFNγ-inducible chemokines (CXCL9 and CXCL10) were down-modulated. Since key proteins belonging to this pathway are also inhibited upon L. major infection (K Ben-Aissa, Personal communication), such effects render the MΦ refractory to any potential activation by IFNγ and obviously favor parasite survival. Other genes, among those involved in antigen presentation and implicated in the stabilization and the recycling of classical MHC class II and in the binding and the capture of antigens were also down-modulated by L. major, as reported by Chaussabel et al. and Dogra et al. [10,11].
Our results also show that several genes encoding proinflammatory mediators were up-regulated, while other family members were down-modulated. This indicates that Leishmania have a remarkable capacity to specifically inhibits the transcription of several molecules associated with pro-inflammatory responses. It is notable that this peculiarity of L. major infection does not completely fitin contrast to other pathogens (i.e., Mycobacterium tuberculosis, Listeria monocytogenes, Escherichia coli, Bordetella pertussis, Candida albicans, etc.) -with the so called "common host-transcriptional response" [35], stressing the particularity of this parasite. This is probably a survival mechanism whereby the parasites can inhibit a harmful inflammatory reaction in order to slip silently into the MΦ and successfully establish inside the host.
In addition to the analysis of the MΦ transcriptome, in the last 5 years, several studies have focused on the parasitic transcriptome taking advantage of the availability of L. major genome sequence [36]. Although this genome ( Hence, our tag-to-gene mapping for parasite transcripts was rather encouraging, compared to the number of sequenced tags. Indeed, we were able to list up to 900 tags expressed in at least two copies in the metacyclic promastigote stage but totally absent from the intracellular amas-tigote stage, generating useful data for better data mining. In addition, among the tags common to L. major promastigote and MΦ-infected libraries, 19% (697/3,666) were differentially expressed. This led us to estimate (without taking into account the tags that were specifically intracellular and present only in the infected MΦ library) the transcripts differentially expressed, between the two parasitic stages, to roughly 1,600 tags, representing approximately 20% of transcripts if one considers the 8,370 annotated Leishmania genes registered in the databases.
This figure is several-fold higher than those reported from a variety of Leishmania species (i.e., L. major [13][14][15]18], L. donovani [20], L. infantum [17] and L. mexicana [16]), which clearly show limited differences using microarrays (ranging from 0.2 to 5% of total genes) in stage-specific gene expression between the promastigote and amastigote life stages. These studies also show that the vast majority of genes are constitutively expressed [18,20,37]. One should note that these studies analyzed the amastigote transcripts, either using amastigote parasites derived from BALB/c lesions or axenic amastigotes obtained in vitro, whereas our study used the amastigotes derived from human MΦs.
However, while analyzing the functional significance of gene expression in Leishmania, we should consider that it is mainly regulated at the post-transcriptional level. As highlighted by Cohen-Freue et al. [37], the alteration in mRNA levels of regulated genes in Leishmania does not necessarily correlate with subsequent protein abundance. The functional significance is better manifested at the protein level, which is regulated by mechanisms such as stage-specific translational control, RNA stability, processing events and post-translational modifications. Nonetheless and despite these limitations, transcriptomic approaches for Leishmania could mainly help to better annotate its genome and to study the stability and translational regulation of its transcripts.

Conclusion
To our knowledge, we provide here for the first time a large-scale gene expression profile of both the infected human MΦ and the infective form of L. major using SAGE. This set of expressed genes deserves future rounds of data mining and experimental work, since it contains latent information about proteins susceptible to behaving as antigens and being evaluated as candidates in a vaccine approach. These data also provide the basis for studies in progress that aim to compare, at the molecular level, various strains of Leishmania known to differ by their behavior at the physiopathological level. Thus, comparing viscerotropic strains, e.g., L. infantum or L. donovani, to strictly dermotropic strains e.g., L. major, may reveal differences at the level of parasite-MΦ interactions that could indicate cellular targets of parasite virulence factors as well as decipher mechanisms of specific tissular tropism.

Methods
Parasite culture and preparation L. major isolate obtained from the field (MHOM/TN/95/ GLC94) was used [38]. Parasites were cultured at 26°C without CO 2 in endotoxin-free RPMI 1640 medium supplemented with 10% heat-inactivated fetal calf serum (HyClone Laboratories, Logan, UT, USA), 100 U/ml penicillin, 100 μg/ml streptomycin and 2 mM L-glutamine. Infective-stage metacyclic promastigotes were isolated from stationary culture (5-6 days old) by negative selection using peanut agglutinin (Sigma, Saint-Quentin Fallavier, France). Parasites were then harvested for RNA preparation or used to infect cells (5:1 parasite-to-cell ratio). Axenic amastigotes of L. major were obtained by shifting the incubation conditions of a saturated culture of L. major promastigotes from 26 to 37°C and pH 5.5 in a modified RPMI 1640 medium as described previously [39].

In vitro generation of human MΦs
Donors were selected as negative for any recent infection and with no history of Leishmaniasis. Their peripheral blood mononuclear cells (PBMC) did not proliferate in vitro to Soluble Leishmania Antigens and they were not taking medication at the time of the study. Informed consent was obtained from all donors. The experimental protocol was approved by the institutional ethics committee of the Institute Pasteur of Tunis. Human PBMCs were isolated from leukopack peripheral blood mononuclear cells of four healthy volunteers using Ficoll-Paque (Pharmacia, Uppsala, Sweden) density gradient centrifugation. Cells were washed and resuspended at 10 6 cells/ml in RPMI 1640 medium supplemented with 2 mM L-glutamine, 100 U/ml penicillin, 100 μg/ml streptomycin and 10% autologous heat-inactivated serum. Monocytes were purified by fibronectin-mediated adhesion [40] using gelatin (Sigma) and autologous heat-inactivated serum substratum.

Human MΦs infection
The MDMs obtained were exposed to metacyclic parasites of L. major (MHOM/TN/95/GLC94 strain parasites to cells (ratio 5:1) for 24 h and then harvested for RNA preparation. To determine infection levels, an aliquot was taken from each culture, spun onto glass microscope slide, and stained with Giemsa-May Grünwald. The percentage of infected cells was counted by microscopy, in triplicate of one hundred cells for each slide.

RNA isolation
Cells or parasites were collected at the indicated time points by centrifugation, homogenized by Trizol reagent (Gibco BRL) and frozen at -70°C until RNA extraction. The RNA from each of the four donors was extracted independently then pooled, and used for library construction.
RNA was purified from contaminating genomic DNA using the standard protocol. Briefly, contaminating DNA was removed from total macrophage or parasite RNA using DNase I (Invitrogen, Carlsbad, CA, USA). The RNA samples were then ethanol-precipitated, washed once in 70% ethanol, and redissolved in water. RNA was quantified using a spectrophotometer. Examination of purified total RNA by gel electrophoresis revealed prominent 5S, 18S and 28S ribosomal bands for human samples and 18S and 24Sα and 24Sβ ribosomal bands for parasitic samples, indicating that the RNA was not degraded.

SAGE library construction
Libraries were constructed using the I-SAGE Kit (Invitrogen) according to the protocol developed by Velculescu et al. [41]. Briefly, a pool of mRNA samples was converted into cDNA using biotinylated oligo(dt) primer linked to magnetic beads. The cDNA were cleaved using the NlaIII anchoring enzyme. Digested DNAs were split in two and each ligated with one of two adapters containing a restriction site of BsmFI tagging enzyme. The two pools of the tags obtained were ligated to one another and served as templates for PCR amplification. The PCR product (containing two tags (ditag) linked tail to tail) was then cleaved with the NlaIII anchoring enzyme, thus releasing 14 bp-long ditags that were then concatenated by ligation, cloned and sequenced.

Computer-based analysis of the SAGE libraries
The raw sequences obtained from concatemer clones were analyzed using PHRED [42] and trimmed for quality to eliminate erroneous tags as much as possible. Contaminating vector sequences or SAGE tags derived from linkers were then discarded using CROSS-MATCH software [43]. Experimental tag sequences were extracted using DIGITAG [26]. This software is written in PERL and implemented on a UNIX operating workstation for automatic tag detection and counting. DIGITAG analyzes all concatemer sequences to discard ditags that are duplicated or different from 20 bp between the two CATG. Then, for each concatemer sequence, DIGITAG generates the reverse complement, adds it to the initial sequence, and extracts all CATGs plus the 10 following bases to obtain the tag sequences and determine their copy number in each library. P-value calculations and identification of genes differentially expressed were performed according to the procedure described by Piquemal et al. [26]. Tag levels were compared between the two MDM libraries generated in the absence or presence of parasites, or between the MDM-infected library and metacyclic parasite library. Differentially expressed tags were selected for further analysis. Expression data were also analyzed using various modules of the TIGR MultiExperiment Viewer Package (MeV 4.0, 2006).

Human leukocyte SAGE library collections
Following sequence analysis of SAGE libraries, data were assembled in a unique matrix. We also collected up to 357,888 experimental tags from nine publicly available human leukocyte SAGE libraries (retrieved from [44][45][46]) and from a second in-house non-stimulated MΦ generated library (raised in similar conditions but from a different pool of donors and noted MDM-M; Ottones et al., unpublished data). These SAGE libraries were generated from freshly isolated monocytes [47], M-CSF-differentiated [47], GM-CSF-differentiated [47] and LPS-activated [48] cells, immature [49] and mature [50] monocytederived dendritic cells and unfractionated populations of leukocytes [51], noted Mono, M-CSF, GM-CSF, LPS, IDC, MADC, leuk, WBC-N and WBC-Bc respectively. Data were assembled to build a matrix giving the expression levels of 26,176 unique tags.

Human tag-to-gene mapping
Regular SAGE tags were identified as previously described [26]. Briefly, we constructed a reference database to compile tags predicted from collections of expressed sequences, including well-annotated sequences [52], reference sequences of UniGene clusters [53], SAGEmap tags [44] and the GenBank collection of human Alu sequences. This Preditag ® software (Skuld-Tech, Montpellier, France) was also modified to register virtual tags matching the reverse complement of the sequences. We used its functions to automatically generate a table of results, by matching experimental tags to virtual ones. Tags matching with 100% sequence identity were then ranked based on the fidelity of the source sequence. The first positions starting from the 3'-most end of the transcript were kept.

Parasite tag-to-gene mapping
In order to assign gene identity to each parasite tag, the experimental tag list from the purified metacyclic parasite SAGE library were matched against the L. major Friedlin genome (version 5.2) downloaded from GeneDB [54].
Since SAGE tags should be sitting in the untranslated regions of a given gene, tags that had a unique match, with 100% sequence identity, and that were found within 1 kb downstream of the stop codon of one gene or related genes within the same family and alternatively in the CDS, were assigned. These related genes were identified by blast, comparing the set of L. major proteins and selecting those that were sharing more than 85% identity.

Validation of SAGE libraries by Q-PCR
The same pooled RNAs used for SAGE libraries were used for real-time reverse transcriptase-polymerase chain reaction (RT-PCR). For human tag validation analyses, predeveloped assay reagent probes, reagents, and Real-time PCR ABI-7900HT equipment were used for validation experiments as recommended by the manufacturer (Applied Biosystems, Fullerton, CA, USA). For parasite tag validation studies, reverse transcription and real-time PCR were performed using SYBR Green I Universal PCR Mas-terMix (PE Applied Biosystems, Foster City, CA, USA) and primers (Additional file 7: Parasite primer sequences for quantitative RT-PCR) were designed for each sequence, including endogenous controls, using Primer express Software (Version 1.5, PE Applied Biosystems). All PCR reactions were performed using the ABI PRISM 7700 sequence detection system. This technique is based on measuring PCR products in the logarithmic phase of the reaction by determining the CT [55], CT being the threshold cycle at which the fluorescence emission reaches the log phase of product accumulation.
Briefly, after defrosting at room temperature, total RNA was extracted using the Qiagen RNeasy Mini kit as indicated by the manufacturer (Qiagen, Courtaboeuf, France). The quality of the total RNA was determined by capillary electrophoresis analysis using an Agilent 2100 Bioanalyser (Agilent, Palo Alto, CA, USA). cDNA was then synthesized using the High-Capacity cDNA Archive Kit (Applied Biosystems) according to the manufacturer's protocol. Samples were loaded on micro-fluidic plates and data were normalized by referring to the expression of an endogenous control which was highly homogeneous between used samples (Ct = 20.55 for MDM library and 20.46 for "MDM+Lm" library, i.e., human glyceraldehyde-3-phosphate dehydrogenase (GAPDH).
For parasite Q-PCR, each analysis was performed in triplicate and data were normalized by referring to the expression of an endogenous control (rRNA45; accession number: CC144545) described as equally expressed between procyclic, metacyclic, and amastigote stages of L. major using DNA microarrays, quantitative PCR and Northern blot experiments [14,56].
Finally, for each human target mRNA, results were expressed as a fold difference in MDM exposed to metacy-clic parasites of L. major promastigotes vs. non-infected MDM by calculating 2 -ΔΔCT . For parasite target mRNA, results were expressed as a fold difference in L. major metacyclic promastigotes vs. L. major-infected MDM.