The midgut transcriptome of Lutzomyia longipalpis: comparative analysis of cDNA libraries from sugar-fed, blood-fed, post-digested and Leishmania infantum chagasi-infected sand flies

Background In the life cycle of Leishmania within the alimentary canal of sand flies the parasites have to survive the hostile environment of blood meal digestion, escape the blood bolus and attach to the midgut epithelium before differentiating into the infective metacyclic stages. The molecular interactions between the Leishmania parasites and the gut of the sand fly are poorly understood. In the present work we sequenced five cDNA libraries constructed from midgut tissue from the sand fly Lutzomyia longipalpis and analyzed the transcripts present following sugar feeding, blood feeding and after the blood meal has been processed and excreted, both in the presence and absence of Leishmania infantum chagasi. Results Comparative analysis of the transcripts from sugar-fed and blood-fed cDNA libraries resulted in the identification of transcripts differentially expressed during blood feeding. This included upregulated transcripts such as four distinct microvillar-like proteins (LuloMVP1, 2, 4 and 5), two peritrophin like proteins, a trypsin like protein (Lltryp1), two chymotrypsin like proteins (LuloChym1A and 2) and an unknown protein. Downregulated transcripts by blood feeding were a microvillar-like protein (LuloMVP3), a trypsin like protein (Lltryp2) and an astacin-like metalloprotease (LuloAstacin). Furthermore, a comparative analysis between blood-fed and Leishmania infected midgut cDNA libraries resulted in the identification of the transcripts that were differentially expressed due to the presence of Leishmania in the gut of the sand fly. This included down regulated transcripts such as four microvillar-like proteins (LuloMVP1,2, 4 and 5), a Chymotrypsin (LuloChym1A) and a carboxypeptidase (LuloCpepA1), among others. Upregulated midgut transcripts in the presence of Leishmania were a peritrophin like protein (LuloPer1), a trypsin-like protein (Lltryp2) and an unknown protein. Conclusion This transcriptome analysis represents the largest set of sequence data reported from a specific sand fly tissue and provides further information of the transcripts present in the sand fly Lutzomyia longipalpis. This analysis provides the detailed information of molecules present in the midgut of this sand fly and the transcripts potentially modulated by blood feeding and by the presence of the Leishmania parasite. More importantly, this analysis suggests that Leishmania infantum chagasi alters the expression profile of certain midgut transcripts in the sand fly during blood meal digestion and that this modulation may be relevant for the survival and establishment of the parasite in the gut of the fly. Moreover, this analysis suggests that these changes may be occurring during the digestion of the blood meal and not afterwards.

fly Lutzomyia longipalpis. This analysis provides the detailed information of molecules present in the midgut of this sand fly and the transcripts potentially modulated by blood feeding and by the presence of the Leishmania parasite. More importantly, this analysis suggests that Leishmania infantum chagasi alters the expression profile of certain midgut transcripts in the sand fly during blood meal digestion and that this modulation may be relevant for the survival and establishment of the parasite in the gut of the fly. Moreover, this analysis suggests that these changes may be occurring during the digestion of the blood meal and not afterwards.

Background
Leishmaniasis is a spectrum of diseases caused by numerous species of the kinetoplastid parasite Leishmania which are transmitted by Phlebotomine sand flies. Different forms of disease presentation can be linked with the various species of Leishmania parasites, with the visceral form of the disease being caused mainly by the Old World Leishmania infantum or the New World variant Leishmania infantum (chagasi). Visceral leishmaniasis is a disease which is commonly fatal if left untreated. Currently, there is no licensed vaccine for the prevention of visceral disease in humans and current drug treatment with antimonials and other components is a lengthy and arduous procedure with undesirable secondary effects [1].
The sand fly Lutzomyia longipalpis, the principal vector of the parasite Leishmania infantum chagasi, is the most significant source of American visceral leishmaniasis. As with many other arthropod-borne diseases, transmission of the Leishmania parasite, occurs during the act of vector blood feeding upon a vertebrate host. Upon blood meal ingestion a large number of events are induced, including digestion, metabolism, diuresis, and ultimately oogenesis. Unlike arboviruses, Plasmodium or Borrelia, Leishmania can complete the necessary developmental changes and propagate to numbers sufficient for transmission and infection solely within the confines of the midgut tissue of the sand fly [2]. Several sand fly proteases involved in blood meal digestion and implicated in the species specificity between Leishmania and the respective vectors have been characterized and include trypsins, chymotrypsins and chitinases from both L. longipalpis and Phlebotomus papatasi [3,4].
More global approaches to identifying and characterizing sand fly molecules has been accomplished through the sequencing of whole sand fly-derived expressed sequence tags [5]. While that study contributes to the knowledge of the molecular components of the sand fly it does not provide the specific molecules of the midgut tissue, which would interact with the developing parasites. The construction and sequencing of midgut tissue-specific cDNA libraries aims therefore, to identify those molecules involved in blood meal digestion and metabolism, peritrophic matrix formation, and possible parasite associa-tions. Here we have generated and sequenced five cDNA libraries from the midgut tissue from L. longipalpis; investigated the molecules present during sugar and blood feeding as well as after the blood meal has been processed and excreted, both in the presence and absence of L. infantum chagasi. In addition to the identification of midgutassociated molecules, sequence analysis and phylogenetic comparison of the sequences of L. longipalpis allows a better understanding of blood meal processing in sand flies and the differences between visceral (Lutzomyia longipalpis) and cutaneous leishmaniasis (Phlebotomus papatasi) sand fly vectors.

Results and discussion
As the midgut is the primary organ of the sand fly in which the Leishmania parasite develops, cDNA libraries of the midgut tissue were constructed, sequenced and analyzed to investigate the molecules present which may have important interactions between these two organisms. In total, five cDNA libraries were constructed from the midgut tissue of female L. longipalpis during different conditions of feeding and digestion. These conditions included one library combining the midguts from sand flies allowed to feed on a sucrose solution (SF), a pool of midgut tissue from sand flies fully engorged from an artificial blood meal 1, 2, and 3 days post blood meal ingestion (BF), and a pool of midguts from gravid sand flies 5, 6, and 7 days post blood meal digestion (PBMD). The conditions chosen and the pooling of those times after blood meal ingestion allows better coverage of the most abundant molecules transcribed in the midgut as well as a comparison of the molecules present prior to blood feeding, while the blood bolus is present, during digestion of the blood meal, and after the blood byproducts have been excreted. Two cDNA libraries were constructed from the equivalent pools of time points after blood feeding in L. longipalpis midgut tissue from sand flies which had ingested amastigote-infected macrophages in an artificial blood meal (BFi and PBMDi), a more natural presentation of parasites to the blood-feeding sand fly.
Once constructed, approximately 2300 phage plaques were picked and ultimately sequenced for each of the five cDNA libraries; generating a total of 9601 high quality sequences from the midgut tissue of L. longipalpis. These sequences have been submitted to the NCBI EST database under accession numbers EW987149 -EW996682. Table  1 summarizes the results of sequence quality and bioinformatics analysis of each library and the combination of all libraries by the number of sequences analyzed, the number of high quality sequences used in the bioinformatics analysis, the number of contigs, the number of singletons and the average number of sequences per contig. Each library generated a similar number of sequences and sequence recovery from the phage plaques ranged from 79-85%. After discarding low quality sequences, each library retained 71-80% sequences with an average of 73% of the total 11,520 phage producing high quality sequence data. Clustering similar sequences into contigs, based on sequence homology, produced a comparable number of contigs for each library as well as a similar number of singletons. The comparable number of high quality sequences, contigs and singletons produced from each library allows for a better comparison between the sequence abundance of specific molecules of interest and the respective biological condition of the midgut under which they were recovered. The average number of sequences used in the cluster of the contigs varied slightly between libraries. The BF, PBMD, and PBMDi cDNA libraries contained an average sequence per cluster ratio of 8.4, 8.10 and 8.76, respectively. The SF cDNA library had a sequence per cluster ratio of 6.86 and the BFi cDNA library produced an average of 6.34 sequences per cluster. The combining of all cDNA library sequences produced 655 contigs, 2279 singletons and an average of 9.45 sequences per contig. Each cluster was assigned a putative function and placed in a functional class based on the sequence homology to molecules identified by the BLAST results from the NCBI non-redundant protein, the Gene Ontology, the conserved domain, rRNA and mitochondrial databases. Figure 1 shows an overall view of sequence abundance of the functional classes that occurs during the processes of sugar feeding, blood feeding and after the digestion of the blood meal. The clusters of those three cDNA libraries, with an E-value less than 10E-5 result of the KOG BLAST, were grouped according to the general functional class. Although this is a summation of a large number of different clusters, the total number of sequences in each functional class can highlight overall trends that are potentially important in the processes of blood feeding and digestion.
Following is a more detailed description of the most abundant transcripts identified in this analysis:

Proteases
Proteases were among the most abundant transcripts captured in the random sequencing of the midgut cDNA libraries and included trypsin-like serine proteases, chymotrypsins, carboxypeptidases, and an astacin-like metalloprotease. Table 2 shows the putative proteases identified in the midgut transcriptome. The Sanger Institute's Lutzomyia longipalpis EST database was searched using BLAST to find the best matches and results are shown with the corresponding E value. The proteases described here are most similar to those described in the sand fly Phlebotomus papatasi, the mosquitoes Aedes aegypti or Anopheles gambie, with the exception that cluster 91 encodes a putative carboxypeptidase that shares homology with a molecule from the beetle Tribolium castaneum. Table 3 shows the Histograph of the number of sequences grouped into func-tional classes from the sugar fed, blood fed and post blood meal digestion cDNA libraries Figure 1 Histograph of the number of sequences grouped into functional classes from the sugar fed, blood fed and post blood meal digestion cDNA libraries. Sequences from clusters of those three cDNA libraries, with an E-value less than 10E-5 result of the COG BLAST grouped into the general functional class as assigned by COG. transcript producing a full length, high quality sequence for each cluster and the putative function of the identified transcripts. The number of sequences that each cluster contributed to each of the cDNA libraries is also shown and from this it can be seen that most proteases are more abundant, as expected, in the blood fed (BF) and blood fed-Leishmania infected (BFi) libraries. An interesting observation is that cluster 18, which encodes a putative trypsin, is most abundant in the SF, PBMD and PMBDi cDNA libraries, indicating that this putative trypsin may have a role other than blood meal digestion or is produced and stored prior to the ingestion of a blood meal. Table 4 describes the predicted localization, molecular weight and isoelectric point of these proteases. All of the identified proteases posses a potential signal peptide and the molecular weight and isoelectric point given is that of the predicted mature and secreted protein.
A novel midgut-associated serine protease, LuloSerPro, was identified in the sequencing and annotation of these midgut cDNA libraries. LuloSerPro is predicted to be secreted and have a mature molecular weight of 29.0 kDa, slightly larger than the other trypsin-like serine proteases in the midgut, and has an unusually high predicted pI of 8.26 (Table 4). This molecule, while found in low abundance, was present in the sugar fed, blood fed-Leishmania infected, and post blood meal digestion-Leishmania infected cDNA libraries (Table 3). Phylogenetic analysis and multiple sequence alignments of the midgut trypsin molecules and LuloSerPro show that while this molecule is very similar to other trypsin molecules and retains the catalytic residues, this is a distinctly different serine protease ( Figure 2). Additionally, there is a difference in the residues that determine the substrate specificity (Lys to Val) between the other midgut trypsins and LuloSerPro ( Figure  2B).

Chymotrypsin
Chymotrypsin is another serine protease found in abundance in the midgut of this hematophage midgut. This study identified five clusters with homology to chymotrypsin molecules described in P. papatasi and one cluster with homology to a putative larval chymotrypsin found in A. aegypti (Tables 2, 3, 4). Clusters 33, 32, 64, 87, 30 and 31 were named LuloChym1A, LuloChym1B, LuloChym2, LuloChym3, LuloChym4 and LuloChym5, respectively. LuloChym4 was found in higher abundance in the sugar fed cDNA library and LuloChym5 sequences were found in relatively equal numbers between blood fed and sugar fed cDNA libraries. In contrast the other chymotrypsin molecules appear in highest abundance in the blood fed and blood fed-Leishmania infected cDNA libraries (Table 3). According to sequence numbers between the cDNA libraries it appears that chymotrypsin transcription is quiescent after the blood meal has been digested and excreted. The L. longipalpis chymotrypsin sequences have a predicted  (Table 4).
Phylogenetic analysis of chymotrypsin amino acid sequences show that there is conservation in sequence homology between L. longipalpis chymotrypsin and P. papatasi chymotrypsin molecules ( Figure 3A). LuloChym1A, ated chymotrypsin molecules show that the cysteine and catalytic residues H/D/S are conserved ( Figure 3B).

Carboxypeptidases
The three longest transcripts encoding putative proteases identified in the analysis are similar to zinc metallocarboxypeptidases found in other insects and significant similarity to ESTs from the Sanger Institute database ( Table  2). These transcripts from clusters 104, 107 and 91 were named LuloCpepA1,LuloCpepA2 and LuloCpepB, have molecular weights of 45.8, 46.0 and 45.9 kDa and a pI of 5.36, 5.41 and 4.73, respectively (Table 4). Although LuloCpepA2 appears to be an incomplete transcript with a 5' truncation, based on homology and predicted signal peptide sequences a putative mature protein can be used in further characterization and comparison. Most of the sequences grouped to produce the carboxypeptidase clusters were captured from the blood fed library, suggesting that these molecules are likely induced by the ingestion or presence of blood in the midgut of the sand fly ( Table 3).
The classification of these molecules as members of the A or B class of metallocarboxypeptidases was determined by the output from phylogenetic analysis of the amino acid sequences ( Figure 4A). The phylogenetic tree produced by this analysis shows distinct clades containing insect sequences nearly all annotated as either carboxypeptidase A or carboxypeptidase B molecules. The high node support values of the sand fly carboxypeptidases in the phylogenetic tree imply conservation of these molecules when comparing the Old World sand fly P. papatasi and that of the New World sand fly L. longipalpis. Similarity between the two sand flies, with regards to the carboxypeptidase molecules, can be seen in amino acid sequence alignments, depicting the high level of identity and retention of the catalytic residues necessary for metallocarboxypeptidase activity ( Figure 4B, 4C). Furthermore, the amino acid sequence alignment depicts the incongruousness that separates LuloCpepA1 from LuloCpepA2 ( Figure 4B).

Astacin
A putative zinc metalloprotease was identified as a likely astacin-like molecule based on results for a search of the conserved domains database. This molecule was derived from clusters 58 and 59, both encoding the same putative protein, but separated due to differing lengths of 5'-and 3'-UTRs by the bioinformatics software. The astacin-like metalloprotease was named LuloAstacin and is predicted to have a molecular weight or 28 kDa once secreted and pI of 5.36 (Table 4). LuloAstacin was most abundant in the sugar fed cDNA library in contrast to PpAstacin, an astacin-like molecule identified in P. papatasi midgut, which was most abundant in the blood fed cDNA library (Table 3). Phylogenetic analysis of other putative astacin amino acid sequences illustrate that one clade is an assemblage of the Dipteran sequences. LuloAstacin branches out of the subclade containing PpAstacin and away from the other Dipteran sequences ( Figure 5A). Further differences in amino acid sequence can be visualized in the multiple sequence alignment of Dipteran astacins and while LuloAstacin diverges from the other astacin molecules, the residues responsible for zinc-binding and activity are conserved ( Figure 5B).

Peritrophin-like proteins
A number of molecules were identified as containing chitin binding domains based on results from the conserved domains database (Tables 5, 6, 7). Three of the transcripts resembled previously identified peritrophin molecules based on sequence homology with peritrophin-A domains. The most abundant of these putative peritrophin transcripts was named LuloPer1 (Cluster 77/78) and was overrepresented in the blood fed Leishmaniainfected cDNA library and encodes a likely secreted protein of 27.8 kDa (Tables 6 and 7). LuloPer1 consists of four chitin-binding domains (Fig. 6A); contrasting the other two peritrophin molecules, LuloPer2 and LuloPer3, which are molecules of a single chitin-binding domain ( Figure 6). LuloPer2 and LuloPer3 sequences originated in higher numbers from blood fed midgut cDNA libraries and were in relatively equal numbers between the infected and uninfected sand flies. These small putative peritrophins are predicted to have a mature molecular weight of 9.2 and 7.5 kDa and isoelectric points of 4.38 and 3.8 for LuloPer2 and LuloPer3, respectively (Table 7). LuloPer1 is likely to have a role in cross linking chitin fibrils that will form the peritrophic matrix around the ingested blood bolus. LuloPer2 and LuloPer3 may have roles in capping the ends of chitin fibrils or sequestering free chitinous molecules within the midgut lumen. However, the two sequences share only 39% identity and 44% similarity, conserving primarily the cysteine residues, suggesting they may have very different ligand specificities or roles in peritrophic matrix formation or chitin management within the midgut (data not shown). Phylogenetic analysis of the individual chitin-binding domains from several other insect peritrophin and mucin molecules demonstrates conservation of the LuloPer1 domain arrangement when compared with P. papatasi PpPer1, suggesting that if the domains are gene duplication events that those events occurred prior to speciation ( Figure 6). Additionally, the small putative peritrophin molecules domains from LuloPer2 and LuloPer3 form a clade containing another chitin-binding domain from a small peritophin of P. papatasi ( Figure 6).
In addition to the putative peritrophin molecules a transcript with homology to a predicted chitin-binding domain was identified from the clustering of 6 sequences collected primarily from the blood fed Leishmaniainfected cDNA library. This domain has homology to a Analysis of putative carboxypeptidase molecules  much larger chitin-binding domain than those found in the putative peritrophin molecules and the identified transcript, LuloChiBi, has one of these domains and is predicted to be a mature molecular weight of 20.9 kDa (Table 7).

Microvillar proteins
Among the most abundant sequences identified in the cDNA libraries were transcripts encoding putative microvillar-associate proteins with homology to insect allergens identified in Periplaneta americana and Blattella germanica Astacin-like metalloprotease sequence comparison and analysis   (Table   9). In general the microvillar proteins were most abundant in the blood fed cDNA libraries; although, LuloMVP3 (cluster 48) sequences were underrepresented in the blood fed cDNA libraries and was relatively equally identified in the sugar fed and post-blood meal ingestion Characterization of peritrophin sequences  (Table 10).
The L. longipalpis microvillar proteins share respective homology with similar molecules identified in the midgut of P. papatasi, as demonstrated by amino acid phylogenetic analysis (Figure 7). The sand fly microvillar proteins are separated from the clade containing cockroaches. Additionally, LuloMVP2 and LuloMVP5 are in a subclade with the microvillar proteins of A. aegypti and A. gambiae while the other molecules pair with the P. papatasi microvillar proteins ( Figure 7A). Sequence alignment of the L. longipalpis microvillar proteins shows little sequence homology suggesting that the classification of microvillar proteins is rather broad and in fact these molecules may have different functions altogether.

Oxidative stress molecules
The sand fly, being an obligate blood feeding insect, must cope with the physiological challenges posed by the digestion of blood which includes the generation of reactive oxygen species (ROS) released by free heme and metabolic radicals produced in abundance during the digestion of the blood meal [7]. Five molecules were identified in the midgut cDNA libraries which have putative roles as antioxidants such as glutathione s-transferase (GST), catalase, copper-zinc superoxide dismutase (SOD) and peroxiredoxin (PRX) ( Table 11). In addition to the protection these molecules may impart on the regulation of ROS due to blood meal digestion there is evidence that antioxidants interact with and can impact the outcomes of infection by bacterial and parasitic agents [8]. Two transcripts were identified with homology to GST molecules of the Class Sigma and Class Delta and Epsilon subfamilies and were named LuloGST1 and LuloGST2, respectively. Phylogenetic analysis of the putative GST molecules supports the separation and classification of into the subfamily classes of Sigma and Delta/Epsilon. Additionally, LuloGST1 is grouped in a subclade with other dipertan GST molecules while LuloGST2 diverges from the dipteran Delta/Epsilon GST molecules (Figure 8). The LuloGST1 cluster was generated from sequences from each of the cDNA libraries made and analyzed while LuloGST2 consists of one sequence from the sugar fed library and two sequences from the blood fed Leishmania-infected cDNA library. Additional antioxidant molecules include a catalase (LuloCAT), copper-zinc superoxide dismutase (LuloSOD), and peroxiredoxin (LuloPRX) of which LuloSOD and LuloPRX are both predicted to be secreted based on the presence of a likely signal peptide sequence. ROS and reactive nitrogen oxide species (RNOS) are important in host defenses against microorganisms and LuloCAT, LuloSOD and LuloPRX are molecules which may serve to regulate and prevent damage of the sand fly midgut by the ROS and RNOS defenses similar to the protective effect of a peroxiredoxin in Anopheles stephensi [9].  Upon the ingestion of a blood meal by a hematophagous insect a large amount of iron and heme is released during digestion. To combat the toxic effects of free iron and the generation of damaging reactive oxygen species ferritin is produced to sequester the iron and hemoglobin that is liberated by the digestion of red blood cells. Ferritin molecules are commonly associated with iron metabolism and it is likely that the molecules identified in this transcriptome engage in metabolic function; however, given the relative size of the blood meal in comparison with the sand fly ferritin molecules within the midgut likely serve a large role in preventing the generation of oxygen radicals by the Fenton reaction. Two transcripts from clusters 76 and 79 were identified with homology to ferritin lightchain and ferritin heavy-chain molecules and were named LuloFLC and LuloFHC, respectively (Tables 11 and 13). The expression of LuloFLC and LuloFHC appears to be constitutive based on the number of sequences generated in each cDNA library spanning the condition of sugar fed, blood fed, and post blood meal digestion (Table 12).

Serine protease inhibitors
Two types of serine protease inhibitors were identified in the cDNA libraries; a single sequence with homology to SERPIN and a cluster of 17 sequences with homology to a Kazal-type serine protease inhibitor (Tables 14, 15, 16). SERPIN molecules within the midgut of the sand fly may serve to counteract damaging proteases produced by microorganisms; however LuloSRPN lacks a predicted signal peptide sequence and thus may serve an intracellular housekeeping function. LuloKZL, identified from cluster 112, is a small molecule of 6.3 kDa and is predicted to be secreted. Comparison of LuloKZL with Kazal-type serine protease inhibitors found in a transcriptome analysis of the midgut of P. papatasi identified PpKZL1 as a highly conserved homolog (data not shown). Kazal-type protease inhibitors, such as rhodniin and infestin identified in Rhodnius prolixus and Triatoma infestans, respectively, have been characterized as thrombin inhibitors; thereby these molecules would prevent coagulation of ingested blood to facilitate successful digestion of the blood meal [10,11]. LuloKZL sequences are more abundant prior to and during blood meal digestion based on the number of sequences in the sugar fed, blood fed and post blood meal digestion cDNA libraries. Additionally, LuloKZL was not identified in an EST analysis of whole sand fly L. longipalpis and is therefore more likely a midgut-specific molecule found in abundance only in the alimentary tissue [5]. Thus, a prudent hypothesis would be that LuloKZL serves a similar function, allowing the blood bolus to remain in a colloidal suspension within the gut to facilitate peristalsis and digestion.

Anti-bacterial molecules
Two molecules, originating from clusters 235 and 1960, encode a putative peptidoglycan recognition protein (LuloPGRP) and defensin (LuloDEF), respectively. LuloP-GRP is similar to other predicted peptidoglycan recognition proteins found in Glossina morsitans morsitans and mosquitoes and is phylogenetically distinct from lepidopteran molecues (Figure 9). This is the first report of a putative PGRP identified in sand flies and in searching a midgut transcriptome database of P. papatasi a molecule was identified with 87% identity. LuloPGRP may serve as a pattern recognition protein, specifically for the conserved structure of peptidoglycan indicated by the conservation of the amino acid sequence among insects, as a component of the sand fly immune system defense  against bacterial pathogens (Figure 9). PGRP molecules characterized in Bombyx mori and Trichoplusia ni have been shown to be expressed primarily in the fat body and hemocytes and it is conceivable that the identification of LuloPGRP transcripts arose due to a contamination of the tissue sample [12,13]. It is possible that the midgut tissue of sand flies express a PGRP for protection against microorganisms ingested during sugar and blood feeding as a PGRP was identified as preferentially expressed in the midgut of Samia cynthia ricini [14].
Defensins are another type of innate immune defense that insect possesses to ward off pathogenic bacteria. A single sequence, named LuloDEF, was identified in the post blood meal digestion midgut cDNA library with homology to a defensin molecule characterized in A. aegypti. Like other insect defensin molecules, LuloDEF has a predicted secretion signal peptide and most homology is given by the carboxyl half of the sequence and conservation of cysteine residues (Figure 10). LuloDEF shares 47% identity and 61% similarity with a defensin characterized in Phlebotomus duboscqi which is induced by the presence of wild type Leishmania major [15]. Both immunity-associated genes, LuloPGRP and LuloDEF, may have an impact on the progression and result of a midgut infection by Leishmania parasites, either directly or by indirect effects if co-colonization of the midgut with bacteria is an intermediary confounding factor.

Transcripts differentially expressed by blood feeding and digestion
A comparison between the sugar fed and blood and between the blood fed and post blood meal digestion libraries was conducted using Pearson's chi-square equation to identify overrepresented transcripts within each cluster. As was previously seen in P. papatasi a number of digestion-associated transcripts were overabundant in the blood fed cDNA library [6]. We envisioned similar results in the analysis of the L. longipalpis midgut cDNA libraries with the enhanced advantage of a cDNA library produced from midguts that had fully processed and excreted the blood meal byproducts. It was our hypothesis that the post blood meal midgut transcript abundance is most similar to the sugar fed midgut transcript abundance prior to a blood meal. Overall, the number of sequences per cluster was similar in the sugar fed cDNA library to those in the post blood meal digestion cDNA library and most transcripts are overrepresented in the blood fed library (Table 17). Several exceptions to both overall observations do occur, however. Most of the microvillar protein transcripts are abundant in the blood fed cDNA library except for LuloMVP3, which is highly represented in the sugar fed and post blood meal digestion cDNA libraries. This reinforces the suggestion that the microvillar proteins are likely functionally different molecules grouped solely on homology to previously annotated sequences. In general, proteases appear to be induced by the act of blood feeding or the presence of a blood meal within the midgut; with the exception of Lltryp2 which is significantly more abundant in the sugar-fed and also in the post blood meal digestion cDNA libraries and also LuloAstacin which is more abundant in the sugar-fed cDNA library (Tables  17 and 18). These molecules may be produced and stored prior to blood feeding for immediate use in digestion or perhaps have a role other than digestion altogether, such as immunity. Other proteases such as LuloChym4 and    LuloCpepA2 are present in higher or near equal numbers in the sugar fed library when compared with that of the blood fed library. Other molecules such as Peritrophin LuloPer1 and LuloPer2, are also more plentiful in the blood fed cDNA library, suggesting that these molecules may be transcribed only in response to blood feeding. A transcript encoding a predicted protein of unknown function derived from cluster 40 was identified as being most abundant in the post blood meal digestion cDNA library, signifying it may play a role outside of blood meal digestion, such as oogenesis.

Transcripts differentially expressed by the presence of Leishmania infantum chagasi
To evaluate the effects of the presence of L. infantum chagasi parasites on the transcript abundance in the midgut tissue of the sand fly we compared the number of sequences in each cluster between the blood fed and blood fed Leishmania-infected cDNA library and the post blood meal digestion and post blood meal digestion Leishmania-infected cDNA library using chi-square analysis (Tables 19, 20, 21, 22). We hypothesized that the effects of the parasites presence in the blood engorged sand fly would likely mirror what we had observed in a similar comparison of P. papatasi infected with L. major. Additionally, we hypothesized that the analysis of the post blood meal digestion midgut tissue would reveal a large number of differentially abundant transcripts as during this time period Leishmania parasites are interacting with the midgut epithelium, replicating, and differentiating to the metacyclic form.
In accordance with what we observed previously in blood engorged P. papatasi infected with L. major, there was an under representation of the microvillar protein transcripts [6]. Similar trends in abundance between infected P. papatasi and infected L. longipalpis also occur for transcripts encoding the putative digestion enzymes trypsin (Lltryp2) and chymotrypsin (LuloChym1A). Two other digestive proteases, LuloAstacin and LuloCpepA1, were identified as differentially abundant in the presence of L. infantum chagasi with a reduction in the number of transcripts captured in the blood fed Leishmania-infected library, however only the LuloCpepA1 difference was statistically significant. There is a striking contradiction of the modulated abundance of peritrophin transcripts. In the midgut of infected P. papatasi peritrophin transcripts decrease whereas in L. longipalpis infected with L. infantum chagasi has a significant over representation of peritrophin (LuloPer1) and over representation of the putative chitin-binding molecule (LuloChiBi). There appears to be a downregulation of actin transcripts by the presence of the L. infantum chagasi para-   sites in the midgut. We speculate that this could be a tactic of the parasite to decrease the cytoskeletal rearrangement that occurs after blood feeding as a means of decreasing peristalsis, which may aid in the retention of the parasite within the gut of the sand fly.
In the context of abundant transcripts, the post blood meal digestion midgut infected with L. infantum chagasi is relatively quiescent. Only one transcript, encoding a putative trypsin molecule, was identified as significantly different in abundance. Lltryp2 sequences were 1.54 times more abundant in the L. infantum chagasi-infected post blood meal digestion cDNA library which corroborates the observed overrepresentation of Lltryp2 sequences in the blood fed infected cDNA library. It is possible that the increase in sand fly Lltryp2 occurs due to the presence of a perceived pathogen or as a consequence of a non-specific perception of contents within the midgut. Con-  versely, LuloTryp3 transcripts were captured at a lower frequency in the L. infantum chagasi-infected midgut after blood meal digestion.

Conclusion
Leishmania parasites develop to a transmissible and infective form entirely within the confines of the alimentary tract of the sand fly, in contrast to numerous other arthropod-borne pathogens. We wished to further investigate the response of the sand fly midgut tissues that are occurring in response to blood meal ingestion and interactions with Leishmania parasites. The previously reported extensive sequencing of whole sand fly Lutzomyia longipalpis ESTs provided a large overview of the transcripts present in this vector; however, it did not provide information regarding tissue specific transcripts, particularly from the sand fly midgut or information regarding the midgut molecules which may be transcribed in response to blood feeding and digestion or interact with the Leishmania parasite. In the present work, the production of five different cDNA libraries generated a large number of redundant tissue specific transcripts for analysis as well as provided the capability of a comparative analysis between these cDNA libraries. Several molecules were identified in this midgutspecific transcriptome that were not identified in the EST database of whole sand fly sequences, including LuloKZL and LuloDEF.
The present analysis of midgut tissue from L. longipalpis further increases our knowledge of the molecular events which occur throughout the adult lifecycle of the sand fly. In general, it appears that the midgut reverts, after complete digestion and excretion of the blood meal, to a state nearly mimicking the midgut of a sand fly that has only taken a sugar meal. Comparing data generated from the sugar fed and blood fed sand fly midguts resulted in comparable global changes found in the same analysis of the midgut of P. papatasi [6]. Microvillar proteins, digestive proteases and peritrophin molecules are some of the transcripts identified as differentially represented between cDNA libraries when comparing unfed and blood fed sand flies. Interestingly, many molecules, such as microvillar proteins and digestive proteases, were found to be over or under represented when comparing the blood fed with the blood fed Leishmania-infected cDNA libraries. Similar results were observed in the midgut of P. papatasi when infected with L. major. This not only demonstrates the reproducibility of this technique of analyzing transcript abundance across cDNA libraries, but the redundancy present in the biology of blood feeding and digestion in sand flies as well as the Leishmania-vector interactions occurring between Old World and New World sand fly species. When comparing the uninfected and L. infantum chagasi-infected post blood meal digestion library we were astounded by the scarcity of differentially abundant transcripts when considering the number and volume of Leishmania parasites present in the midgut at the time points encompassed by the cDNA library. This data suggest the Leishmania parasite affects the midgut expression profile during the blood digestion process and not afterwards. It is likely that Leishmania parasite modulates the expression profile of other molecules but our approach was not able to detect these proteins probably for their low abundance.
Further testing employing more direct techniques such as real time PCR or other expression profiles approaches are still required to test the hypothesis that L. infantum chagasi is altering the expression of specific gut transcripts from the sand fly Lutzomyia longipalpis. However, the information presented on the current work and previous work on P. papatasi and L. major strongly suggest that Leishmania parasites can alter the expression of midgut transcripts that may be relevant for the survival and establishment of the parasite in the gut of the fly and that these changes may be occurring during the digestion of the blood meal and not afterwards.

Sand flies
Lutzomyia longipalpis sand flies (Jacobina strain) were maintained at the Laboratory of Malaria and Vector Research at the National Institute of Allergy and Infectious Diseases. Three to four-day post eclosion sand flies were allowed a 20% sucrose solution (sugar fed/unfed) or fed blood on anesthetized BALB/c mice (blood fed). Sequence analysis of peptidoglycan recognition proteins  ) and stored at 4°C prior to cDNA library construction. Libraries constructed using midguts at different time points consisted of two midguts at each day the midguts were dissected. L. longipalpis midgut mRNA was isolated from six midguts using the MicroFastTrack mRNA isolation kit (Invitrogen, San Diego, CA). The cDNA libraries were constructed using the SMART cDNA Library Construction Kit (Clontech, Mountain View, CA) as described previously [18].

DNA Sequencing
Phage plaques lacking β-galactosidase activity were picked from the soft top agar using a sterilized wooden stick and placed into 75 μl of ultrapure water in a 96-well v-bottom plate. PCR was used to amplify the cDNA insert from 3 μl of the phage in water using FastStart PCR Master premixed PCR reagent (Roche Applied Science, Indianapolis, IN) and primers PT2F1 (AAGTACTCTAGCAATTGTGAGC) and PT2R1 (CTCTTCGCTATTACGCCAGCTG). Reaction conditions were 75°C, 3 min; 94°C, 4 min; 33 cycles of 94°C, 1 min; 49°C, 1 min; 72°C 2 min; a final extension of 72°C for 7 minutes. The PCR products were cleaned of buffering salts, dNTPs, and primers using ExcelaPure 96well UF PCR purification plates (Edge Biosystems, Gaithersburg, MD) using three washes of 100 μl of ultrapure water and recovery in 30 μl of ultrapure water. Cycle sequencing was accomplished using in a reaction using BigDye Terminator v3.1 (Applied Biosystems, Foster City, CA), primer PT2F3 (TCTCGGGAAGCGCGCCATTGT), and 5 μl of the cleaned PCR product. The cycle sequencing products were prepared for sequencing by centrifugation through hydrated Sephadex G-50 (Amersham, Piscataway, NJ), desiccation, and rehydration with 10 μl sequencing buffer. Sequencing was performed using a 3730xl DNA analyzer (Applied Biosystems, Foster City, CA).

Bioinformatics
Detailed reports of the bioinformatic analysis of the data are previously reported [19,20]. Succinctly, high N (unidentified nucleotide) content was removed at the 5' and 3' ends of each sequence any primer and vector nucleotides removed. Sequences from all five libraries were combined and contigs constructed from the clustering of homologous sequences based on 100% identity over 64 nucleotides while sequences with greater than 5% N's were discarded. Three frame translated sequences were sup-  , respectively. A custom program, Count Libraries, was used to identify the number of transcripts that each library contributed to the formation of a contig (JMC Ribeiro). The contigs, information regarding each contig, the BLAST and SignalP results were combined in a hyperlinked Excel spreadsheet and each contig annotated by manually assigning the most likely predicted function based on BLAST results. Sequences were aligned using Clustal X, version 1.83, and converted to graphical aligned sequences using BioEdit, version 7.0.5.3 [27]. Phylogenetic analysis was conducted on amino acid alignments using TREE-PUZZLE, version 5.2, generating trees by maximum likelihood using quartet puzzling with 10,000 puzzling steps to calculate node support [28]. Statistical significance in the number of transcripts per cluster within that same cluster, between cDNA libraries, was analyzed using Pearson's Chi-square test.