Skip to main content
  • Research article
  • Open access
  • Published:

A deep insight into the sialotranscriptome of the mosquito, Psorophora albipes

Abstract

Background

Psorophora mosquitoes are exclusively found in the Americas and have been associated with transmission of encephalitis and West Nile fever viruses, among other arboviruses. Mosquito salivary glands represent the final route of differentiation and transmission of many parasites. They also secrete molecules with powerful pharmacologic actions that modulate host hemostasis, inflammation, and immune response. Here, we employed next generation sequencing and proteome approaches to investigate for the first time the salivary composition of a mosquito member of the Psorophora genus. We additionally discuss the evolutionary position of this mosquito genus into the Culicidae family by comparing the identity of its secreted salivary compounds to other mosquito salivary proteins identified so far.

Results

Illumina sequencing resulted in 13,535,229 sequence reads, which were assembled into 3,247 contigs. All families were classified according to their in silico-predicted function/ activity. Annotation of these sequences allowed classification of their products into 83 salivary protein families, twenty (24.39%) of which were confirmed by our subsequent proteome analysis. Two protein families were deorphanized from Aedes and one from Ochlerotatus, while four protein families were described as novel to Psorophora genus because they had no match with any other known mosquito salivary sequence. Several protein families described as exclusive to Culicines were present in Psorophora mosquitoes, while we did not identify any member of the protein families already known as unique to Anophelines. Also, the Psorophora salivary proteins had better identity to homologs in Aedes (69.23%), followed by Ochlerotatus (8.15%), Culex (6.52%), and Anopheles (4.66%), respectively.

Conclusions

This is the first sialome (from the Greek sialo = saliva) catalog of salivary proteins from a Psorophora mosquito, which may be useful for better understanding the lifecycle of this mosquito and the role of its salivary secretion in arboviral transmission.

Background

Psorophora mosquitos—commonly known as “giant mosquitoes”—belong to the subfamily Culicinae, which includes many genera with epidemiologic importance to humans and animals such as Aedes, Ochlerotatus, Haemagogus, and Culex. Notably, members of the Psorophora genus are found only in the New World. Psorophora mosquitoes are opportunistic, having mammals and birds as the main hosts of their blood-feeding [1, 2]. Psorophora females have been associated with transmission of equine encephalitis virus, West Nile fever virus, and other arboviruses [39].

The phylogeny of mosquitoes includes three subfamilies within the Culicidae: Anophelinae, Culicinae, and Toxorhynchitinae. Studies based on the morphology, behavior, biogeographic distribution, and life-history suggest the Anophelinae subfamily as monophyletic and basal into the Culicidae family. On the other hand, the Culicinae subfamily includes the majority of remaining mosquito genera distributed into ten tribes. Psorophora mosquitoes share the tribe Aedini together with Aedes, Ochlerotatus, and other mosquito genera, while Culex mosquitoes belong to the Culicini tribe. Previous studies have supported the genera from the tribe Culicini as basal to genera of the tribe Aedini [10]. These results are in agreement with the phylogeny proposed by Besansky and Fahey [11]. The Psorophora genus contains 48 species divided into three subgenera: Grabhamia (15 species), Janthinosoma (23 species), and Psorophora (10 species) [12]. Recently, morphologic and molecular studies have supported Psorophora as a sister group with Aedes/Ochlerotatus[1315]. In contrast, studies using 18S rDNA sequence have suggested Psorophora species as a sister group to Culex and/or to Aedes/Ochlerotatus species [12, 16].

The salivary glands (SGs) of hematophagous insects secrete a cocktail of biochemically active compounds [17] that interacts with hemostasis [1821], immunity, and inflammation of their hosts [22, 23]. Perhaps because of the continuous contact of mosquito salivary proteins with host immunity, salivary proteins are at a fast pace of evolution and divergence, even in closely related species [24]. In the past decade, the continuous advances in the fields of transcriptome and proteome analysis led to the development of high-throughput sialotranscriptome studies (from the Greek sialo = saliva) [23, 25]. These studies resulted in a large database of secreted salivary proteins from different blood-feeding arthropod families including members of the Culicidae family.

All mosquito sialotranscriptome studies so far have targeted members of the Aedes, Ochlerotatus, Anopheles, and Culex genera [24], which are important vectors of human and animal diseases. Although some Psorophora species are known to be vectors of several arboviruses, the molecular composition of their salivary secretion remains unknown. Our primary aim was to investigate the salivary transcriptome and proteome of a member of the Psorophora genus (Psorophora albipes) to ultimately better understand the evolution of SG composition within the Culicidae family. In addition, our work makes available the first platform of salivary proteins from this mosquito genus, relevant for improving our understanding of mosquito evolution, the evolving risks in public health due to the recent expansions of Psorophora mosquitoes to the North, and for development of exposure markers to mosquito bites and to vector-borne diseases transmitted by mosquitoes.

Methods

Mosquitoes

Psorophora mosquitoes were collected in fragments of unflooded rain forest in Manacapuru municipality, Amazonas state, Brazil, using modified CDC traps. The mosquitoes were maintained with water and sugar solution and transported to Biodiversity Laboratory of Leônidas and Maria Deane Institute (Fiocruz/Manaus). The mosquitoes were identified using the taxonomic keys proposed by Forattini [12] and Consoll and Lourenco de Oliveira [26].

Dissection and RNA extraction

SGs from P. albipes (50 pairs) were dissected in 150 mM sodium chloride pH 7.4 and immediately transferred to 50 μl RNAlater® solution and maintained at 4°C until the RNA extraction. SG RNA was extracted and isolated using the Micro-FastTrack™ mRNA isolation kit (Invitrogen, San Diego, CA) per manufacturer’s instructions. The integrity of the total RNA was checked on a Bioanalyser (Agilent Technologies Inc., Santa Clara, CA).

Next-Generation Sequencing (NGS) and bioinformatic analysis

The SG library was constructed using the TruSeq RNA sample prep kit, v2 (Illumina Inc., San Diego, CA). The resulting cDNA was fragmented using a Covaris E210™ focused ultrasonicator (Covaris, Woburn, MA). Library amplification was performed using eight cycles to minimize the risk of over-amplification. Sequencing was performed on a HiSeq 2000 (Illumina) with v3 flow cells and sequencing reagents. One lane of the HiSeq machine was used for this and two other libraries, distinguished by bar coding. A total of 135,651,020 sequences of 101 nt in length were obtained. A paired-end protocol was used. Raw data were processed using RTA 1.12.4.2 and CASAVA 1.8.2. mRNA library construction, and sequencing was done by the NIH Intramural Sequencing Center (NISC). Reads were trimmed of low-quality regions (< 10) and were assembled together with the assembly by short sequences (ABySS) software (Genome Sciences Centre, Vancouver, BC, Canada) [27, 28] using various kmer (k) values (every even number from 24 to 96). Because the ABySS assembler tends to miss highly expressed transcripts [29], the Trinity assembler [30] was also used. The resulting assemblies were joined by an iterative BLAST and cap3 assembler [31]. Sequence contamination between bar-coded libraries were identified and removed when their sequence identities were over 98%, but their abundance of reads were > 50 fold between libraries. Coding sequences (CDS) were extracted using an automated pipeline, based on similarities to known proteins, or by obtaining CDS containing a signal peptide [32]. Coding and their protein sequences were mapped into a hyperlinked Excel spreadsheet (presented as Additional file 1, and also located at http://exon.niaid.nih.gov/transcriptome/Psorophora_albipes/Pso-s2-web.xlsx.). Signal peptides, transmembrane domains, furin cleavage sites, and mucin-type glycosylation were determined with software from the Center for Biological Sequence Analysis (Technical University of Denmark, Lyngby, Denmark) [3235]. Reads were mapped into the contigs using blastn [36] with a word size of 25, masking homonucleotide decamers and allowing mapping to up to three different CDS if the BLAST results had the same score values. Mapping of the reads was also included in the Excel spreadsheet. Automated annotation of proteins was based on a vocabulary of nearly 250 words found in matches to various databases—including Swissprot, Gene Ontology, KOG, PFAM, and SMART, and a subset of the non-redundant protein database of the NCBI containing proteins from vertebrates. Further manual annotation was done as required. Detailed bioinformatics analysis of our pipeline can be found in our previous publication [31]. Sequence alignments were done with the ClustalX software package [37]. Phylogenetic analysis and statistical neighbor-joining bootstrap tests of the phylogenies were done with the Mega package [38]. Blast score ratios were done as indicated previously [39]. For visualization of synonymous and non-synonymous sites within coding sequences, the tool BWA aln [40] was used to map the reads to the CDS, producing SAI files that were joined by BWA sampe module, converted to BAM format, and sorted. The sequence alignment/map tools (samtools) package [41] was used to do the mpileup of the reads (samtools mpileup), and the binary call format tools (bcftools) program from the same package was used to make the final vcf file containing the single-nucleotide polymorphic (SNP) sites, which were only taken if the site coverage was at least 100 (-D100), the quality was 13 or better and the SNP frequency was 5 or higher (default). Determination of whether the SNPs lead to a synonymous or non-synonymous codon change was achieved by a program written in Visual Basic by JMCR, the results of which are mapped into the Excel spreadsheet and color visualized in hyperlinked rtf files within Additional file 1.

Proteome analysis

Fifty SG pairs from female P. albipes were used in the proteome analysis. Briefly, the glands were sonicated and the supernatant was boiled for 10 min in reducing Laemmli gel-loading buffer and subsequently resolved on a NuPAGE 4-12% Bis-Tris precast gradient gel. Proteins were visualized with SimplyBlue stain (Invitrogen). The gel was arbitrary sliced into 19 individual sections (coded as F1–19) that were destained and digested overnight with trypsin at 37°C. ZipTips® (Millipore, Belford, MA) were used to extract and desalt the peptides, which were resuspended in 0.1% TFA before mass spectrometry analysis (MS).

Nanoflow reverse-phase liquid chromatography coupled with tandem MS (MS/MS) was performed as described [42]. We obtained a database of the tryptic peptides identified by MS as a final product. This was used to search for matches from our transcriptome database of P. albipes. Additional details about the proteome procedure and analysis can be found in the methodology described in Chagas et al. [42].

Results and discussion

Exploring the sialotranscriptome of a Psorophoramosquito

Assembly of 135,651,020 reads into 43,466 contigs (see assembly details in Additional file 2) allowed the extraction of 3,247 CDS (which mapped 13,535,229 reads) which in turn were classified according to their primary sequence (presence of homology to an already described sequence) into three categories: i) transcripts encoding for secreted (S) proteins, ii) transcripts encoding for housekeeping (H) proteins, and iii) transcripts encoding for proteins of unknown (U) function that lack homology with any functionally characterized protein from another organism (Table 1). Notice that these 3,247 CDS contain 485 similar CDS divergent by a few amino acids, which can be verified in the clusterization column grouping protein sequences with 95% similarity on 50% of their lengths. These may represent allele products, recent gene duplications, or sequencing/assembly errors.

Table 1 Functional classification of transcripts from salivary glands of the mosquito Psorophora albipes

After annotation, 7,537,805 reads (55.69% of the reads mapped to CDS) were classified as originating from transcripts encoding putative S proteins, and these were assembled into 802 contigs (24.70% of the total contigs) (Table 1). Signal peptide was detected in these contig sequences, suggesting that these contigs encode for proteins secreted in the saliva. In addition, 5,473,151 transcript reads (40.44% of the total reads) mapped to transcripts encoding H proteins, which were assembled into 1,973 contigs (60.76% of the total contigs). Another 85,213 reads (0.63% of total reads) correspond to transposable elements, and 439,060 reads (3.24% of total reads) were classified as originating from transcripts that encode for U products (Table 1).

The sequences encoding for H proteins were further classified into 26 subgroups according to their predicted function or membership to previously described protein families (Table 2). The potentially highly expressed H proteins include those associated with protein synthesis machinery (14.92% of the reads classified as H products), signal transduction proteins (5.18% of the total reads), unknown conserved—which represent highly conserved proteins of unknown function most likely related with cellular function (3.25% of the total reads), transporters and channels (3.18% of total reads), and proteins with a potential role in lipid metabolism (2.84% of the total reads). Because SGs are specialized in secretion, high expression of transcripts encoding for constituents of protein synthesis machinery and energy metabolism is commonly observed in similar analyses of blood-feeding arthropod sialotranscriptomes. Here energy metabolism represents only 0.86% of the total transcript reads encoding for H products.

Table 2 Functional classification of the housekeeping products expressed in the female Psorophora albipes salivary glands

The putative S proteins were further divided into 16 general categories (Table 3), several of which were abundantly expressed in P. albipes SGs at the transcriptome level. Their members had a classic secretion signal: mucin I mosquito family (24.77% of total reads classified as S products), similar to OT-19 containing HH repeats family (10.25% of total reads), glycosidases (9.05% of total reads), HHH peptide family (peptides containing a His triad) (7.65% of total reads), 30.5-kDa family (4.35% of total reads), long-D7 mosquito family (3.95% of total reads), Antigen-5 family (3.42% of total reads), Aegyptin family (2.62% of total reads), Serpin family (2.24% of total reads), Culicine short-D7 protein family (2.08% of total reads), Aedes 5-kDa family (1.67% of total reads), 41-kDa canonical family (1.45% of total reads), and Hyp8.2 Culicine family (1.30% of total reads) (Table 3). Additionally, eight novel protein families were described in P. albipes with either no significant matches to any sequence deposited in the NCBI database, or matching mosquito hypothetical proteins not previously described in sialotranscriptomes; these were named Psor 4.7 kDa, Psor 4.2 kDa, Psor 12 kDa, Psor 6.3 kDa, Psor 4.01 kDa ultrashort-D7 family, Psor 12.8 kDa novel mosquito peptide family, Psor 4.69 kDa weakly similar to Aedes, and Psor 20.44 kDa weakly similar to Culicine. These new protein families account for 1% of all the transcripts reads of P. albipes SG transcriptome. A summary and details related to the transcript annotation encoding for S proteins can be found in Table 3 and in Additional file 1.

Table 3 Functional classification of transcripts coding for putative secreted proteins in female Psorophora albipes salivary glands

Proteomics analysis of P. albipesSGs

We employed a proteomics analysis to investigate protein expression in SGs of P. albipes. After Coomassie staining, five bands were revealed as strongly stained at approximate molecular weight (MW) near 191 kDa, 64 to 51 kDa, between 51 to 39 kDa, between 39 to 28 kDa, and one last band with an estimated MW of 28 kDa. Other bands with lesser stain intensity were also revealed in the gel (Figure 1). The NuPage gel was arbitrary cut into 19 fractions and submitted to MS/MS analysis. Contigs showing up two or more tryptic peptides were identified by using the P. albipes transcriptome database. Table 4 presents the details of all secreted contigs identified in the P. albipes SG proteome.

Figure 1
figure 1

Salivary gland proteins from the mosquito Psorophora albipes. The left gel lane shows the protein standards with their molecular weights (kDa). The right gel lane shows the P. albipes salivary proteins (Coomassie stained). The grid at the right (F1–F19) shows the area of the gel slices submitted for tryptic digest and tandem mass spectrometry identification.

Table 4 Putative secreted proteins identified in the sialotranscriptome of Psorophora albipes and confirmed by our proteomic studies

We confirmed expression of 20 of 83 (24.09%) S protein families described in the sialotranscriptome. The three strongly stained bands of the gel apparently match to: F9 (glycosidase family), F11 (apyrase, adenosine deaminase), F15 (long-D7 mosquito family, 30-kDa Aegyptin family, Antigen-5). To conclude, six of ten protein families described as highly expressed in our P. albipes SG transcriptome (glycosidases, 30.5-kDa family, long-D7 mosquito family, 30-kDa Aegyptin-like family, Serpin family, and Culicine D7 mosquito family, respectively) were confirmed to be present in the salivary proteome of P. albipes based on our subsequent proteomics analysis. Furthermore, seven families (35% of the total families confirmed by proteome) described in the transcriptome as specific for mosquitoes (basic tail mosquito, Aedes 62 kDa, 9.7-kDa family, Hyp8.2 Culicine, 30.5-kDa family, 23.5-kDa family, Aedes 34 kDa) were also confirmed by our proteome analysis. Additionally, the proteomics analysis confirmed the presence of the newly described protein family named as “Psor-4 kDa ultrashort-D7 family–Contig Psor-9075.” More details about contigs/families found in the proteome of Psorophora can be seen in Figure 1 and Table 4. Tryptic peptides were assigned to several contigs encoding for H proteins (Additional file 1) such as a P. albipes Sphingomyelin phosphodiesterase that shows 55% amino acid identity to the homolog/ortholog from Culex quinquefasciatus. Previous proteomic studies using mosquito SGs identified some abundant protein families in Aedes aegypti such as long-D7 protein, adenosine deaminase, serpin, and 30-kDa Aegyptin [43]. Members of all these families were similarly identified in our P. albipes proteome. Additionally, members of the two mosquito-specific families—known as 34-kDa and 32-kDa families—were identified in our Psorophora proteome; members of this family were described as immunogenic in the proteome study of Ae. aegypti saliva [43]. Also, the antigen-5 protein was confirmed in the Psorophora proteome, and members of this family have been previously described as a SG-secreted product in Culex[44]. Many of the identified proteins have homologs/orthologs in other mosquitoes that have been described as related to blood feeding.

Insight into the P. albipesSecreted Sialome

The following highlights are related to the secreted sialome of P. albipes compared with others from bloodsucking Nematocera.

Ubiquitous protein families

Enzymes

ʹ-nucleotidase/apyrases, adenosine deaminase, ribonuclease, endonuclease, alkaline phosphatase, serine proteases, lipase, destabilase/lysozyme, hyaluronidase, and glycosidases were identified. Cathepsins and serine-type carboxypeptidase are also noted but could be of H functions. These enzymes have all been found before in mosquito sialotranscriptomes, and their role in blood and sugar feeding has been reviewed [24]. Notably in the case of Psorophora, however, is the finding of both endonuclease (identified by MS/MS in band 12) and hyaluronidase, which were previously restricted to C. quinquefasciatus[24] and sand flies, but not found in Aedes or Anopheles sialotranscriptomes. This enzyme combination may help decrease skin-matrix viscosity and diffusion of salivary components, as well as breaking down neutrophil extracellular traps [45]. Apyrase, adenosine deaminase, and glycosidases were found by MS/MS in fraction 10, consistent with their expected sizes. Transcripts encoding for sphingomyelin phosphodiesterases (SMases)—some of which are highly transcribed with coverages higher than 500—is an unusual finding in mosquito sialotranscriptomes. Although lacking the initial methionine, Psor-15064 matches at position 6 a C. quinquefasciatus protein with 55% identity over 564 amino acids that has a predicted signal peptide. The SMases are members of the DNase I superfamily of enzymes responsible for breaking sphingomyelin into phosphocholine and ceramide. In addition, activation of SMase is suggested to play a role in production of ceramide in response to cellular stresses [46]. Tryptic peptides originating from SMase were found in fractions 11 and 12 of the NuPage gel in our proteomic analysis. The high expression of this enzyme suggests it may be secreted.

Protease inhibitor domains: Serpins were well expressed, with 2.24% of the reads of the S class, and were identified by MS/MS in gel fraction 12. The protein encoded by Psor-18383 is 44% identical with the FXa-directed anticoagulant precursor of Aedes albopictus. Phylogenetic analysis indicated the presence of at least five distinct gene families (Roman numerals in Figure 2), of which clades I, II, and III are found in both culicines and anophelines (clade III also having a sand fly member), but clades IV and V are exclusively Aedine; clade IV includes the salivary Xase clotting inhibitor of Ae. aegypti[47]. The targets of serpins from clades I, II, III, and V remain to be identified. It is to be noted that the salivary anticlotting of anophelines is not a serpin but rather a novel protein family of antithrombins [48, 49]. TIL and Kazal domain-containing peptides may be related to additional anticlotting proteins [50] or antimicrobials. A metalloproteinase inhibitor represents the first such finding in Nematocera sialomes; Psor-25577 is 85% and 78% identical to their Ae. aegypti and C. quinquefasciatus homologs, respectively. Psor-21372 codes for a pacifastin homolog, which may be an H protein. A poorly expressed cystatin may also be an H protein, but tick salivary cystatins are secreted and poorly expressed and could have immunosuppressive function [51, 52].

Figure 2
figure 2

Phylogram of the salivary serpins of Psorophora albipes . The bootstrapped (1,000 iterations) phylogram was obtained using the serpins from P. albipes and their best-matching proteins from the non-redundant database from NCBI. P. albipes proteins are recognized by the prefix Psor- and a red marker (red circle symbol). Other proteins are represented by the first three letters of the genus name, followed by the first three letters of the species name, followed by their gi| accession number. The Aedes aegypti proteins identified as anti Xa anticlotting are marked with a blue symbol (blue reversed triangle symbol). The sole Lutzomyia longipalpis protein is identified by a green symbol (green circle symbol).The numbers at the nodes represent bootstrap support for 1,000 iterations using the neighbor-joining algorithm. The bar at the bottom indicates 50% amino acid substitution. Roman numerals indicate individual clades with strong bootstrap support.

Immunity-related proteins

Lysozyme, gambicin, cecropin, and defensins were found among antimicrobial agents. Pathogen recognition proteins of the ML domain, Fred/ficolin, Gram negative binding, peptidoglycan recognition, leucine-rich, galectin, and C-type lectin families were identified. Of these, lysozyme was identified in gel fraction 19 by MS/MS.

Yellow protein family

The yellow gene in Drosophila is responsible for tanning of the cuticle, and the mosquito homolog was shown to have a dopachrome oxidase function [53, 54]. This protein family is specific to insects, the royal jelly protein being a member of the superfamily [55]. Interestingly, sand flies—but no other insect sialotranscriptomes—have two members of this family recently shown to be a scavenger of serotonin [5658]. The P. albipes sialotranscriptome revealed two members of this family, probably alleles, relatively well expressed, assembled with over 200 × coverage. This is the first description of a yellow family member in mosquito sialotranscriptomes; however, these results derive from a high-coverage mosquito sialotranscriptome, and it may be possible that members of this family may be found in species previously studied if higher transcript coverage is attained.

Mosquito-specific protein families

1,319,744 reads (18% of the total reads classified as S products) mapped to transcripts encoding proteins that can be classified according to their sequence similarity to 18 different protein families (21.68% of the total S protein families described in this transcriptome) previously described as unique to mosquitoes, i.e., they are not recognized in any other organism apart from mosquitoes [24]. A total of 69.23% of these mosquito-specific contigs had their best matches originating from Aedes, followed by 8.15% best matching to Ochlerotatus, 6.52% to Culex, and 4.66% to Anopheles. A previous review of Nematocera sialomes [24] proposed that some of these mosquito-specific families appear to be spread in all mosquito genera (studied so far), while others show specific distributions to a certain mosquito subfamily and/or genus. Accordingly, we conceptually divided our discussion regarding the mosquito-specific protein families present in Psorophora sialomes into four groups: i) mosquito-specific protein families common to Culicines and Anophelines, ii) mosquito-specific protein families thus far found only in Culicines, iii) mosquito-specific protein families unique to Aedes/Ochlerotatus, and iv) mosquito-specific protein families unique to Culex.

Mosquito-specific protein families common to Culicines and Anophelines

Nine of the 12 protein families previously known as common to Culicine and Anopheline were described in the P. albipes transcriptome: the HHH peptide family, the HHH peptide family 2, the mosquito basic tail family, the salivary protein 16 family, the Aedes/Anopheles darlingi 14-15 family, the gSG8 family, the Hyp6.2 family, the Aedes 62 kDa family, and the Anopheline SG1 family. Although commonly found in mosquito SG transcriptome analyses, no member of these families has been functionally characterized so far. Moreover, studies based on RT-PCR have assigned to some of these family members a tissue and/or sex specificity in their expression that suggests a role in the physiology of Ae. albopictus SGs [24].

Among them, the HHH peptide family was previously suggested to play a role in antimicrobial defense because of its His richness as Zn ion chelators [24, 59]. Here, this family was revealed as the fourth most abundantly expressed, with 7.65% of the total reads (Table 3). This family appears to be expanded in Psorophora, with a possible total of at least six genes (Figure 3). The abundant expression of this protein family suggests this protein as a good candidate for exposure marker to mosquito bites. Alignment of Psorophora transcripts encode two HHH repeats separated by NGTS amino acids, while one repeat was seen in the homologs from Aedes, Ochlerotatus, and Culex (Figure 3A); 35% to 55% identity is observed to the Ae. aegypti and Ae. albopictus homologs. The phylogram obtained after alignment of all HHH peptide genes found in mosquitoes shows at least five distinct clades with strong bootstrap support (Figure 3C). Two clades contain solely Psorophora transcripts (in several subclades). The remaining clades are specific to Culex, An. darlingi, Ochlerotatus, and Aedes (Figure 3C).

Figure 3
figure 3

Phylogram of salivary protein families derived from Psorophora albipes sialotranscriptome commonly found in Culicine and Anopheline mosquito sialotranscriptomes. A, C: HHH peptide family. B, D: Mosquito basic tail family. Clustal alignment (A and B) and dendrogram (C and D) of all HHH peptide family and mosquito basic tail family, respectively, derived from the Psorophora sialotranscriptome. The symbols above the alignment indicate (*) identical sites, (:) conserved sites, and (.) less-conserved sites. The phylogram derived from the alignment of Psorophora proteins (indicate by Psor- and its contig number) with their best matches in the non-redundant database. The three first letters indicate the genus name from which each protein originates, followed by the three first letters of the species name, followed by the NCBI accession number. The numbers on the tree bifurcations indicate the percentage of bootstrap support above 75%. The scale bar at the bottom represents 10% amino acid substitution. Sequences were aligned by the ClustalW program, and the phylogram was made with the Mega package after 10,000 bootstraps with the neighbor-joining algorithm.

Mosquito basic tail proteins contain a Lys dipeptide tail (Figure 3B) and have been suggested as binding to negatively charged phospholipids found in cell membranes such as in the surface of platelets [60]. They may also be associated with plasminogen activation [61, 62]. In the Psorophora transcriptome, six contigs (0.43% of the total contigs classified as S products) match mosquito basic tail peptides with 50% identity to Ae. albopictus family members (Additional file 1). Three tryptic peptides in our proteome analysis match contig Psor-13880, which encodes for a member of this family. Phylogenetic analysis of the basic tail mosquito family supports divergence of Culicine salivary proteins from the Anopheline family members (Figure 3D) where Anopheline and Culicine proteins are grouped in distinct clades (Figure 3D). Although Anophelines lack the basic tail, they have a conserved backbone (Figure 3B). In the Culicine clade, we observe that all Psorophora proteins are isolated in a genus-specific branch, separated from the other Culicine proteins with strong bootstrap support (Figure 3D).

Family Hyp6.2, represented with three truncated-sequences, is approximately 45% identical to the homologs from Ochlerotatus (Additional file 1). Additionally, all the contigs found in P. albipes transcriptome from the mosquito-specific families HHH family-2, salivary protein 16 family, Aedes/An. darlingi family, gSG8 family, and Aedes 62-kDa family have as their best matches the homologs from Ae. aegypti, with identities varying from 80% to 42% (Additional file 1). Proteome analysis revealed tryptic peptides originating from Psorophora family members showing higher similarities to the Aedes 62-kDa family (Additional file 1).

Mosquito-specific protein families thus far found only in Culicines

To date, five protein families found in the P. albipes sialotranscriptome are unique to Culicines. Two of these (9.7-kDa and Hyp8.2 Culicine protein families) may play a role in blood feeding, as they are abundantly expressed in female Ae. albopictus SGs [63]. The 30.5-kDa and 23.5-kDa protein families appear to be involved in mosquito sugar feeding due to their reported expression in male and female SGs [63]; however, the tissue specificity of the fifth protein family—namely, the GQ-rich Culicine family—is still unknown [63, 64]. So far, no member from these families has been functionally characterized.

Two abundantly expressed families in our transcriptome analysis are represented by the 30.5-kDa (4.35% of total reads encoding for S products) and Hyp8.2 Culicine families (1.30% of total reads encoding for S products). The first family was also within the 50 most-expressed families in this transcriptome (Table 3). Expression of these two families in Psorophora SGs was confirmed by our proteome analysis (Figure 1 and Table 4). Overall, they share 53% amino acid identity with the family member from Ae. albopictus (Additional file 1). The Psorophora 9.7-kDa and 23.5-kDa families had Ae. aegypti proteins as their best BLAST similarity matches; tryptic peptides were found in our proteome analysis identifying these family members. In contrast, members of the GQ-rich Culicine family revealed 58% identity to its homologs from C. quinquefasciatus (Additional file 1).

Here we performed phylogenetic analysis of the 9.7-kDa family (Figure 4A), the 30.5-kDa family (Figure 4B), the 23.5-kDa family (Figure 4C), and GQ-rich family (Figure 4D). Overall, all four phylograms show Psorophora proteins phylogenetically far from Culex proteins. The phylogenetic tree of the 9.7-kDa family (Figure 4A) shows at least four different transcript clusters in Psorophora (one cluster in the branch named Psorophora-I and three in Psorophora-II). Also, several gene duplications can be found in each cluster. This phylogeny shows that Aedes family members are closer to Psorophora family members, while Culex proteins appear as an outgroup (Figure 4A).

Figure 4
figure 4

Phylogram of salivary protein families derived from Psorophora albipes sialotranscriptome that are exclusively found in Culicine mosquitoes. A: 9.7-kDa family. B: 30.5-kDa family. C: 23.5-kDa family. D: GQ-rich family. The phylogram derives from the alignment of Psorophora proteins (indicated by Psor- and their contig number) and their comparison with their best matches in the non-redundant database. The three first letters indicate the genus name, followed by the three first letters of the species name, followed by the NCBI accession numbers. Numbers on the tree bifurcations indicate the percentage of bootstrap support above 75%. The bar at the bottom represents 10% amino acid substitution. Scale sequences were aligned by the ClustalW program, and the phylogram was made with the Mega package after 10,000 bootstraps with the neighbor-joining algorithm.

The phylogram of the 30.5-kDa (Figure 4B) and 23.5-kDa (Figure 4C) families confirm the same pattern seen for the 9.7-kDa family (Figure 4A) in the sense that Psorophora proteins are grouped in the same clade with Aedes proteins (Figure 4A–C). The GQ-rich family shows Psorophora members grouped within the same clade containing Ochlerotarus proteins (Figure 4D). Although previous studies using 18S rDNA sequence suggested Psorophora species as a sister group to Culex and/or a sister group to the Aedes/Ochlerotatus species [12, 16], our results suggest—based on the composition of the salivary proteins—that Psorophora is much closer to Aedes than to Culex.

Mosquito-specific protein families unique to Aedes/Ochlerotatus mosquitoes

Three protein families—Aedes 6.5–8.5-kDa family (Figure 5A), Aedes W-rich peptides family (Figure 5B), and Aedes 34-kDa family (Figure 5C)—previously described as exclusive to Aedes/Ochlerotatus were found in the Psorophora genus. Previous studies showed that these families are female- and SG-specific [63, 64], but their function still remains unknown. Alignment of the transcripts found in Psorophora encoding for Aedes 6.5–8.5-kDa (Figure 5A) and Aedes W-rich peptides (Figure 5B) families reveal higher identity in their amino acid sequences to Ae. albopictus (55%) and to Ochlerotatus triseriatus (62%), respectively (Additional file 1). Here, only the 34-kDa family was confirmed as present in the Psorophora SG proteome (fraction F14; Figure 1, Table 4). Alignments of 34-kDa family members showed 29–37% identity to their homologous proteins from Aedes mosquitoes (Figure 5C and Additional file 1). The phylogram shows all Psorophora proteins of the 34-kDa family grouped in the same clade, while the second clade of the phylogram contains all the Aedes/Ochlerotatus gene products (Figure 5C).

Figure 5
figure 5

Phylogenetic analyses of salivary Psorophora protein families previously found exclusively in Aedes/Ochlerotatus mosquitoes. Clustal alignment of all Aedes 6.5–8.5-kDa proteins (A) and Aedes W-rich peptides (B) derived from Psorophora sialotranscriptome. Symbols above the alignment indicate (*) identical sites, (:) conserved sites, and (.) less-conserved sites. Aedes 34-kDa family alignment and bootstrapped phylogram (C). The numbers on the tree bifurcations indicate the percentage bootstrap support above 75%. The scale bar at the bottom represents 10% amino acid substitution. Sequences were aligned by the ClustalW program, and the dendrogram was made with the Mega package after 10,000 bootstraps with the neighbor-joining algorithm.

Protein family so far found only in Culex

The Culex W-rich protein (WRP)/16-kDa family is a salivary protein family so far uniquely found in Culex sialotranscriptomes, where nearly 20 genes coding for this family are known, subdivided into different subfamilies varying in their number of cysteine residues [44, 65]. Although highly expressed and specific to Culex, the function of the WRP/16-kDa family remains still unclear. Here we report for the first time members of this family originating from a non-Culex mosquito. A total of 2,138 reads were found grouped into two contigs, Psor-32363 and Psor-32364, the latter being a truncated variant of the first with a few amino acid changes. The mature MW of the encoded polypeptides is approximately 24 kDa with an estimated isoelectric point of 7.1 and amino acid sequences that are W rich (Additional file 1). Interestingly, the Psorophora protein best matches two putative Ae. aegypti proteins never previously described in sialotranscriptomes. Alignment of the P. albipes sequence with Aedes and selected Culex sequences shows three conserved tryptophan residues among a total of 8 identities and 22 similarities, with a total similarity of only 14% (Figure 6A). Phylogenetic analysis (Figure 6B) groups the Psorophora and Aedes sequences with 100% bootstrap support within a clade of four Culex proteins having 99% bootstrap support (Clade III in Figure 6B). These results suggest that Culicines shared a common ancestor of a gene coding for this protein family that expanded in Culex but not in Aedes, indicating this family was not a Culex “invention.” The Psorophora member of the family thus helped us to partially understand the evolution of this family in Culex by providing a link between Culex and Aedes sequences.

Figure 6
figure 6

Mosquito proteins of the W-rich peptides/16-kDa protein family of Culex. (A) Clustal alignment. Symbols above the alignment indicate (*) identical sites, (:) conserved sites, and (.) less-conserved sites. (B) Bootstrapped phylogram derived from the alignment in A. Numbers on the tree bifurcations indicate the percentage bootstrap support above 50%. The scale bar at the bottom represents 20% amino acid substitution. Sequences were aligned by the ClustalW program, and the dendrogram was made with the Mega package after 1,000 bootstraps with the neighbor-joining algorithm. The Psorophora albipes proteins are recognized by the prefix Psor- and a red marker (red circle symbol). Aedes proteins are indicated with a blue marker (blue circle symbol). Other proteins are represented by the first three letters of the genus name, followed by the first three letters of the species name, followed by the gi| accession number. Roman numerals indicate individual clades with strong bootstrap support.

Other putative secreted proteins

Two putative S protein sequences match black fly proteins previously thought to be unique to Simulium sialomes. Three previously thought to be orphan proteins of Aedes and Ochlerotatus (Aedes 7-kDa and 5-kDa families and Ochlerotatus OT-19 family) were deorphanized. Eight novel salivary protein families were found in the Psorophora sialotranscriptome, four of which appear unique to Psorophora, while the others have matches to mosquito hypothetical proteins not previously described in sialotranscriptomes.

We additionally identified 372 transcripts sequences encoding for secreted polypeptides, most of which have no relevant matches to any sequence deposited thus far in the NR database. Two of these were identified by proteome analysis. All details of these proteins are in the hyperlinked Excel spreadsheet available in Additional file 1.

P. albipessimilarities to other mosquito species

The availability of the genomes of Ae. aegypti, C. quinquefasciatus, and Anopheles gambiae[6668] allows for comparisons of the protein sequences of Psorophora to those deducted from the three mosquito genomes. We have determined (Additional file 1) the BLAST score ratios of each protein for the three genomes by dividing the BLAST score found for the blastp result against one of the three mosquito proteomes by the BLAST score of the Psorophora protein blasted against itself [39]. The comparisons indicate that Aedes is the closest related mosquito to Psorophora, followed by Culex and Anopheles (Figure 7). It also shows that the S class of proteins has the lowest ratio of all, while those for proteasome machinery, nuclear regulation, and cytoskeletal are among the most conserved (Figure 7). In the S class computed above, those 372 proteins indicated as “Other putative secreted peptides” were not included, as they are of uncertain nature and would further decrease the score ratios. This divergence of salivary proteins in mosquitoes has been previously reported for other taxa [22, 24, 6365, 6971].

Figure 7
figure 7

Comparison of Psorophora albipes protein sequences to those of Aedes aegypti ( black square symbol ), Culex quinquefasciatus ( dark gray symbol ) , and Anopheles gambiae (SlateGray square symbol) by blastp score ratio. The bars and lines represent the average ratio and standard error calculated by dividing the score of the best-matching mosquito protein and the self-score of the protein for the functional categories indicated in the abscissa. For details, see text.

Polymorphism of P. albipescoding sequences inferred from the RNAseq data

RNAseq data produces high contig coverage. In our CDS set, 1,822 had average coverage of 100 or larger per nucleotide site, allowing reliable identification of SNPs using the tools BWA and Samtools (see Methods). From this CDS set we excluded the 372 proteins indicated in the previous section, plus all of the unknown class, and all CDS having a similar protein sequence with > 95% similarity (to avoid overrepresentation of alleles), producing a set of 1,100 CDS. When comparing the number of synonymous sites per 100 codons among different functional classes, the protein syntheses and secreted classes have the smallest value, while the proteasome and immune categories have the highest (Figure 8A and Table 5). When the values of the number of non-synonymous SNPs are compared, the figure reverses (Figure 8B), the secreted category having the highest value, to near 0.33 per 100 codons (Table 5). The overall non-synonymous to synonymous rate also shows the secreted class to have the highest ratio (Table 5). This increased non-synonymous polymorphism is not an artifact resulting of increased read coverage of the contigs of the secreted class because the protein synthesis class of contigs has an even higher read coverage but has the second smallest non-synonymous polymorphism index. It is possible that this high value of non-synonymous polymorphism observed for the secreted class may result from chimeric assembly of coding sequences originating from multiple recently duplicated genes coding for very similar proteins. At any rate, this high polymorphism may underlie the mechanisms leading to accelerated evolution of salivary proteins observed in bloodsucking arthropods.

Figure 8
figure 8

Number of polymorphic sites per 100 codons in Psorophora albipes proteins. Bars represent the average and standard errors of synonymous (A) and non-synonymous (B) sites per 100 codons in the indicated functional categories.

Table 5 Psorophora albipes polymorphisms detected on a set of 1,100 coding sequences of 16 functional classes

Conclusions

The sialotranscriptome of P. albipes as described here is the first—or among the first—to use solely Illumina sequences for its assembly, in the absence of a reference genome. Over 3,000 coding sequences were recovered, 1,790 of which were submitted to GenBank. This is also the first transcriptome of a member of the Psorophora genus. As expected, the protein sequences presented more similarities to Aedes, followed by Culex and Anopheles proteins. Despite this more Aedine nature, P. albipes presented some Culex characters—such as the presence of endonuclease and hyaluronidase—common in sand flies and black flies but so far uniquely found in Culex. A Psorophora protein similar to the WRP/16-kDa family also unique so far to Culex allowed the discovery of a “missing link” between this Culex family and hypothetical Ae. aegypti proteins, indicating this gene family is ancestral in all Culicines but poorly or not expressed in Aedes SGs. Orphan protein families from Aedes and Ochlerotatus were deorphanized, and several new families of proteins were identified, four of which appear unique to Psorophora, supporting the idea that sialotranscriptomes of new bloodsucking genera yield at least two novel protein families [72]. However, these novel sequences may result from misassemblies or chymeras. Further sequencing of other Psorophora species may clarify this area. Unique to Psorophora is also the finding of SMase, not previously found in mosquito sialomes. Because the sample derived from 50 field-collected mosquitoes, we also were able to derive an estimate of SNPs and the rate of synonymous and non-synonymous mutations in this data set.

Availability of supporting data

All data from the transcriptome and proteome analysis of P. albipes SGs are disclosed in Additional file 1, a hyperlinked Excel spreadsheet available at http://exon.niaid.nih.gov/transcriptome/Psorophora_albipes/Pso-s2-web.xlsx. Raw reads were deposited in the SRA of the NCBI under bioproject numbers PRJNA208524 and 208958 and raw data file SRR908278. One thousand seven hundred and ninety coding sequences have been publicly deposited in the Transcriptome Shotgun Assembly project at DDBJ/EMBL/GenBank under accession GALA00000000. The version described in this paper is the first version, GALA01000000, ranging from GALA01000001 to GALA01001790.

Abbreviations

ABySS:

Assembly by short sequences software

bcftools:

Binary call format tools

CDS:

Coding sequences

H:

Housekeeping proteins

MS:

Mass spectrometry

MS/MS:

Tandem MS

MW:

Molecular weight

NCBI:

National Center for Biotechnology Information

NGS:

Next-generation sequencing

NISC:

NIH Intramural Sequencing Center

S:

Secreted proteins

samtools:

Sequence alignment/map tools

SG:

Salivary gland

SMase:

Sphingomyelin phosphodiesterase

SNP:

Single-nucleotide polymorphism

SRA:

Sequence read archives

U:

Proteins of unknown function

WRP:

W-rich protein.

References

  1. Molaei G, Andreadis T, Armstrong P, Diuk-Wasser M: Host-feeding patterns of potential mosquito vectors in Connecticut, USA: Molecular analysis of bloodmeals from 23 species of Aedes, Anopheles, Culex, Coquillettidia, Psorophora and Uranotaenia. J Med Entomol. 2008, 45 (6): 1143-1151. 10.1603/0022-2585(2008)45[1143:HPOPMV]2.0.CO;2.

    Article  CAS  PubMed  Google Scholar 

  2. Dos Santos Silva J, Alencar J, JM C, Seixas-Lorosa E, Guimaraes A: Feeding patterns of mosquitoes (Diptera:Culicidae) in six Brazilian environmental preservation areas. J Vector Ecol. 2012, 37 (2): 342-350. 10.1111/j.1948-7134.2012.00237.x.

    Article  PubMed  Google Scholar 

  3. Turell M, Barth J, Coleman R: Potential for Central American mosquitoes to transmit epizootic and enzootic strains of Venezuelan equine encephalitis virus. J Am Mosq Control Assoc. 1999, 15 (3): 295-298.

    CAS  PubMed  Google Scholar 

  4. Turell M, Jones J, Sardelis M, Dohm D, Coleman R, Watts D, Fernandez R, Calampa C, Klein T: Vector competence of Peruvian mosquitoes (Diptera:Culicidae) for epizootic and enzootic strains of Venezuelan equine encephalomyelitis virus. J Med Entomol. 2000, 37 (6): 835-839. 10.1603/0022-2585-37.6.835.

    Article  CAS  PubMed  Google Scholar 

  5. Turell M, Dohm D, Fernandez R, Calampa C, O’Guinn ML: Vector competence of Peruvian mosquitoes (Diptera:Culicidae) for a subtype IIIC virus in the Venezuelan equine encephalomyelitis complex isolated from mosquitoes captured in Peru. J Am Mosq Control Assoc. 2006, 22 (1): 70-75. 10.2987/8756-971X(2006)22[70:VCOPMD]2.0.CO;2.

    Article  CAS  PubMed  Google Scholar 

  6. Unlu I, Kramer W, Roy A, LD F: Detection of West Nile virus RNA in mosquitoes and identification of mosquito blood meals collected at alligator farms in Louisiana. J Med Entomol. 2010, 47 (4): 625-633. 10.1603/ME09087.

    Article  CAS  PubMed  Google Scholar 

  7. Pitzer J, Byford R, Vuong H, Steiner R, RJ C, Caccamise D: Potential vectors of West Nile virus in a semiarid environment: Dona Ana County, New mexico. J Med Entomol. 2009, 46 (6): 1474-1482. 10.1603/033.046.0634.

    Article  PubMed  Google Scholar 

  8. Moreno E, Rocco I, Bergo E, Brasil R, Siciliano M, Suzuki A, Silveira V, Bisordi I, Souza R, Group YFW: Reemergence of yellow fever: detection of transmission in the State of Sao Paulo, Brazil, 2008. Rev Soc Bras Med Trop. 2011, 44 (3): 290-296. 10.1590/S0037-86822011005000041.

    Article  PubMed  Google Scholar 

  9. Laporta G, Ribeiro M, Ramos D, Sallum M: Spatial distribution of arboviral mosquito vectors (Diptera, Culicidae) in Vale do Ribeira in the South-eastern Brazilian Atlantic Forest. Cad Saude Publica. 2012, 28 (2): 229-238. 10.1590/S0102-311X2012000200003.

    Article  PubMed  Google Scholar 

  10. Ross H: Conflict with Culex. Mosq News. 1951, 11: 128-132.

    Google Scholar 

  11. Besansky N, Fahey G: Utility of the white gene in estimating phylogenetic relationships among mosquitoes (Diptera: Culicidae). Mol Biol Evol. 1997, 14: 442-454. 10.1093/oxfordjournals.molbev.a025780.

    Article  CAS  PubMed  Google Scholar 

  12. Forattini O: Culicidologia medica, vol. 2. 2002, São Paulo: Editora da Universidade de São Paulo – EDUSP

    Google Scholar 

  13. Isoe J: Comparative analysis of the vitellogenin genes of the Culicidae. 2000, Tucson, AZ: University of Arizona

    Google Scholar 

  14. Savage H, Strickman D: The genus and subgenus categories within Culicidae and placement of Ochlerotatus as a subgenus of Aedes. J Am Mosq Control Assoc. 2004, 20: 208-214.

    PubMed  Google Scholar 

  15. Reinert J, Harbach R, Kitching I: Phylogeny and classification of Aedini (Diptera:Culicidae), based on morphological characters of all life stages. Zool J Linn Soc. 2004, 142: 289-368. 10.1111/j.1096-3642.2004.00144.x.

    Article  Google Scholar 

  16. Shepard JJ, Andreadis TG, Vossbrinck CR: Molecular phylogeny and evolutionary relationships among mosquitoes (Diptera: Culicidae) from the northeastern United States based on small subunit ribosomal DNA (18S rDNA) sequences. J Med Entomol. 2006, 43 (3): 443-454. 10.1603/0022-2585(2006)43[443:MPAERA]2.0.CO;2.

    Article  CAS  PubMed  Google Scholar 

  17. Ribeiro JMC: Role of arthropod saliva in blood feeding. Ann Rev Entomol. 1987, 32: 463-478. 10.1146/annurev.en.32.010187.002335.

    Article  CAS  Google Scholar 

  18. Camez A, Dupuy E, Bellucci S, Calvo F, Bryckaert MC: Human platelet-tumor cell interactions vary with the tumor cell lines. Invasion-Metastasis. 1986, 6: 321-334.

    CAS  PubMed  Google Scholar 

  19. Calvo E, Mans BJ, Ribeiro JM, Andersen JF: Multifunctionality and mechanism of ligand binding in a mosquito antiinflammatory protein. Proc Natl Acad Sci USA. 2009, 106 (10): 3728-3733. 10.1073/pnas.0813190106.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  20. Calvo E, Mizurini D, Sa-Nunes A, Ribeiro JM, Andersen J, Mans B, Monteiro R, Kotsyfakis M, Francischetti I: Alboserpin, a factor Xa inhibitor from the mosquito vector of yellow fever, binds heparin and membrane phospholipids and exhibits antithrombotic activity. J Biol Chem. 2011, 286: 27998-28010. 10.1074/jbc.M111.247924.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  21. Calvo E, Tokusamu F, Mizurini D, McPhie P, Narum D, Ribeiro JM, Monteiro R, Francischetti I: Aegyptin displays high-affinity for the von Willebrand factor binding site (QGQOGVMGF) in collagen and inhibits carotid thrombus formation in vivo. FEBS J. 2010, 277 (2): 413-427. 10.1111/j.1742-4658.2009.07494.x.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  22. Calvo E, Pham VM, Marinotti O, Andersen JF, Ribeiro JM: The salivary gland transcriptome of the neotropical malaria vector Anopheles darlingi reveals accelerated evolution of genes relevant to hematophagy. BMC Genomics. 2009, 10: 57-10.1186/1471-2164-10-57.

    Article  PubMed Central  PubMed  Google Scholar 

  23. Ribeiro JM, Francischetti IM: Role of arthropod saliva in blood feeding: sialome and post-sialome perspectives. Annu Rev Entomol. 2003, 48: 73-88. 10.1146/annurev.ento.48.060402.102812.

    Article  CAS  PubMed  Google Scholar 

  24. Ribeiro JM, Mans BJ, Arca B: An insight into the sialome of blood-feeding Nematocera. Insect Biochem Mol Biol. 2010, 40 (11): 767-784. 10.1016/j.ibmb.2010.08.002.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  25. Valenzuela JG: High-throughput approaches to study salivary proteins and genes from vectors of disease. Insect Biochem Mol Biol. 2002, 32 (10): 1199-1209. 10.1016/S0965-1748(02)00083-8.

    Article  CAS  PubMed  Google Scholar 

  26. Consoli RAGB, de Oliveira RL: Principais mosquitos de importancia sanitaria no Brasil. 1994, Rio de Janeiro: Editora Fiocruz

    Google Scholar 

  27. Birol I, Jackman SD, Nielsen CB, Qian JQ, Varhol R, Stazyk G, Morin RD, Zhao Y, Hirst M, Schein JE, et al: De novo transcriptome assembly with ABySS. Bioinformatics (Oxford, England). 2009, 25 (21): 2872-2877. 10.1093/bioinformatics/btp367.

    Article  CAS  Google Scholar 

  28. Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I: ABySS: a parallel assembler for short read sequence data. Genome Res. 2009, 19 (6): 1117-1123. 10.1101/gr.089532.108.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  29. Zhao QY, Wang Y, Kong YM, Luo D, Li X, Hao P: Optimizing de novo transcriptome assembly from short-read RNA-Seq data: a comparative study. BMC Bioinformatics. 2011, 12 (Suppl 14): S2-10.1186/1471-2105-12-S14-S2.

    Article  CAS  Google Scholar 

  30. Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, et al: Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011, 29 (7): 644-652. 10.1038/nbt.1883.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  31. Karim S, Singh P, Ribeiro JM: A deep insight into the sialotranscriptome of the Gulf Coast tick, Amblyomma maculatum. PLoS ONE. 2011, 6 (12): e28525-10.1371/journal.pone.0028525.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  32. Nielsen H, Brunak S, von Heijne G: Machine learning approaches for the prediction of signal peptides and other protein sorting signals. Protein Eng. 1999, 12 (1): 3-9. 10.1093/protein/12.1.3.

    Article  CAS  PubMed  Google Scholar 

  33. Duckert P, Brunak S, Blom N: Prediction of proprotein convertase cleavage sites. Protein Eng Des Sel. 2004, 17 (1): 107-112. 10.1093/protein/gzh013.

    Article  CAS  PubMed  Google Scholar 

  34. Sonnhammer EL, von Heijne G, Krogh A: A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol. 1998, 6: 175-182.

    CAS  PubMed  Google Scholar 

  35. Julenius K, Molgaard A, Gupta R, Brunak S: Prediction, conservation analysis, and structural characterization of mammalian mucin-type O-glycosylation sites. Glycobiology. 2005, 15 (2): 153-164.

    Article  CAS  PubMed  Google Scholar 

  36. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25 (17): 3389-3402. 10.1093/nar/25.17.3389.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  37. Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG: The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 1997, 25 (24): 4876-4882. 10.1093/nar/25.24.4876.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  38. Kumar S, Tamura K, Nei M: MEGA3: Integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment. Brief Bioinform. 2004, 5 (2): 150-163. 10.1093/bib/5.2.150.

    Article  CAS  PubMed  Google Scholar 

  39. Rasko DA, Myers GS, Ravel J: Visualization of comparative genomic analyses by BLAST score ratio. BMC Bioinformatics. 2005, 6: 2-10.1186/1471-2105-6-2.

    Article  PubMed Central  PubMed  Google Scholar 

  40. Li H, Durbin R: Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics (Oxford, England). 2010, 26 (5): 589-595. 10.1093/bioinformatics/btp698.

    Article  Google Scholar 

  41. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Genome Project Data Processing S: The sequence alignment/map format and SAMtools. Bioinformatics (Oxford, England). 2009, 25 (16): 2078-2079. 10.1093/bioinformatics/btp352.

    Article  Google Scholar 

  42. Chagas AC, Calvo E, Pimenta PF, Ribeiro JM: An insight into the sialome of Simulium guianense (DIPTERA:SIMulIIDAE), the main vector of River Blindness Disease in Brazil. BMC Genomics. 2011, 12: 612-10.1186/1471-2164-12-612.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  43. Wasinpiyamongkol L, Patramool S, Luplertlop N, Surasombatpattana P, Doucoure S, Mouchet F, Seveno M, Remoue F, Demettre E, Brizard J, et al: Blood-feeding and immunogenic Aedes aegypti saliva proteins. Proteomics. 2010, 10: 1906-1916. 10.1002/pmic.200900626.

    Article  CAS  PubMed  Google Scholar 

  44. Ribeiro JM, Charlab R, Pham VM, Garfield M, Valenzuela JG: An insight into the salivary transcriptome and proteome of the adult female mosquito Culex pipiens quinquefasciatus. Insect Biochem Mol Biol. 2004, 34 (6): 543-563. 10.1016/j.ibmb.2004.02.008.

    Article  CAS  PubMed  Google Scholar 

  45. Wartha F, Beiter K, Normark S, Henriques-Normark B: Neutrophil extracellular traps: casting the NET over pathogenesis. Curr Opin Microbiol. 2007, 10 (1): 52-56. 10.1016/j.mib.2006.12.005.

    Article  CAS  PubMed  Google Scholar 

  46. Hannun Y, Obeid L: The Ceramide-centric universe of lipid-mediated cell regulation: stress encounters of the lipid kind. J Biol Chem. 2002, 277 (29): 25847-25850. 10.1074/jbc.R200008200.

    Article  CAS  PubMed  Google Scholar 

  47. Stark KR, James AA: Isolation and characterization of the gene encoding a novel factor Xa-directed anticoagulant from the yellow fever mosquito, Aedes aegypti. J Biol Chem. 1998, 273 (33): 20802-20809. 10.1074/jbc.273.33.20802.

    Article  CAS  PubMed  Google Scholar 

  48. Francischetti IM, Valenzuela JG, Ribeiro JM: Anophelin: kinetics and mechanism of thrombin inhibition. Biochemistry. 1999, 38 (50): 16678-16685. 10.1021/bi991231p.

    Article  CAS  PubMed  Google Scholar 

  49. Valenzuela JG, Francischetti IM, Ribeiro JM: Purification, cloning, and synthesis of a novel salivary anti-thrombin from the mosquito Anopheles albimanus. Biochemistry. 1999, 38 (34): 11209-11215. 10.1021/bi990761i.

    Article  CAS  PubMed  Google Scholar 

  50. Watanabe RM, Soares TS, Morais-Zani K, Tanaka-Azevedo AM, Maciel C, Capurro ML, Torquato RJ, Tanaka AS: A novel trypsin Kazal-type inhibitor from Aedes aegypti with thrombin coagulant inhibitory activity. Biochimie. 2010, 92 (8): 933-939. 10.1016/j.biochi.2010.03.024.

    Article  CAS  PubMed  Google Scholar 

  51. Kotsyfakis M, Anderson JM, Andersen JF, Calvo E, Francischetti IM, Mather TN, Valenzuela JG, Ribeiro JM: Cutting edge: Immunity against a “silent” salivary antigen of the Lyme vector Ixodes scapularis impairs its ability to feed. J Immunol. 2008, 181 (8): 5209-5212.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  52. Kotsyfakis M, Sa-Nunes A, Francischetti IM, Mather TN, Andersen JF, Ribeiro JM: Antiinflammatory and immunosuppressive activity of sialostatin L, a salivary cystatin from the tick Ixodes scapularis. J Biol Chem. 2006, 281 (36): 26298-26307. 10.1074/jbc.M513010200.

    Article  CAS  PubMed  Google Scholar 

  53. Han Q, Fang J, Ding H, Johnson JK, Christensen BM, Li J: Identification of Drosophila melanogaster yellow-f and yellow-f2 proteins as dopachrome-conversion enzymes. Biochem J. 2002, 368 (Pt 1): 333-340.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  54. Johnson JK, Li J, Christensen BM: Cloning and characterization of a dopachrome conversion enzyme from the yellow fever mosquito, Aedes aegypti. Insect Biochem Mol Biol. 2001, 31 (11): 1125-1135. 10.1016/S0965-1748(01)00072-8.

    Article  CAS  PubMed  Google Scholar 

  55. Albert S, Bhattacharya D, Klaudiny J, Schmitzova J, Simuth J: The family of major royal jelly proteins and its evolution. J Mol Evol. 1999, 49 (2): 290-297. 10.1007/PL00006551.

    Article  CAS  PubMed  Google Scholar 

  56. Charlab R, Valenzuela JG, Rowton ED, Ribeiro JM: Toward an understanding of the biochemical and pharmacological complexity of the saliva of a hematophagous sand fly Lutzomyia longipalpis. Proc Natl Acad Sci USA. 1999, 96 (26): 15155-15160. 10.1073/pnas.96.26.15155.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  57. Valenzuela JG, Garfield M, Rowton ED, Pham VM: Identification of the most abundant secreted proteins from the salivary glands of the sand fly Lutzomyia longipalpis, vector of Leishmania chagasi. J Exp Biol. 2004, 207 (Pt 21): 3717-3729.

    Article  CAS  PubMed  Google Scholar 

  58. Xu X, Oliveira F, Chang BW, Collin N, Gomes R, Teixeira C, Reynoso D, My Pham V, Elnaiem DE, Kamhawi S, et al: Structure and function of a “yellow” protein from saliva of the sand fly Lutzomyia longipalpis that confers protective immunity against Leishmania major infection. J Biol Chem. 2011, 286 (37): 32383-32393. 10.1074/jbc.M111.268904.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  59. Loomans HJ, Hahn BL, Li QQ, Phadnis SH, Sohnle PG: Histidine-based zinc-binding sequences and the antimicrobial activity of calprotectin. J Infect Dis. 1998, 177 (3): 812-814. 10.1086/517816.

    Article  CAS  PubMed  Google Scholar 

  60. Andersen JF, Gudderra NP, Francischetti IM, Valenzuela JG, Ribeiro JM: Recognition of anionic phospholipid membranes by an antihemostatic protein from a blood-feeding insect. Biochemistry. 2004, 43 (22): 6987-6994. 10.1021/bi049655t.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  61. Castellino FJ, McCance SG: The kringle domains of human plasminogen. Ciba Found Symp. 1997, 212: 46-60. discussion 60–45

    CAS  PubMed  Google Scholar 

  62. Mosesson MW, Siebenlist KR, Meh DA: The structure and biological features of fibrinogen and fibrin. Ann N Y Acad Sci. 2001, 936: 11-30.

    Article  CAS  PubMed  Google Scholar 

  63. Arca B, Lombardo F, Francischetti IM, Pham VM, Mestres-Simon M, Andersen JF, Ribeiro JM: An insight into the sialome of the adult female mosquito Aedes albopictus. Insect Biochem Mol Biol. 2007, 37 (2): 107-127. 10.1016/j.ibmb.2006.10.007.

    Article  CAS  PubMed  Google Scholar 

  64. Ribeiro JM, Arca B, Lombardo F, Calvo E, Phan VM, Chandra PK, Wikel SK: An annotated catalogue of salivary gland transcripts in the adult female mosquito, Aedes aegypti. BMC Genomics. 2007, 8 (1): 6-10.1186/1471-2164-8-6.

    Article  PubMed Central  PubMed  Google Scholar 

  65. Calvo E, Sanchez-Vargas I, Favreau AJ, Barbian KD, Pham VM, Olson KE, Ribeiro JM: An insight into the sialotranscriptome of the West Nile mosquito vector, Culex tarsalis. BMC Genomics. 2010, 11 (1): 51-10.1186/1471-2164-11-51.

    Article  PubMed Central  PubMed  Google Scholar 

  66. Arensburger P, Megy K, Waterhouse RM, Abrudan J, Amedeo P, Antelo B, Bartholomay L, Bidwell S, Caler E, Camara F, et al: Sequencing of Culex quinquefasciatus establishes a platform for mosquito comparative genomics. Science New York, NY. 2010, 330 (6000): 86-88. 10.1126/science.1191864.

    Article  CAS  Google Scholar 

  67. Nene V, Wortman JR, Lawson D, Haas B, Kodira C, Tu ZJ, Loftus B, Xi Z, Megy K, Grabherr M, et al: Genome sequence of Aedes aegypti, a major arbovirus vector. Science New York, NY. 2007, 316 (5832): 1718-1723. 10.1126/science.1138878.

    Article  CAS  Google Scholar 

  68. Holt RA, Subramanian GM, Halpern A, Sutton GG, Charlab R, Nusskern DR, Wincker P, Clark AG, Ribeiro JM, Wides R, et al: The genome sequence of the malaria mosquito Anopheles gambiae. Science New York, NY. 2002, 298 (5591): 129-149. 10.1126/science.1076181.

    Article  CAS  Google Scholar 

  69. Calvo E, Sanchez-Vargas I, Kotsyfakis M, Favreau AJ, Barbian KD, Pham VM, Olson KE, Ribeiro JM: The salivary gland transcriptome of the eastern tree hole mosquito, Ochlerotatus triseriatus. J Med Entomol. 2010, 47 (3): 376-386. 10.1603/ME09226.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  70. Calvo E, Dao A, Pham VM, Ribeiro JM: An insight into the sialome of Anopheles funestus reveals an emerging pattern in anopheline salivary protein families. Insect Biochem Mol Biol. 2007, 37 (2): 164-175. 10.1016/j.ibmb.2006.11.005.

    Article  PubMed Central  CAS  PubMed  Google Scholar 

  71. Valenzuela JG, Francischetti IM, Pham VM, Garfield MK, Ribeiro JM: Exploring the salivary gland transcriptome and proteome of the Anopheles stephensi mosquito. Insect Biochem Mol Biol. 2003, 33 (7): 717-732. 10.1016/S0965-1748(03)00067-5.

    Article  CAS  PubMed  Google Scholar 

  72. Ribeiro JMC, Arca B: From sialomes to the sialoverse: An insight into the salivary potion of blood feeding insects. Adv Insect Physiol. 2009, 37: 59-118.

    Article  Google Scholar 

Download references

Acknowledgments

This work was supported by the Intramural Research Program of the Division of Intramural Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health and by Fundação Oswaldo Cruz (Fiocruz) represented by Instituto Leônidas e Maria Deane (ILMD). We also thank the PAPES V program support FIOCRUZ/CNPq. We are grateful to Dr. Michalis Kotsyfakis for the critical reading of the manuscript and to Brenda Rae Marshall, DPSS, NIAID, for editing assistance. In addition, we thank Dr. Roberto Rocha (Fiocruz/Amazonia/ Brazil) for his support. Because JMCR, EC, and ACC are government employees and this is a government work, the work is in the public domain in the United States. Notwithstanding any other agreements, the NIH reserves the right to provide the work to PubMedCentral for display and use by the public, and PubMedCentral may tag or modify the work consistent with its customary practices. You can establish rights outside of the U.S. subject to a government use license.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José MC Ribeiro.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

ACC, EC, and JMCR contributed to experimental design, bioinformatics analysis, and writing of the manuscript. CMRV, FACP, and JFM contributed to insect collection, dissections, and taxonomic identification of mosquitoes. All authors read and approved the final manuscript.

Electronic supplementary material

Authors’ original submitted files for images

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Chagas, A.C., Calvo, E., Rios-Velásquez, C.M. et al. A deep insight into the sialotranscriptome of the mosquito, Psorophora albipes. BMC Genomics 14, 875 (2013). https://doi.org/10.1186/1471-2164-14-875

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1471-2164-14-875

Keywords