Antimicrobial peptide-like genes in Nasonia vitripennis : a genomic perspective
© Tian et al. 2010
Received: 25 August 2009
Accepted: 19 March 2010
Published: 19 March 2010
Skip to main content
© Tian et al. 2010
Received: 25 August 2009
Accepted: 19 March 2010
Published: 19 March 2010
Antimicrobial peptides (AMPs) are an essential component of innate immunity which can rapidly respond to diverse microbial pathogens. Insects, as a rich source of AMPs, attract great attention of scientists in both understanding of the basic biology of the immune system and searching molecular templates for anti-infective drug design. Despite a large number of AMPs have been identified from different insect species, little information in terms of these peptides is available from parasitic insects.
By using integrated computational approaches to systemically mining the Hymenopteran parasitic wasp Nasonia vitripennis genome, we establish the first AMP repertoire whose members exhibit extensive sequence and structural diversity and can be distinguished into multiple molecular types, including insect and fungal defensin-like peptides (DLPs) with the cysteine-stabilized α-helical and β-sheet (CSαβ) fold; Pro- or Gly-rich abaecins and hymenoptaecins; horseshoe crab tachystatin-type AMPs with the inhibitor cystine knot (ICK) fold; and a linear α-helical peptide. Inducible expression pattern of seven N. vitripennis AMP genes were verified, and two representative peptides were synthesized and functionally identified to be antibacterial. In comparison with Apis mellifera (Hymenoptera) and several non-Hymenopteran model insects, N. vitripennis has evolved a complex antimicrobial immune system with more genes and larger protein precursors. Three classical strategies that are likely responsible for the complexity increase have been recognized: 1) Gene duplication; 2) Exon duplication; and 3) Exon-shuffling.
The present study established the N. vitripennis peptidome associated with antimicrobial immunity by using a combined computational and experimental strategy. As the first AMP repertoire of a parasitic wasp, our results offer a basic platform for further studying the immunological and evolutionary significances of these newly discovered AMP-like genes in this class of insects.
AMPs constitute essential components of innate immunity to rapidly respond to diverse microbial pathogens [1, 2]. As key effectors, AMPs directly kill invaders by acting either as pore-formers or metabolic inhibitors . Most AMPs are cationic polypeptides, usually smaller than 100 amino acids in length, with enormous sequence diversity. Based on structural characteristics, these molecules are roughly divided into three different groups : 1) Linear peptides free of cysteines, often forming an α-helical conformation with amphiphilic surface, such as insect cecropins and mellitins, amphibian magainins and arachnidian meucins [5, 6]; 2) Cysteine-rich peptides. The most representative of this group are insect-derived defensins, which belong to the CSαβ polypeptide superfamily and are conserved across the kingdoms of eukaryotes [7, 8], even in bacteria [9, 10]; 3) Peptides with unusual bias in certain amino acids, such as glycine-rich tenecin3 , proline-rich apidaecin , histidine-rich histatin  and tryptophan-rich indolicidin . AMPs in this group are unstructured and often enter into bacterial cells to inhibit different metabolic targets [3, 15].
Insects, as a rich source of AMPs, attract great attention of scientists in both understanding the basic biology of immune system and searching molecular templates for anti-infective drug design. So far, about 200 such peptides have been identified from insects. In Drosophila melanogaster, there are 20 AMPs characterized . In the past ten years, a number of genome sequencing projects have promoted application of computational approaches for the discovery of components involved in innate immunity of a given species. Some examples include D. melanogaster, Anopheles gambiae, A. mellifera, Tribolium castaneum and Bombyx mori [17–21]. This will undoubtedly extend our knowledge in basic biology of insect innate immunity system. Furthermore, these data also provide new clues for elucidation of immunological adaptation of insects to environmental changes. For example, analysis of the complete immune system of A. gambiae results in the discovery of a marked deficit of orthologues due to gene loss and excessive expansion of some specific genes, which is significantly different from that of D. melanogaster. This possibly reflects differential selective pressures from different pathogens between these two species. These results also facilitate further in-depth analysis of the mosquito immune system's impact on the malaria parasite [17, 22].
Parasitoids (Hymenoptera, Insecta) are a group of parasitic insects. Some of them attach significant vectors of human disease, such as house flies, roaches and ticks, and some of them are extremely important regulators of agricultural pests . To study molecular mechanism of the innate immunity of parasitic wasps, the basic biology of their AMPs is needed to be established. As the first parasitic hymenopteran insect to have its genome sequenced , N. vitripennis provides a new resource for the identification of AMP genes in this class of insects in a genomic scale [25, 26]. Here, we report systemic characterization of the AMP repertoire from the N. vitripennis genome, which provide a perspective for the understanding of a possible relationship between immunity and parasitism. In comparison with A. mellifera and several non-Hymenopteran insects, N. vitripennis has developed a more complex antimicrobial immune system through genetic duplication and exon-shuffling.
To identify N. vitripennis AMP repertoire, we employed three complementary approaches to database search of the N. vitripennis genome (see additional file 1). Firstly, we chose A. mellifera AMPs (e.g. mellitin, apidaecin, apisimin, abaecin, hymenoptaecin, defensins) as queries to recognize their orthologues in N. vitripennis by BLASTP and TBLASTN. Secondly, we performed pattern search by the Scanprosite program based on the cysteine arrangement pattern of CSαβ-type defensins . This strategy allowed identifying a large family of DLPs with a typical CXXXC and CXC motif and an amino-terminal signal peptide, which can be divided into three distinct subfamilies, including the known navidefensins  and two new subfamilies named nasonins and navitricins, respectively. Thirdly, we screened putative secreted peptides of < 150 amino acids from the N. vitripennis proteome by recognizing an amino-terminal signal sequence, from which AMP-like peptides rich in specific amino acids were identified. All AMPs found by the above approaches were again used as query to carry out BLASTP and TBLASTN iteratively until no new hits appeared. As a result, we identified a total of 44 AMPs in the genome of N. vitripennis, as shown in additional file 2, in which only nahymenoptaecin-1 and five defensins are recently reported [25, 26].
All these newly discovered peptides display remarkable AMP characteristics, as reflected by their small size and net positive charges at pH 7.0. Most peptides described here are smaller than 150 amino acids in length, except for nahymenoptaecin-1, nahymenoptaecin-2, nasonin-2 and nasonin-6, which all have undergone internal duplication (see additional file 2). N. vitripennis AMPs exhibit extensive sequence and structural diversity and can be distinguished into multiple molecular types including insect and fungal DLPs with the CSαβ fold; Pro- or Gly-rich abaecins and hymenoptaecins; horseshoe crab tachystatin-type AMPs with the ICK fold; and a linear α-helical peptide.
To confirm the reliability of our computational prediction, we undertook the N. vitripennis EST database search and found more than 60% peptides have corresponding transcripts, suggesting the genes encoding these peptides are expressed at the transcriptional level. For remaining 40% peptides whose cDNA sequences were not found in the EST database, one possibility is that their expression depends upon suitable microbial challenges  or is associated with different developmental stages of N. vitripennis, as observed in the D. melanogaster antifungal drosomycin , although pseudogene or wrong prediction resulted from computational methods cannot be completely excluded.
Defensins with the CSαβ structure are crucial effectors of innate immunity  which have been found in many insect species, the majority of them coming from different orders of the subclass Neoptera (Diptera, Coleoptera, Lepidopera, Hemiptera and Hymenoptera). These AMPs are primarily active against Gram-positive bacteria likely by forming voltage-dependent channels in bacterial membrane leading to a loss of cytoplasmic potassium [29, 30]. Their protective roles have been well documented by in vivo targeted disruption of defensin gene which resulted in the death of A. gambiae after Gram-positive bacterial infection . Insect defensins consist of 30-50 amino acids linked by three or four disulfide bridges. They share a common CSαβ structural motif with some functionally related scorpion neurotoxins targeting various ion channels, protease inhibitors, and even a plant sweet taste peptide [8, 32].
In addition to the defensins mentioned above, we identified a new DLP subfamily without a propeptide, which contains fourteen members (named nasonin-1 to -14) (Figure 1). Sequence analysis revealed that nasonins are more closely related to two known non-classical defensins than CITDs in precursor organization, n-loop size and sequence similarity, one being antifungal termicin isolated from termites [39, 40]; another being defensin1 from the Mexican scorpion Centruroides limpidus limpidus . Intriguingly, such similarity was also observed between nasonins and a scorpion K+ channel toxin cobatoxin [42, 43] (Figure 1). These observations suggest that nasonins could have diverse functional features. For nasonin-2 and nasonin-6, their defensin unit has undergone 4 to 5 internal repeats, as described in fungal DLPs . What's the potential function of such repeats and how these repeats occur will be further discussed below.
We also predicted the structure of nasonin-1 by comparative modeling using scorpion neurotoxin cobatoxin (PDB: 1PJV)  as a template. Although having a shorter n-loop than CITDs, nasonins can also adopt a typical CSαβ architecture (Figure 2B). The reliability of this model was evaluated by Verify3D with a score of 0.248. Overall, nasonin-1 more resembles structurally termicin (PDB: 1MM0)  and cobatoxin (Figure 2B), but they differ in molecular surface charge distribution (data not shown), which could explain functional diversity of these peptides, as observed between human beta-defensin2 and the snake toxin crotamine .
Navidefensin3 is a unique non-classical defensin because it has identical precursor organization and gene structure to navidefensin1 and 2 but similar n-loop size and amino acid sequence to nasonins. In fact, several non-classical insect defensins with a short n-loop were also found in Lepidoptera insects (e.g. gallerimycin, spodoptericin and the B. mori defensins) [46–48]. In the tree presented in Figure 2A, navidefensin3 is clustered together with nasonins rather than navidefensins, suggesting that this DLP is an evolutionary link between these two multigene subfamilies of defensins. Its lack in A. mellifera is consistent with the absence of the nasonin subfamily in this species.
As mentioned previously, navitricins have two additional cysteines which could form a fourth disulfide bridge. To confirm this, comparative modeling was again used to build the structure of navitricin-1 by choosing different templates, e.g. CITDs (sapecin and phormicin), the scorpion toxin BmTX1, and the plant sweet taste protein brazzein. In all these cases no acceptable models were generated and we thus tried to use the I-TASSER algorithm for structure simulation which produced a suitable model, as evaluated by Verify 3D (Figure 3B). Overall, this molecule is similar to the defensins described above including three identical disulfide bridges. Importantly, this model correctly predicts the fourth disulfide bridge linking the n- and c-loops together which makes the whole structure more compact. In fact, computational prediction was also applied to reveal an additional disulfide bridge in several insulin-like proteins .
Subsequently, we evaluated antibacterial and antifungal activities of nasonin-1 using classical inhibition zone assays  and found that nasonin-1 only displayed activity against two Gram-negative bacteria at micromolar concentrations with a lethal concentrations (CL) of 1.52 μM for Stenotrophomonus sp. LZ-1 and of 15.4 μM for Escherichia coli ATCC 25922 (Figure 4C). No effect was observed on Gram-positive bacteria and fungi. This is the first example describing a DLP mainly active on Gram-negative bacteria. Recombinant navidefensin2-2 was recently reported to be active on Gram-positive bacteria . Whether such remarkable difference in target selectivity is associated with the change of n-loop length in these two Nasonia defensins needs further investigation.
AMPs rich in specific amino acids (e.g. glycine, proline, histidine, tryptophan or arginine) represent an additional class of AMPs acting on diverse microorganisms (e.g. bacteria, fungi, and virus) [53, 54]. The majority of these molecules are unstructured and they often inhibit microbial growth by entering into cells and interacting with proteins involved in key metabolic processes . Eighteen such AMPs have been identified here, including two known AMP families (abaecin and hymenoptaecin) [55, 56].
Nahymenoptaecin-2 presents a more interesting structural feature in the mature peptide region, whose amino- and carboxyl-termini are respectively a Pro-rich peptide (named pronavicin) and a hymenoptaecin-like peptide with four internal repeats. Overall, these repeats can be well aligned with hymenoptaecin except a deletion of 31 amino acids in each repeat. Similarly, a conserved phase 0 intron is also present in the amino-termini of bee and parasitoid hymenoptaecins, suggesting that these peptides diverged from a common ancestor after speciation.
Although hymenoptaecins are believed to be only present in Hymenoptera, our study reveals that similar peptides were also evolved by Dipteran Drosophila (Figure 6A). These fly-derived peptides display a chimeric characteristic because they have an acidic propeptides, as bee hymenoptaecins, but their mature peptides more resemble the repeat of nahymenoptaecin-2 in size. Moreover, Drosophila hymenoptaecins have lost their introns, consistent with the lineage-specific intron loss in these species . All Drosophila hymenoptaecins clustering together in the tree (Figure 6B) indicates their monophyletic origin after separation of Diptera from Hymenoptera.
Antibacterial activity of pronavicin.
Lethal conc. (CL) (μM)
3.11 (y = 1.0867x + 1.019, R 2 = 0.998)
42.4 (y = 0.65x + 0.0175, R 2 = 0.9956)
19.5 (y = 0.7125x - 0.2313, R 2 = 0.995)
15.8 (y = 0.3458x + 0.2525, R 2 = 0.9993)
61.8 (y = 0.2375x + 0.0712, R 2 = 0.9774)
Stenotrophomonus sp. YC-1
78.0 (y = 0.3x + 0.0292, R 2 = 0.9959)
Stenotrophomonus sp. LZ-1
12.0 (y = 0.3625x + 0.3, R 2 = 0.9542)*
Linear peptides with an α-helical conformation are a large class of AMPs involved in immune response of arthropods, amphibians and vertebrates. Since cecropin, the first insect-derived linear AMP, was isolated from Cecropia moth , the number of such peptides dramatically increases in recent decades [72, 73]. Several linear α-helical AMPs have also been characterized in social wasps [74–76]. From the N. vitripennis genome, we identified one AMP-like peptide (called nahelixin) belonging to this class (Figure 7C). The precursor of nahelixin is 118 residues in length with an amino-terminal signal peptide and a putative carboxyl-terminal propeptide. The predicted mature peptide of 22 amino acids shares about 40% sequence identity with four known antibacterial peptides from frogs . Secondary structure prediction indicated it could adopt an α-helical structure (data not shown), which was further confirmed by ab initio structural prediction. An amphiphilic surface, with hydrophilic and hydrophobic residues separately arranged at two directions of the helical axis, is present in the model, as seen at the helical wheel projection (Figure 7C). Such structural feature represents a prerequisite for antibacterial activity of linear AMPs .
With more and more genome sequences released, identifying new bioactive peptides/proteins by computational genomics approaches is becoming quite valuable. Some examples include hormones from nematode and mosquito, odorant binding-like proteins from honey bee, opossum immune genome, and reptile venom genes [78–82]. However, most strategies used are primarily based on sequence similarity that is useful in finding orthologues of known genes. An additional computational search strategy depends on the presence of conserved structural motifs within a peptide superfamily. Such examples include the identification of fungal DLPs and mammalian β-defensins [32, 83]. Here, we identified the N. vitripennis AMPs in a genomic scale by using an integrated strategy which combines similarity search, pattern recognition and AMP characteristics (see additional file 1).
In comparison with several non-Hymenopteran insects whose genome sequences have been released (e.g. D. melanogaster and A. gambiae from Diptera, B. mori from Lepidoptera, and T. castaneum from Coleoptera) [17–21], N. vitripennis still have evolved more AMP genes both in kinds and numbers (Figure 9A). Based on the phylogenetic relationship of holometabolous insects , we elucidate several evolutionary events (e.g. gene expansion, terminal extension and tandem repeat) which might have occurred along the parasitoid branch (Figure 9B) to result in the lineage-specific complexity increase in N. vitripennis AMPs for synergetic defense in parasitic condition .
Three different strategies have been recognized to shape the complex antimicrobial immune system in N. vitripennis :
Maximum likelihood estimates of parameters and sites inferred to be under positive selection in the amino-terminal abaecin unit.
Positive selected sites
ω = 0.15
p0 = 0.80, ω0 = 0.046
p1 = 0.20, ω1 = 1.00
p0 = 0.80, ω0 = 0.05
p1 = 0.00, ω1 = 0.05315
p2 = 0.20, ω 2 = 5.91067
4Y, 6P*, 8R, 11Q, 12K
p = 0.48 q = 1.62
p0 = 0.80, p = 1.06, q = 14.00
(p1 = 0.20), ω = 6.69
4Y, 6P*, 8R, 11Q, 12K
By using a combined computational and experimental strategy, we established for the first time the N. vitripennis peptidome associated with innate immunity. Three basic evolutionary scenarios are recognized in the generation of a complex antimicrobial system in N. vitripennis. Our work presented here will offer a basic platform for further studying the immunological and evolutionary significances of these newly discovered AMPs in parasitic insects. Whether these AMP-like genes with a complicate structure or much more copies are involved in parasitism remains an open question which constitutes our next research direction. In particular, our efficient approaches for finding new AMPs in a genomic scale can easily be applied to other model organisms including humans, which will accelerate our understandings on AMP-mediated innate immunity at a defense network level. Further functional characterization of these parasitic wasp-derived AMPs will also help explore new-type of drug leads for anti-infective therapy.
Strategies for gene discovery used here are provided in the supplementary information (see additional file 1). BLASTP and TBLASTN programs were used to characterize orthologues of known AMPs from A. mellifera against the database of N. vitripennis . When searching for hymenoptaecin in flies, BLAST was carried out in Flybase . The program WinGene  was employed to predict and translate a complete open reading frame from a selected nucleotide sequence. The N. vitripennis protein sequences downloaded from GenBank  and the FTP site  were applied to perform ScanProsite on the ExPaSy sever .
All sequences we identified as AMP-like peptides were submitted to SignalP 3.0 server  for signal peptide prediction. Peptide characteristics identified here were further analyzed on the ExPaSy server  for estimation of molecular weights and calculation of net charges at Protein Calculator . Secondary structure prediction was done on the ExPaSy server . To predict amphiphilic structure of peptides, helical wheels models were calculated on the server .
Amino acid sequences were aligned using Clustal X  with fine adjustment by hands and phylogenetic trees were constructed by MEGA 4.0 . Structures were modeled from ab initio by I-TASSER on-line [105–107] or on the Structural Bioinformatics server by comparative modeling method . Structure evaluation was carried out by Verify 3D [109, 110]. The MultiProt server was used to do structural superimposition and calculation of root mean square deviation (RMSD) . All model structures predicted here have been despoited in Protein Model DataBase  under id number of PM0075686-PM0075709.
Codon-substitution models were selected to estimate the nonsynonymous-to-synonymous rate ratio (ω = dN/dS) using the CODEML program of the PAML software package . Four models recommended by Yang make two likelihood ratio tests (LRTs) by M1a/M2a and M7/M8: M1a (nearly neutral model) constraints a proportion p0 of conserved sites with 0 < ω < 1, while a proportion p1 = 1-p0 of neutral sites with ω1 = 1; M2a (positive selection model) adds an extra class of sites with the proportion p2 = 1-p0 -p1 and with ω estimated from the data. M7 (β distribution model) does not allow for positively selected sites and M8(β and ω model) adds an extra class of sites to M7, allowing for ω > 1, which means the presence of positively selected sites. In addition, Model 0 (M0) assuming one ω for all sites that does not allow the existence of positive selection was chosen for negative control [114, 115]. Upon detection of the positively selected signals, the calculation of posterior probabilities was completed using the Bayes Empirical Bayes (BEB) method [115, 116].
Pronavicin and reduced nasonin-1 were chemically synthesized by Xi'an Huachen Bio-Technology Co., Ltd. (Xi'an, China). The cyclization reaction to form disulfide bridges in the reduced nasonin-1 molecule was carried out in 0.1 M Tris-HCl (pH = 8.0) with 2 mM GSH and 0.2 mM GSSG. Refolded peptide was purified to be homogeneity by C18-RP-HPLC. Its MW was determined by MALDI-TOF MS on a Kratos PC Axima CFR plus (Shimazu Co. LTD, Kyoto).
Inhibition zone assays were carried out to evaluate antimicrobial activities of peptides [28, 52, 117]. Briefly, 50 μl bacteria or fungal spores with OD600 = 0.5 were inoculated to pre-heated 6 ml Luria-Bertaini's medium (1% bactotryptone, 0.5% bactoyeast extract and 0.5% NaCl) for bacteria or MEA medium (1% Malt Extract, 0.1% Peptone, 2% Glucose) for fungi containing 0.8% agar. The mixture was then spread on a 9-cm Petri dish, giving a depth of 1 mm. After settling, 2-mm wells were punched in the plate and then 2 μl of peptide under different concentrations was added to each well. Diameters of the inhibition zones were measured 16 hours after incubation at 30°C. C L was calculated by the Hultmark's method . The microorganisms used in this assay were listed in additional file 3.
N. vitripennis was maintained as a laboratory culture on pupae of the housefly Musca domestica. They were reared under a 14:10 light-dark cycle at 25°C in glass containers fed on 20% (v/v) honey solution. A standard method was applied to establish an infection model  of N. vitripennis. Briefly, a bacterial mixture of E. coli ATCC 25922 and M. luteus which had been separately incubated overnight was injected into abdomen of N. vitripennis adults with a micro-injector. About 50 wasps were challenged in total. Non-injected wasps were used as a control.
For isolating N. vitripennis total RNA, 50 wasps challenged for 16 hours by bacteria were grounded into fine powder in liquid nitrogen. The Trizol reagent (SBS Genetech, Beijing) was used to prepare total RNA according to the supplier's instructions. This method was also applied to prepare non-challenged total RNA .
Reverse transcription of total RNAs prepared from non-infected or infected N. vitripennis was performed using RT PreMix kit (TransGen, Beijing) and a universal oligo (dT) - containing adaptor primer dT3AP. The first-strand cDNA amplification in both non-infected and infected wasps was performed by using an AMP-like gene-specific primer and the universal primer 3AP according to the method described . PCR product was cloned into pGEM-T (TIANGEN, Beijing), and transformed into E. coli DH5α competent cells (TransGen, Beijing). Their sequences were confirmed by DNA sequencing. Primers used in this study are all listed in additional file 4.
For semi-quantitative RT-PCR, an AMP-like gene were amplified from non-infected or infected cDNAs by using a gene-specific primer and 3AP and PCR products obtained from different cycles (i.e. 25, 30, 35, 40, 45) were taken for comparison of the amounts of these products by electrophoresis on 1.5% agarose gels. N. vitripennis ribosome protein RP49 was chosen as an internal control which was amplified by using primers NvRp49-F2/3AP and the same cDNA templates.
We thank Human Genome Sequencing Center, Baylor College of Medicine http://www.hgsc.bcm.tmc.edu/ for providing N. vitripennis sequence data for public use and Prof. Chuanling Qiao (Institute of Zoology, Chinese Academy of Sciences, China), Prof. Fengyan Bai (Institute of Microbiology, Chinese Academy of Sciences, China) and Dr. Jianguo Zhou (Oak Ridge National Laboratory, USA) for providing microbial strains used in this study. We also thank Prof. Fengqin He (Institute of Zoology, Chinese Academy of Sciences, China) for providing M. domestica. This work was supported by grants from the National Natural Science Foundation of China (30730015, 30621003 and 90608009) and the 973 Program from the Ministry of Science and Technology of China (2010CB945304).
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.