- Research article
- Open Access
The mining of toxin-like polypeptides from EST database by single residue distribution analysis
© Kozlov and Grishin; licensee BioMed Central Ltd. 2011
- Received: 16 August 2010
- Accepted: 31 January 2011
- Published: 31 January 2011
Novel high throughput sequencing technologies require permanent development of bioinformatics data processing methods. Among them, rapid and reliable identification of encoded proteins plays a pivotal role. To search for particular protein families, the amino acid sequence motifs suitable for selective screening of nucleotide sequence databases may be used. In this work, we suggest a novel method for simplified representation of protein amino acid sequences named Single Residue Distribution Analysis, which is applicable both for homology search and database screening.
Using the procedure developed, a search for amino acid sequence motifs in sea anemone polypeptides was performed, and 14 different motifs with broad and low specificity were discriminated. The adequacy of motifs for mining toxin-like sequences was confirmed by their ability to identify 100% toxin-like anemone polypeptides in the reference polypeptide database. The employment of novel motifs for the search of polypeptide toxins in Anemonia viridis EST dataset allowed us to identify 89 putative toxin precursors. The translated and modified ESTs were scanned using a special algorithm. In addition to direct comparison with the motifs developed, the putative signal peptides were predicted and homology with known structures was examined.
The suggested method may be used to retrieve structures of interest from the EST databases using simple amino acid sequence motifs as templates. The efficiency of the procedure for directed search of polypeptides is higher than that of most currently used methods. Analysis of 39939 ESTs of sea anemone Anemonia viridis resulted in identification of five protein precursors of earlier described toxins, discovery of 43 novel polypeptide toxins, and prediction of 39 putative polypeptide toxin sequences. In addition, two precursors of novel peptides presumably displaying neuronal function were disclosed.
- Database Screening
- Polypeptide Pattern
- Amino Acid Sequence Motif
- Toxin Sequence
- Polypeptide Toxin
Expressed sequence tag (EST) analysis is widely used in molecular biology. This analysis comprises the transcriptome of a given tissue at a given time. These data are deposited in a specialized resource at the National Center for Biotechnology Information (NCBI) - dbEST . The EST databases are used to address different problems [2–6].
The EST database analysis requires the development of novel methods and software for data processing. The standard procedure includes processing of the biological material, production of clones, construction of libraries, and data analysis, from grouping in contigs to gene annotation and microarray design . Special program modules facilitating different stages of analysis, such as those for preprocessing of data [8–10] and software for combining sequences in contigs and their annotation, have been developed [11–13]. To improve the quality of initial data processing, the results of different scanning methods can be combined from homology search of a nucleotide consensus sequence, homology search of deduced protein sequences and involvement of reference databases of known organisms [14–17].
The strategy of bioinformatics to database analysis remains the same, variety of diverse crude sequences combined by cluster analysis in contigs should be subjected to alignment search tools and function classification by gene ontologies. It gives good results although is not always optimum. Earlier, analysis of the EST database from spider venomous glands showed  that the conventional approach including the preprocessing of the original data and formation of contigs decreased the efficiency of identification of rare polypeptide toxins. The recommended search procedure of scanning translated sequences against characteristic toxin structural motifs proved more effective. Another alternative consists in the use of search queries created from the alignment of known proteins families for database screening. Thus, 83 new peptides were found, which were not earlier discovered in the EST databases of different aphid species . A family of new proteins from corals with a Cys-rich beta-defensin motif was identified as well .
Identification of short polypeptides in EST datasets is especially challenging since they may be aligned only with highly homologous proteins. They are synthesized as precursors, which are consequently processed into mature polypeptides. The enzymes involved in maturation recognize specific regulatory amino acid motifs, which help to identify precursor proteins in EST databases [18, 19, 21].
Polypeptide toxins from natural venoms are of considerable scientific and practical interest. They may be used for designing drugs of new generation . Venom of a single spider contains hundreds of polypeptides of similar three-dimensional structure but divergent biological activity. In toxins, the mature peptide domain is highly variable, while the signal peptide and the propeptide domain are conserved [23, 24]. The specificity of action on different cellular receptors depends on the unique combination of variable amino acid residues in the toxin molecule. Using a common scaffold, venomous animals actively change amino acid residues in the spatial loops of toxins thus adjusting the structure of a novel toxin molecule to novel receptor types. This array of polypeptide toxins in venoms is called a natural combinatorial library [25–27].
Homologous polypeptides in a combinatorial library may differ by point mutations or deletions of single amino acid residues. During contig formation such mutations may be considered as sequencing errors and can be ignored. Our method is devoid of such limitations. Instead of the whole EST dataset annotation and search for all possible homologous sequences, we suggest to consider the bank as a "black box", from which the necessary information may be recovered. The criterion for selection of necessary sequences in each particular case depends on the aim of the research and the structural characteristics of the proteins of interest.
To make queries in the EST database and to search for structural homology, we suggest to use single residue distribution analysis (SRDA) earlier developed for classification of spider toxins . In this work, we demonstrate the simplicity and efficacy of SRDA for identifying polypeptide toxins in the EST database of sea anemone Anemonia viridis.
For successful analysis, the choice of the key amino acid is of crucial importance. In polypeptide toxins, the structure-forming cysteine residues play this role, for other proteins, some other residues, e.g. lysine, may be as much important (see Figure 1). Sometimes it is necessary to find a specific residues distribution not in the whole protein sequences, but in the most conserved or other interesting sequence fragments. It is advised to start key residue mining in training data sets of limited size. Several amino acids in the polypeptide sequence may be selected for polypeptide pattern construction; however, in this case, the polypeptide pattern will be more complicated. If more than three key amino acid residues are chosen, analysis of their arrangement becomes too complicated. It is necessary to know the position of breaks in the amino acid sequences corresponding to stop codons in protein-coding genes. Figure 1 clearly demonstrates that the distribution of Cys residues in the sequence analyzed by SRDA ("C") differs considerably from that of SRDA ("C.") taking into account termination symbols. For scanning A. viridis EST database, the position of termination codons was always taken into consideration.
To search for sequences of interest, a correctly formulated query is necessary. Queries also in pattern format (screening line in Figure 2) were based on amino acid sequences of anemone toxins after analysis of homology between their simplified structures.
At subsequent stages, from the converted database, amino acid sequences that satisfy each query were selected. Using the identifier, the necessary clones and open reading frames in the original EST database were correlated. As a result, a set of amino acid sequences was formed. Identical sequences, namely identical mature peptide domains without taking into account variations in the signal peptide and propeptide regions, were excluded from analysis. To identify the mature peptide domain, an earlier developed algorithm was used [21, 29]. The anemone toxins are secreted polypeptides; therefore only sequences with signal peptides were selected. Signal peptide cleavage sites were detected using both neural networks and Hidden Markov Models trained on eukaryotes using the online-tool SignalP http://www.cbs.dtu.dk/services/SignalP. To ensure that the identified structures were new, homology search in the non-redundant protein sequence database by blastp and PSI-BLAST http://blast.ncbi.nlm.nih.gov/Blast was carried out .
Data for analyses
To search for toxin structures, the EST database created for the Mediterranean anemone A. viridis was used . The original data containing 39939 ESTs was obtained from the NCBI server and converted in the table format for Microsoft Excel.
To formulate queries, amino acid sequences of anemone toxins using NCBI database were retrieved. 231 amino acid sequences were deposited in the database to February 1, 2010. All precursor sequences were converted into the mature toxin forms; identical and hypothetical sequences were excluded from analysis. Anemone toxin sequences deduced from databases of A. viridis were also excluded. The final number of toxin sequences was 104.
The reference database for review of the developed algorithms and queries was formed from amino acid sequences deposited in the NCBI database. To retrieve toxin sequences, the query "toxin" was used. The search was restricted to the Animal Kingdom. As a result, 10903 sequences were retrieved.
EST database analysis was performed on a personal computer using an operating system WindowsXP with installed MS Office 2003. Analyzed sequences in FASTA format were exported into the MS Excel editor with security level allowed macro commands execution (see additional file 1). Translation, SRDA and homology search in the converted database were carry out using special functions on VBA language for use in MS Excel (see additional file 2). Multiple alignments of toxin sequences were carried out with MegAlign program (DNASTAR Inc.).
Anemone toxin motifs
The development of appropriate queries is the most important part of the analysis. Their tolerance determines the accuracy of EST database screening and finally the number of retrieved sequences. 104 retrieved sequences of mature anemone toxins were subjected to SRDA using a number of key amino acid residues. The best results, as suspected, were obtained with structure patterns based on key cysteine residues. The enrichment in cysteine residues is a characteristic feature of many natural toxins, thus making it possible to use cysteine as a key amino acid residue in data conversion.
Pattern motifs for converted structures of sea anemone toxins obtained with SRDA("C.").
Number of seq.
Example (sequence ID)
Cangitoxin (P82803), AETX-1 (P69943)
BDS-II (P59084), APETx2 (P61542)
kaliseptin (Q9TWG1), ShK (P29187)
AsKC1 (Q9TWG0), APHC1 (B2G331)
SHTX-5 (B1B5J0), Gigantoxin-1 (BAD01579)
Neurotoxin 3 (1ANS)
acrorhagin II (BAE46983)
AvTX-60A (BAD04943), PsTX-60B (P58912)
Equinatoxin II (P61915), Cytolysin-3 (Q9U6X1)
Up-1 (P0C1G1), bandaporin (BAH80315)
K > = 6 AND C < = 2
? - any single symbol,
# - any single digit (0-9),
* - gap in the search line from 0 to any number of symbols.
Since the final goal by query motifs developing was maximum retrieving of sequences from the database, we didn't try to create universal motifs with broad specificity. Conversely, many motifs were developed to ensure search specificity of key residues distribution in patterns. The first four motifs enclose the largest number of known sea anemone toxins and are the most discriminative. For motifs 5-9, we tried to achieve high identification capacity, while motifs 10-13 were made degenerative and partially overlapped earlier developed motifs.
Among anemone toxins, large cysteine-free molecules exhibiting strong cytolytic activity are present. These toxins named cytolysines comprise a heterogeneous group of membrane-active molecules subdivided into several groups on the basis of primary structure homology and similarity of physical and chemical properties . For these molecules, pattern motifs developed to be too simple (0 and 14 in Table 1) and inadequate for analysis. For identification of such possible structures in databank, a novel motif K was generated; it combined two search parameters: the presence of not more than 2 cysteine residues at SRDA ("C.") and not less than 6 lysine residues at SRDA ("K.").
Toxins retrieved from the reference database using pattern motifs.
Motif specificity to anemone seq.
In the database studied with a total of 13 motifs, we were unable to identify 154 sequences of anemone toxins from 374 available, 108 of which belonged to predicted structures or sequence fragments, and the remaining 46 sequences referred to cytotoxins (motif K).
As shown in Table 2, motif specificity varies considerably that was already mentioned during motif construction. For instance, only motifs 1 and 2 proved specific to anemone toxins. Motifs 3 and 4, early expected to be specific to sea anemones toxin, were also found in toxins of other animals, mainly nematodes and snakes. Although motif 8 was rarest it was found for a spider toxin, a conotoxin and an anemone toxin, therefore it also could not be considered specific.
Data retrieval from EST database
Results obtained from A. viridis EST database at each stage of analysis.
Nr clones SignalP approved
Known structures found
In deduced amino acid sequences, the mature peptide chain was determined using a maturation algorithm [21, 29], and repetitious mature sequences were discarded. Finally 89 unique secreted sequences possess homology to anemone polypeptide toxins were discovered in A. viridis database (see Table 3). Duplicated clones were not numerous; two most abundant sequences revealed with motifs 3 and K were repeated in the database 103 and 58 times, respectively. Detailed information on the correspondence of the deduced polypeptides to the EST nucleotide sequences is given in an additional file 3. Deduced polypeptides were compared on the next processing stage with protein databank resulting in determination of 7 known toxins.
Polypeptide toxins of A. viridis
The sea anemone A. viridis earlier described as Anemonia sulcata is an extensively studied Mediterranean species [34–37]. More than 20 polypeptide toxins of different structure and function have been isolated from this species. They include potassium channel blockers, such as kalicludines, kaliseptine, blood depressing substance (BDS) [38, 39], neurotoxins effectively blocking sodium channels , and Kunitz-type inhibitors of proteolytic enzymes [41, 42].
Thirty nine sequences were retrieved from the EST database using motifs 11, 13 and K. All of them are presented in the additional file 4. Homology search with blastp algorithm failed to reveal related sequences, however there structures possess correct signal peptides providing effective secretion. For some sequences, the sites of limited proteolysis and the location of the mature peptide domain may be predicted using earlier developed procedures [21, 29]. The sequences identified with motifs 11 and 13 were named toxin-like, however their function remains unknown. In the group of short sequences presents only two structural families other sequences are single (additional file 4 panel A). Homology search showed that two sequences Tox-like av-1 and 5 matched earlier predicted structures. Polypeptides Tox-like av-4, 5 and 6 were repetitious in the EST database (see additional file 3).
We also discovered long cysteine-containing sequences named Tox-like av-9 - Tox-like av-16 (additional file 4 panel B). Their structural peculiarities include a long propeptide fragment followed the signal peptide, which is enriched in negatively charged amino acid residues, and numerous arginine and lysine residues in the mature peptide chain. We assume that propeptide can stabilized precursor's structure by compensating excess positive charge of the mature peptide and prevents premature proteolytic degradation, as was demonstrated for precursors of antimicrobial peptides [46, 47]. The presence of a large number of positively charged amino acid residues points to possible cytotoxic functions of these peptides.
Several other cysteine-free cytotoxins enriched in lysine residues, the so-called cytolysin-like sequences, were retrieved from the EST database with motif K (additional file 4 panel C). These sequences were repetitive in the database and formed a homologous family (additional file 3). We suppose that natural venom contains truncated variants of these sequences and suggest that two C-terminal fragments of about 40 residues in length represent the putative mature polypeptides.
With motif K, 12 short sequences were retrieved from the database. All of them, except one, grouped in four homologous families. Since their functions remain obscure, they were called 'hypothetical peptides' (additional file 4 panel D).
There is no sequence similarity between the precursor proteins presented, however the limited proteolysis motif between generated neuropeptides is similar, and almost all of them keeping a C-terminal amidation signal. The localization of the position of the N-terminal amino acid residue is problematic; therefore we suggested that active neuropeptides should be consisted of 4-6 amino acid residues. The peptides produced during maturation ended by the sequence Arg-Pro-NH2 therefore they were called RPamide neuropeptides.
To summarize, novel polypeptide sequences deduced from A. viridis EST database were assembled into several families with members differing by point mutations. This is a common feature of venomous animals, which produce a variety of toxins affecting different targets on the basis of a limited number of sequence patterns. Traditional sequence processing algorithms consider minor sequences as erroneous, but it is not ruled out that these structures are in fact correct. Following proteomic research is necessary to test either possibility.
The efficiency of the method developed: a comparative study
The SRDA efficiency compared to grouping nucleotide sequences in contigs was earlier demonstrated for the EST database of venomous spider glands . Due to the absence of substantial data on amino acid sequences of homologous proteins, the blast search fails to reveal homology with known proteins. This means that some good consensus sequence and the entire contig will be excluded from a consideration. It is exemplified by the data presented in the additional file 3, where for some sequences the homology was not revealed.
It is more reasonable to compare the efficiency of mining polypeptide sequences using SRDA with other methods, which are also operated with amino acid sequence patterns, such as Pfam or GO [52, 53]. This checking was done using a set of amino acid sequences of predicted peptides. Eighty nine sequences in FASTA were downloaded in UFO web server . In comparison with SRDA and blastp, assignment of sequences to protein families by UFO was less successful. The results of search are given for each analyzed sequence in the additional file 3 together with blastp data.
A similar approach was applied for retrieval of polypeptides from the rodent EST database using conserved Cys pattern of the transforming growth factor-β (TGF- β) family . A special Motifer search tool with flexible interface of queries was used. Similarly to our algorithm, Motifer operates with sequences translated in several reading frames and takes into consideration the termination signals. One of the weak points of the program was low database scanning speed.
SRDA simplifies the database itself and the search queries, thus considerably simplify the comparing algorithm and consequently increasing the analysis rate. Thus, the search of 12 queries in the reference database of 10489 sequences on a standard desktop PC required 30 sec. We suggest that the simplicity and high rate of analysis make SRDA attractive not only for the study of polypeptide toxins but of other polypeptides as well.
Since some procedures in the analyses are tedious and labor-consuming, it may be useful to combine SRDA with other progressive techniques, for example based on the Hidden Markov Model. A novel consolidated algorithm will enclose best features of all parts to aid a precise and fast technology of EST processing.
The SRDA of A. viridis EST database showed that this method is effective for rapid retrieval of sequences from the bulk of bioinformatics data. The correct formulation of query plays the crucial role in the outcome of database screening and requires small additional study. The key residues, whose arrangement we wish to fix in the polypeptide pattern, should be selected on the basis of their structural or functional significance. The introduction of termination signals considerably decreases false positive results.
Using the procedure developed, we identified both new sequences and sequences showing high homology to already described toxins. For two known toxins, the precursor structures were determined. All retrieved sequences formed families of homologous peptides that differ by single or multiple amino acid substitutions, providing additional evidence for the combinatorial principle of natural venom formation. In addition to 23 earlier reported polypeptide toxins in sea anemone A. viridis, we discovered 43 novel sequences. Besides toxins, we also found short peptides with regulatory neuronal function, whose role is still to be investigated, and several groups of toxin-like polypeptides.
Simplification of queries and the database itself reduces the time of analysis as compared to methods based on the search for complete amino acid sequences. The procedure developed may be used for scanning newly generated databases or as a complementation to traditionally used approaches. It is suitable not only for retrieval of polypeptide toxins but for finding any type of amino acid sequences once their structural motifs have been established.
This work was financially supported by the Russian Foundation for Basic Research (antimicrobial peptides from natural venoms project) and the program of the Russian Academy of Sciences "Molecular and Cell Biology" (analgesic polypeptides project). SK and EG acknowledge Prof. Roman Efremov (Shemyakin-Ovchinnikov Institute) for reading and criticism of the manuscript.
- Boguski MS, Lowe TM, Tolstoshev CM: dbEST--database for "expressed sequence tags". Nat Genet. 1993, 4: 332-333. 10.1038/ng0893-332.View ArticlePubMedGoogle Scholar
- Silva EC, Camargos TS, Maranhao AQ, Silva-Pereira I, Silva LP, Possani LD, Schwartz EF: Cloning and characterization of cDNA sequences encoding for new venom peptides of the Brazilian scorpion Opisthacanthus cayaporum. Toxicon. 2009, 54: 252-261. 10.1016/j.toxicon.2009.04.010.View ArticlePubMedGoogle Scholar
- Vasemagi A, Gross R, Palm D, Paaver T, Primmer CR: Discovery and application of insertion-deletion (INDEL) polymorphisms for QTL mapping of early life-history traits in Atlantic salmon. BMC Genomics. 11: 156-10.1186/1471-2164-11-156.Google Scholar
- Jongeneel CV: Searching the expressed sequence tag (EST) databases: panning for genes. Brief Bioinform. 2000, 1: 76-92. 10.1093/bib/1.1.76.View ArticlePubMedGoogle Scholar
- Dong Q, Kroiss L, Oakley FD, Wang BB, Brendel V: Comparative EST analyses in plant systems. Methods Enzymol. 2005, 395: 400-418. full_text.View ArticlePubMedGoogle Scholar
- Rudd S: Expressed sequence tags: alternative or complement to whole genome sequences?. Trends Plant Sci. 2003, 8: 321-329. 10.1016/S1360-1385(03)00131-6.View ArticlePubMedGoogle Scholar
- Parkinson J, Blaxter M: Expressed sequence tags: an overview. Methods Mol Biol. 2009, 533: 1-12. full_text.View ArticlePubMedGoogle Scholar
- Scheetz TE, Trivedi N, Roberts CA, Kucaba T, Berger B, Robinson NL, Birkett CL, Gavin AJ, O'Leary B, Braun TA, et al: ESTprep: preprocessing cDNA sequence reads. Bioinformatics. 2003, 19: 1318-1324. 10.1093/bioinformatics/btg159.View ArticlePubMedGoogle Scholar
- Bonfield JK, Smith K, Staden R: A new DNA sequence assembly program. Nucleic Acids Res. 1995, 23: 4992-4999. 10.1093/nar/23.24.4992.View ArticlePubMedPubMed CentralGoogle Scholar
- Chou HH, Holmes MH: DNA sequence quality trimming and vector removal. Bioinformatics. 2001, 17: 1093-1104. 10.1093/bioinformatics/17.12.1093.View ArticlePubMedGoogle Scholar
- Hotz-Wagenblatt A, Hankeln T, Ernst P, Glatting KH, Schmidt ER, Suhai S: ESTAnnotator: A tool for high throughput EST annotation. Nucleic Acids Res. 2003, 31: 3716-3719. 10.1093/nar/gkg566.View ArticlePubMedPubMed CentralGoogle Scholar
- Lee B, Hong T, Byun SJ, Woo T, Choi YJ: ESTpass: a web-based server for processing and annotating expressed sequence tag (EST) sequences. Nucleic Acids Res. 2007, 35: W159-162. 10.1093/nar/gkm369.View ArticlePubMedPubMed CentralGoogle Scholar
- Nagaraj SH, Deshpande N, Gasser RB, Ranganathan S: ESTExplorer: an expressed sequence tag (EST) assembly and annotation platform. Nucleic Acids Res. 2007, 35: W143-147. 10.1093/nar/gkm378.View ArticlePubMedPubMed CentralGoogle Scholar
- Nam SH, Kim DW, Jung TS, Choi YS, Kim DW, Choi HS, Choi SH, Park HS: PESTAS: a web server for EST analysis and sequence mining. Bioinformatics. 2009, 25: 1846-1848. 10.1093/bioinformatics/btp293.View ArticlePubMedGoogle Scholar
- Waegele B, Schmidt T, Mewes HW, Ruepp A: OREST: the online resource for EST analysis. Nucleic Acids Res. 2008, 36: W140-144. 10.1093/nar/gkn253.View ArticlePubMedPubMed CentralGoogle Scholar
- Ranganathan S, Menon R, Gasser RB: Advanced in silico analysis of expressed sequence tag (EST) data for parasitic nematodes of major socio-economic importance--fundamental insights toward biotechnological outcomes. Biotechnol Adv. 2009, 27: 439-448. 10.1016/j.biotechadv.2009.03.005.View ArticlePubMedGoogle Scholar
- Tang Z, Choi JH, Hemmerich C, Sarangi A, Colbourne JK, Dong Q: ESTPiper--a web-based analysis pipeline for expressed sequence tags. BMC Genomics. 2009, 10: 174-10.1186/1471-2164-10-174.View ArticlePubMedPubMed CentralGoogle Scholar
- Kozlov S, Malyavka A, McCutchen B, Lu A, Schepers E, Herrmann R, Grishin E: A novel strategy for the identification of toxinlike structures in spider venom. Proteins. 2005, 59: 131-140. 10.1002/prot.20390.View ArticlePubMedGoogle Scholar
- Christie AE: In silico analyses of peptide paracrines/hormones in Aphidoidea. Gen Comp Endocrinol. 2008, 159: 67-79. 10.1016/j.ygcen.2008.07.022.View ArticlePubMedGoogle Scholar
- Sunagawa S, DeSalvo MK, Voolstra CR, Reyes-Bermudez A, Medina M: Identification and gene expression analysis of a taxonomically restricted cysteine-rich protein family in reef-building corals. PLoS One. 2009, 4: e4865-10.1371/journal.pone.0004865.View ArticlePubMedPubMed CentralGoogle Scholar
- Kozlov SA, Vassilevski AA, Grishin EV: Secreted protein and peptide biosynthesis: precursor structures and processing mechanisms. Protein biosynthesis. Edited by: Esterhouse TE, Petrinos LB. 2009, New York: Nova Biomedical Books, 225-248. Toma E. Esterhouse and Lado B. Petrinos editionGoogle Scholar
- Escoubas P, King GF: Venomics as a drug discovery platform. Expert Rev Proteomics. 2009, 6: 221-224. 10.1586/epr.09.45.View ArticlePubMedGoogle Scholar
- Conticello SG, Gilad Y, Avidan N, Ben-Asher E, Levy Z, Fainzilber M: Mechanisms for evolving hypervariability: the case of conopeptides. Mol Biol Evol. 2001, 18: 120-131.View ArticlePubMedGoogle Scholar
- Duda TF, Palumbi SR: Molecular genetics of ecological diversification: duplication and rapid evolution of toxin genes of the venomous gastropod Conus. Proc Natl Acad Sci USA. 1999, 96: 6820-6823. 10.1073/pnas.96.12.6820.View ArticlePubMedPubMed CentralGoogle Scholar
- Sollod BL, Wilson D, Zhaxybayeva O, Gogarten JP, Drinkwater R, King GF: Were arachnids the first to use combinatorial peptide libraries?. Peptides. 2005, 26: 131-139. 10.1016/j.peptides.2004.07.016.View ArticlePubMedGoogle Scholar
- Olivera BM, Hillyard DR, Marsh M, Yoshikami D: Combinatorial peptide libraries in drug design: lessons from venomous cone snails. Trends Biotechnol. 1995, 13: 422-426. 10.1016/S0167-7799(00)88996-9.View ArticlePubMedGoogle Scholar
- Chen J, Deng M, He Q, Meng E, Jiang L, Liao Z, Rong M, Liang S: Molecular diversity and evolution of cystine knot toxins of the tarantula Chilobrachys jingzhao. Cell Mol Life Sci. 2008, 65: 2431-2444. 10.1007/s00018-008-8135-x.View ArticlePubMedGoogle Scholar
- Kozlov S, Grishin E: Classification of spider neurotoxins using structural motifs by primary structure features. Single residue distribution analysis and pattern analysis techniques. Toxicon. 2005, 46: 672-686. 10.1016/j.toxicon.2005.07.009.View ArticlePubMedGoogle Scholar
- Kozlov SA, Grishin EV: The universal algorithm of maturation for secretory and excretory protein precursors. Toxicon. 2007, 49: 721-726. 10.1016/j.toxicon.2006.11.007.View ArticlePubMedGoogle Scholar
- Bendtsen JD, Nielsen H, von Heijne G, Brunak S: Improved prediction of signal peptides: SignalP 3.0. J Mol Biol. 2004, 340: 783-795. 10.1016/j.jmb.2004.05.028.View ArticlePubMedGoogle Scholar
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.View ArticlePubMedPubMed CentralGoogle Scholar
- Sabourault C, Ganot P, Deleury E, Allemand D, Furla P: Comprehensive EST analysis of the symbiotic sea anemone, Anemonia viridis. BMC Genomics. 2009, 10: 333-10.1186/1471-2164-10-333.View ArticlePubMedPubMed CentralGoogle Scholar
- Anderluh G, Macek P: Cytolytic peptide and protein toxins from sea anemones (Anthozoa: Actiniaria). Toxicon. 2002, 40: 111-124. 10.1016/S0041-0101(01)00191-X.View ArticlePubMedGoogle Scholar
- Moran Y, Weinberger H, Sullivan JC, Reitzel AM, Finnerty JR, Gurevitz M: Concerted evolution of sea anemone neurotoxin genes is revealed through analysis of the Nematostella vectensis genome. Mol Biol Evol. 2008, 25: 737-747. 10.1093/molbev/msn021.View ArticlePubMedGoogle Scholar
- Venn AA, Tambutte E, Lotto S, Zoccola D, Allemand D, Tambutte S: Imaging intracellular pH in a reef coral and symbiotic anemone. Proc Natl Acad Sci USA. 2009, 106: 16574-16579. 10.1073/pnas.0902894106.View ArticlePubMedPubMed CentralGoogle Scholar
- Castaneda O, Harvey AL: Discovery and Characterization of Cnidarian Peptide Toxins that Affect Neuronal Potassium Ion Channels. Toxicon. 2009, 54: 1119-1124. 10.1016/j.toxicon.2009.02.032.View ArticlePubMedGoogle Scholar
- Moran Y, Weinberger H, Lazarus N, Gur M, Kahn R, Gordon D, Gurevitz M: Fusion and retrotransposition events in the evolution of the sea anemone Anemonia viridis neurotoxin genes. J Mol Evol. 2009, 69: 115-124. 10.1007/s00239-009-9258-x.View ArticlePubMedGoogle Scholar
- Schweitz H, Bruhn T, Guillemare E, Moinier D, Lancelin JM, Beress L, Lazdunski M: Kalicludines and kaliseptine. Two different classes of sea anemone toxins for voltage sensitive K+ channels. J Biol Chem. 1995, 270: 25121-25126. 10.1074/jbc.270.42.25121.View ArticlePubMedGoogle Scholar
- Diochot S, Schweitz H, Beress L, Lazdunski M: Sea anemone peptides with a specific blocking activity against the fast inactivating potassium channel Kv3.4. J Biol Chem. 1998, 273: 6744-6749. 10.1074/jbc.273.12.6744.View ArticlePubMedGoogle Scholar
- Alsen C: Biological significance of peptides from Anemonia sulcata. Fed Proc. 1983, 42: 101-108.PubMedGoogle Scholar
- Hemmi H, Kumazaki T, Yoshizawa-Kumagaye K, Nishiuchi Y, Yoshida T, Ohkubo T, Kobayashi Y: Structural and functional study of an Anemonia elastase inhibitor, a "nonclassical" Kazal-type inhibitor from Anemonia sulcata. Biochemistry. 2005, 44: 9626-9636. 10.1021/bi0472806.View ArticlePubMedGoogle Scholar
- Beress L, Wunderer G, Wachter E: Amino acid sequence of toxin III from Anemonia sulcata. Hoppe Seylers Z Physiol Chem. 1977, 358: 985-988.View ArticlePubMedGoogle Scholar
- Andreev YA, Kozlov SA, Koshelev SG, Ivanova EA, Monastyrnaya MM, Kozlovskaya EP, Grishin EV: Analgesic compound from sea anemone Heteractis crispa is the first polypeptide inhibitor of vanilloid receptor 1 (TRPV1). J Biol Chem. 2008, 283: 23914-23921. 10.1074/jbc.M800776200.View ArticlePubMedPubMed CentralGoogle Scholar
- Shiomi K, Honma T, Ide M, Nagashima Y, Ishida M, Chino M: An epidermal growth factor-like toxin and two sodium channel toxins from the sea anemone Stichodactyla gigantea. Toxicon. 2003, 41: 229-236. 10.1016/S0041-0101(02)00281-7.View ArticlePubMedGoogle Scholar
- Honma T, Hasegawa Y, Ishida M, Nagai H, Nagashima Y, Shiomi K: Isolation and molecular cloning of novel peptide toxins from the sea anemone Antheopsis maculata. Toxicon. 2005, 45: 33-41. 10.1016/j.toxicon.2004.09.013.View ArticlePubMedGoogle Scholar
- Vassilevski AA, Kozlov SA, Grishin EV: Antimicrobial peptide precursor structures suggest effective production strategies. Recent Pat Inflamm Allergy Drug Discov. 2008, 2: 58-63. 10.2174/187221308783399261.View ArticlePubMedGoogle Scholar
- Kozlov SA, Vassilevski AA, Feofanov AV, Surovoy AY, Karpunin DV, Grishin EV: Latarcins, antimicrobial and cytolytic peptides from the venom of the spider Lachesana tarabaevi (Zodariidae) that exemplify biomolecular diversity. J Biol Chem. 2006, 281: 20983-20992. 10.1074/jbc.M602168200.View ArticlePubMedGoogle Scholar
- Darmer D, Hauser F, Nothacker HP, Bosch TC, Williamson M, Grimmelikhuijzen CJ: Three different prohormones yield a variety of Hydra-RFamide (Arg-Phe-NH2) neuropeptides in Hydra magnipapillata. Biochem J. 1998, 332 (Pt 2): 403-412.View ArticlePubMedPubMed CentralGoogle Scholar
- Gajewski M, Leitz T, Schloßherr J, Plickert G, Elmasri H: LWamides from Cnidaria constitute a novel family of neuropeptides with morphogenetic activity. Dev Genes Evol. 1996, 205: 232-242.Google Scholar
- McFarlane ID, Anderson PA, Grimmelikhuijzen CJ: Effects of three anthozoan neuropeptides, Antho-RWamide I, Antho-RWamide II and Antho-RFamide, on slow muscles from sea anemones. J Exp Biol. 1991, 156: 419-431.PubMedGoogle Scholar
- Katsukura Y, Ando H, David CN, Grimmelikhuijzen CJ, Sugiyama T: Control of planula migration by LWamide and RFamide neuropeptides in Hydractinia echinata. J Exp Biol. 2004, 207: 1803-1810. 10.1242/jeb.00974.View ArticlePubMedGoogle Scholar
- Coggill P, Finn RD, Bateman A: Identifying protein domains with the Pfam database. Curr Protoc Bioinformatics. 2008, Chapter 2: Unit 25-Google Scholar
- Mistry J, Finn R: Pfam: a domain-centric method for analyzing proteins and proteomes. Methods Mol Biol. 2007, 396: 43-58. full_text.View ArticlePubMedGoogle Scholar
- Meinicke P: UFO: a web server for ultra-fast functional profiling of whole genome protein sequences. BMC Genomics. 2009, 10: 409-10.1186/1471-2164-10-409.View ArticlePubMedPubMed CentralGoogle Scholar
- Jornvall H: Motifer, a search tool for finding amino acid sequence patterns from nucleotide sequence databases. FEBS Lett. 1999, 456: 85-88. 10.1016/S0014-5793(99)00941-2.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.