Volume 10 Supplement 3
Phylogenetic reconstruction using secondary structures and sequence motifs of ITS2 rDNA of Paragonimus westermani (Kerbert, 1878) Braun, 1899 (Digenea: Paragonimidae) and related species
© Prasad et al; licensee BioMed Central Ltd. 2009
Published: 3 December 2009
Most phylogenetic studies using current methods have focused on primary DNA sequence information. However, RNA secondary structures are particularly useful in systematics because they include characteristics that give "morphological" information, not found in the primary sequence. In several mountainous regions of Northeastern India, foci of Paragonimus (lung fluke) infection reportedly involve species that are known to prevail in neighbouring countries. The present study was undertaken to demonstrate the sequence analysis of the ribosomal DNA (ITS2) of the infective (metacercarial) stage of the lung fluke collected from the edible crab hosts that are abundant in a mountain stream of the area (Miao, Changlang District in Arunachal Pradesh) and to construct its phylogeny. Using the approach of molecular morphometrics that is based on ITS2 secondary structure homologies, phylogenetic relationships of the various isolates of Paragonimus species that are prevalent in the neighbouring Near-eastern countries have been discussed.
Initially, ten predicted RNA secondary structures were reconstructed and the topology based only on the predicted RNA secondary structure of the ITS2 region resolved most relationships among the species studied. We obtained three similar topologies for seven species of the genus Paragonimus on the basis of traditional primary sequence analysis using MEGA and a Bayesian analysis of the combined data. The latter approach allowed us to include both primary sequence and RNA molecular morphometrics; each data partition was allowed to have a different evolution rate. Paragonimus westermani was found to group with P. siamensis of Thailand; this was best supported by both the molecular morphometrics and combined analyses. P. heterotremus, P. proliferus, P. skrjabini, P. bangkokensis and P. harinasutai formed a separate clade in the molecular phylogenies, and were reciprocally monophyletic with respect to other species. ITS2 sequence motifs allowed an accurate in-silico distinction of lung flukes.
Data indicate that ITS2 motifs (≤ 50 bp in size) can be considered a promising tool for trematode species identification. RNA secondary structure analysis could be a valuable tool for distinguishing new species and completing Paragonimus systematics, more so because ITS2 secondary structure contains more information than the usual primary sequence alignment.
The lung flukes of the genus Paragonimus have been known as one of the most important zoonotic parasites causing paragonimiasis, also known as endemic haemoptysis, in man. It is estimated that over 20 million people are infected worldwide due to several species of Paragonimus . Over 40 species are known to infect the lung of different mammalian hosts (representing as many as eleven families) throughout the world  and approximately 15 species are known to infect humans. The parasite can migrate to several other vital tissues including brain . The best known species is P. westermani  Braun, 1899 - a human parasite that can undergo development in as many as 16 different snail species and 50 crustacean species. Beside P. westermani, several other species namely, P. pulmonalis (Baelz, 1880) Miyazaki, 1978; P. ohirai Miyazaki, 1939; P. iloktsunensis Chen, 1940; P. skrjabini Chen, 1959; P. miyazaki Kamo et al., 1961 and P. heterotremus Chen and Hsia, 1964 - all reported to be occurring in Asia; P. africanus and P. uterobilateralis Voelker and Vogel, 1965 in Africa; and P. mexicanus Miyazaki and Ishii, 1968 in America are considered pathogenic to man. While P. westermani is distributed mostly in Asia, P. heterotremus is the predominant causative agent of paragonimiasis in Thailand .
In the context of India, the states of West Bengal, Assam and some other parts of the country are endemic foci of human paragonimiasis. This infection has been reported in a sizeable human population of Manipur, a north-eastern state of India [6, 7]. Very recently in Manipur and Arunachal Pradesh (Northeast India), the suspected foci of human infection where consumption of crustacean intermediate hosts is of regular practice, the Chinese species, P. hueitungensis and P. heterotremus, respectively were identified as etiological agents of paragonimiasis [8, 9]. However, no or scanty information is available about the prevalence of the parasite among its molluscan and crustacean intermediate hosts as even in the suspected foci of human infection.
Morphology of the encysted and excysted metacercariae, which occur as the infective stage in the muscle tissue of the crustacean second intermediate host, has been conventionally used in identification of species of Paragonimus. The taxonomy of Paragonimus spp has been based mainly on morphological data complemented with ecological, cytological and pathological results as well as clinical manifestations. Morphological differences found on stained and mounted adult specimens have been widely used to discriminate between platyhelminth species. Thus, identification of closely related metazoan parasite species based on morphological characters alone can be difficult. In recent times, the amplification of specific DNA regions via the polymerase chain reaction (PCR) and improved sequencing techniques have been employed to resolve taxonomic issues related to various helminth parasites by comparing their DNA. PCR-based techniques utilizing the rDNA ITS2 sequences, which occur between the 5.8S and 28S coding regions, have proven to be a reliable tool to identify the helminth species and their phylogenetic relationships [10–12]. Studies on phylogeny and/or intraspecific variation in Paragonimus species have also been done using ITS2 region [13, 14].
Most phylogenetic studies using current methods have focused on primary DNA sequence information. However, RNA secondary structures are particularly useful in systematics because they include characteristics, not found in the primary sequence, that give 'morphological' information. The molecular morphometrics approach has been employed in the present study for comparisons between primary and secondary structure information. It is an established fact that rRNA structure is highly conserved throughout evolution as most of the folding is functionally important despite primary sequence divergence . The novel approach of molecular morphometrics that relies both on traditional morphological comparison and on molecular sequence comparison by measuring the structural parameters of the ITS2 secondary structure homologies (geometrical features, bond energies, base composition etc.) is recently being used to study the phylogenetic relationships of various species . It is well known that rRNA is highly conserved throughout evolution. Thus, the secondary-structure elements of the RNA molecule, i.e., the helices, loops, bulges, and separating single-stranded portions, can be considered phylogenetic characters [17–19].
The present study was undertaken to demonstrate the sequence analysis of the ribosomal DNA (ITS2) of the infective metacercarial stages of the lung fluke, which abound in the muscle tissue of the edible crab hosts abundant in mountain streams of a suspected focus of infection and to construct its phylogeny using rRNA secondary structures supplementing the primary sequence analysis.
Parasite material and DNA isolation, amplification and sequencing
Naturally infected freshwater edible crabs (Barytelphusa lugubris) were collected from a mountain stream of Miao, Changlang District in Arunachal Pradesh (Altitude - 213 mASL, Longitude - 96°-15'N and Latitude - 27°-30'E). Metacercariae were isolated from the muscles of the crustacean host by digestion technique in artificial gastric juice. The sediments were examined for Paragonimus metacercariae under a dissecting stereoscopic microscope.
For the purpose of DNA extraction, metacercariae collected from the same part of stream were pooled. The 70% alcohol-fixed metacercariae were further processed for DNA extraction and PCR amplification following the earlier standardised procedure and protocol. The rDNA region spanning the ITS2 was amplified from metacercarial DNA by PCR; using trematode universal primers designed based on Schistosoma species .
3S (forward): 5'GGTACCGGTGGATCACTCGGCTCGTG-3'
A28 (reverse): 5'-GGGATCCTGGTTAGTTTCTTTTCCTCCGC-3'
The PCR amplification was performed following the standard protocol with minor modifications as described earlier .
Molecular phylogenetic analysis using bioinfomatic tools
The DNA sequences were put to further analysis with the usage of bioinfomatics tools including similarity search using BLAST (Basic Local Alignment Search Tool); http://www.ncbi.nlm.nih.gov/blast, and phylogenetic prediction using ClustalW provided at the http://www.ebi.ac.uk/clustalw for query DNA sequence. A total of 10 ITS2 sequences were selected (4 Indian isolates and 6 Isolates of Neighbouring countries) depending upon their BLAST hits and E-value. Only unique sequences were used in tree construction; ITS sequences arranged with MEGA format were entered in the MEGA  for construction of phylogenetic trees that were inferred using distance method like Neighbor Joining and character state method Maximum Parsimony. Test of phylogenetic accuracy was done by bootstrapping.
Predicted ITS2 RNA secondary structures and analyses
Secondary structures of ITS2 sequences of various paragonimid species were reconstructed by aligning their sequences using Bioedit . The acquired structures with restrictions and constrains were submitted in MFOLD . RNA was folded at a fixed temperature of 37°C and the structure chosen from different output files was the desired 6-helicoidal ring or the one with the highest negative free energy if various similar structures were obtained.
Motif identification, testing and validation
The ITS sequence motifs were identified from aligned sequences of the data set for the species using PRATT software http://www.ebi.ac.uk/Tools/pratt/index.html. The minimum percentage of sequences to match (C%) parameter was adjusted to report pattern matching at 100% of the sequence input. The motifs were expressed using the DNA alphabet (A, T, C, G) in PROSITE language . The validation of the motifs was performed for each species using a "PATTERN MATCHING" Web application http://genoweb.univ-rennes1.fr/Serveur-GPO/. A motif was considered highly specific to a Paragonimus species if it matched most or all the ITS sequences of this species but no other ITS of any other trematode species.
Bayesian phylogenetic analysis
DNA sequences were aligned using ClustalX 2.0.7. The interleaved NEXUS file was edited manually in order for it to be recognized by Mr. Bayes V3.1.2 programme. Phylogenetic analysis was carried out using the Bayesian approach with combined datasets (wherein each data partition is allowed to have a different evolution rate) using default parameters. The cladogram with the posterior probabilities for each split and a phylogram with mean branch lengths were generated and subsequently read by the tree drawing program Tree view V1.6.6.
Construction of phylogenetic trees
In-silico identification of Paragonimus species based on pattern matching ITS motifs
BLAST outputs of Paragonimus ITS sequence motifs against NCBI GenBank database (nr db)
Species motif patterns (5'-3' ends)
No. of best hits
Secondary structure analysis
The ITS2 RNA secondary structures yielded homologous models that grouped conserved species features in the Paragonimus genus (e.g. long helix IV), though it is important to note that these molecular structures can also produce homoplasies because of selective pressures acting on particular helix features such as the 6-helicoidal ring core, conserved among most eukaryotes. This approach resulted in a supported Bayesian inference phylogram very similar to the molecular morphometrics approach. The major difference between the topologies obtained by the Bayesian (e.g. molecular morphometrics using partitions) and primary sequence methodologies was the placement of P. siamensis of Thailand, which could be due to the lack of phylogenetic signal and the likely saturation of ITS2 sequences after multiple alignments. However, P. westermani, grouping with P. siamensis of Thailand, showed an overall similarity in the ITS2 rRNA folding and identical secondary structures, which in the remaining five isolates showed some variation. Generally RNA secondary structure prediction programs rely on free energy minimization using nearest-neighbor parameters for predicting the overall structural stability in terms of Gibbs free energy at 37°C. The observed similarities at the secondary structural level are further reflected at the energy level (-ΔG). The difference in their topology, however, is due to differences in nucleotide sequence lengths. Moreover, the observed phylogenetic trend was identified with respect to the target accessibility sites for the seven different isolates. Bayesian analysis of the alignment retained the same topology and supported the same branches of the trees derived from the primary sequence data.
In sequence analysis of the rDNA ITS2, comparing with the known sequences of the other lung fluke species, the present study revealed that the sequence of ITS2 (plus flanking regions) show close resemblance with that of Paragonimus westermani. In phylogenetic analysis, as a general rule, if the bootstrap value for a given interior branch of a phylogenetic tree is 70% or higher, then the topology at that branch is considered reliable. Our results showed a bootstrap value to be >70% among the trees obtained and the ITS2 sequence resembled Paragonimus westermani, another Indian isolate [GenBank: DQ336246]. The comparison of ITS sequences from the parasites of different hosts and of different countries indicates that there exists a high species-specific homogeneity. In the present study, primary sequence analysis revealed a close relationship between the query sequence and other isolates from NER of India and those from neighbouring countries. Thus, on the basis of absolute matching of ITS2 sequence that could be used as one of the species markers, it can be concluded that Paragonimus species prevailing in Miao region of Arunachal Pradesh is indeed P. westermani and not P. heterotremus that was earlier reported from the same host and locality .
In phylogenetic studies involving secondary structure analysis as a tool, RNA folding is used for refining the alignment. Although the correct alignment with high precision cannot be determined, in extensively folded structure, yet this procedure improves multiple alignment and hence phylogeny construction. The measurable structural parameters of the molecules are directly used as specific characters to construct a phylogenetic tree. These structures are inferred from the sequence of the nucleotides, often using energy minimization . Molecular morphometrics appears to be complimentary to classical primary sequence analysis in phylogenetic studies as it takes into consideration only the size variations of homologous structural segments and this choice implies that the overall architecture of the molecule remains same among the observed taxa. This method helps in taking into account the regions where multiple alignments are barely reliable because of large number of insertion/deletion operation. Besides, the secondary structures are built on each sequence separately thereby making it unnecessary for a sharp computational sequence analysis. Thus homologous recognizable characters on the secondary structures are easily traced out contrary to finding right counterpart for each nucleotide in every other sequence.
Several patterns of predicted secondary structures of RNA that were constructed from unique ITS sequences from different geographical isolates of Paragonimus, provided us with the additional information for correct identification of the species prevalent in the region. The secondary structure analysis of the same data also confirmed the results mentioned for primary sequence analysis. Differences in their topology are only due to the fact that there are variable lengths of the sequences. However, there are difficulties in de fining a distance between two related structures with variable topologies . Following earlier studies P. skrjabini, P. miyazaki, P. szechuanensis, P. hueitungensis and P. veocularis are considered synonyms; the name of P. hokuoensis was proposed for two individual metacercaria of distinctive appearance from southern Yunnan and a number of questions remained unresolved . Nevertheless, because there were inconsistencies in the placement of a few Paragonimus species, this study needs to be extended, in order to gain a better understanding of the systematics of this group as well as the evolution of their predicted ITS2 RNA secondary structures.
The molecular study of the genus Paragonimus, which is the most common lung fluke throughout the globe, has gained impetus in the recent times. ITS2 motifs (≤ 50 bp in size) can be considered a promising tool for trematode species identification. RNA secondary structure analysis could be a valuable tool for distinguishing new species and completing Paragonimus systematic, more so because ITS2 secondary structure contains more information than the usual primary sequence alignment.
Other papers from the meeting have been published as part of BMC Bioinformatics Volume 10 Supplement 15, 2009: Eighth International Conference on Bioinformatics (InCoB2009): Bioinformatics, available online at http://www.biomedcentral.com/1471-2105/10?issue=S15.
This study was carried out under the 'All India Co-ordinated Project on Capacity Building in Taxonomy: Research on Helminths', awarded to VT by Ministry of Environment & Forests, GOI; 'North East Parasite Information and Analysis Centre (NEPIAC): an in silico approach' sanctioned to VT and DKB by DIT, GOI; DBT Project to VT and AC; DSA (UGC-SAP) in Zoology and UGC's UPE (Biosciences) programme in the School of Life Sciences at NEHU, Shillong.
This article has been published as part of BMC Genomics Volume 10 Supplement 3, 2009: Eighth International Conference on Bioinformatics (InCoB2009): Computational Biology. The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2164/10?issue=S3.
- Toscano C, Yu SH, Nunn P, Mott KE: Paragonimiasis and tuberculosis, diagnostic confusion: a review of the literature. Trop Disease Bulletin. 1995, 92: R1-R26.Google Scholar
- Bunnag D, Harinasuta T: Opisthorchiasis, clonorchiasis and paragonimiasis. Tropical and Geographical Medicine. Edited by: Warren KS, Mahmoud AAF. 1985, New York: McGraw-Hill Iac, 461-470.Google Scholar
- Kusner DJ, King CH: Cerebral paragonimiasis. Seminars in Neurology. 1993, 13: 201-208. 10.1055/s-2008-1041126.View ArticlePubMedGoogle Scholar
- Kerbert C: Zur Trematoden-Kenntnis. Zoologischer Anzeiger. 1878, 1: 271-273.Google Scholar
- Blair D, Xu ZB, Agatsuma T: Paragonimiasis and the Genus Paragonimus (Review). Adv Parasitol. 1999, 42: 113-222. full_text.View ArticlePubMedGoogle Scholar
- Razaque MA, Mutum SS, Singh TS: Recurrent haemoptysis? Think of Paragonimiasis. Trop Doct. 1991, 21: 153-155.PubMedGoogle Scholar
- Singh TS, Mutum S, Razaque MA, Singh YI, Singh EY: Paragonimiasis in Manipur. Indian J Med Res. 1993, 97: 247-252.PubMedGoogle Scholar
- Singh TS: Occurrence of the lung fluke Paragonimus hueitungensis in Manipur, India. Zhonghua Yi Xue Za Zhi (Taipei). 2002, 65: 426-429.Google Scholar
- Narain K, Devi KR, Mahanta J: Paragonimus and paragonimiasis - A new focus in Arunachal Pradesh, India. Curr Sci. 2003, 84: 985-987.Google Scholar
- Blair D, Wu B, Chang ZS, Gong X, Agatsuma T, Zhang YN, Chen SH, Lin JX, Chen MG, Waikagui J, Guevara AG, Feng Z, David GM: A molecular perspective on the genera Paragonimus Braun, Euparagonimus Chen and Pagumogonimus Chen. J Helminthol. 1999, 73: 295-299.PubMedGoogle Scholar
- Leon-Regagnon V, Brooks DR, Perez-Ponce de Leon G: Differentiation of Mexican species of Haematoloechus Looss, 1899 (Digenea: Plagiorchiformes): Molecular and morphological evidence. J Parasitol. 1999, 85: 935-946. 10.2307/3285832.View ArticlePubMedGoogle Scholar
- Iwagami M, Lo Y, Su K, Lai PF, Fukushima M, Nakano M, Blair D, Kawashima K, Agatsuma T: Molecular phylogeographic studies on Paragonimus westermani in Asia. J Helminthol. 2000, 74: 315-322.PubMedGoogle Scholar
- Blair D, Campos A, Cummings MP, Laclette JP: Evolutionary biology of parasitic platyhelminths: the role of molecular phylogenetics. Parasitol Today. 1996, 12: 66-71. 10.1016/0169-4758(96)80657-0.View ArticlePubMedGoogle Scholar
- Blair D, Agatsuma T, Watanobe T, Okamoto M, Ito A: Geographical genetic structure within the human lung fluke, Paragonimus westermani, detected from DNA sequences. Parasitology. 1997, 115: 411-417. 10.1017/S0031182097001534.View ArticlePubMedGoogle Scholar
- Caetano-Anolles G: Tracing the evolution of RNA structure in ribosomes. Nucleic Acids Res. 2002, 30: 2575-2587. 10.1093/nar/30.11.2575.PubMed CentralView ArticlePubMedGoogle Scholar
- Billoud B, Guerrucci MA, Masselot M, Deutsch JS: Cirripede phylogeny using a novel approach: molecular morphometrics. Mol Biol Evol. 2000, 17: 1435-1445.View ArticlePubMedGoogle Scholar
- Zwieb C, Glotz C, Brimacombe R: Secondary structure comparisons between small subunit ribosomal RNA molecules from six different species. Nucleic Acids Res. 1981, 9: 3621-3640. 10.1093/nar/9.15.3621.PubMed CentralView ArticlePubMedGoogle Scholar
- Schultz J, Maisel S, Gerlach D, Müller T, Wolf M: A common core of secondary structure of the internal transcribed spacer 2(ITS2) throughout the Eukaryota. RNA. 2005, 11: 361-364. 10.1261/rna.7204505.PubMed CentralView ArticlePubMedGoogle Scholar
- Grajales A, Aguilar C, Sánchez JA: Phylogenetic reconstruction using secondary structures of Internal Transcribed Spacer 2 (ITS2, rDNA): finding the molecular and morphological gap in Caribbean gorgonian corals. BMC Evol Biol. 2007, 7: 90-10.1186/1471-2148-7-90.PubMed CentralView ArticlePubMedGoogle Scholar
- Bowles J, Blair D, McManus DP: A molecular phylogeny of the human schistosomes. Mol Phylogenet Evol. 1995, 4: 103-109. 10.1006/mpev.1995.1011.View ArticlePubMedGoogle Scholar
- Prasad PK, Tandon V, Chatterjee A, Bandyopadhyay S: PCR-based determination of internal transcribed spacer (ITS) regions of ribosomal DNA of giant intestinal fluke, Fasciolopsis buski (Lankester, 1857) Looss, 1899. Parasitol Res. 2007, 101: 1581-1587. 10.1007/s00436-007-0680-y.View ArticlePubMedGoogle Scholar
- Tamura K, Dudley J, Nei M, Kumar S: MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Molecular Biology and Evolution. 2007, 24: 1596-1599. 10.1093/molbev/msm092.View ArticlePubMedGoogle Scholar
- Hall TA: BioEdit: a user-friendly biological sequence alignment editor and analsysis program for Windows 95/98/NT. Nucl Acids Symp. 1999, 41: 95-98.Google Scholar
- Zuker M: Mfold web server for nucleic acid folding and hybridization prediction. Nucl Acids Res. 2003, 31: 3406-3415. 10.1093/nar/gkg595.PubMed CentralView ArticlePubMedGoogle Scholar
- Bailey TL, Elkan C: Fitting a mixure model by expectation maximization to discover motifs in biopolymers. Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology. 1994, AAAI Press, Menlo Park, California, 28-36.Google Scholar
- Zuker M: Prediction of RNA secondary structure by energy minimization. Methods Mol Biol. 1994, 25: 267-294.PubMedGoogle Scholar
- Shapiro BA: An algorithm for comparing multiple RNA secondary structures. Comput Appl Biosci. 1998, 4: 387-393.Google Scholar
- Blair D, Zhengshan C, Minggang C, Aili C, Bo W, Agatsuma T, Iwagami M, Corlis D, Chengbin F, Ximei Zhan: Paragonimus skrjabini Chen, 1959 (Digenea: Paragonimidae) and related species in eastern Asia: a combined molecular and morphological approach to identification and taxonomy. Syst Parasitol. 2005, 60: 1-21. 10.1007/s11230-004-1378-5.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.