- Methodology article
- Open Access
Acquisition of genome information from single-celled unculturable organisms (radiolaria) by exploiting genome profiling (GP)
BMC Genomics volume 7, Article number: 135 (2006)
There is no effective method to obtain genome information from single-celled unculturable organisms such as radiolarians. Even worse, such organisms are often very difficult to collect. Sequence analysis of 18S rDNA has been carried out, but obtaining the data has been difficult and it has provided a rather limited amount of genome information. In this paper, we have developed a method which provides a sufficient amount of data from an unculturable organism. The effectiveness of this method was demonstrated by applying it to the provisional classification of a set of unculturable organisms (radiolarians).
Dendrogram was drawn regarding the single-celled unculturable species based on the similarity score termed PaSS, offering a consistent result with the conventional taxonomy of them built up based on phenotypes. This fact has shown that genome profiling-based technology developed here can obtain genome information being sufficient for identifying and classifying species from a single-celled organism.
Since this method is so simple, general, and yet powerful, it can be applied to various organisms and cells, especially single-celled, uncluturable ones, for their genome analysis.
Even in the post-genomic era, genome information is often difficult to obtain. This is especially true for organisms that are difficult to collect due to the extremes of their living environment. Radiolaria are protozoa used in the study of geology and environmental and ecological sciences because they leave a fossil of hard skeletal material that can provide a record of more than 500 million years. Studies on Radiolaria have progressed since the early work by Haeckel [1, 2], which was based on phenotypic analysis. The diverse shapes of their skeletons have been useful characteristics for the identification of species. However, there is a need to obtain genomic information about these organisms in order to more fully understand their diversity and phylogeny. Recently, the gene encoding a small subunit of ribosomal RNA (18S rDNA) has been used for classifying species of Radiolaria, which are taxonomically controversial [3–5]. This is by no means an easy task not only because of the difficulty in collecting samples, but also the additional complication that each organism is composed of only a single cell, and methods for culturing Radiolaria in the laboratory are not available. This means that only a single set of genomic DNA molecules can be obtained for experiments such as PCR. Species identification and/or classification from a single cell has an ultimate value for various purposes: genome analysis of precious specimens using only a minute fraction of the available sample or tracing alterations in the genome of somatic cells (e.g., differences between the genomes of the cells from the brain and heart). It will be also useful for settling controversial classification problems regarding Radiolarian . The sequence information obtained from 18 S rDNA is often insufficient to identify and classify species in detail, wanting a method which can provide a more amount of information. This challenge is addressed in this paper: the authors developed a general way to obtain genome information from a single cell. For this purpose, a triple-session of random PCR was successfully employed. The experimental results obtained by genome profiling (GP) [7–10] of a group of Radiolaria gave consistent data from a taxonomical viewpoint. The authenticity of the results, including sequence analysis data, is extensively discussed.
Results and discussion
Experimental materials and genome profiling
Three sessions of random PCR (70 cycles in all) enabled us to obtain DNA fragments from single-celled organisms. The DNA fragments were subjected to micro-TGGE , providing genome profiles as shown in Fig. 1b. In this series of experiments, currently unculturable and difficult-to-collect species of Haeckel's Radiolaria and related organisms (C and E in Table 1) were collected in the field and examined (Fig. 1). The genome profiles of these organisms were processed to generate species identification dots (spiddos) as shown in Fig. 1c. Since these dots were obtained after standardization and normalization by utilizing the internal reference (see Materials and Methods), the coordinates of these spiddos can be regarded as sufficiently reproducible, intrinsic properties of each genome without the influence of environmental factors [12, 13]. The visual pattern of spiddos of an organism on the plane tells us roughly which type of family it belongs to. On the other hand, phenotypes of organisms such as those shown in Fig. 1a provide abundant information about the environment that must have affected the formation of the detailed shape and size. Therefore, spiddos can be used to assign the species and measure the difference (distance) between two organisms at the genome level.
PaSS and clustering analyses
Quantitative analysis of spiddos has been established as PaSS (pattern similarity score) analysis . In preceding studies, the reproducibility of this method has been thoroughly examined and established to be within 1% of errors as PaSS values for a moderately trained person of this technology, which were obtained by measuring the self-PaSS, i.e., the PaSS value for the same genome (operationally, this is obtained based on two genome profiles through two independent trials beginning with the same culture (clone) or with the same tissue of an organism). In other words, the same sample usually provides the PaSS value of 0.986 ± 0.003, if compared with each other . As shown in Fig. 2, PaSS values for each pair of Polycystinea are significantly larger (c.f., PaSS value of 1 indicates a complete match and 0.435 ± 0.091 for non-related two genomes ) in comparison to those of the other pairs. In contrast, pairs composed of a radiolarian and a bacterium or a bacterium and a fungus gave significantly lower PaSS values, which is clearly shown in the lower left triangle of Fig. 2. To evaluate the PaSS values obtained here, we can refer to a list of PaSS made for all organisms so far examined (Table 2). Although these values were statistically obtained based on GP data of more than 100 species and they may change depending on the number and the species of organisms analyzed, they can be used with a finite reliability as a reference to estimate the difference between two genomes at a particular level of taxon. According to Table 2, the values presented in Fig. 2 (~0.85) seem relatively high (almost the level of genus), and this result may indicate the degree of similarity of genomes in Haeckel's Radiolaria.
Phenotype tree is drawn in Fig. 3a. On the other hand, we employed a clustering analysis (group average method) to the PaSS scores and obtained the results shown in Fig. 3b. A reasonable result was obtained at a distance for clustering of 0.149 where all of the members of Polycystinea (A1~A5) were integrated in the same cluster though it includes Foraminiferida (C) as a unique stringer, and there are some rearrangements in branches among the species of Polycystinea. At the same time, mutually distant species such as diatom, bacteria, and fungus (E~H) were separate. At the distance of 0.163, all of the Rhizaria (A1~A5, B~D) were combined into a cluster while the other mutually-distant species (E~H) were still isolated, conforming to the concept of traditional taxonomy. If we consider that the classification of Polycystinea is under construction and there is generally an experimental error in the PaSS values of approximately 1 %  and that Some Radiolaria, such as the groups Polycystinea A1 and A3 are well known to have symbiotic algae within the cell , while other Polycystinea (A2, A4, A5) do not, the apparent minor differences may be insignificant (in other words, beyond the resolution of the current approach) and as a whole, the entire clustering seems rational. Therefore, these results strongly indicate that the method applied here, i.e., random PCR-based genome profiling, could provide the sufficient amount of genomic information from a single cell. The GP method supported the classical Haeckel's view of Radiolaria as monophyletic group, and remained in opposition to the polyphyly of Radiolaria suggested by rRNA and actin phylogenies [6, 16] although the results obtained here are provisional.
Reliability tested by sequencing
Since there are no published genome sequences for any radiolarians or related organisms, the results obtained here must be critically evaluated based on other available information. Currently, the most common approach is to sequence the 18S rDNA; however, it is very difficult to carry out both GP and 18S rDNA sequencing experiments with single-celled and unculturable organisms (only a single copy of each DNA molecule is available). However, anyhow, we tried to amplify the 18S rDNA genes with specific PCR primers (see Materials and Methods) using the template DNA used for GP as the template (which, naturally, contations random PCR products also). We were not always able to amplify the target molecule (18S rDNA). This may be due to a stochastic or inevitable loss of the template DNA when dealing with a single copy. Though we can expect least to obtain 18S rDNA from the mixture of DNAs obtains after genome profiling, DNA molecules amplified using the primers designed for 18S rDNA from the genome of one sample (Dictyocoryne profunda) happened to have around the expected molecular size (200 bp) and were sequenced. After searching the database (DDBJ) with FASTA, the sequence obtained was found not to be the same as any of those reported to date (data not shown). Sequences currently in the database included Didymocyrtis tetrathalamus, Dictyocoryne profunda, and Dictyocoryne truncatum. In addition to these, Acantharea (11 entries), Phaeodarea (3 entries), Polycystinea (20 entries) have been reported. The results of our analysis are as follows: i) the sequence obtained was so novel that it had not been reported previously, ii) it had the highest similarity (66.7 %) with partial 18S rDNA of a protozoan, Euglena spathirhyncha, iii) none of the small subunit ribosomal DNA sequences of organisms that might be expected as contaminants, such as algae, planktonic organisms, fungi or bacteria were highly ranked within the top 200 sequences. Therefore, although the result could not be reproducibly tested, as the same genome cannot be obtained again, it is very plausible that the DNA sequence we obtained and used for GP analysis was that of a Radiolarian 18S rDNA that had never been sequenced. At the same time, we also sequenced the random PCR products of samples A2, A3-1, 2, B, D-1 and others (70 reads in total, 305 nucleotides on average; data not shown) and found that: i) the sequences obtained were quite novel with no registered sequences having more than 70 % similarity except one E. coli sequence (99 % similarity) and two algae sequences (81 % and 77 % similarities), and ii) according to the FASTA search, sequences with the highest similarity with the query sequences were not those of probable contaminants but those of protozoa, which are related to Radiolaria. According to the first finding, an accidental contamination of E. coli DNA must have happened during the experiment. It may also mean that some of our samples were contaminated with algae. However, since they are not major representatives in the sequencing results for 70 clones, they must not have contributed significantly to the spiddos generated in the GP experiments. It is already known that a minor population of DNAs can not be represented in GP unless they have exceptionally strong binding sites for a primer or highly repetitive sequences that can be amplified by random PCR. Therefore, these findings provide supporting evidence that the random PCR products were derived from the genomes of Radiolaria.
In conclusion, the results presented here, including the data indicating that GPs of Radiolarians provided a taxonomically-consistent clustering result (Figs. 2 and 3), indicate that genome profiles for Radiolarians can be obtained from a single cell. Here, we need to recall the fact that GPs obtained here may consist of composite genomes, e.g., those of host and parasitic/symbiotic organisms which can not be observed by the conventional microscopic techniques (though employing more sophisticated technologies such as SEM will make it possible). In other words, some of GPs reported here may represent a metagenome, or 'an ensemble of genomes' (originally, GP was developed based on the idea that the genome is the whole ensemble of DNAs contained in a cell ). If so, it is reasonable to register such GPs as being representative of Radiolaria since we cannot separate the symbiotic genomes effectively, and this type of partnership is often permanent. We think this fact dose not devaluate the methodology established here.
In this paper we developed a potent method to obtain genome information from unculturable and single-celled organisms. This was demonstrated by the experiments applied to radiolarians sampled in the field. Final confirmation can only be obtained by the repeating similar experiments using other fresh experimental materials. Therefore, the classification reported here should be considered to be rather tentative, but the GP method shows promise for solving controversial classification problems. We believe this is the first success in which the genomes of unculturable, single-celled organisms were treated quantitatively and systematically to extract genomic information for comparison.
DNA of radiolarians and the other organisms
The samples of protozoa were collected from surface seawater during 2003–2005. Protozoa, such as species belonging to Phaeodaria and Acantharia, Foraminifera, and Diatomophyceae were obtained from the shores off of Sado Island, Japan. Radiolarians, including Dictyocoryne truncatum, Dictyocoryne profunda, Didymocyrtis tetrathalamus, Spongaster tetras, Dictyocoryne truncatum, Eucyrtidium hexasticum, Hymeniastrum euclidis were from the shores off of Okinawa Island, Japan. The collected organisms are listed in Table 1 with their taxonomical classification and sampling date and location. Photomicrographs of some of the organisms are shown in Fig. 1a. The plankton-net collected microbes were dissolved in sterile water. Then we prepared each organism as a single cell by sucking it together with a small aliquot(less than 1 μl including a cell) under the microscopic view without contaminating the other visible organisms and diluted it with pure alcohol of more than 1 ml. An aliquot (1 μl) of this solution which contains the cell was used for PCR. Thus, the sample cells used here were microscopically pure but it can not rule out the possibility of being contaminated with intracellular organisms (symbionts) or externally invisible virus-like organisms (though its probability is less than a millionth of that of finding in the original sea water due to the repeated dilution). The other organisms dealt here (Saccharomyces cerevisiae, Escherichia coli, and Bacillus subtilis) were reported previously [10, 13].
Each sample was purified by picking up a single-cell in an aliquot. Preparation of DNA was carried out by the alkaline extraction method . Briefly, the procedures adopted here are as follows: 1) An aliquot containing a single cell which had been identified by phenotype under the microscope was transferred into an Eppendorf tube; 2) After adding 3 μl of 0.5 M NaOH, the sample solution was incubated at 94°C for 5 min and then at 64°C for 60 min; 3) the sample solution was neutralized with 5 μl of 200 mM Tris-HCl (pH 8.0) buffer, and incubated at 65°C.
Genome profiling (GP)
GP is composed of two major steps: random PCR and temperature gradient gel electrophoresis (TGGE)(Fig. 4). Random PCR was performed using a single primer of dodeca-nucleotides (pfM12, dAGAACGCGCCTG) with the 5'-end Cy3-labeled. This primer sequence has been recommended for general use including the application to animal cells . As one round of PCR amplification of DNA from a single cell does not produce a sufficient amount of DNA, three rounds were employed. The first PCR reaction (50 μl) contained 200 μM dNTPs (N= G,A,T,C), 0.5 μM primer, 10 mM Tris-HCl (pH 9.0), 50 mM KCl, 2.5 mM MgCl2, 0.02 unit/μl Taq DNA polymerase (Takara Bio, Shiga, Japan) and a particular amount of template DNA. In the random PCR experiments, special cares were taken as follows ; i) The reaction mixtures just before adding template DNA were UV irradiated. ii) the third session of random PCR was stopped sufficiently before the consumption of the primers so as to keep the products in the double stranded state. Random PCR was carried out with 30 cycles of denaturation (94°C, 30 s), annealing (26°C, 2 min) and extension (47°C, 2 min) using a PTC-100TM PCR machine (MJ Research, Inc., Massachusetts, USA). The second random PCR (50 μl) contained 5 μl of the first PCR products as template and the same concentrations of the other constituents, and the reaction was run for 30 cycles under the same conditions. The third round of random PCR (25 μl) contained 1 μl of the second round products as template. The third round of PCR was stopped after 10 cycles in order to produce as much double-stranded DNA as possible. The DNA samples were subjected to μ-TGGE , which adopts a tiny slab gel of 24 × 16 × 1 mm3 for electrophoresis using a temperature-gradient generator, μ-TG (Taitec, Saitama, Japan). In each run of electrophoresis, an internal reference DNA was co-migrated. The 200-bp reference DNA (the 191-bp bacteriophage fd gene VIII, sites 1350~1540 attached to a 9-bp sequence, CTACGTCTC, at the 3'-end) is experimentally determined to have a melting temperature of 60°C under standard conditions. The gel used was 6% acrylamide (acrylamide:bis = 19:1) containing 90 mM Tris-HCl (pH 8.0), 90 mM boric acid, 2 mM EDTA and 8 M urea. The linear temperature gradient was run from 15°C to 60°C. After electrophoresis, DNA bands were detected with a fluorescence imager (Molecular Imager FX, Biorad, Hercules, CA) or by silver staining .
Spiddos and PaSS
Genome profiles obtained by GP are highly informative but difficult to manage due to their complexity. However, this inconvenience was overcome by introducing the featuring points, designated as spiddos (species identification dots), which can represent genome profiles compactly . The featuring points, or spiddos, correspond to the points where structural transitions of DNA occur, such as double-stranded to single-stranded DNA . Spiddos can be used to provide a sufficient amount of information for identifying species . One can easily increase the amount of information about the genome by adding spiddos obtained by another GP operated with a different primer. Because the number of possible primers is so large (1.6 × 107 for dodecanucleotides), many GPs (thus, spiddos) can be obtained if necessary. Using spiddos, we can define the pattern similarity score (PaSS) of the genomes between two species as follows:
where and correspond to the normalized positional vectors (composed of two elements, mobility and temperature) for spiddos Pi and Pi' collected from two genome profiles (discriminated with or without a prime), respectively, and i denotes the serial number of spiddos. A database site has been constructed (On-web GP ) in order to provide semi-automatic data processing . The PaSS value thus obtained is empirically known to be a good measure to quantify the closeness between two species (or cells) .
After calculating the average PaSS value (), a representative PaSS value for a particular level of taxon T, was obtained over the constituents at a particular level of taxon using the following equations.
Where pass (i, j ; A, B) means the PaSS elementary value between the i-th element of the genome profile of an organism A and j-th element of the genome profile of an organism B ; PaSSP,Q means the PaSS value between the spiddos for organisms, P and Q ; NA and NB, stand for the number of spiddos belong to A and B, respectively ; and ν stands for the number of species (Fig.5).
The group average method  was used for the clustering analysis of the genomes of the organisms examined here. With this method, a separate element is integrated into a cluster if the distance is smaller than or equal to a finite value between the element and at least one member of the cluster.
Sequencing and FASTA analysis
Some of the random PCR products (DNA) were cloned and analyzed by sequencing. In addition, a part of the 18S rDNA was subjected to sequencing. This product was obtained by PCR-amplification using the mixture of random PCR products plus the template genome DNA as a template source and the primers, F (GAGGGCAAGTCTGGT) and R (ACGACGGTATCTGAT), which were designed as in the preceding study . The sequencing was done by adopting the TA cloning method (Invitrogen, Carlsbad, California, USA). The DNA sequencer, DSQ-2000L (Shimadzu, Kyoto, Japan) was used for this purpose. Experimentally obtained DNA sequences were analyzed with FASTA against all of the genomes and genes in the database. DNA segments of more than 55 % similarity were carefully analyzed based on their annotations so as to determine the source of the DNA sequences.
Haeckel E: Report on the Radiolaria collected by HMS Challenger during the years 1873–1876. Rep Sci Res. 1887, 18: 1-1803. Voyage HMS Challenger 1873–1876 Zool
Yasugi R, Koseki O, Hurutani M, Hidaka T: Iwanami Biology Encyclopedia. 1996, Tokyo, Japan: Iwanami Press, 4
Gast RJ, Caron DA: Molecular Phylogeny of Symbiotic Dinoflagellates from Planktonic Foraminifera and Radiolaria. Mol Biol Evol. 1996, 13: 1192-1197.
Gutell RR, Larsen N, Woese CR: Lessons from an evolving rRNA: 16S and 23S rRNA structures from a comparative perspective. Microbiol Rev. 1994, 58: 10-26.
Zettler LA, Sogin ML, Caron DA: Phylogenetic relationships between the Acantharea and the Polycystinea: A molecular perspective on Haeckel's Radiolaria. Proc Natl Acad Sci USA. 1997, 94: 11411-11416. 10.1073/pnas.94.21.11411.
Polet S, Berney C, Fahrni J, Pawlowski J: Small-Subunit Ribosomal RNA Gene Sequences of Phaeodarea Challenge the Monophyly of Haeckel's Radiolaria. Protist. 2004, 155: 53-63. 10.1078/1434461000164.
Hamano K, Takasawa T, Kurazono T, Okuyama Y, Nishigaki K: Genome Profiling-Establishment and practical evaluation of its methodology. Nikkashi. 1996, 1996: 54-61.
Nishigaki K, Amano N, Takasawa T: DNA Profiling: An approach of systematic characterization, classification, and comparison of genomic DNAs. Chem Lett. 1991, 1991: 1097-1100. 10.1246/cl.1991.1097.
Nishigaki K, Naimuddin M, Hamano K: Genome profiling: a realistic solution for genotype-based identification of species. J Biochem. 2000, 128: 107-112.
Nishigaki K, Saito A, Hasegawa T, Naimuddin M: Whole genomesequence-enabled prediction of sequences performed for random PCR products of Escherichia coli. Nucleic Acids Res. 2000, 28: 1879-1884. 10.1093/nar/28.9.1879.
Biyani M, Nishigaki K: Hundredfold productivity of genome analysis by introduction of microtemperature-gradient gel electrophoresis. Electrophoresis. 2001, 22: 23-28. 10.1002/1522-2683(200101)22:1<23::AID-ELPS23>3.0.CO;2-Z.
Naimuddin M, Kurazono T, Nishigaki K: Commonly conserved genetic fragments revealed by genome profiling can serve as tracers of evolution. Nucleic Acids Res. 2002, 30 (10): e42-10.1093/nar/30.10.e42.
Naimuddin M, Kurazono T, Zhangc Y, Watanabe T, Yamaguchi M, Nishigaki K: Species-identification dots: a potent tool for developing genome microbiology. Gene. 2000, 261: 243-250. 10.1016/S0378-1119(00)00502-3.
Downs GM: Barnard, J. M. Clustering Methods and Their Uses in Computational Chemistry. Reviews in Computational Chemistry. 2002, Wiley: Hoboken, NJ, 18: 11-
Anderson OR: Radiolaria. 1983, New York: Springer-Verlag
Nikolaev SI, Berney C, Fahrni JF, Bolivar I, Polet S, Mylnikov AP, Aleshin VV, Petrov NB, Pawlowski J: The twilight of Heliozoa and rise of Rhizaria, an emerging supergroup of amoeboid eukaryotes. Proc Natl Acad Sci USA. 2004, 101 (21): 8066-10.1073/pnas.0308602101.
Naimuddin M, Nishigaki K: Genome analysis technologies toward species identification by genotype. Brief Fune Genomics Proteomics. 2003, 1: 356-71. 10.1093/bfgp/1.4.356.
Wang H, Qin M, Cutler AJ: A simple method of preparing plant samples for PCR. Nucleic Acids Res. 1993, 21: 4153-4154.
Nishigaki K, Husimi Y, Masuda M, Kaneko K, Tanaka T: Strand dissociation and cooperative melting of double-stranded DNAs detected by denaturant gradient gel electrophoresis. J Biochem. 1984, 95: 627-35.
On-web GP. [http://gp.fms.saitama-u.ac.jp/]
Watanabe T, Saito A, Takeuchi Y, Naimuddin M, Nishigaki K: A database for the provisional identification of species using only genotypes: web-based genome profiling. Genome Biol. 2002, 3: 00101.1-10.1186/gb-2002-3-2-research0010.
Voet D, Voet J: Biochemistry. 1995, J Wiley & Sons, New York: NY, 2
Madigan MT, Martinko JM, Parker J: Brock biology of microorganisms. 1997, Prentice-Hall, Inc., Upper Saddle River, NJ, 8
NCBI PubMed – Entrez Taxonomy. [http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=Limits&DB=taxonomy]
Authors are grateful for the technical assistance of Mohammed Naimuddin, Takehiro Watanabe, Ayumu Saito and Daisuke Sato. This study was supported in part by the grant REDS (Project of the Saitama Prefecture Collaboration of Regional Entities for the Advancement of Technological Excellence supported by JST) and also by a Grant-in-Aid for Scientific Research from the Ministry of Education, Culture, Sports, Science and Technology, Japan (no. 17654100).
The author(s) declare that they have no competing interests.
MK performed the experiments and the genome analysis, interpreted the results and prepared the substantial amount of the manuscripts. AM suggested the essence of this study at the beginning of this study, collected the Radiolarian specimens, and contributed to the discussion of this project. KN initiated and organized this study, helped interpreting the results and also helped drafting the manuscripts. Both of AM and KN contributed to supporting this study financially.
Mariko Kouduka, Atsushi Matsuoka contributed equally to this work.