Collembase: a repository for springtail genomics and soil quality assessment
© Timmermans et al; licensee BioMed Central Ltd. 2007
Received: 08 May 2007
Accepted: 27 September 2007
Published: 27 September 2007
Environmental quality assessment is traditionally based on responses of reproduction and survival of indicator organisms. For soil assessment the springtail Folsomia candida (Collembola) is an accepted standard test organism. We argue that environmental quality assessment using gene expression profiles of indicator organisms exposed to test substrates is more sensitive, more toxicant specific and significantly faster than current risk assessment methods. To apply this species as a genomic model for soil quality testing we conducted an EST sequencing project and developed an online database.
Collembase is a web-accessible database comprising springtail (F. candida) genomic data. Presently, the database contains information on 8686 ESTs that are assembled into 5952 unique gene objects. Of those gene objects ~40% showed homology to other protein sequences available in GenBank (blastx analysis; non-redundant (nr) database; expect-value < 10-5). Software was applied to infer protein sequences. The putative peptides, which had an average length of 115 amino-acids (ranging between 23 and 440) were annotated with Gene Ontology (GO) terms. In total 1025 peptides (~17% of the gene objects) were assigned at least one GO term (expect-value < 10-25). Within Collembase searches can be conducted based on BLAST and GO annotation, cluster name or using a BLAST server. The system furthermore enables easy sequence retrieval for functional genomic and Quantitative-PCR experiments. Sequences are submitted to GenBank (Accession numbers: EV473060 – EV481745).
Collembase http://www.collembase.org is a resource of sequence data on the springtail F. candida. The information within the database will be linked to a custom made microarray, based on the Agilent platform, which can be applied for soil quality testing. In addition, Collembase supplies information that is valuable for related scientific disciplines such as molecular ecology, ecogenomics, molecular evolution and phylogenetics.
Organisms are able to maintain homeostasis in changing environments by regulating their metabolic machinery. To accomplish this, organisms continuously have to adjust the expression of their genes. This is particularly evident when environmental challenges drive organisms to the boundaries of their ecological niche and induce stress responses (e.g. ). In recent years, significant understanding has been obtained on the signal transduction pathways by which stress affects gene transcription . The question arises whether it is possible to sense aspects of the environment by investigating transcriptional profiles of exposed organisms.
Recent advances in the field of toxicogenomics suggest that environmental quality can indeed be diagnosed by transcriptional profiling  and it is generally acknowledged that genomic techniques, and more specifically transcriptomics, have the potential to revolutionize environmental risk assessment [4–9]. The prospects are that gene expression studies will enable a fast and sensitive detection and evaluation of environmental stressors and toxicants. This is strengthened by the fact that several recent studies have shown that transcription profiling can be applied as an early indicator of toxicity [10, 11] in a dose-dependent manner .
We started a project that aims to develop a microarray-based methodology for soil quality assessment using the parthenogenetic springtail Folsomia candida (Collembola). This species, which is easy to culture and has a short generation time, was chosen because it is already a standard test organism in ecotoxicology . It lives in direct contact with the soil and toxicological data are already widely available (e.g. ECOTOX database from U.S. EPA ). Furthermore, a standard test looking at survival and reproduction after 28 day exposure is in place that follows OECD (Organisation for Economic Co-operation and Development) and ISO (International Standard Organization) guidelines. Although the latter test is conducted in a standardized laboratory setting, it has been shown that the outcomes are predictive of natural situations . However, there are several shortcomings to the current test. First, it does not provide information about the nature of the stressor. Second, the mode of action of toxicants cannot be verified. Third, the test is time-consuming as it lasts for at least 28 days. Finally, the test is rather labor intensive.
By extending the ISO standard test with genomic technologies, these shortcomings may be circumvented. However, genomic information on F. candida is very poor: a search for sequences yields only 52 hits in the National Center of Biotechnology (NCBI;) nucleotide database (July 5th 2007), mainly consisting of 18S rRNA, 28S rRNA and cytochrome c oxidase sequences used as phylogenetic markers.
A time- and cost effective way to retrieve sequence information on the functional part of the genome is to set up an Expressed Sequence Tag (EST) project, which was conducted for the F. candida transcriptome. Here we report on the sequencing and annotation of ~9000 ESTs, which form the starting point for the construction of an oligo array that can be applied in soil quality testing. The sequences were processed, assembled, BLAST-based annotated and stored in a web-accessible database . The database can be searched for BLAST-based annotations and Gene Ontology terms  and by using a stand alone BLAST server. Collembase furthermore enables retrieval of sequence information on (differentially) expressed genes, which can then be applied in functional genomic and Quantitative-Polymerase Chain Reaction (Q-PCR) validation experiments.
Although Collembase was primarily created for the development of a microarray, we expect that it is of interest for researchers outside the field of ecotoxicology as well. Due to its short generation time, F. candida is often used in ecological studies . In addition, Collembola have a crucial position in the phylogeny of the arthropods and, thus, also have the attention from evolutionary biologists (e.g. ). The retrieved genome data will significantly enhance molecular ecological and evolutionary studies on F. candida.
Construction and content
Construction of cDNA libraries
To restrict redundant sequencing we chose to start our EST project with a normalized cDNA pool. RNA extraction from the parthenogenetic, clonally reproducing collembolan Folsomia candida (laboratory strain 'Berlin'; Vrije Universiteit Amsterdam) was carried out using the Spin Vacuum (SV) Total RNA isolation system (Promega). Animals (eggs, juveniles and adult females) were taken from a culture of mixed age with a more or less even age distribution. All animals (~100 mg) were pooled before RNA extraction. Concentration and purity of the total RNA pool was checked by UV absorption (260 and 280 nm). Quality of total RNA was evaluated on a 1% agarose gel (stained with SYBR Gold stain; Invitrogen) and on an Agilent BioAnalyzer (Agilent Technologies). Afterwards 0.1 volumes of 3 M sodium acetate and 3 volumes of 96% ethanol were added and total RNA was shipped at room temperature to Evrogen (Moscow, Russia).
Double-stranded cDNA synthesis (SMART technology ), normalization and library construction were performed by Evrogen. The reaction was started with 0.3 μg total RNA and cDNA was SMART amplified (18 PCR cycles) and normalized by the procedure described by , which consists of cDNA denaturation/reassociation, a duplex-specific nuclease (DNS) treatment  and PCR amplification. The cDNA thus obtained was used for library construction as follows. The cDNA was incubated with restriction enzymes Sbf1 and Not1, and ligated into Sbf1 and Not1 digested pAL17.2 vector (Evrogen). The resulting plasmids were subsequently transformed into E. coli (Evrogen). Finally, glycerol stocks were made (17% glycerol), which were transferred to the Vrije Universiteit (Amsterdam) on dry-ice and stored at -80°C until further use.
Efficiency of the procedure was examined by determining the abundance of several transcripts before and after normalization using Q-PCR. Primers were developed based on five available GenBank accessions and β-actin. Genes amplified were β-actin (GenBank:EU037094), USP-RXR (GenBank:AY157930), Ultrabitorax (GenBank:AF435789), Kruppel (GenBank:AF395109), RNA helicase Dead1 (GenBank:AY043229) and 28S rDNA (GenBank:AF483424). Primer sequences are given on  (see Additional file 1). Primers were developed using Primer Express version 1.5 (Applied Biosystems Inc., Foster City, USA), using the following parameters: Minimum Tm: 59–60°C, Maximum Tm difference between primers: 1°, Oligo length: 20–25 bp, Amplicon length: 90–120 bp.
De Boer et al. (unpublished data) constructed cDNA libraries enriched for stress responsive genes as described by . In short, 960 clones were isolated from each of two subtracted cDNA libraries enriched for 1) cadmium- and 2) phenanthrene responsive genes. Both libraries were built using the suppression subtractive hybridization procedure (SSH)  making use of poly (A)+ RNA isolated from ~150 exposed unsynchronized adult individuals (whole body; laboratory strain 'Berlin'; Vrije Universiteit Amsterdam). Exposure to cadmium was performed by placing animals on cellulose filters wetted to approximately 50% water-holding capacity with a 267 μmole/l CdCl2 solution for 48 h. Animals were exposed to phenanthrene by placing them on a compressed layer of LUFA 2.2 soil spiked with 840 μm/kg phenanthrene according to the standard ISO11267  protocol for 6 days.
EST sequencing, bioinformatics and construction of the database
In total, 9984 cDNA clones were picked and sequenced (Greenomics; Wageningen University and Research Center) using the M13 forward primer. Clones originating from the normalized library were sequenced from the 5' end of the gene (8064 total). The cDNA fragments from the SSH procedure were not ligated directionally, and therefore not sequenced from a predefined orientation (960 clones from each of the two libraries).
Remaining sequences after the Trace2dbest process
# Clones sequenced
# Passed (%)
CLOBB  and Phrap (P. Green, personal communication ) were applied, as part of the Partigene script , to cluster and assemble the ESTs into unique gene objects. This procedure resulted in 6092 unique sequences. There were 4686 singletons and 1406 clusters with more than one sequence. Of those 1406 clusters 920 consisted of two sequences only. The redundancy (defined as total number of sequences/clusters) was 1.45, 1.32 and 1.62 for the total dataset, the normalized library and the cadmium library respectively, but appeared considerably higher in the phenanthrene enriched library (3.18). The highest sequence depth also occurred the phenanthrene enriched library with 98 ESTs in one cluster, compared to a maximum of 31 and 16 ESTs per cluster for the normalized and cadmium library respectively.
Contigs per cluster, as generated by CLOBB and Phrap
Total number of contigs
A) Clusters that contain sequences from all three libraries and B) the most abundantly sequenced transcripts for each of the three F. candida cDNA libraries. n = the number of sequences that are found in a cluster and that originate from the library specified. e-values for blast analyses against 'nr'-databases
Overview of related sequences (blastx)
No Significant Hit
No Significant Hit
Haloacid dehalogenase-like hydrolase
Cytochrome c oxidase s.u.II
No Significant Hit
Alpha-aminoadipyl-cysteinyl -valine synthetase
No Significant Hit
No Significant Hit
No Significant Hit
No Significant Hit
Monooxygenase, DBH-like 1
Monooxygenase, DBH-like 1
Endo-1,3 1,4-beta-D -glucanase precursor
Alpha-aminoadipyl-cysteinyl -valine synthetase
16S ribosomal RNA gene
No Significant Hit
No Significant Hit
Percentages of contigs showing sequence similarity (e-value < 10-5) with sequences stored in GenBank (nr, est databases and nr database restricted to the Insecta) and proteins of Caenorhabditis elegans, Drosophila melanogaster and Mus musculus (April 2007)
Significant hits for the total dataset
Significant hits excl. 140 clusters*
nr – Insecta**
F. candida harbors intracellular bacteria of the genus Wolbachia  and its gut contains many bacterial species as well . Those might turn up as contaminating sequences in the EST dataset. To pinpoint contaminating sequences from bacterial origin the clusters were compared to all protein encoding sequences found in the genome of Escherichia coli (GenBank: U00096) and in the Wolbachia endosymbiont of Drosophila melanogaster (GenBank: AE017196). Sequences showing significant homology to E. coli or Wolbachia (blastx; e-value < 10-5), but not to D. melanogaster, C. elegans or M. musculus, were marked as putative contaminants. In total 70 of such clusters were retrieved, which overlapped to a great extent (56 E. coli and 32 Wolbachia clusters): In total 18 clusters appeared in both analyses (see Additional file 3). Those putative 'bacterial clusters' were not excluded from further analysis, as our procedure does not guarantee if a sequence is contamination or not.
Table 3 shows the five most abundant transcripts for each of the three libraries. The SSH procedure conducted on phenanthrene exposed animals appeared efficient. Of the top five phenanthrene clusters three show high similarities to monooxygenases of the cytochrome P450 enzyme family, which are known to be involved in phase I biotransformation of lipophilic substances such as phenanthrene . The two other clusters show homology to other monooxygenases, and might be involved in phase I metabolism as well. The results for the cadmium library are less straightforward. Two of the five most abundant clusters remain un-annotated, and two clusters show resemblance to accessions that are not from animal origin. Note that one of those two latter clusters (cluster Fcc00170) occurred in all three libraries (Table 3). As with the 'bacterial clusters', those clusters are currently not discarded from the database and are submitted to GenBank. Supplementary experiments will be conducted to determine the exact origin of those clusters, and whether or not they represent contaminants.
The absence of highly expressed house-keeping genes among the five most abundant transcripts in the normalized library, suggests that the normalization procedure was successful. Without normalization more highly abundant transcripts, like tubulins, ribosomal proteins and actins, would have been sequenced (e.g. ). Although these sequences are present in the dataset, they do not form the list of most abundantly sequenced transcripts. For example, more than 40 ribosomal protein sequences were obtained (e.g. cluster Fcc02740), but most of these were represented by only one or two ESTs.
GO slim terms for F. candida genes based on a BLAST search (e-value < 10-25) against the GO annotated UniProt database as generated by Annot8r_blast2GO
Gene Ontology ID
Response to stimulus
Amino acid and derivative metabolism
Regulation of biological process
Nucleobase, nucleoside, nucleotide and nucleic acid metabolism
Biological process unknown
Transcription regulator activity
Signal transducer activity
Enzyme regulator activity
Nucleic acid binding
Molecular function unknown
Structural molecule activity
Unlocalized protein complex
Cellular component unknown
Utility and discussion
Current contents of the database
Currently, Collembase comprises data on 8686 ESTs, which are structured in 5952 clusters. That is 6092 minus the 140 clusters from yeast and human origin. To enable easy access to the sequence dataset, the information gathered was stored in a relational database and a web-interface was created. For all clusters data is offered on (1) the ESTs within a cluster and their clone names, (2) the cDNA library from which the ESTs originated, (3) blastx and blastn hits against GenBank 'nr' databases and tblastx hits against dbEST, which all will be updated regularly, (4) the consensus sequences as generated by Phrap, and (5) the GO terms when available. Furthermore, for each cluster the BLAST results and the processed ESTs can be downloaded.
Collembase can be explored library-specific using text queries (e.g. cluster name or BLAST annotation) and by sequence similarity using a local BLAST  server. Furthermore, a Primer3 web-server  was implemented to enable PCR primer design on the assembled sequences.
Future application and intended uses of the database
Soil quality and risk assessment
The dataset presented here was generated mainly to obtain the required genomic information to construct a microarray for soil quality assessment. The array, which is based on the Agilent microarray technology, is linked to Collembase: The 60-mer oligos printed on the chip follow the nomenclature of the clusters from which they were derived. This "linkage" enables straightforward sequence retrieval. Sequences of differentially expressed genes can be downloaded from Collembase and used in validation experiments (e.g. Q-PCR). Furthermore, in the near future we intend to store microarray and Q-PCR gene expression data as well. This freely accessible online repository will allow evaluation and analysis of the data by the scientific community (sensu ).
The small overlap between the toxicant enriched libraries and the normalized library (Figure 2), in combination with the higher redundancy of the toxicant enriched libraries (especially the phenanthrene library), suggests that metal and PAH exposure trigger different genes in F. candida. Although our expression data on F. candida still have to be verified by actual gene expression assays, such specificity would imply that transcription profiles contain a signature of the nature of the stress, and that different stresses can be distinguished by transcription profiling. This view is strengthened by a recent ecotoxicogenomic study by . These authors showed that in the crustacean Daphnia magna different substances belonging to one chemical class (metals) can be discriminated on the basis of their characteristic expression profiles. Finally, we believe that transcription profiling will enable mechanistic insight in responses to mixtures of toxicants, a relatively new and unknown field in (eco)toxicology.
The collembolan F. candida is frequently used in experimental studies (for a recent review see ), therefore Collembase could be useful outside the field of ecotoxicology as well. We expect applicability in the following research areas:
To fully disentangle the molecular mechanisms by which organisms deal with ecological challenges and environmental stress, additional ecologically relevant model organisms are needed [36, 43]. F. candida is among a few others (e.g. free-living nematodes and earthworms [37, 44]) one of the first soil organisms that is subject to EST sequencing. Collembase could form the basis of F. candida becoming a model organism in the research field of ecogenomics. F. candida has this potential as the species is easy to rear in the laboratory, reproduces parthenogenetically, has a short generation time, has a well-defined ecology and is traceable in (mesocosm) field experiments. It seems obvious that the sequence information stored in Collembase can be exploited to answer ecological questions, e.g. related to drought-tolerance, starvation and microbial resistance in soil ecosystems.
Molecular ecology and population genetics
The EST dataset presented holds information applicable in molecular ecological- and population genetic studies. For example, within the dataset 184 contigs showing one or more tandem-repeats (microsatellites) with a minimum of five repeats were discovered using the MISA PERL script  (Additional file 4). Within some of the clusters up to three different alleles were observed. However, due to the limited redundancy in our dataset and the fact that the libraries were constructed from animals from one parthenogenetic strain it is impossible to determine their degree of polymorphism. Still, in theory those loci are molecular markers that can be applied to unravel the forces that maintain genetic diversity and generate population genetic structure in this soil and cave inhabiting species. Furthermore, it seems obvious that the dataset and its accompanying microarray could be helpful in finding out whether transcriptional regulation is an important driver of adaptive evolution in this species.
Phylogenetics and comparative genomics
Collembola take an exceptional and fascinating position in the tree of life. Together with other basal hexapods (e.g. Protura, Diplura) they are positioned in-between the insects and crustaceans. However, recently some authors suggested that the six-legged body plan found among basal hexapods and insects evolved minimally twice (e.g. [46, 47]). The dataset presented here might add the sequence information that is needed to gain a more detailed insight into the evolution of these groups, and the relationship between insects and crustaceans. Using the BLAST tool, Collembase can be queried for genes valuable for phylogenetic inference. Degenerate PCR primers can be developed on the retrieved sequences to obtain information on other basal hexapod groups.
Collembase provides EST and related data on the springtail F. candida. In the near future this database will be supplemented with microarray expression data. We expect that our strategy will impact soil quality testing. In addition, it is clear that Collembase holds information applicable to many fields of ecological sciences (e.g. molecular ecology and ecogenomics, molecular evolution and phylogenetics).
Availability and requirements
Collembase can be accessed from URL: http://www.collembase.org
We thank Marleen Henkens for her dedicated practical assistance at PRI Greenomics during sequencing and data management. We thank Bart van Houte, Michel Vorenhout and Sander Peters for (bio)informatics support. Furthermore, we would like to thank the three anonymous reviewers for their comments and valuable suggestions. This project was partly financed through funding from a Bsik Research grant (BSIK03011) from the Dutch Government.
- Burnaford JL: Habitat modification and refuge from sublethal stress drive a marine plant-herbivore association. Ecology. 2004, 85 (10): 2837-2849. 10.1890/03-0113.View ArticleGoogle Scholar
- ROELOFS D, AARTS MGM, SCHAT H, VAN STRAALEN NM: Functional ecological genomics to demonstrate general and specific responses to abiotic stress. Functional Ecology. 2007Google Scholar
- Lettieri T: Recent applications of DNA microarray technology to toxicology and ecotoxicology. ENVIRONMENTAL HEALTH PERSPECTIVES. 2006, 114 (1): 4-9.PubMed CentralPubMedGoogle Scholar
- Bishop WE, Clarke DP, Travis CC: The genomic revolution: What does it mean for risk assessment?. RISK ANALYSIS. 2001, 21 (6): 983-987. 10.1111/0272-4332.216167.PubMedView ArticleGoogle Scholar
- Neumann NF, Galvez F: DNA microarrays and toxicogenomics: applications for ecotoxicology?. BIOTECHNOLOGY ADVANCES. 2002, 20 (5-6): 391-419. 10.1016/S0734-9750(02)00025-3.PubMedView ArticleGoogle Scholar
- van Straalen NM: Ecotoxicology becomes stress ecology. ENVIRONMENTAL SCIENCE & TECHNOLOGY. 2003, 37 (17): 324A-330A.View ArticleGoogle Scholar
- Klaper R, Thomas MA: At the crossroads of genomics and ecology: The promise of a canary on a chip. BIOSCIENCE. 2004, 54 (5): 403-412. 10.1641/0006-3568(2004)054[0403:ATCOGA]2.0.CO;2.View ArticleGoogle Scholar
- Snell TW, Brogdon SE, Morgan MB: Gene expression profiling in ecotoxicology. ECOTOXICOLOGY. 2003, 12 (6): 475-483. 10.1023/B:ECTX.0000003033.09923.a8.PubMedView ArticleGoogle Scholar
- Snape JR, Maund SJ, Pickford DB, Hutchinson TH: Ecotoxicogenomics: the challenge of integrating genomics into aquatic and terrestrial ecotoxicology. AQUATIC TOXICOLOGY. 2004, 67 (2): 143-154. 10.1016/j.aquatox.2003.11.011.PubMedView ArticleGoogle Scholar
- Volz DC, Hinton DE, Law JM, Kullman SW: Dynamic gene expression changes precede dioxin-induced liver pathogenesis in medaka fish. TOXICOLOGICAL SCIENCES. 2006, 89 (2): 524-534. 10.1093/toxsci/kfj033.PubMedView ArticleGoogle Scholar
- Wintz H, Yoo LJ, Loguinov A, Wu YY, Steevens JA, Holland RD, Beger RD, Perkins EJ, Hughes O, Vulpe CD: Gene expression profiles in fathead minnow exposed to 2,4-DNT: Correlation with toxicity in mammals. TOXICOLOGICAL SCIENCES. 2006, 94 (1): 71-82. 10.1093/toxsci/kfl080.PubMedView ArticleGoogle Scholar
- Ezendam J, Staedtler F, Pennings J, Vandebriel RJ, Pieters R, Harleman JH, Vos JG: Toxicogenomics of subchronic hexachlorobenzene exposure in Brown Norway rats. ENVIRONMENTAL HEALTH PERSPECTIVES. 2004, 112 (7): 782-791.PubMed CentralPubMedGoogle Scholar
- Fountain MT, Hopkin SP: Folsomia candida (Collembola): A "standard" soil arthropod. ANNUAL REVIEW OF ENTOMOLOGY. 2005, 50: 201-222. 10.1146/annurev.ento.50.071803.130331.PubMedView ArticleGoogle Scholar
- ECOTOX_database: US EPA, Mid-Continent Ecology Division, Duluth, MN (MED-Duluth). [http://www.epa.gov/ecotox]
- Smit CE, van Beelen P, Van Gestel CAM: Development of zinc bioavailability and toxicity for the springtail Folsomia candida in an experimentally contaminated field plot. ENVIRONMENTAL POLLUTION. 1997, 98 (1): 73-80. 10.1016/S0269-7491(97)00104-8.PubMedView ArticleGoogle Scholar
- National Center of Biotechnology. [http://www.ncbi.nlm.nih.gov/]
- Collembase. [http://www.collembase.org]
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G: Gene Ontology: tool for the unification of biology. Nature Genetics. 2000, 25 (1): 25-29. 10.1038/75556.PubMed CentralPubMedView ArticleGoogle Scholar
- Panfilio KA, Akam M: A comparison of Hox3 and Zen protein coding sequences in taxa that span the Hox3/zen divergence. Development Genes and Evolution. 2007, 217 (4): 323-329. 10.1007/s00427-007-0133-8.PubMedView ArticleGoogle Scholar
- Zhu YY, Machleder EM, Chenchik A, Li R, Siebert PD: Reverse transcriptase template switching: A SMART (TM) approach for full-length cDNA library construction. BIOTECHNIQUES. 2001, 30 (4): 892-897.PubMedGoogle Scholar
- Zhulidov PA, Bogdanova EA, Shcheglov AS, Vagner LL, Khaspekov GL, Kozhemyako VB, Matz MV, Meleshkevitch E, Moroz LL, Lukyanov SA, Shagin DA: Simple cDNA normalization using kamchatka crab duplex-specific nuclease. NUCLEIC ACIDS RESEARCH. 2004, 32 (3):Google Scholar
- Shagin DA, Rebrikov DV, Kozhemyako VB, Altshuler IM, Shcheglov AS, Zhulidov PA, Bogdanova EA, Staroverov DB, Rasskazov VA, Lukyanov S: A novel method for SNP detection using a new duplex-specific nuclease from crab hepatopancreas. GENOME RESEARCH. 2002, 12 (12): 1935-1942. 10.1101/gr.547002.PubMed CentralPubMedView ArticleGoogle Scholar
- Roelofs D, Overhein L, de Boer ME, Janssens TKS, van Straalen NM: Additive genetic variation of transcriptional regulation: metallothionein expression in the soil insect Orchesella cincta. Heredity. 2006, 96 (1): 85-92.PubMedGoogle Scholar
- Roelofs D, Marien J, van Straalen NM: Differential gene expression profiles associated with heavy metal tolerance in the soil insect Orchesella cincta. Insect Biochemistry and Molecular Biology. 2007, 37: 287-295. 10.1016/j.ibmb.2006.11.013.PubMedView ArticleGoogle Scholar
- Diatchenko L, Lau YFC, Campbell AP, Chenchik A, Moqadam F, Huang B, Lukyanov S, Lukyanov K, Gurskaya N, Sverdlov ED, Siebert PD: Suppression subtractive hybridization: A method for generating differentially regulated or tissue-specific cDNA probes and libraries. Proceedings of the National Academy of Sciences of the United States of America. 1996, 93 (12): 6025-6030. 10.1073/pnas.93.12.6025.PubMed CentralPubMedView ArticleGoogle Scholar
- International Organization for Standardization: Soil quality—inhibition of reproduction of Collembola (Folsomia candida) by soil pollutants. ISO 11267. 1999Google Scholar
- Parkinson J, Anthony A, Wasmuth J, Schmid R, Hedley A, Blaxter M: PartiGene - constructing partial genomes. BIOINFORMATICS. 2004, 20 (9): 1398-1404. 10.1093/bioinformatics/bth101.PubMedView ArticleGoogle Scholar
- Ewing B, Hillier L, Wendl MC, Green P: Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Research. 1998, 8 (3): 175-185.PubMedView ArticleGoogle Scholar
- Ewing B, Green P: Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Research. 1998, 8 (3): 186-194.PubMedView ArticleGoogle Scholar
- Parkinson J, Guiliano DB, Blaxter M: Making sense of EST sequences by CLOBBing them. BMC BIOINFORMATICS. 2002, 3:Google Scholar
- Phrap. [http://www.phrap.com/]
- Beldade P, Rudd S, Gruber JD, Long AD: A wing expressed sequence tag resource for Bicyclus anynana butterflies, an evo-devo model. Bmc Genomics. 2006, 7:Google Scholar
- McGinnis S, Madden TL: BLAST: at the core of a powerful and diverse set of sequence analysis tools. NUCLEIC ACIDS RESEARCH. 2004, 32: W20-W25. 10.1093/nar/gkh435.PubMed CentralPubMedView ArticleGoogle Scholar
- Vandekerckhove TTM, Watteyne S, Willems A, Swing JG, Mertens J, Gillis M: Phylogenetic analysis of the 16S rDNA of the cytoplasmic bacterium Wolbachia from the novel host Folsomia candida (Hexapoda, Collembola) and its implications for wolbachial taxonomy. Fems Microbiology Letters. 1999, 180 (2): 279-286.PubMedView ArticleGoogle Scholar
- Czarnetzki AB, Tebbe CC: Diversity of bacteria associated with Collembola - a cultivation-independent survey based on PCR-amplified 16S rRNA genes. Fems Microbiology Ecology. 2004, 49 (2): 217-227. 10.1016/j.femsec.2004.03.007.PubMedView ArticleGoogle Scholar
- van Straalen NM, Roelofs D: An Introduction to Ecological Genomics. 2006, Oxford , Oxford University PressGoogle Scholar
- Sturzenbaum SR, Parkinson J, Blaxter M, Morgan AJ, Kille P, Georgiev O: The earthworm Expressed Sequence Tag project. Pedobiologia. 2003, 47 (5-6): 447-451.Google Scholar
- Wasmuth JD, Blaxter ML: Prot4EST: Translating Expressed Sequence Tags from neglected genomes. Bmc Bioinformatics. 2004, 5:Google Scholar
- BaNG Nematode and Neglected Genomics. [http://www.nematodes.org/bioinformatics/]
- Rozen S, Skaletsky H: Primer3 on the WWW for general users and for biologist programmers. Bioinformatics Methods and Protocols: Methods in Molecular Biology. Edited by: Krawetz S, Misener S. 2000, Totowa, NJ , Humana Press, 365-386.Google Scholar
- Pennie W, Pettit SD, Lord PG: Toxicogenomics in risk assessment: An overview of an HESI collaborative research program. ENVIRONMENTAL HEALTH PERSPECTIVES. 2004, 112 (4): 417-419.PubMed CentralPubMedView ArticleGoogle Scholar
- Poynton HC, Varshavsky JR, Chang B, Cavigiolio G, Chan S, Holman PS, Loguinov AV, Bauer DJ, Komachi K, Theil EC, Perkins EJ, Hughes O, Vulpe CD: Daphnia magna ecotoxicogenomics provides mechanistic insights into metal toxicity. Environmental Science & Technology. 2007, 41 (3): 1044-1050. 10.1021/es0615573.View ArticleGoogle Scholar
- Feder ME, Mitchell-Olds T: Evolutionary and ecological functional genomics. NATURE REVIEWS GENETICS. 2003, 4 (8): 651-657. 10.1038/nrg1128.PubMedView ArticleGoogle Scholar
- Parkinson J, Whitton C, Schmid R, Thomson M, Blaxter M: NEMBASE: a resource for parasitic nematode ESTs. NUCLEIC ACIDS RESEARCH. 2004, 32: D427-D430. 10.1093/nar/gkh018.PubMed CentralPubMedView ArticleGoogle Scholar
- MISA. [http://pgrc.ipk-gatersleben.de/misa/]
- Nardi F, Spinsanti G, Boore JL, Carapelli A, Dallai R, Frati F: Hexapod origins: Monophyletic or paraphyletic?. Science. 2003, 299 (5614): 1887-1889. 10.1126/science.1078607.PubMedView ArticleGoogle Scholar
- Carapelli A, Lio P, Nardi F, van der Wath E, Frati F: Phylogenetic analysis of mitochondrial protein coding genes confirms the reciprocal paraphyly of Hexapoda and Crustacea. BMC Evolutionary Biology. 2007, 7 (Suppl 2): S8-10.1186/1471-2148-7-S2-S8.PubMed CentralPubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.