Comprehensive EST analysis of the symbiotic sea anemone, Anemonia viridis
© Sabourault et al. 2009
Received: 26 March 2009
Accepted: 23 July 2009
Published: 23 July 2009
Skip to main content
© Sabourault et al. 2009
Received: 26 March 2009
Accepted: 23 July 2009
Published: 23 July 2009
Coral reef ecosystems are renowned for their diversity and beauty. Their immense ecological success is due to a symbiotic association between cnidarian hosts and unicellular dinoflagellate algae, known as zooxanthellae. These algae are photosynthetic and the cnidarian-zooxanthellae association is based on nutritional exchanges. Maintenance of such an intimate cellular partnership involves many crosstalks between the partners. To better characterize symbiotic relationships between a cnidarian host and its dinoflagellate symbionts, we conducted a large-scale EST study on a symbiotic sea anemone, Anemonia viridis, in which the two tissue layers (epiderm and gastroderm) can be easily separated.
A single cDNA library was constructed from symbiotic tissue of sea anemones A. viridis in various environmental conditions (both normal and stressed). We generated 39,939 high quality ESTs, which were assembled into 14,504 unique sequences (UniSeqs). Sequences were analysed and sorted according to their putative origin (animal, algal or bacterial). We identified many new repeated elements in the 3'UTR of most animal genes, suggesting that these elements potentially have a biological role, especially with respect to gene expression regulation. We identified genes of animal origin that have no homolog in the non-symbiotic starlet sea anemone Nematostella vectensis genome, but in other symbiotic cnidarians, and may therefore be involved in the symbiosis relationship in A. viridis. Comparison of protein domain occurrence in A. viridis with that in N. vectensis demonstrated an increase in abundance of some molecular functions, such as protein binding or antioxidant activity, suggesting that these functions are essential for the symbiotic State and may be specific adaptations.
This large dataset of sequences provides a valuable resource for future studies on symbiotic interactions in Cnidaria. The comparison with the closest available genome, the sea anemone N. vectensis, as well as with EST datasets from other symbiotic cnidarians provided a set of candidate genes involved in symbiosis-related molecular crosstalks. Altogether, these results provide new molecular insights that could be used as a starting-point for further functional genomics studies.
Sea anemones, together with corals, jellyfish and hydras, belong to the Cnidaria, which are basal to the eumetazoa and ancestral to the bilateria. Cnidaria are characterized by a sac-like body plan with a single oral opening surrounded by numerous tentacles. As diploblastic animals, they are composed of only two embryonic tissue layers, the epiderm and the gastroderm (Additional file 1).
Many cnidarians harbour photosynthetically active unicellular algae within their gastrodermal cells. In most cases, such symbiont algae are dinoflagellates from the genus Symbiodinium, commonly referred to as zooxanthellae. This association is a trophic endosymbiosis and is considered to be mutualistic because the zooxanthellae provide their cnidarian host with reduced organic carbon resulting from their photosynthetic activity  while the host provides the zooxanthellae with inorganic carbon , inorganic nitrogen [3, 4] and inorganic phosphate , as well as a refuge from herbivory. This simple mutual partnership has recently been revealed to be more complex, however, since the holobiont was found to be a dynamic assemblage of animal, zooxanthellae, endolithic algae and fungi, prokaryotes (Bacteria and Archaea) and viruses [6, 7]. Endosymbioses are thus highly complex associations, implying intimate interactions between host and symbionts as well as constraints, such as hyperoxic conditions generated by symbiont photosynthesis, and transfer of inorganic carbon to the symbiont .
In recent decades, biochemical and physiological studies have highlighted numerous adaptations in cnidarian host tissues (for review ), such as the presence of natural sunscreens (UV-absorbing mycosporine-like amino acids, ), remarkable antioxidant defences [10, 11], specific mechanisms of inorganic carbon absorption and concentration , and mechanisms of inorganic nitrogen absorption . However, despite increasing knowledge about their physiological inter-relationship, very little is known about the molecular adaptations that have permitted this successful partnership.
The cnidarian-dinoflagellate endosymbiotic association is the very foundation of the highly productive and diversified coral reef ecosystem. Coral reefs are considered to host at least 30% of all known marine fauna , like "oases" within marine nutrient-deprived deserts , and play a crucial role in shaping tropical ecosystems. Coral reefs are, however, now also experiencing high levels of anthropogenically-induced stress (global climate change, pollution). Such environmental perturbations, in addition to pathogens, contribute to the breakdown of symbiosis known as "coral bleaching", and even mortality . Bleaching results in whitening of cnidarian symbiotic tissues, due either to a direct loss of dinoflagellates and/or a decrease in photosynthetic pigment concentration . Mass bleaching events have been increasing in both frequency and severity since the 1980s .
The most significant contributions to cnidarian molecular biology are the complete genome analysis of the starlet Sea anemone Nematostella vectensis [17–19] and the Hydra magnipapillata genome project . However, neither of these species is symbiotic. In addition, phylogenetic studies suggest that N. vectensis and A. viridis might belong to different suborders (Additional file 2), although both belong to the Hexacorallia Actiniaria . Very few studies using high-throughput techniques have been published to date on symbiotic cnidarians. Most cDNA libraries have been constructed from non-symbiotic cells and only limited EST datasets were generated. Only three studies have been conducted on symbiotic anthozoans: Acropora millepora, a reef building coral (10,247 ESTs; [19, 22]); Aiptasia pulchella, a tropical Sea anemone (870 ESTs; ); and a comparative study on two scleractinian corals, Montastrea faveolata (3,854 ESTs) and Acropora palmata (14,588 ESTs, ). All have shown that cnidarians are more similar to vertebrates, in terms of gene repertoire, composition and intron/exon structure, than the well known ecdysozoan model organisms, Drosophila melanogaster and Caenorhabditis elegans, where extensive gene loss has occurred [19, 22]. These EST datasets have been used to develop small-scale microarrays for gene expression analysis in several coral-algal symbioses: Acropora millepora, Acropora palmata and Montastrea faveolata [25–27].
Genomic information on zooxanthellae has been obtained by Leggat et al  who analysed 2,682 ESTs of tropical Symbiodinium (clade C3) extracted from the coral Acropora aspera. Another cDNA library (1,484 UniSeqs) was constructed from a cultured Symbiodinium of clade A (CassKB8), originally isolated from the Upside-down jellyfish Cassiopea sp, and sequences were compared to those of clade C3 . Additional metagenomic analyses were performed on the microbial community associated with the coral Porites asteroids [7, 30], extending our knowledge on the organismal diversity of the holobiont, although no transcriptomic analyses were carried out.
To better characterize symbiotic relationships between cnidarian host and associated symbionts, we conducted a large-scale EST study on a symbiotic sea anemone. We chose the sea anemone Anemonia viridis as our study species, in which the two tissue layers (epiderm and gastroderm) can be easily separated. A. viridis is the most abundant sea anemone of the Mediterranean coasts and hosts the temperate Symbiodinium sp. of Clade A, which has been suggested to be the dominant clade of zooxanthellae in the Mediterranean Sea . We prepared a single cDNA library from whole specimens under several stress conditions, in order to maximize the presence of genes required for symbiosis. 39, 939 ESTs were generated and a total of 14,504 UniSeqs were identified, assigned a putative origin and annotated. This large collection of UniSeqs provides the better characterized transcriptomic knowledge of symbiosis in cnidarians. The data will further our comprehension of such relationships and contribute to functional genomic surveys.
The 19 most abundant transcripts (>100 ESTs)
Top hit Swissprot (2008.03)
N. vectensis top hit
CCAAT/enhancer-binding protein beta
Elongation factor 1-alpha
lipoprotein PA4545 precursor
No hits found
60S ribosomal protein L8
No hits found
No hits found
Translation elongation factor 2
Pancreatic secretory granule membrane major glycoprotein GP2
Canis lupus familiaris
Polyadenylate-binding protein 4
40S ribosomal protein S2
Actin, cytoskeletal 1A
ADP, ATP carrier protein, mitochondrial precursor
No hits found
No hits found
60S acidic ribosomal protein P0
40S ribosomal protein SA
40S ribosomal protein S3a
Heat shock 70 kDa protein cognate 4
40S ribosomal protein S4
Based on reverse transcriptase domain search, 46 transposable elements (retrotransposons/retroposons) were identified (data not shown). Among these, 4 were of prokaryotic origin (without any similarity to N. vectensis sequences) and 42 were of metazoan origin. While 2 of them were almost identical to N. vectensis transposable elements, 21 only had slight similarity to N. vectensis sequences (BlastX or BlastN with E-value of 1.10-25 to 1.10-10), and 19 showed no similarity to N. vectensis sequences but had been previously identified in other Metazoa.
Because the cDNA library was made from symbiotic tissue, we expected to find ESTs related both to the cnidarian host and to its dinoflagellate symbionts. The analysis pipeline used in this study is presented in Figure 1. Sequences were compared with SwissProt (2008.03) and Uniprot KB (TrEMBL+Swissprot) (2008.03) databases using BlastX with a cutoff E-value of < 1.10-10 to retrieve functional annotations. Of the assembled dataset, 6,238 UniSeqs had putative similarities while 8,266 had no similarity to any sequences in the chosen databases. Sequences were also compared with NCBI-indexed prokaryotic nucleic sequences (2007.08 release), using BlastN with a stringent E-value of < 1.10-15 to assess the proportion of prokaryotic sequences. Finally, a specific search of the UniSeqs for virus proteins returned 7 hits. Relative contribution is shown in Figure 3. First of all, a relatively high proportion of sequences (57%) remained that showed no significant similarities to previously described genes and were therefore considered as 'unknown'. This is somewhat comparable with results obtained from other cnidaria, Acropora palmata, Acropora millepora, Montastrea faveolata, and Nematostella vectensis [19, 24]. Most of the UniSeqs identified in Symbiodinium sp were also of unknown origin . Metazoan hits were found for 32.3% of UniSeqs (75% of annotated sequences). Among these, most of the annotated UniSeqs (4,266 out of 4685) matched with Nematostella vectensis predicted proteins. However, we also identified sequences that were clearly from the host (first Blast hits of metazoan origin with cutoff E-value < 1.10-50), but these had no significant similarity to predicted proteins of N. vectensis. Three of these (a glycoprotein, a ferroxidase and an amine-oxidase) were studied in more detail (Figure 4). PCR and sequencing were first performed on genomic DNA from both A. viridis epiderm and in vitro cultured Symbiodinium, which confirmed the animal origin of these sequences (Figure 4A). The first sequence studied is related to the ependymin glycoprotein family (more specifically to the Mammalian Ependymin-Related Proteins group or MERP1), which has been well described in vertebrates due to its involvement in the regeneration processes. Ependymins are secretory proteins that can bind calcium and that were found predominantly in the cerebrospinal fluid of teleost fish. A bound form has been described, associated with the extracellular matrix. Recent data demonstrated that these proteins are also present in non-vertebrate deuterostomes and protostomes , and that positive selection may have shaped their evolution. Figure 4B illustrates an amino acid alignment of A. viridis MERP sequences to homologous proteins found in publicly available databases (Bayesian analysis of the MERP sequence confirmed the phylogenetic relationship, not shown). The presence of ependymin proteins in basal organisms clearly indicates for the first time that this protein family is far older than previously thought (first described as chordate-specific, then deuterostome-specific, and finally also found in protostomes).
Quite a large proportion of UniSeqs (9%) were uniquely shared with prokaryotes, of which Proteobacteria was the most prominent bacterial group (Pseudomonas, Bordetella, and Burkholderia). The holobiont has already been described as a dynamic assemblage, made up of the animal host, zooxanthellae, endolithic algae and fungi, Bacteria, and Archea . Such "prokaryotic sequences" could therefore be assigned to the sea anemone-associated flora. In addition, a small but significant number of A. viridis sequences, recognized as being of prokaryotic origin based on Blast analysis, had already been identified in the N. vectensis genome. Such genes are clearly similar in sequence to prokaryotic homologs, although they contain introns. In N. vectensis and A. millepora they were proposed as "ancient genes", conserved in cnidarians but lost in other animal genomes . Although our results are in line with this interpretation, comparative genomic studies among cnidarians, such as A. viridis, could help to identify the most probable evolutionary scenario between maintenance of "non-metazoan" genes in cnidarians or lateral gene transfer events, followed by rapid intron acquisition.
Surprisingly, only a small fraction of our dataset could be assigned to unicellular eukaryote sequences (putative Symbiodinium sp, 3.6% of annotated sequences). Two well accepted explanations have been proposed: i) poor representation of dinoflagellate sequences in databases, leading to wrong assignment after Blast analysis; and ii) technical bias due to the Symbiodinium cell wall impairing complete RNA extraction with standard methods, thus leading to an under-representative number of cDNAs in our library.
Distribution of positive blast hits among Cnidaria
Number of ESTs
Number of hits
All these differences between the present A. viridis dataset and the N. vectensis genome may reflect the crucial role of trophic exchanges between the sea anemone and its dinoflagellate symbionts, as well as specific host adaptations.
This large EST collection has provided high quality data on all aspects of a temperate symbiotic cnidarian, particularly with regard to coding sequences and regulation features. For example, we identified many novel repeated elements (RE) in 3'UTRs, suggesting an invasion of most animal sequences by some specific RE families. It will be interesting to further investigate their potential biological role, particularly on gene regulation. Phylogenetic origin and functional classification of the holobiont sequences allowed the identification of several symbiotic candidate genes. These data are now being used to develop a dedicated microarray that will provide a valuable resource for future studies on symbiotic interactions in A. viridis. Furthermore, these data also have shown the importance of Symbiodinium symbionts as well as the associated flora of the three major prokaryotic species. Some sequences will be further analysed from the new perspective of gene transfer between host and symbiont. The relatively low abundance of sequences from Symbiodinium was attributed to experimental bias; a new ongoing sequencing project should fill this gap. Finally, these data from a temperate zone cnidarian provide novel molecular insights that will complement those obtained from tropical anthozoans. This dataset is valuable resource that will be of great help for comparative genomics and evolutionary studies.
To maximize the diversity of genes expressed in the symbiotic association under both normal and stress conditions, sea anemones were subjected to different controlled stress conditions before RNA extraction and cDNA library construction.
Specimens of the Mediterranean sea anemone, Anemonia viridis (Forskål, 1775), were collected close to Villefranche-sur-mer (France). During an initial acclimatation period of at least 4 weeks, animals were maintained in tanks of running seawater at 17 ± 1°C with a light intensity of 250 μmol quanta m-2s-1 (overhead metal halide lamps Philips HQI TS 400W), on a 12 h light/12 h dark cycle (starting at 8 am). Animals were fed once a week. After the acclimatation period, animals were treated and sampled (five tentacles from each specimen, four specimens from each experiment except for the hyperoxia stress) as follows:
Light/dark cycle: sampling was performed at different times of the day (10 am and 7 pm) and night (7 am and 10 pm).
Thermal stress: sea water was heated from 17°C to 24°C over 2·hrs and maintained at this maximal temperature for 5·days. Five A. viridis tentacles were sampled after 1, 2 and 5 days of continuous thermal stress.
Hyperoxia condition: three specimens of A. viridis were subjected to 10·h of 100% O2 at 17.0 ± 1°C under a constant irradiance of 250·μmol·m-2·s-1. Oxygen saturation of the medium was achieved by bubbling pure O2 through seawater and was monitored using a gas analyser (Radiometer Copenhagen ABL 30; Copenhagen, Denmark).
Bleached specimens, resulting from symbiosis disruption, and maintained bleached over several years (at 17°C in dark conditions) were also sampled.
The mRNAs were then extracted using Trizol Reagent (Invitrogen), as described in Richier et al. , and equal amounts of mRNA from the different conditions described above were combined before cDNA library construction. A. viridis cDNA library was generated at the RZPD (Deutsches Ressourcenzentrum für Genomforschung GmbH, Berlin, Germany) by oligodT, random priming and directionally cloning into pSPORT1. From this library, 50,304 clones were picked, replicated and subsequently sequenced at the Genoscope (French national sequencing centre). High-throughput sequencing of 5'end was performed using a Big-Dye terminator cycle sequencing kit and M13 reverse primer on an ABI-3730 Genetic Analyzer (Applied Biosystems) following the manufacturer's protocol (Genoscope, Evry, France). 41,247 chromatograms were thus generated and further analyzed.
EST sequences were processed using SURF analysis pipeline tools (SURF: SeqUence Repository and Feature detection, developed by the SIGENAE team, Dehais Patrice and Eddie Iannucelli, INRA, Toulouse). Basically, SURF provided an integrated solution, from chromatogram data storage to cloned insert detection, by integrating several dedicated bioinformatic software programs (sequence base calling, vector detection, etc.) in order to produce relevant nucleotide sequences according to base quality and feature detection. The chromatogram files were exported to PHRED for base calling [40, 41]. Cloned insert detection was made according to different detected features (vector, adaptator, poly(A) or poly(T) tails and repeat) and their respectively positions, using third party programs (Crossmatch and RepeatMasker). Only inserts with more than 100 bp, with a Phred score >20, and not belonging to a low complexity area were exported into a fasta format with its corresponding quality file. Additional extremity trimming was made using the "trimseq" command (EMBOSS package). Low complexity regions and repeats were masked using the RepeatMasker program . For this purpose, two different libraries were used: the RepeatMaskerLib (RepBase Update of 2007.09.24, ) and a custom library of A. viridis. This custom library was made both by using CENSOR  to retrieve publicly available repeats, and by running a BlastN of all ESTs against themselves to identify the most abundant repeat regions.
High quality ESTs (39,939) were then assembled into contigs using the TIGR-TGICL tool .
Putative transposable elements (46 sequences) were first identified based on homology search after BlastX analysis against UniProt KB (E-value of < 1.10-20). An additional local BlastN search was performed using the EST dataset both as query sequence file and target database. Repeated motif sequences, i.e. repeats occurring more than twice from non overlapping ESTs, were selected as a first screen repeats dataset. These were used, together with the Repbase repeats library, to mask our EST sequences before clustering and assembling (TGICL). The same first screen dataset was then BlastN compared with the assembled database, and repeated motif sequences occurring on more than two different UniSeqs (E-value of < 1.10-20) were considered as A. viridis repeat sequences.
Gene functions were automatically assigned to 39% of the predicted proteins (5,652 UniSeqs). This assignment was based on the identification of InterPro (IPR) domains  using InterproScan  and the following command line: iprscan -cli -i unisequences.fa -o unisequences.ipr.raw -seqtype n -goterms -iprlookup -format raw. For comparative analysis of IPR domains found in the A. viridis dataset, we also ran the program InterproScan on the predicted N. vectensis proteome dataset. To homogenize the granularity level of annotation between organisms for each non-overlapping set of domains found, we only kept the root domain. We used the hierarchical organization of domains proposed in the "Parent-Child" description available on the EBI public ftp server ftp://ftp.ebi.ac.uk/pub/databases/interpro/ParentChildTreeFile.txt. For example, all CYP proteins which have a P450 domain ([InterPro:IPR002949], [InterPro: IPR002397], [InterPro: IPR008070]) were counted at their root domain (IPR001128).
All contig and singleton sequences were compared with several databases, using Blast: the Nematostella vectensis draft genome (Predicted proteins, http://www.jgi.doe.gov/), SwissProt (2008.03), TrEMBL (2008.03) and other ESTs from symbiotic cnidarian species (Acropora millepora, Acropora palmata, Aiptasia pallida, Montastrea faveolata) or from non symbiotic species (Metridium senile).
To confirm the origin of some selected genes, amplifications were performed on 10 ng of genomic DNA from A. viridis epidermal cells (non-symbiotic cells, animal origin ), cultured Symbiodinium cells extracted from A. viridis tentacles (non symbiotic cells, symbiont origin), and whole tentacle extracts (symbiotic cells, both animal and symbionts). Primers designed in the experiment are presented in Additional file 6 and were used in 40-cycle PCR reactions. Elongation factor 1 alpha and Elongation factor 2 were used as positive controls for nuclear-encoded genes from A. viridis and Symbiodinium spp, respectively, while psbA (photosystem II protein D1) was used as a positive control for chloroplast-encoded Symbiodinium spp genes. Sequence alignment was done using Multalin . Signal peptide prediction was performed using SignalP . Phylogenetic analyses were done using both MEGA 4.0  and PHYML  software.
The ESTs generated in this study were submitted to dbEST ([GenBank:FK719875–FK759813]). Accession numbers of sequences used in MERP alignment: Am_MERP, Acropora millepora, [GenBank:EZ013381.1]; Mm_MERP, Mus musculus, [REFSEQ:NP_598826]; Hs_MERP, Homo sapiens, [REFSEQ:NP_060019]; Ca_MERP1, Carassius auratus, [GenBank: X14134.1]; Ca_MERP2, Carassius auratus, [GenBank: J04986.1].
This work was supported by CNRS (GIS Génomique Marine) and ANR (JCJC05-AGeSyMar) grants. PG was funded through the ANR grant. We thank P. Wincker and the Genoscope team for high-throughput sequencing. We are extremely grateful to H. McCombie-Boudry for her writing contribution.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.