Lichens are symbiotic organisms consisting of two components; a fungal partner or mycobiont and an algal partner or photobiont. Lichens are remarkable organisms in their ability to tolerate extreme environmental conditions including even outer space [6, 25, 26]. They resume photosynthesis rapidly after even long periods of desiccation . The molecular mechanisms underlying lichen’s survival adaptations are uncharacterised and genomic resources for lichens are limited. To gain a glimpse of the molecular nature of these neglected organisms, we have generated EST sequences from grey reindeer lichen, Cladonia rangiferina, using both high-throughput next-generation sequencing and traditional Sanger sequencing. The sequences were de novo assembled with 79.7% of the reads assembling into contigs, and only 20.3% of the reads remaining as singletons. These values are similar to other non-model organism transcriptome de novo assemblies [27, 28].
As the grey reindeer lichen is a symbiotic organism comprising of two distinct genomes, the Asterochloris genome and the Cladonia rangiferina genome, the sequences were classified to identify their genome of origin, and to obtain an estimate of the ratio of fungal to algal sequences in the lichen transcriptome. More than half of the sequences were classified as fungal sequences, although this varied depending on whether contigs or singletons were classified. The difference between contigs and singletons is in both their length and cis-substantiation suggesting that sequence length affects classification. BLASTX best match taxonomic assignment shows a similar ratio of fungus to alga / plant sequences ratio as the predictive classification performed by Eclat. This suggests that the taxonomic assignment performance is consistent and robust. Although probable non-target sequences, as evidenced by BLASTX analysis, may be present within our sequence collection, the amount of contamination is modest and should not affect the classification. The ratio of the two organisms has been estimated previously with 7% of cells being of algal origin . Similar values were obtained in an analysis of Lobaria pulmonaria protein spectra, where 10% of the spectra were assigned to green algal proteins . Our results suggest a higher percentage of algal transcripts expressed in wetted lichen tissue. Transcript abundance likely correlates most with transcriptional activity and with thallus cell abundance to a lesser level . However, our results appear to confirm the mycobiont as the dominant partner in the symbiosis even in the context of gene expression.
Since no lichen reference genome has yet been published annotation was performed by comparing homologous protein sequences with BLASTX. 57.2% of all assembled contig sequences had a BLAST hit when run against the non-redundant protein sequence database. A considerable fraction of the sequences remain as unidentified and apparently novel sequences. This percentage was considerably lower for the singleton sequences. Similar homology results with lower BLAST match percentages for singletons have been reported for other non-model organism transcriptome de novo assemblies . The numbers are also concordant with those published by Joneson et al. who found a significant homology to 50% of Cladonia grayi sequences in the nr database . Some of the sequences without a BLAST match are likely UTRs, but novel, lichen specific sequences are also likely present in this sequence collection. The cDNA libraries used for sequencing were also un-normalized and therefore there can be a significant redundancy in the ESTs sequenced. In addition, as no reference genome is yet available for any lichen species, the reads were not mapped - this would yield an ideal assignment.
A significant majority of the sequences had either an alga, a fungus, or a lichen species as the best match in the BLAST search (Figure 3). However, only 0.6% of the sequences had the best match to a lichen species, which illustrates the current lack of lichen sequences in the public databases. The largest non-target taxonomic groups were the bacteria (2.5% of sequences), protists (2.6% of sequences), and other (3.8%). Since lichen thalli are also known to contain internal bacterial communities , the presence of bacterial sequences from the lichen microbiome is not unexpected.
To decipher the biological meaning of the BLAST annotated sequences, GO and KEGG databases were used for functional annotation, while InterPro search was performed to identify recognisable protein motifs within our sequence collection. Lichens have been found to protect themselves from the damage caused by ROS during desiccation by using antioxidants [11, 32] but the enzymatic antioxidants are also involved in removing ROS produced during normal metabolism . This could be reflected by the GO terms related to oxidation within the most enriched GO terms (Figure 4). Also in the identified KEGG pathways (Table 4, Additional file 2), glutathione metabolism pathway potentially indicates that constitutive protection mechanisms against ROS are active in wetted lichen thallus, as has been previously studied by measuring high amounts of reduced glutathione in undesiccated lichens . These results support the hypothesis that highly-desiccation tolerant lichens rely mainly on constitutive protection mechanisms, which require constant levels of gene expression .
Several enriched GO terms and most of the identified KEGG pathways were involved in energy, nucleotide and amino acid metabolisms. These findings are consistent with earlier results, in which spectra assigned to proteins involved in post-translational modifications, energy production and conversion were highly abundant in the mycobiont . The same study found that proteins involved in energy production and conversion strongly dominate the protein fraction of green alga. Similarly, pathways involved in photosynthesis (carbon fixation in photosynthetic organisms, porphyrine and chlorophyll metabolism) are among the KEGG pathways with highest amount of sequences in our results.
The carbohydrate produced by the photobiont is leaked and taken up by the mycobiont and consequently converted to arabitol and mannitol through the phosphate pentose pathway . The transport-related enriched GO terms and the pentose phosphate pathway within the identified KEGG pathways potentially indicate that this mechanism is active in the studied lichen thallus. Surprisingly, methane metabolism had the second highest amount of sequences within the KEGG pathways. The sequences associated with this pathway could potentially be novel, lichen-specific sequences, which have a high homology to the proteins associated with methane metabolism, but which are in reality associated with an uncharacterised pathway, e.g. the production of a lichen-specific secondary metabolite. 27.9% of the sequences had a match in the InterPro database, and this suggests that although a reasonable proportion of the sequences contain a number of recognisable protein motifs, there are many unrecognisable sequences, some of which may contain novel protein structures.