Insights into the innate immunome of actiniarians using a comparative genomic approach

van der Burg, Chloé A.; Prentis, Peter J.; Surm, Joachim M.; Pavasovic, Ana

doi:10.1186/s12864-016-3204-2

Research article
Open access
Published: 02 November 2016

Insights into the innate immunome of actiniarians using a comparative genomic approach

Chloé A. van der Burg^1,2,
Peter J. Prentis^3,4,
Joachim M. Surm^1,2 &
…
Ana Pavasovic^1,2

BMC Genomics volume 17, Article number: 850 (2016) Cite this article

3262 Accesses
36 Citations
9 Altmetric
Metrics details

Abstract

Background

Innate immune genes tend to be highly conserved in metazoans, even in early divergent lineages such as Cnidaria (jellyfish, corals, hydroids and sea anemones) and Porifera (sponges). However, constant and diverse selection pressures on the immune system have driven the expansion and diversification of different immune gene families in a lineage-specific manner. To investigate how the innate immune system has evolved in a subset of sea anemone species (Order: Actiniaria), we performed a comprehensive and comparative study using 10 newly sequenced transcriptomes, as well as three publically available transcriptomes, to identify the origins, expansions and contractions of candidate and novel immune gene families.

Results

We characterised five conserved genes and gene families, as well as multiple novel innate immune genes, including the newly recognised putative pattern recognition receptor CniFL. Single copies of TLR, MyD88 and NF-κB were found in most species, and several copies of IL-1R-like, NLR and CniFL were found in almost all species. Multiple novel immune genes were identified with domain architectures including the Toll/interleukin-1 receptor (TIR) homology domain, which is well documented as functioning in protein-protein interactions and signal transduction in immune pathways. We hypothesise that these genes may interact as novel proteins in immune pathways of cnidarian species. Novelty in the actiniarian immunome is not restricted to only TIR-domain-containing proteins, as we identify a subset of NLRs which have undergone neofunctionalisation and contain 3–5 N-terminal transmembrane domains, which have so far only been identified in two anthozoan species.

Conclusions

This research has significance in understanding the evolution and origin of the core eumetazoan gene set, including how novel innate immune genes evolve. For example, the evolution of transmembrane domain containing NLRs indicates that these NLRs may be membrane-bound, while all other metazoan and plant NLRs are exclusively cytosolic receptors. This is one example of how species without an adaptive immune system may evolve innovative solutions to detect pathogens or interact with native microbiota. Overall, these results provide an insight into the evolution of the innate immune system, and show that early divergent lineages, such as actiniarians, have a diverse repertoire of conserved and novel innate immune genes.

Background

The immune system is an ancient and complex system that works to detect and defend against pathogens, as well as to regulate the interaction between microbes and hosts [1–4]. Defence against pathogens in vertebrates is a two-fold mechanism. It consists of the innate immune system, which provides non-specific protection to the host, and the adaptive immune system, which mounts a specific attack against foreign bodies (e.g., microbes) and displays immunological memory [1–4]. Invertebrates, such as cnidarians (corals, jellyfish, hydroids and sea anemones), however, only possess the innate immune system as their primary mode of pathogen defence [5].

Innate immune genes tend to be highly conserved in metazoans, even in early divergent lineages, such as Cnidaria and Porifera [5–9]. However, constant and diverse selection pressures on the immune system have driven the expansion and diversification of different immune gene families in a lineage-specific manner [10–13]. One particular example is the NOD-like receptor family (renamed by [14] as Nucleotide-binding and Leucine-rich Repeat containing gene family (NLR)), of which expansions have occurred independently in multiple early divergent lineages [7, 12, 15, 16].

The evolution of immune genes can be complex, as not all functionally conserved immune genes are homologous. Protein domains may evolve independently and subsequently converge on a conserved function or architecture [12, 17]. For example, the functional Toll-like receptor pathway in Hydra magnipapillata (freshwater polyp) is formed using a scaffold assembly of TIR-only proteins (Toll/interleukin-1 receptor homology domain) and LRR-only proteins (Leucine-rich repeat) [18–20]. Similarly, it appears that the immunoglobulin-TIR domain combination found in the interleukin receptor family has evolved independently in cnidarians and vertebrates, as these genes share limited sequence similarity [5, 11, 16]. For these reasons, investigating conserved domain architectures can be highly informative in identification and characterisation of the immune repertoire in cnidarians and other early divergent metazoan taxa. Using this approach, the evolution of novel immune genes can also be investigated, through interrogation of novel architectures with domains that have known immune functions. The TIR domain is a well-characterised example; it functions in protein-protein interactions and signal transduction in immune pathways. Such an approach has previously [11] been successfully applied to identify novel immune genes by interrogating TIR-domain-containing architectures.

Like other members of phylum Cnidaria, actiniarians are anatomically simple and develop from only two germ layers (diploblastic). These animals are typically sedentary with no physical protective barriers such as a shell, cuticle or exoskeleton and are therefore directly exposed to the abiotic and biotic conditions surrounding them. Consequently, these organisms have evolved immune defence mechanisms that tend to rely on mucous secretions which consist of antimicrobial peptides, as well as a diverse range of pattern recognition receptors (PRRs) which work in concert with the glycocalyx, to provide a physicochemical barrier [21, 22]. As with other eumetazoans, pathogen recognition in actiniarians is thought to occur primarily via the detection of pathogen associated molecular patterns (PAMPs), using a diverse array of PRRs. Cnidarian immune genes, in particular PRRs, also have a major role in maintaining homeostasis between the host and the beneficial native microbiota, which primarily reside on the epithelium [19, 23], although many cnidarians also undergo endosymbiosis with dinoflagellates [24].

Current genomic resources for cnidarians have been limited to a few key model species, including Nematostella vectensis (starlet sea anemone) [6], Acropora digitifera (coral) [25], H. magnipapillata. [26] and Aiptasia sp. (sea anemone) [27]. Interrogation of these genomic resources has revealed that the cnidarian genome is surprisingly complex. In fact, cnidarians have maintained both eumetazoan and early divergent metazoan gene families, some of which have been lost in invertebrate models Drosophila melanogaster and Caenorhabditis elegans [5, 9, 28–30]. In particular, the genome of the sea anemone Nematostella, has been shown to be more similar to vertebrates than other invertebrate model species, with gross genomic structure and exon/intron arrangement similar to vertebrates [6]. While these few cnidarian species have been crucial in elucidating the complexity of the cnidarian genome, in this highly speciose phylum, more resources are needed to facilitate further in-depth comparative evolutionary and phylogenetic analyses.

Actiniarians are uniquely suited for the study of the origin and evolution of the innate immune system. This is principally attributed to their phylogenetic placement as members of Cnidaria, the sister phylum to Bilateria. Their simple anatomical features and complex, vertebrate-like genomes make Actiniarians an interesting candidate for understanding how novel innate immune genes evolve within eumetazoan species that have limited physical defences. Furthermore, using Actiniarians as a model for understanding anthozoan immunity will help provide insights into other issues facing anthozoan species, such as coral bleaching and disease. To better understand the evolution of the innate immune system, here we performed a comprehensive and comparative investigation of the actiniarian innate immune gene repertoire with the aim to provide preliminary insights into the key processes that have shaped the cnidarian immunome. Specifically, we have examined gene family evolution of candidate immune genes, to better understand the role of gene gain and loss events, the generation of new genes and the acquisition of genes from other species, as key drivers of the innate immune gene repertoire in actiniarians.

Results

Sequencing, assembly and annotation

Whole organism extractions of total RNA were performed from Actinia tenebrosa (red, brown, green and blue colourmorphs, designated 1, 2, 3 and 4 respectively) Anthopleura buddemeieri, Aulactinia veratra, Calliactis polypus (designated 1 and 2), Telmatactis sp. and Nemanthus annamensis. Sequencing was performed on an Illumina NextSeq 500, and raw reads per sample ranged between 51,620,970 and 220,632,160. These reads were assembled into 116,120–212,774 contiguous sequences (contigs) for each transcriptome. All raw reads were submitted to the NCBI Sequence Read Archive (SRA) under one BioProject (PRJNA313244); accession numbers for individual transcriptomes can be viewed in Additional file 1: Table S1. Assembly statistics as well as completeness scores can be found in Additional file 1: Tables S2 and S3, which also includes assembly metrics for raw reads downloaded from the SRA for Aiptasia pallida, Anthopleura elegantissima and Nematostella vectensis. A total of 13 transcriptomes were used in this study, from 9 actiniarian species.

Gene ontology

Gene ontology (GO) terms were assigned to between 21,750 and 43,791 transcripts using Trinotate (Additional file 1: Table S4). WEGO analysis indicated a similar distribution of GO terms among assemblies (Additional file 1: Figure S1). CateGOrizer analysis showed that the number of genes assigned immune class GOs ranged from 7239 to 12,515 across species, with the exception of A. pallida which had only 1500 genes (Additional file 1: Table S5 and Figure S2). Stress response and protein metabolism were the most frequently assigned immune class GO terms for all species, while the ‘immunology, immune response’ term corresponded to approximately 2.7 % of the total number of genes in this GO category, except A. pallida which had 0.5 %. The results from CateGOrizer showed that most assemblies have a similar distribution of GO terms associated with the immune class. A. pallida showed the most variation from the other transcriptomes, with a much lower fraction of genes in all categories except categories relating to metabolism (i.e., catabolism and protein, lipid and carbohydrate metabolism).

RSEM

RNA-Seq by Expectation-Maximization (RSEM) analysis of transcriptomes was performed to verify candidate transcripts are expressed and as a guide for choosing the most highly expressed candidate genes for PCR validation i.e., transcripts with the highest FPKM (Fragments Per Kilobase of transcript per Million mapped reads) score were chosen in most instances for PCR validation. The most highly expressed immune transcripts in all transcriptomes for which RSEM was performed (A. tenebrosa (1), A. buddemeieri, A. veratra, C. polypus (2)) were all NLR transcripts, with FPKMs of 84.69 for A. tenebrosa, 22.76 for A. buddemeieri, 65.74 for A. veratra and 370.7 for C. polypus.

Candidate gene counts and architecture

Candidate gene analysis revealed that for most species, a single gene copy of TLR, MyD88 and NF-κB was identified (Table 1), with very few species having multiple isoforms of these genes (Additional file 2: Table S6). In contrast, multiple copies of NLR and IL-1R-like were found in all species. Complete copies of NLR genes were found in most species (with the exception of A. elegantissima and Telmatactis sp., where only partial sequences were found) ranging from 2 to 4 complete and 1–8 partial copies of NLR genes, with total (complete and partial) gene copy numbers ranging from 3 to 10 (Table 1 and Additional file 2: Table S7). Complete copies of IL-1R-like genes were found in all species, ranging from 1 to 7 complete gene copies and 3–7 total gene copies when considering partial sequences as well (Table 1 and Additional file 2: Table S7).

Table 1 Candidate gene counts in each transcriptome

Full size table

All TLR and IL-1R-like predicted peptides had transmembrane domains (TMDs) and all MyD88 and NF-κB predicted peptides did not. Transmembrane domain analysis of NLRs, however, revealed 8 out of 35 total (including complete and partial) NLRs had TMDs. Multiple (3–5) TMDs were identified in TMD containing NLRs and were clustered at the N-terminus. TMD containing NLRs were found in A. pallida, A. tenebrosa, A. buddemeieri and C. polypus. The most common protein domain architecture of all candidate genes and their hypothesised location in the cell are shown in Fig. 1.

Candidate and novel gene validation

To validate candidate and novel genes, PCR amplification from A. tenebrosa, A. buddemeieri, A. veratra and C. polypus was performed. PCR products were sequenced and realigned to the contig from which primers were designed for each transcript. Sequences were considered complete when they aligned to the transcript with at least 94 % pairwise sequence similarity to the ORF, following trimming. A total of 15 complete and 3 partial sequences were validated, with at least one representative sequence from each of the 5 candidate genes, and one representative sequence from each of the 3 novel genes. All validated sequences were submitted to GenBank® (NCBI) and accession numbers can be viewed in Additional file 3: Table S14.

Presence/absence of other innate immune genes

Additional innate immune genes were investigated in this study for their presence/absence in actiniarian species. These included key receptors and proteins in the complement pathway (i.e., CniFL, which is hypothesised to function in the lectin-complement pathway and complement proteins MASP, C3, Factor B, C6 and Factor I), as well as genes encoding protein domains known to have immune function (i.e., SRCR and C-type Lectin domains). The results are presented in Table 2.

Table 2 Gene counts and presence/absence of other innate immune genes

Full size table

Identification of novel immune genes

Taxonomically-restricted transcripts with novel domain architectures were identified in all species, but not all novel transcripts were found in all species examined (Additional file 2: Tables S8–S10). Novel genes with potential immune function were identified based on novel architectures that included the TIR (or TIR_2) domain. A representation of the full repertoire of these genes is shown in Fig. 2. Three of the most common transcripts were investigated further and designated as Novel Genes 1, 2 and 3.

Novel Gene 1 (NG1) was the most commonly observed transcript with novel domain architecture and was found in all species examined. NG1 includes the Pfam domains Leucine-rich repeat (LRR), Miro-like protein (Miro), Ras family (Ras) and Toll/Interleukin-1 receptor homology domain (TIR_2). SMART domain annotation represented the Miro and Ras domains as either Ras family, Ras of complex (Roc) or small GTPase, all of which are GTPase family domains. A low threshold C-terminal of Roc (COR) domain was also found ~300 aa downstream in all genes, which may be essential to the function of the GTPase domain and is always found downstream of Ras [31]. NG1 is represented in Fig. 2a.

BLASTx analysis of NG1 against the TrEMBL database (Additional file 2: Table S9) identified the only significant hit as a Nematostella predicted protein. Swiss-Prot BLASTx searches (Additional file 2: Table S9) identified this gene as an F-box/LRR (FBXL) protein from multiple vertebrate species. The FBXL protein, however, only has the LRR region in common with NG1, indicating there is no ortholog of NG1 in vertebrates. Further investigation of this gene with known TIR and TIR_2 architectures from the EMBL-EBI [32] domain database identified 362 unique architectures with the TIR_2 domain (PF13676) and 381 unique architectures with the TIR domain (PF01582), but the TIR domain was not found in combination with any GTPase (Roc, Ras or COR). The TIR_2 domain was identified in combination with one or more of the GTPase domains in multiple bacteria species, including cyanobacteria, filamentous bacteria, purple sulfur bacteria and flavobacteria, and with an additional C2 domain in the pacific oyster, Crassostrea gigas (C2 domain – targets proteins to cell membranes), and a MBT domain in the placozoan, Trichoplax adhaerens (MBT – repeat of unknown function). Table S10 (Additional file 2) provides details of the variable domain architectures found that could represent orthologous genes to NG1.

Novel Gene 2 (NG2) was identified in six species, A. tenebrosa (2, 3 and 4), A. pallida, A. elegantissima, C. polypus (2), N. annamensis and N. vectensis, and included one or two TIR_2 domains upstream of a BTK motif (Bruton's tyrosine kinase-type zinc-finger motif). Only a single predicted Nematostella protein was identified in TrEMBL and no proteins were identified in Swiss-Prot (Additional file 2: Table S9). No sequences in EMBL-EBI were identified with a BTK motif and TIR domain and only two sequences containing both a TIR_2 and BTK motif were found in EMBL-EBI. These predicted proteins were found in Phytophthora ramorum (oomycete) and Guillardia theta (crytpomonad algae) (UniProt IDs in Additional file 2: Table S10). NG2 is represented in Fig. 2b.

Novel Gene 3 (NG3) was identified in only three species, A. pallida, C. polypus and N. vectensis. NG3 includes the Pfam domains protein tyrosine kinase or protein kinase (Pkinase_Tyr or Pkinase; always overlapping), a death domain and a TIR (or TIR_2) domain. Additional domains are seen in this gene when represented by SMART; these included central Roc/COR domains and a SH3 domain (SRC Homology 3 domain). Only a single Nematostella predicted protein was found to match this transcript in the TrEMBL database (Additional file 2: Table S9). BLASTx analyses against the Swiss-Prot database (Additional file 2: Table S9) identified two slime mould predicted proteins with similarity; however, neither contained either the DEATH or TIR domains, and were therefore disregarded. EMBL-EBI database search returned no TIR domain architectures that were similar to NG3. A few sequences were found in the EMBL-EBI database with TIR_2 domain combinations similar to NG3. The most similar domain combination found was with a ‘neuralized’ domain in place of the ‘Pkinase’ and ‘SH3’ domains in Strigamia maritima (centipede) and Crassostrea gigas (pacific oyster). The ‘Pkinase’ and ‘SH3’ domains are also missing in a less similar domain architecture found in Strongylocentrotus purpuratus (purple sea urchin) (Additional file 2: Table S10). NG3 is represented in Fig. 2c.

Evolutionary and Phylogenetic analyses

Non-synonymous vs synonymous substitution rates

Non-synonymous vs synonymous substitution rates (d_N/d_S) were calculated using PAML for all TLR, MyD88, NF-κB and TMD containing NLRs genes, as well as for Novel Genes 1, 2 and 3. Table 3 shows a summary of the results and all PAML results for the Yang and Nielsen [33] method can be viewed in Additional file 4. Conserved and novel genes where all found to have patterns of synonymous and non-synonymous mutations consistent with the action of purifying selection (i.e., dN/dS ratio < 1) across all genes.

Table 3 Non-synonymous vs synonymous substitution rates

Full size table

NLR family phylogenetic analysis

Maximum likelihood analysis was used to infer phylogenetic relationships of NLRs between actiniarian species in this study with a selection of additional metazoan NLRs. Figure 3 shows a radial maximum likelihood tree, generated using a LG + G protein model. Highlighted on the tree are two clades, which correspond to NLRs with certain classes of N-terminal domains. All known metazoan TMD containing NLRs cluster into one clade, which only consists of anthozoan species (Actiniaria and Scleractinia (A. digitifera)). Most of the other anthozoan NLRs all cluster into another clade. Interestingly the DED sequences from anemone species from this study cluster more closely with the Chordata, Annelida and Mollusca BIR, CARD and DD clade, than with other anthozoan species (i.e., N. vectensis and A. digitifera), although this is lowly supported. N. vectensis NLRs consistently cluster outside of other actiniarian species with high bootstrap support.

CniFL family phylogenetic analysis

A total of 42 protein sequences from all species except A. elegantissima and Telmatactis sp, were identified with a complete ORF. Removing duplicate sequences resulted in a final alignment of 30 sequences, which was trimmed to remove poorly aligned regions to a final 132 aa alignment. The phylogeny represented in Fig. 4 was generated using the JTT + G protein model of evolution and the alignment used can be found in Additional file 5. Most CniFL genes were identified with an additional TMD which is represented in the domain architecture illustrations in Fig. 4. One clade also contains CniFLs with an additional domain WAP (whey acidic protein) found in N. vectensis, A. tenebrosa and C. polypus. Mapping back the N. vectensis CniFL transcript to the genome (v1.0) identified the N. vectensis CniFL gene at scaffold_24:467371-486150 (gene model: e_gw.24.49.1). Multiple Ig domains were annotated by Pfam in the genome; however, they are annotated in a region that has been annotated as intronic.

Discussion

To better understand the evolution of the innate immune system we have performed a comprehensive and comparative investigation of the actiniarian innate immune gene repertoire, including TLR, NLR,IL-1R-like, MyD88, NF-κB, CniFL, SRCR, MASP, C-type lectin domains, complement control proteins and multiple novel innate immune genes. By generating multiple high-quality transcriptome assemblies we were able to undertake a detailed study of innate immune genes in actiniarians. Here we note that in some cases specific gene families appear to have undergone lineage-specific expansions and potential neofunctionalisation (e.g. NLR), while other genes such as TLRs have fewer copies than observed in other eumetazoan phyla. Although these observations need further functional validation, they provide an insight into the core eumetazoan immune gene set and show that the majority of actiniarian innate immune genes show evidence of purifying selection.

Candidate and novel gene counts and architectures

The identification of candidate and novel genes in this study is mostly consistent with previously reported information on gene copy number in cnidarians. For example [11], reported similar counts for some TIR domain containing proteins identified in nine anthozoan species. Their study included three of the species interrogated in our study: A. pallida, A. elegantissima and N. vectensis and confirmed single copies of TLR and MyD88. For IL-1R-like, of which there are multiple copies, fewer complete genes were identified in our study for the three species, whilst numbers of all TIR domain containing proteins were higher. The study [11] also identified an expansion of TIR only proteins restricted to coral species. Here we present evidence for an anthozoan-specific expansion of TIR only proteins, as we identify similar counts of TIR only proteins in all anemone species (Additional file 2: Table S8).

In this study, the TIR domain, not TIR_2, was always found in proteins with evolutionarily conserved architectures (i.e., TLR, MyD88, IL-1R-like). Notably, in most instances TIR_2, a common bacterial TIR domain [34] also found in some marine invertebrates [35–37] was found in novel genes identified here. This highlights that horizontal gene transfer (HGT) may have played a role in the introduction of the TIR_2 domain to the genome of actiniarian species, which then underwent domain shuffling to form the observed novel genes. Whilst the possibility of contamination cannot conclusively be ruled out, in most instances no similar domain architectures to the three novel genes were identified in any other species. This study identifies two previously uncharacterised genes (NG1 and NG2) and extends on the previously reported novel genes for Actiniarians [11], including NG3 and other novel genes shown in Fig. 2, which have been identified by [11]. Interestingly, NG3 identified here was found with additional domains (SH3, Roc and COR domains) not identified by [11]. These novel domain architectures found in lineage-specific actiniarian species are likely the result of domain shuffling.

Expanded repertoire and neofunctionalisation of NLRs

The diversity in domain architectures observed in the NLR family across metazoans suggests potential evidence for neofunctionalisation in this gene family. The expansion and diversification of actiniarian NLRs, such as those with transmembrane domains, observed in our study is significant as it indicates that some of these duplicated copies have likely undergone independent neofunctionalisation within Anthozoa. This is supported by phylogenetic analysis of NLRs across Metazoa (Fig. 3) where all 15 TMD containing NLRs clustered into one clade, but other actiniarian NLRs clustered in separate clades. The presence of multiple transmembrane domains in NLRs identified in our study indicates that they may be membrane-bound. This appears to be an anthozoan specific innovation as all other metazoan NLRs are exclusively localised to the cytoplasm and do not have these domains [38]. Interestingly, only two other anthozoan species (N. vectensis and A. digitifera) have so far been reported with TMD containing NLRs [7], but our results indicate a wide distribution of TMD containing NLRs across multiple actiniarian species. This highlights that novel innovations and neofunctionalisation have occurred in innate immune gene families of Actiniaria. Amphimedon queenslandica (sponge) NLRs may also contain TMDs, however, no anthozoan TMD containing NLRs cluster with these sequences and do not currently support this hypothesis [7]. Phylogenetic analysis of the metazoan NLRs resulted in a similar tree to the one obtained by [7] in their study on sponge NLRs, providing further evidence that the TMD containing NLR clade are correctly clustered into a single clade, indicating this novel structure has likely evolved from a single origin within Anthozoa. Potential functional implications of the neofunctionalisation of NLRs include allowing cooperation between other pattern recognition receptors or a modified or novel response to an immune signal.

Conservation of innate immune pathways and the interaction of novel genes

A functioning innate immune system may depend on the conservation of signalling pathways, more so than the conservation of particular immune genes. For example, in Hydra the TLR pathway is conserved through a scaffold of TIR-only and LRR-only proteins despite the lack of a full length TLR gene [18–20]. In our study, however, we found canonical TLR genes in most species (Table 1), with the exception of A. pallida and Telmatactis sp. In fact, no canonical TLR gene was identified in the Aiptasia genome [27]. Whether A. pallida and Telmatactis sp. also use scaffold TIR-only and LRR-only proteins remains unexamined, however, the expansion of TIR-only and TIR-domain-containing proteins in all species in this study may contribute to the building of scaffold-like immune pathways. Interestingly, the presence of 4 full length canonical TLR genes in coral (i.e., A. digitifera) [25] suggests that the canonical TLR was present in the common ancestor of anthozoans.

This principle of pathway conservation is demonstrated by the evolution of the interleukin-1 receptor gene family. The Ig-TIR domain combination (found in interleukin receptors such as IL-1R) is hypothesised to have evolved independently in both cnidarians and vertebrates [11, 12]. No IL-1R-like identified here were annotated as bona fide IL-1R protein by Swiss-Prot, rather they were all annotated as either TIR-domain-containing proteins or Ig containing genes. Further domain analysis demonstrated that these genes do indeed possess the canonical IL-1R-like architecture, with a C-terminal TIR, transmembrane domain and multiple Ig domains (Fig. 1). Previous studies [5, 11] have reported the independent evolution of the Ig-TIR domain combination and thus concluded that cnidarian IL-1R (or rather IL-1R-like,) are not orthologs of the vertebrate IL-1R family. In particular, one study [12] argued that either the LRR-TIR (as seen in TLRs) or the Ig-TIR domain combination must have evolved independently in more than one metazoan lineage. Further phylogenetic analyses may clarify whether these gene families or domain combinations have evolved independently in multiple metazoan lineages.

Expansions in genes encoding TIR-domain-containing proteins, including both novel and ‘conserved’ (i.e., IL-1R-like) genes, may compensate for the lack of more than one TLR (or no TLR, in the case of Aiptasia, Hydra and Telmatactis sp. [19, 27]). The TLR and IL-1R-like pathways both interact with MyD88 via their TIR domain during signal transduction, and therefore, may also interact with novel TIR-domain-containing proteins in a similar way. It is plausible to hypothesise that Novel Gene 3 (Fig. 2c) may interact in these pathways via its C-terminal death and TIR domains, which together form the canonical MyD88 domain architecture. Other novel TIR-domain-containing proteins (Fig. 2) may also interact in known or new pathways via the interactions of the TIR domain. Functional validation is required to determine the extent to which these novel genes contribute to cellular processes underpinning the innate immune response.

Distribution of CniFL: a putative complement pathway receptor

The initiation of known immune pathways though novel or lineage-specific immune genes is demonstrated by the discovery of the novel putative immune receptor CniFL. First reported in the recently sequenced Aiptasia genome [27], CniFL is hypothesised to function through the lectin-complement pathway. Here we identify a copy of the CniFL gene (Table 2) in all species (except A. elegantissima, however, this transcriptome was derived from a single tissue source (acrorhagi)), as well as multiple components of the complement pathway (Table 2). We also identify three species (N. vectensis, A. tenebrosa and C. polypus) with CniFLs that contain an additional WAP domain and also identify a transmembrane domain CniFL in most species (Fig. 4). Interestingly, the single CniFL gene copy in N. vectensis was not identified previously [27] due to an incorrectly annotated ab initio gene model prediction in the Nematostella genome (v1.0). In addition, CniFL appears to have extensive copy number variation within hexacorallians (1 in N. vectensis to 7 in A. tenebrosa), but whether this variation in copy number is adaptive remains unexplored. The CniFL gene was hypothesised to have a role in symbiont uptake [27], which was partly due to its absence in the non-symbiotic N. vectensis. However, we observe that the CniFL gene is present in both symbiotic and non-symbiotic (A. tenebrosa, A. buddemeieri, C. polypus, N. vectensis) species investigated in our study (see Additional file 1: Table S2) with highest copy number reported in a non-symbiotic species (A. tenebrosa), therefore we recommend further functional validation to support this hypothesis.

Conclusion

In conclusion, actiniarians have a diverse repertoire of innate immune genes. These include novel domain architectures and potentially novel innovations, such as membrane-bound NLRs and novel putative complement pathway PRRs (i.e., CniFL). Further novelties may have been introduced to the actiniarian immune gene set through HGT followed by domain shuffling. Together, the conservation, expansion and diversification of different gene families and genes encoding TIR-domain-containing proteins have shaped the evolution of the actiniarian innate immune system. The conservation of specific protein domain architectures, immune genes and pathways provides an insight into the core gene set required for a functioning eumetazoan innate immune system. Further functional validation of these genes may elucidate conserved and novel biological functions, including how these genes may contribute to lineage-specific processes and morphological novelties.

Methods

To perform a comparative analysis of the innate immune gene repertoire, across multiple actiniarian taxa, we interrogated a suite of genomic resources, both publically available and newly generated in this study. Specifically, genomic datasets used in this study consisted of newly sequenced transcriptomes for Actinia tenebrosa (total n = 5; red colourmorph n = 2, brown n = 1, green n = 1, blue n = 1), Anthopleura buddemeieri (n = 1), Aulactinia veratra (n = 2), Calliactis polypus (n = 2), Nemanthus annamensis (n = 1) and Telmatactis sp. (n = 1), and as well as publically available transcriptomes for Anthopleura elegantissima (n = 1; SRX754678), Aiptasia pallida (n = 1; SRX231866, run SRR696721) and Nematostella vectensis (n = 1; SRX315372). These species were selected based on their phylogenetic distribution across most superfamilies within Actiniaria, availability of animals and general lack of sequence information for these species. Multiple transcriptomes were generated for A. tenebrosa and A. veratra which represent colourmorphs (red, brown, green and blue for A. tenebrosa) and aerial exposure treatments (one control and one treatment each for A. tenebrosa (red only) and A. veratra). All raw sequence reads were submitted to NCBI Sequence Read Archive under BioProject accession PRJNA313244 (see Additional file 1: Table S1 for all accession numbers).

Animal acquisition

Individual specimens of A. tenebrosa, A. buddemeieri and A. veratra were collected from the intertidal zone at Point Cartwright, (QLD, Australia) in 2014. C. polypus individuals were collected in 2014 from pumice washed onto rock pools at Snapper Rocks (QLD, Australia). Telmatactis sp. was bought from Cairns Marine (QLD, Australia) in 2014. N. annamensis was bought from Great Barrier Reef Marine Pty Ltd in 2015. All animals were housed in holding tanks, in the marine laboratory at QUT, until experimental use. Tanks were maintained at between 33 and 37 ppt salinity and 20–28 °C.