KEGG orthology-based annotation of the predicted proteome of Acropora digitifera: ZoophyteBase - an open access and searchable database of a coral genome

Background Contemporary coral reef research has firmly established that a genomic approach is urgently needed to better understand the effects of anthropogenic environmental stress and global climate change on coral holobiont interactions. Here we present KEGG orthology-based annotation of the complete genome sequence of the scleractinian coral Acropora digitifera and provide the first comprehensive view of the genome of a reef-building coral by applying advanced bioinformatics. Description Sequences from the KEGG database of protein function were used to construct hidden Markov models. These models were used to search the predicted proteome of A. digitifera to establish complete genomic annotation. The annotated dataset is published in ZoophyteBase, an open access format with different options for searching the data. A particularly useful feature is the ability to use a Google-like search engine that links query words to protein attributes. We present features of the annotation that underpin the molecular structure of key processes of coral physiology that include (1) regulatory proteins of symbiosis, (2) planula and early developmental proteins, (3) neural messengers, receptors and sensory proteins, (4) calcification and Ca2+-signalling proteins, (5) plant-derived proteins, (6) proteins of nitrogen metabolism, (7) DNA repair proteins, (8) stress response proteins, (9) antioxidant and redox-protective proteins, (10) proteins of cellular apoptosis, (11) microbial symbioses and pathogenicity proteins, (12) proteins of viral pathogenicity, (13) toxins and venom, (14) proteins of the chemical defensome and (15) coral epigenetics. Conclusions We advocate that providing annotation in an open-access searchable database available to the public domain will give an unprecedented foundation to interrogate the fundamental molecular structure and interactions of coral symbiosis and allow critical questions to be addressed at the genomic level based on combined aspects of evolutionary, developmental, metabolic, and environmental perspectives.


Background
All of the reef-building corals (Scleractinia; phylum Cnidaria) that create the vast calcium carbonate deposits of coral reefs have evolved an endosymbiotic partnership with photosynthetic dinoflagellates of the genus Symbiodinium (Dinophyceae), commonly known as zooxanthellae, which reside within the gastrodermal cells of their scleractinian host [1][2][3]. Coral-algal symbiosis is a cooperative metabolic adaptation necessary for survival in the shallow oligotrophic (nutrient-poor) waters of tropical and subtropical marine environments [4,5] that drives the productivity of coral reefs [6]. Coral reefs provide habitat and trophic support for many thousands of marine species, the richness of which rival the biological biodiversity of tropical rainforests [7]. Underlying the basic requirements of corals for growth, reproduction and survival are special needs to accommodate symbiont-specific host recognition, to control innate and responsive immune systems, and what is likely to emerge from future research is the extent to which the host is involved in direct regulation of its endosymbiont populations. Much is understood about the cellular biology of cnidarian-dinoflagellate symbiosis (reviewed in [8]), but less is known at the molecular level of coral symbiology. There is little opposition to the contention that environmental and anthropogenic disturbances are causing alarming losses to coral reefs ( [9] and reference therein). Threats to productivity are being imposed by the disruption of coral symbiosis (apparent as "coral bleaching") caused in response to increasing thermal stress attributed to global warming [10,11], from an increase in stress-related coral disease [12][13][14], from the discharge of domestic and industrial wastes, pollutants from agricultural development and the transport of sediments in terrestrial runoff [15,16], and potentially from imminent declines in coral calcification owing to rising ocean acidification [17][18][19]. Accordingly, we require a better understanding of the molecular stress responses and adaptive potential of corals. Such information is necessary to predict bleaching events and so better inform effective management policies for the conservation of coral reef ecosystems [20][21][22][23][24].
To understand how coral holobionts respond to environmental change at the molecular level, the identification of genes that may respond by transcription to stress is of primary importance [25]. Thus, the use of transcriptomic methodologies to identify stress-responsive genes has been highly successful [26][27][28][29][30][31][32]. Transcriptome high-throughput profiling has allowed changes in gene expression across thousands of genes to be measured simultaneously. Fuelled by data-generating power, the number of coral based studies utilising transcriptomics to investigate molecular responses to environmental stressors has expanded greatly by the acquisition of expressed sequence tag (EST) gene libraries, the fabrication of microarray biochips used to estimate levels of mRNA expression, and by direct analysis using next-generation, high-throughput sequencing. However, much of this work has been conducted using the aposymbiotic state of pre-settlement coral larvae, so transcribed genes relevant to metamorphosis and the cytobiology of the adult polyp are limited to a few recent studies [33][34][35][36]. The transcriptome additionally does not provide the structural framework and essential regulatory elements of the functional genome for comprehensive evaluation. Recently, deep metatranscriptomic sequencing of two adult coral holobiomes has been made available on searchable databases: PocilloporaBase for Pocillopora damicornis [36] and PcarnBase for Platygyra carnosus [37]. In contrast, high-throughput metaproteomic analyses to quantify the product yield of stress-response genes of the coral holobiome are yet to be widely adopted by the coral reef scientific community, despite the proteome being the ultimate measure of the coral phenotype [38,39].
The early accumulation of transcriptomic data revealed that a small proportion of coral ESTs matched genes known previously only from other kingdoms of life, implying that the ancestral animal genome contained many genes traditionally regarded as 'non-animal' that have been lost from most animal genomes [40]. Furthermore, an unexpected revelation from EST data is the greater extent to which coral sequences resemble human genes than those of the Drosophila and Caenorhabditis model invertebrate genomes [41,42]. Comparative genomic analysis has revealed higher genetic divergence and massive gene loss within the ecdysozoan lineages. Hence, many genes assumed to have much later evolutionary origins are likely to have been present in an ancestral or early-diverged metazoan [43]. While much of the animal kingdom remains yet to be explored, examples of the metazoan phylum Cnidaria provide a unique insight into the deep evolutionary origins of at least some vertebrate gene families [42]. Thus, the complete genomic sequence of a coral is likely to reveal many genes previously assumed to be strictly vertebrate innovations. To date, cnidarian genomes have been published for the sea anemone N. vectensis [42] and the hydroid Hydra magnipapillata [44]. Only the coral genome of Acropora digitifera is available without restriction on use of its published sequence [45], but the compiled sequence has not been fully annotated. At the time of this writing, the genome assembly of Acropora millepora has been released to the public domain [46], also without full annotation, but an embargo is imposed on use of this data that is highly restrictive to the progress of further studies. Understanding how genomic variation affects molecular and organismal biology is the ultimate justification of genome sequencing, and annotation is an essential step in this process. We envisage that unrestricted access to annotation of the A. digitifera genome will provide an unprecedented foundation to freely interrogate the generic molecular structure, possible endobiotic interactions and the response of coral to environmental stress. Accordingly, we offer annotation of the predicted proteome of A. digitifera on the open access and searchable database, ZoophyteBase [47]. Use of the ZoophyteBase search engines will allow genes of encoded proteins to be identified that can be examined in context of the cellular physiology, processes of ecological significance, the evolutionary and developmental biology of corals and the functional metabolism of the holobiont that collectively underpin the health of coral reefs.

Construction and content
ZoophyteBase is an open access and searchable database of complete annotation of the predicted proteome of the coral A. digitifera [48]. It was constructed using the ME GGASENSE system, which is a general system for constructing annotation databases with different sorts of input data (DNA reads, assembled genomes, predicted proteomes) and the possibility of using different combinations of analysis tools to create the annotation (Gacesa et al, in preparation). In the case of ZoophyteBase, hidden Markov model (HMM) profiles [49] were chosen as the annotation tool rather than the more common BLAST searches [50]. HMM profiles are constructed from multiple alignments of protein families and contain information about conserved differences in amino acid residues as well as deletions and insertions [49]. This is particularly important for a coral database, as corals are evolutionarily distant to most other organisms. This means that known homologous sequences present in the databases will usually have relatively low similarity, making BLAST searches inaccurate. The statistical information in an HMM profile gives more sensitive and accurate detection of sequence homology. An additional advantage of HMM profiles is that the statistical significance of hits (the expected value) is much more accurate than that calculated by BLAST programs.
The quality of sequence annotation is limited by the accuracy of information provided in any database used. It is well known that there are many problems with annotation in the large uncurated databases such as the NCBI GenBank nr sequences. Widely accepted, the most accurate database for functional annotation is the KEGG database [51]. The KEGG database organises sequences as groups of KEGG orthologues. These are sets of homologous sequences from as wide a range of organisms as possible having an assigned molecular function. These functions are arranged in a hierarchical fashion and grouped in biological pathways. The sequences belonging to KEGG orthologues were used to construct HMM profiles for annotating the coral sequences. Accordingly, the 23,524 predicted proteins encoded in the coral genome were analysed using HMM profiles. If a protein showed a highly significant correlation ("hit") to a single HMM profile, this was used to create a "trusted" annotation of the sequence. Choosing a cut-off for this criterion is not trivial, because longer sequences tend to have more significant e-values. For construction of ZoophyteBase the criterion 1e-5 was used. This resulted in 19,044 predicted proteins giving "trusted" sequence annotation. For many of these proteins there were two or more highly significant hits to established HMM profiles. In these cases, the most significant correlation was used to construct our "best-fit" annotation file, but other hits can be viewed by the database user so that expert knowledge can be employed to override the automatic annotation function. In 8,004 out of 19,044 predicted proteins which were annotated, more than one annotation was assigned based on nonoverlapping regions within the protein which were used to construct the "best-fit" annotation file. We interpreted these as "fusion" events generated by the in silico protein prediction method used, and these proteins were treated as multiple instead of single encoded proteins. Hence, this analysis resulted in the annotation of 33,195 proteins in total, generated from the original 23,524 predicted coral proteins. This is a very conservative annotation scheme, so it can be assumed that most of the annotations are biologically meaningful. Almost 81% (19,044 out of 23,524) of the predicted proteome was assigned using this method.

Utility
The MEGGASENSE system was used to generate a web interface for ZoophyteBase. The home page ( Figure 1A) allows the use of several functions. A text version of the entire annotation can be downloaded for manual inspection. There is a proteome overview that gives statistics about the database and a breakdown of the annotated functions into different categories of genes. A particularly useful feature of ZoophyteBase is the ability to use text queries employing a search engine that provides a relevant inquiry in the absence of an exact match between key words of a search and those described for a functional protein. The search engine uses text from the KEGG-database, PubMed and other sources to establish links between query words to access protein data using an intelligent Google-like search engine implemented by the search platform Lucene/Solr [52]. This helps to overcome the common problem that different terminology is used by different groups of researchers. The use of this search function is illustrated by using the query "phagocytosis" ( Figure 1B). This inquiry finds 42 hits to KEGG orthologue profiles. One of the hits corresponds to amphiphysin (a synaptic vesicle protein) with annotation of two protein homologues encoded in the coral genome. On the data page there is a brief description of the function of amphiphysin together with a PUBMED literature reference. The sequences of the predicted coral proteins ( Figure 1C) can be retrieved, and it is also possible to analyse such data with computer aided drug design methods [53] to look for conserved domains. There are also two tools for the user to examine matches to protein sequences. The user can carry out a BLAST search against the coral protein sequence or analyse the predicted sequence against HMM profiles used to annotate the coral proteome. These tools require only the user to paste their queury into the sequence window.

Regulatory proteins of symbiosis
Metabolic cooperation is a key feature of coral-algal symbiosis that allows reef-building corals to inhabit the often nutrient-poor waters of tropical oceans [54]. In this phototropic symbiosis, fixed carbon produced by resident algae is released to the host for nutrition, and the algal symbionts benefit by acquiring the inorganic nutrient wastes of host metabolism [2,55]. The symbiotic Figure 1 Graphical overview of the user-web interface for ZoophyteBase during a typical search. The home page allows several search functions (A). Text queries using an intelligent Google-like search engine is illustrated by using the query "phagocytosis" (B). This finds 42 hits to KEGG orthologue profiles. One of the hits corresponds to amphiphysin with annotation of two protein homologues encoded in the coral genome. On the data page there is a brief description of the function of amphiphysin together with a PUBMED literature reference. The sequences of the predicted coral proteins can be retrieved (C).
dinoflagellates reside and proliferate within a specialised phagosome (the symbiosome) maintained within host gastrodermal cells. This arrangement requires complex biochemical coordination by the coral at various metabolic stages that includes endocytosis (phagocytosis) by postsettlement polyps to acquire algal symbionts, accord symbiosome recognition to arrest phagosomal maturation for sustained organelle homeostasis, activate symbiophagy or exocytosis to eliminate damaged symbionts [56,57], and regulate apoptotic or exocytotic pathways to remove excess or impaired populations, all of which have long been recognised as essential to preserve the stability of coral symbiosis [58]. Although these processes are poorly understood in corals, it has been realised from studies of the sea anemone Aiptasia pulchella, a related anthozoan also containing Symbiodinium sp. endosymbionts, that the persistence of algal-containing symbiosomes in Cnidaria relies on the exclusion or retention of small Rab GTPase family proteins that are key regulatory components of vesicular trafficking and membrane fusion in eukaryotic cells [59]. Significantly, ApRab3 and ApRab4 accumulate in the biogenesis of maturing symbiosomes of A. pulchella [60,61], and mature symbiosomes enveloping healthy dinoflagellates have tethered ApRab5 [62], a checkpoint antagonist of downstream ApRab7 and ApRab11 proteins that would otherwise direct autophagy of the symbiont cargo [63,64].
Our annotation of the A. digitifera genome reveals sequences encoding putative Rab homologues of the Ras superfamily of proteins (Table 1). In a comparison of cnidarian Rab proteins, eight proteins of A. digitifera matched homologues of Aiptasia pulchella, twenty-nine matched proteins encoded by the aposymbiotic freshwater H. magnipapillata and the aposymbiotic anemone N. vectensis genomes, while seven Rab and Rab-interacting proteins of A. digitifera did not match other cnidarian proteins (Table 2). Significantly, the eight homologues of A. digitifera that matched exclusively Rab proteins of A. pulchella included homologues of the aforementioned ApRab3, ApRab4 and ApRab5 proteins attributed to the maintenance of healthy symbiosomes in Aiptasia, while homologues of the autophagic ApRab7 and ApRab11 proteins are found also in N. vectensis. While Rab GTPase proteins and their effector proteins coordinate consecutive stages of endocytic vesicular transport [65,66], soluble N-ethylmaleimide-sensitive factor attachment receptor (SNARE) proteins are essential for Rab assembly to complete endosomal fusion of vesicle membranes [67], a process by which Rab proteins impart specificity by binding distinct Rab and SNARE partner proteins prior to membrane fusion [68]. Genes encoding syntaxin-like SNARE proteins have been unambiguously identified [69] from coral EST database libraries constructed from expressed mRNA isolated from various early life stages of Acropora aspera, A. millepora, A. palmata and Orbicella faveolata (= Monastraea faveolata), as well as from the genome of the sea anemone N. vectensis [70]. In metazoans, vacuolar r-SNARE receptor proteins comprise the syntaxin, synaptobrevin and VAMP family proteins, of which there are eight syntaxin and syntaxin-binding proteins (plus two plant-like syntaxins). Additionally, there are one t-SNARE target protein to direct vacuolar morphogenesis, two synaptosomal proteins, one synaptosomal complex ZIP1 protein (yeast homologue), one synap tobrevin membrane protein of secretory vesicles, ten vesicle-associated membrane proteins (VAMPs), a vacuolar protein-8 regulator of autophagy, four vacuolarsorting proteins and two SEC22 vesicle trafficking protein encoded in the genome of A. digitifera (Table 1), many of which may interact to provide metabolic transport between the endoplasmic reticulum and Golgi apparatus [71]. Included in this vast but yet unexplored repertoire of vacuolar-acting proteins are the syntaxinbinding amisyn and tomosyn regulators of SNARE complex assembly and disassembly [72,73], which may control membrane fusion in the phagocytic establishment and dissociation of coral symbiosis.
In the final step of exocytosis there is a cytosolic influx of calcium which binds to synaptotagmin to actuate completion of membrane SNARE protein assembly with exocytic docking to form the conducting channel for trans-membrane vesicular transport on activation by vesicle-fusing ATPase [74]. As synaptotagmin proteins are not included in the KEGG database, Zoophytebase was used for BLAST searches with all known synaptotagamin sequences [27]. Synaptotagamin proteins from A. digitifera were found having similarity to homologues from diverse invertebrate and vertebrate organisms, including one from the human genome (Table 3). Other Ca 2+ -sensing proteins of A. digitifera, such as calmodulin and the calcium binding protein CML, are given with calcification and Ca 2+ -signalling proteins.
Intriguingly, annotation of the A. digitifera genome reveals a host cell factor (K14966), but this is not related to the elusive "host factor" of symbiosis demonstrated to be present in tissue homogenates of corals and other marine invertebrates that harbor Symbiodinium spp. endosymbionts [75][76][77]. Instead, this mammalian transcriptional coactivator host cell factor (HFC-1) is known to mediate the enhancer-promoter assemblies of herpes simplex (HSV) and varicella zoster (VZV) viruses for activation of the latent state for replication [78], such that the coral HCF homologue may have similar relevance as a viral checkpoint transcriptional coactivator of virulence in A. digitifera. HCF-1 expression is coupled also to chromatin modification [79,80] suggesting that the coral protein homologue may have an additional role in Table 1 Regulatory proteins of symbiosis in the predicted proteome of A. digitifera

Planula and early developmental proteins
In this section we discuss predicted proteins encoded in the A. digitifera genome having functional homology to known proteins are specific to early embryonic development, planula larvae function and morphogenesis, which are given in Table 4. Annotation of the coral genome reveals a large set of homeobox proteins involved in the regulation of anatomical development during morphogenesis. The homeobox is a highly conserved DNA sequence (homeodomain) within genes that binds to DNA in a sequence-specific manner [81] often at the promoter region of their target gene to affect transcription in the developing embryo. Amonst these transcriptional regulators, Hox genes are essential to metazoan development as their expressed proteins differentiate embryonic regions along the anterior-posterior axis (the Hox code) and are recognised for their contribution to the evolution of morphological diversity [82]. Hox genes are well characterised in cnidarians and, given their importance in embryonic development, it is not surprising that molecular evidence from the Cnidaria reveal that the genetic origins of Hox genes predate the cnidarian-bilaterian divergence [83][84][85] yet had evolved after divergence of the sponge and eumetazoan lineages [86]. Hox genes of cnidarians are typically located in a conserved genomic collinear cluster, which is apparent also for A. digitifera, whereby the order of the genes on the chromosome is the same as that of gene expression in the developing embryo. Included in our annotation are genes encoding two LIM homeobox proteins and a LIM homeobox transcription factor (Lhx) having conserved roles in neuronal development [87], which in N. vectensis are responsible for the development of neural networks in developing larvae and juvenile polyps [88]. Unlike N. vectensis [89], the coral genome expresses a homeobox BarH-like protein that in vertebrates directs neurogenesis [90]. Distinct from homeodomain proteins, but serving similar functions, are various protein activators, regulators and receptors of cellular morphogenesis. Annotation of the coral genome has revealed multiple sequence alignments to a protein homologue of the dishevelled-associated activator of     [92], and in mammals Daam1 is highly expressed in multiple developing organs and is deemed essential for cardiac morphogenesis [93]. Similar morphogenetic genes express regulatory proteins that are necessary for vacuole biogenesis in yeasts [94]. Others express bone morphogenetic proteins (and their BMP receptors), which are potent multi-functional growth activators that belong to the transforming growth factor beta (TGFbeta) cytokine superfamily of proteins that in humans have various functions during embryogenesis, skeletal formation, neurogenesis and haematopoiesis [95]. However, since many of the homeobox and morgenetic proteins (Table 4) are homologues of proteins with functions ascribed to higher organisms, their precise function in A. digitifera cannot be ascertained by KEGG orthology alone. Another protein encoded in the A. digitifera genome is a retina and anterior neural fold homeobox-like (RAX) protein that may activate the development of primitive coral photoreceptors [96,97], including a blue light-sensing, cryptochrome photoreceptor that in A. millepora is implicated in the detection of light from the lunar cycle of night time illumination to signal synchronous coral spawning [98,99]. Photosensitive behaviours and the circadian rhythms of corals are well described, and diurnal cycles of gene transcription that regulate circadian biological processes in the coral A. millepora have been reported [100]. Such traits in A. millepora appear regulated by an endogenous biological clock entrained to daily cycles of solar illumination [101]. Annotation of the A. digitifera genome reveals a circadian timekeeper protein KaiC [102] that in cyanobacteria is activated during the diurnal phosphorylation rhythm [103,104]. In Synechococcus elongatus, KaiC regulates the rhythmic expression of all other proteins encoded in the genome [105], yet no homologue of any of the prokaryotic clustered circadian kiaABC genes has been identified in eukaryotes [106]. In Drosophila, KaiC together with a homologue of the eukaryotic period (Per) circadian protein drives circadian rhythms in eclosion (hatching) and locomotor activity [107]. Nevertheless, a circadian locomotor output cycles kaput (CLOCK) homologue (Table 4) was found in our annotation. Since CLOCK proteins serve as an essential activator of downstream elements in pathways critical to the regulation of circadian rhythms in eukaryotes [108], it would be worthy to examine how transcription of the RAX-like homeobox protein in this coral contributes to the development of circadian functions by activation of kaiC, per and Clock genes. Such a study might reveal that components of the animal circadian clock are more ancient than data previously suggested [109].
Broadcast-spawning corals, such as A. digitifera, release gametes, and the fertilised eggs develop into planula larvae within the water column until they have reached settlement competency, find a suitable hard substrate, attach and develop into the polyp on metamorphosis. Coral sperm and planula larvae achieve motility using flagella (sperm) or cilia (larvae) as their locomotor organelles. The eukaryotic axonemal proteins of cilia and flagella are composed of a dynein ATPase protein to provide mechanochemical energy transduction together with the principle structural proteins of the ciliary/flagellar microtubules [110]. The flagellar/ciliary microtubules consist of filaments composed of αand β-tubulins, microtubule-stabilising tektins and kinesin motor proteins [111][112][113]. The coral genome encodes members of the dynein axonemal (flagella and cilia) proteins ( Table 4) and many of the dynein cytoplasmic proteins (not tabulated), the latter being involved in intracellular organelle transport and centrosome assembly. The coral genome encodes αand β-tubulins and members of the eukaryotic kinesin superfamily proteins (not tabulated). Amongst the many kinesin proteins encoded in the coral genome is the kinesin family member 3/17 protein, which is a direct homologue of the kinesin-II intraflagellar transport protein FLA10 essential for flagella assembly in the alga Chlamydomonas [114]. The microtubule-stabilising tektin protein, which is required for cilia and flagella assembly [113], is also encoded in the coral genome [note: there is no KEGG orthology identifier assigned to this protein]. It was a surprise, however, to find a large complement of  (Table 4). Included also are the prokaryote homologues FlgN and FlbB that regulate transcriptional activation of flagellar assembly [115,116] and FlhB which controls the substrate specificity of the entire prokaryotic flagellar apparatus [117]. Encoded in the coral genome is a flagella-independent Type IV twitching mobility protein PilT that affords social gliding translocation in many prokaryotic organisms controlled by complex signal transduction systems that include two-component sensor regulators [118]. It is unlikely that these genes are derived from contamination from bacterial DNA. Such contamination would manifest itself by the random occurrence of bacterial genes from the whole genome including many housekeeping genes. In this case, the genes occur as members of groups with specialised functions, suggesting that multiple horizontal gene transfers between bacteria and the coral genome have occurred [119]. Their precise function in A. digitifera remains unknown; homologues of these prokaryotic genes have not been described previously in any other eukaryote genome. Linked closely with flagellar/ciliary proteins are the sensory receptors that signal chemoattraction or avoidance to direct cellular motility. The coral genome reveals a variety of genes that encode chemoreceptor and chemotaxis proteins ( Table 4). The chemoreceptor proteins of A. digitifera include an oxygen-sensing aerotaxis receptor that in bacteria invokes an avoidance response to anoxic micro-environments [120]. Encoded also are a nematode sensory chemoreceptor homologue [121], two homologous pheromone factor receptor proteins that in fungi activate a species-specific mating response [122], three chemotaxis protein sensor receptors belonging to the methyl-accepting chemotaxis family of proteins (MCPs) in bacteria and archaea [123], and two proteins (CheZ and CheR) and two regulators (PixG and WspE) of the twocomponent signal transduction (TCST) system for activation of gene expression. In bacteria and archea, as well as some plants, fungi and protozoa [124], TCST systems mediate many cellular processes that respond to a broad range of environmental stimuli via activation of a specific histidine (or serine) kinase sensor and its cognate response regulator [125]. There are 77 sequence matches to various elements of the TCST family of proteins in the A digitifera genome (data not tabulated). Included also are genes encoding members of the chemotactic cytokine (chemokine) family of sensory proteins that on secretion directs chemotaxis in nearby responsive cells by stimulating target chemokine receptors; both chemokine and chemokine receptor proteins are encoded in the coral genome. Significantly, sensory chemokines/chemokine receptors are found in all vertebrates, some viruses and some groups of bacteria, but none have been described previously for invertebrates [126].

Neural messengers, receptors and sensory proteins
Corals and other cnidarians are the earliest extant group of organisms to have a primitive nervous system network [127] thought to be evolved from a eumetazoan ancestor prior to the divergence of Cnidaria and the Bilateria [128,129]. Unlike marine sponges (Porifera) that predate synaptic innovation [130], cnidarians possess a homogenous nerve net that, although lacking any form of cephalization, accommodates fundamental neurosensory transmission across the nerve net to end in a motoneural junction to coordinate tentacle movement required for feeding and predator avoidance [131]. The nervous systems of cnidarians consist of both ectodermal sensory cells and their effector cells and endodermal multipolar ganglions capable of neurotransmission [132]. At the functional level, synaptic transmission in cnidarians relies on fast neurotransmitters (glutamate, GABA, glycine) and slow neurotransmitters (catecholamine, serotonin, neuropeptides) for sensory-signal conduction [133]. At the ultrastructural level, many cnidarian neurons have multifunctional traits of sensory, neurosecretory and stimulatory attributes [134]. Significantly, the genome of A. digitifera encodes the expression of a ciliary neurotrophic factor, which is a polypeptide hormone and nerve growth factor that promotes neurotransmitter synthesis, neurite outgrowth and regeneration [135]. Additionally, the coral genome encodes nerve growth factor and neurotrophic kinase receptors, a survival motor neuron protein, a survival neuron splicing factor, the neural outgrowth protein neurotrimin, and a neurotrophin growth factor attributed to signalling neuron survival, differentiation and growth (Table 5). Encoded for neuron regulation and development are several neuron cation-gated channels, a neuronal guanine nucleotide exchange factor, a neurotransmitter Na + symporter, several neurogenic differentiation proteins, a neuronal PAS domain transcription factor for activation of neurogenesis, the axon guidance protein neurophilin-2, a neural crest protein of embryonic neural development, neural ELAV-like transcription proteins of neurogenesis, a Notch protein (79 sequence domain matches) and a neutralized protein subset of the Notch signalling pathway that promotes neuron proliferation in early neurogenic development. Structural elements of the coral nerve net include neurofilament polypeptides and neuronal adhesion proteins.
Cnidarians differentiate highly specialised sensory and mechanoreceptor cells involved in the capture of prey and for defence against predators. Their stinging cells, termed nematocysts or cnidocytes, are stimulated by adjacent chemosensory cells. Nematocysts trigger the release of a stinging barb (cnidae tubule) via ultra-fast exocytosis on physical contact with ciliary mechanoreceptors of the cnidocyte to deliver the discharge of its venom [136]. Despite considerable advances in the sensory biology of cnidarians, knowledge of the specific receptor genes that regulate cnidocyte function remains incomplete. In Hydra, and perhaps other cnidarians, cnidocyte discharge is controlled by an ancient lightactivated, opsin-mediated phototransduction pathway [137] that precedes the evolution of cubozoan (box jellyfish) eyes [138]; cubozoans are the most basal of animals to have eyes containing a lens and ciliary-type visual cells similar to that of vertebrate eyes [139]. These G-coupled opsin photoreceptors of the retinylidene-forming protein family encoded in the genome of A. digitifera include rhodopsin, bacteriorhodopsin, c-opsin, r-opsin and G 0opsin (Table 5), but not the Gs-subfamily of opsin receptors reported to be present in sea anemones, hydra and jellyfish [140], that together with cyclic nucleotide-gated (CNG) ion channel proteins, arrestin (β-adrenergic receptor inhibitor) and other retino-protein receptors, are usual components of the bilaterian phototransduction cascade. Present also are genes to express rhodopsin kinase and βadrenergic receptor kinase which are related members of the serine/threonine kinase family of proteins that specifically initiate deactivation of G-protein coupled receptors. Additional proteins of retinol metabolism of the phototransduction pathway encoded in the A. digitifera genome are retinol dehydrogenase, all-trans-retinol 13,14 reductase and phosphatidylcholine (lichthin)-retinol O-acyltrans ferase, a neural retina-specific leucine zipper protein that is an intrinsic regulator of photoreceptor development and function, and a retina and anterior neural fold homeobox-like protein that modulates the expression of photoreceptor genes within the rhodopsin promoter. The genome of A. digitifera encodes also a blue lightsensing, cryptochrome photoreceptor thought to signal synchronous coral spawning by detecting illumination from the lunar cycle [98,99].
The A. digitifera genome reveals genes to express a broad array of neurotransmitter receptor proteins (Table 5), including glycine and glutamate neuroreceptors, adrenergic receptors that target non-dopamine catecholamines (i.e., epinephrine and norepinephrine), dopamine, muscarinic and nicotinic acetylcholine receptors, sensory G proteincoupled receptors and γ-aminobutyric acid (GABA) ligand-gated ion channel and G protein-coupled receptors (and inhibitors), several of which are encoded in high copy numbers. Cellular trafficking of neurotransmitters to presynaptic terminals is essential for neurotransmission, and significantly the genome of A. digitifera encodes a wide range of solute carrier neurotransmitter transporters, including a high affinity choline transporter and an acetylcholine-specific protein belonging to the major facilitator superfamily (MFS) of secondary transporters. Encoded also is dopamine β-monooxygenase that catalyses the conversion of dopamine to norepinephrine in the catecholamine biosynthetic pathway, which is necessary for cross-activation of adrenergic neuroreceptors [141]. Notably, the A. digitifera genome encodes acetylcholinesterase that is expressed at neuromuscular junctions and cholinergic synapses where its protease activity serves to terminate synaptic transmission.
The primitive nervous networks of cnidarians are strongly peptidergic with at least 35 neuropeptides identified from different cnidarian classes [142]. Our annotation of the sequenced A. digitifera genome, however, revealed only the neuropeptide FF-amide neurotransmitter, a RF amide related peptide, and its neuropeptide FF and Y receptors ( Table 5). Neuropeptides are usually expressed as large precursor proteins which comprise multiple copies of "immature" neuropeptides. Our annotation did not readily reveal these precursor neuropeptide proteins, but we did find enzymes required for their processing, for example, a variety of carboxypeptidase enzymes (not tabulated) that remove propeptide carboxyl residues at basic peptidase sites, and the mature peptide neurotransmitters that are finished by consecutive modification by peptidylglycine (α-hydroxylating) monooxidase (PHM) and peptidyl α-hydroxyglycine α-amidating lyase (PAL) enzymes, both of which are commonly expressed in mammals as a single bifunctional peptidylglycine monooxy genase (K00504/EC 1.14.17.3) [143]. Our extensive catalogue of animal-like neural and sensory proteins revealed by genome annotation is testament that essential neurobiological features were developed in the primitive neural networks of early eumetazoan evolution.

Calcification and Ca 2+ -signalling proteins
The massive structures of coral reefs evident today are a construction of aggregated calcium carbonate deposited over long geological time by scleractinian corals and other calcifying organisms, yet our understanding of the molecular processes that regulate the biological processes of coral calcification is limited [144]. Ca 2+ transfer from seawater to the calicoblastic site of coral calcification occurs by passive diffusion through the gastrovascular cavity [145] and by active calcium transport [146]. Active entry of Ca 2+ through the oral epithelial layer is regulated by voltage-dependent calcium channels, such as demonstrated by the L-type alpha protein cloned from the reef-building coral Stylophora pistillata [147]. Ca 2+ transport across the calioblastic ectoderm to the extracellular calcifying site is facilitated by the plasmamembrane ATP-dependent calcium pump that in S. pistillata resemble the Ca 2+ -ATPase family of mammalian proteins [148]. By 2H + /Ca 2+ -exchange at the calioblastic membrane, Ca 2+ -ATPase removes H + (from the net reaction Ca 2+ + CO 2 + H 2 O ⇒ CaCO 3 + 2H + ) thereby   increasing the saturation state of CaCO 3 to sustain calcium precipitation [146]. Importantly, located also at the calicoblastic membrane is carbonic anhydrase [149] which is required to catalyse the intermediate step of calcification by the reversible hydration of carbon dioxide (CO 2 + H 2 O ⇒ HCO 3 -+ H + ). In coral phototrophic symbiosis, despite numerous studies describing the well-known phenomenon of light-enhanced calcification, the relationship linking symbiont photosynthesis to coral calcification has been elusive [150,151]. Nonetheless, efforts to better understand the calcifying response of scleractinian corals to environmental change and ocean acidification are gaining traction [149,152,153].
Voltage-gated calcium channels (VGCCs) have been examined extensively in mammalian physiology for converting membrane potential into intracellular Ca 2+ transients for signalling transduction pathways (reviewed in [154]). VGCC signalling affects cellular processes to include muscle contraction, neuronal excitation, gene transcription, fertilisation, cell differentiation and development, proliferation, hormone release, activation of calcium-dependent protein kinases, cell death via necrosis and apoptosis pathways, phagocytosis and endo/ exocytosis. Remarkably, annotation of the genome of A. digitifera reveals sequences encoding homologues of all the VGCC (α, αδ, β, and γ) subunits of the molecular (L,  (Table 6). There are multiple sequences encoding three variants of Ca 2+ -transporting ATPase, of which at least one is necessary for coral calcification. There is only one sequence match for expressing carbonic anhydrase in the genome of A. digitifera, which may reflect the high catalytic efficiency of this calcifying enzyme [155], although a BLAST search of ZoophyteBase does reveal scaffolds with low e-values which on future experimental inspection might uncover multiple copies of this enzyme essential for calcification. There are multiple sequences that express solute carrier Na + /Ca 2+ -and Na + / K + /Ca 2+ -exchange families of transport proteins that with expression of the coral Ca 2+ /H + -antiporter may regulate cellular pH and Ca 2+ homeostasis. Implicit to coral calcification is Ca 2+ regulation that affects signalling of other vital cellular functions. Cellular Ca 2+ is mediated by the calcium-sensing receptor calmodulin (18 sequence matches) and other messenger calcium-binding effectors (Table 6), including the calcium-binding protein CML (40 protein domain sequence matches). Calcium/calmodulin-protein kinase proteins are arguably key to Ca 2+ -signalling in coral symbiosis but, with the exception of activation of sperm flagellar motility [156], their precise role has not been elaborated.

Plant-derived proteins
Endosymbiosis has contributed greatly to eukaryotic evolution, most notably to the genesis of plastids and mitochondria derived from prokaryotic antecedents. Genetic integration by endosymbiont-to-host transfer (EGT) or replacement (EGR) has been a significant force in early metazoan innovation, whereby nuclear transferred genes may even adopt novel functions in the host cell or replace existing versions of the protein that they encode [157]. Prokaryote-to-eukaryotic gene transfer has been widespread in evolution, but examples of genetic exchange between unrelated eukaryotes, such as between algal symbionts and their multicellular eukaryote host, are considered rare (reviewed by [158,159]). One such example is aroB (3-dehydroquinate synthase) transferred to the genome of the sea anemone N. vectensis, which sequence best fits that of the dinoflagellate Oxyrrhis marina [119]. Close inspection of the amino acid sequence of the aroB gene product, as reported by Shinzato et al. [45], clearly shows this protein to be 2-epi-5-epi-valiolone synthase (EVS), a sugar phosphate cyclase orthologue that catalyses the conversion of sedoheptulose 7-phosphate to 2epi-5-epi-valiolone found to be a precursor of the mycosporine-like amino acid (MAA) sunscreen shinorine in the cyanobacterium Anabaena variabilis [160]. Additionally, the EVS gene of N. vectensis has a distinctive Omethytransferase fusion that is identical in O. marina [161]. The shikimate pathway is essential to apicomplexan parasites of the genera Plasmodium, Toxoplasma and Cryptosporidium and of Tetrahymena ciliates to express a pentafunctional aroM gene similar to that of Ascomycetes, which is thought to have been conveyed by fungal gene transfer to a common ancestral progenitor [162]. In a separate example, H. viridis expresses a plant-like ascorbate peroxidase gene (HvAPX1) during oogenesis in both symbiotic and aposymbiotic individuals [163], whereby peroxidase activity is coincident with oogenesis and embryo genesis that in Hydra acts as a ROS scavenger to protect the oocyte from apoptotic degradation [164]. The sacoglossan (sea slug) molluscs Elysia chlorotica and E. viridis (Plakobranchidae) acquire plastids on ingestion of the siphonaceous alga Voucherea litorea (termed "kleptoplasty") and, by maintaining sequestered plastids in an active photosynthetic state, has emerged as a model organism for the transfer of nuclear-encoded plant genes from algal symbiont to its animal host [165]. In this symbiosis, the family of light-harvesting genes psbO, prk (phosphoribokinase) and chlorophyll synthase (chlG) are entrained in the genome of Elysia chlorotica (reviewed in [166,167]), although there is debate whether these genes are transcriptionally expressed (compare [168] and [169]). Also, phylogenomic analysis of the predicted proteins of the aposymbiotic unicellular choanoflagellate Monosiga brevicollis, considered to be a stem progenitor of the animal kingdom [170,171], reveals 103 genes having strong algal affiliations arising from multiple phototrophic donors [172]. Such notable examples illustrate the transfer of algal genes to animal recipients.
KEGG orthology-based annotation of the predicted proteome of A. digitifera reveals a plethora of sequences presumed to be of algal origin (Table 7). Like E. chlorotica, the coral genome has encoded the photosystem II (PSII) protein PsbO of the oxygen-evolving complex of photosynthesis, as well as the PSII lightharvesting complex protein PsbL that is important in protecting PSII from photo-inactivation [173]. Encoded also are the photosystem I subunit proteins PsaI and PsaO. Additionally encoded are the photosystem P840 reaction center cytochrome c551 (PscC) protein and the photosynthetic reaction center M subunit protein, the light-harvesting proteins complex 1 alpha (PufA), the complex II chlorophyll a/b binding protein 6 (LHCB6), the cyanobacterial phycobilisome proteins AcpF and AcpG, the phycocyanin-associated antenna protein CpcD, the phycocyanobilin lyase protein CpcF and the phycoerythrinassociated linker protein CpeS. Like E. chlorotica, the coral genome encodes chlorophyll synthase (ChlG), a chlorophyll transporter protein PucC, a light-independent nitrogenaselike protochlorophyllide reductase enzyme that is sensitive to oxygen [174] and a red chlorophyll reductase essential to the detoxification of photodynamic chlorophyll catabolites arising from plant/algal senescence [175]. Three chlorosome proteins of the photosynthetic antenna complex of green sulphur bacteria, a bacteriochlorophyll methyltransferase involved in BChl c biosynthesis [176] and the retinylidene bacteriorhodopsin of phototrophic Archaea are also encoded in the coral genome. Present are genes encoding subunit 6 of the cytochrome B 6 f complex that links PSII and PSI via the plastoquinone pool, together with chloroplast ferredoxin-like NapH and NapG proteins and their 2Fe-2S cluster protein. The coral genome, however, encodes sequences for NAD + -ferredoxin  digitifera has yet to be determined, although it has been suggested from the transcriptome of Acropora microphthalma that MAA biosynthesis proceeds from a branch point at 3-dehydroquinate of the shikimic acid pathway as a shared metabolic adaptation between the coral host and its symbiotic zooxanthellae [40]. The 3-dehydroquinate synthase enzyme of the shikimic acid pathway, thought to be a key intermediate in an alternative MAA biosynthetic pathway in A. variabilis [178], is instead encoded by the fused aroKB gene of A. digitifera (Table 7). Additional shikimate proteins of the predicted proteome, although not limited to phototrophs, are shikimate kinase (AroK), quinate dehydrogenase (QuiA) and the conjoined p-aminobenzoate synthase and 4-amino-4-deoxychlorismate lysate (PabBC) enzyme necessary for folate biosynthesis [179]. Other plant-related gene homologues include the phytohormone abscisic acid receptor protein (PabBC) and its cytochrome P450 monooxygenase abscisic acid 8′-hydroxylase, L-ascorbate oxidase and PTS system degrading enzymes, the unique SYP6 and SYP7 syntaxins of plant vesicular transport, tocopherol cyclase and a tocopherol Omethyltransferase enzyme that converts γ-tocopherol to α-tocopherol. Essential for carotene biosynthesis are phytoene synthase (CrtB) and phytoene dehydrogenase (CrtI) enzymes. Significantly, encoded within the coral genome is zeaxanthin epoxidase that is essential for abscisic acid biosynthesis and is a key enzyme in the xanthophyll cycle of plants and algae to impart oxidative stress tolerance.
Given that viruses often mediate gene transfer processes, it is intriguing that certain bacteriophages of marine Synechococcus and Prochlorococcus cyanobacteria are reported to carry genes encoding the photosynthesis D1 (psbA), and D2 (psbD) proteins, a high-light inducible protein (HLIP) [180,181] and the photosynthetic electron transport plastocyanin (petE) and ferredoxin (petF) proteins thought to enhance the photosynthetic fitness of their host [182][183][184]. Accordingly, it has been suggested that the transfer of psbA by viruses associated with Symbiodinium could lessen the severity of thermal impairment to PSII and the response of corals to thermal bleaching [185]. It is yet unknown if phages or dinoflagellate-infecting viruses [186], particularly those of Symbiodinium [187], may affect gene transfer leading to complementary (or "shared") metabolic adaptations of symbiosis [119,188].

Proteins of nitrogen metabolism
It is well accepted that intracellular Symbiodinium spp. provide reduced carbon for coral heterotrophic metabolism by photosynthetic carbon fixation. Because of this metabolic relationship, light is a critical feature in the bioenergetics of coral symbiosis [189]. The algal photosynthate translocated to corals, however, is deficient in nitrogen at levels necessary to sustain autotrophic growth. While corals can assimilate fixed nitrogen from surrounding seawater [190], "recycled" nitrogen within the symbiosis may account for as much as 90% of the photosynthetic nitrogen demand [191]. It would not be surprising then that light would have a strong influence on the uptake and retention of ammonium by symbiotic corals. Consequently, corals excrete excess ammonium in darkness [192], and in light excretion is induced by treatment with the photosynthetic electron transport inhibitor 3-(3,4-diclorophenyl)-1,1-dimethylurea (DCMU) [193]. Since ammonia is the product of nitrogen fixation, these observations suggest that the coral holobiont may fix nitrogen in the dark, or when photosynthesis is repressed, during which coral tissues are hypoxic [194], and nitrogenase activity is not inactivated by molecular oxygen [195].
Tropical coral reefs are typically surrounded by lownutrient oceanic waters of low productivity but, paradoxically, the waters of coral reefs often have elevated levels of inorganic nitrogen [196,197] attributed to high rates of nitrogen fixation. While nitrogen fixation from diazotrophic epiphytes of the coral reef substrata and sediments [197,198] and diazotrophic bacterioplankton of the coral reef lagoon [199] provide substantial quantities of fixed nitrogen for assimilation by the coral reef, mass-balance estimates show this input to be less than the community's annual nitrogen demand [200]. Endolithic nitrogen-fixing bacteria are abundant in the skeleton of living corals where they benefit from organic carbon excreted by overlaying coral tissues to provide a ready source of energy for dinitrogen reduction [201]. Additionally, intracellular nitrogen-fixing cyanobacteria are reported to coexist with dinoflagellate symbionts in the tissues of Monastraea cavernosa and to functionally express nitrogenase activity [202]. Corals also harbour a diverse assemblage of heterotrophic microorganisms in their skeleton, tissues and lipid-rich mucus (reviewed in [203]), and these communities include large populations of diazotrophic bacteria [204,205], and archaea [206]. Apart from nitrogen fixation, the coral microbiota contributes to other nitrogen-cycling processes, such as nitrification, ammonification and denitrification [207,208]. We were surprised to find several nitrogen fixation and cycling proteins encoded in the genome of A. digitifera (Table 8), notably a nitrogen fixation NifU-like protein, the Nif-specific regulatory protein (NifA), the regulatory NAD(+)-dinitrogen-reductase ADP-Dribosylastransferase protein, a nitrifying ammonia monooxygenase enzyme and nitrate reductase, which are usually expressed only by prokaryotic microorganisms. The presence of genes encoding proteins involved in nitrogen fixation raises speculation that corals may contribute directly to, or perhaps co-regulate, certain processes that catalyse the reduction of dinitrogen (N 2 ) to ammonia (NH 3 ) by the enzyme nitrogenase reductase (NifH). The functional NifH enzyme is a binary protein composed of a molybdenum-iron (MoFe) protein (NifB/ NifDK), or its NifEN homologue, fused with a FeMocofactor (FeMoco) protein [209]. While genes encoding NifB, NifDK (or NifEN) and their FeMo-cofactor do not appear in the genome of A. digitifera, a gene encoding the NifEN-like protein protochlorophyllide oxidoreductase (POR) is present (Table 8). POR has all three subunits with high similarity to the assembled MoFe nitrogenase [210], but this homologue is unlikely to be effective in nitrogen reduction [211,212] since its activity is light dependent [213] when tissues are highly oxic [193]. The NifU protein encoded in the coral genome preassembles the metallocatalytic Fe-S clusters for maturation of nitrogenase [214], but its assemblage without NifS, a cysteine desulfurase needed for [Fe-S] cluster assembly [215], would be incomplete, and its pre-nitrogenase receptor is also missing. Yet, the coral does have the nifJ gene that encodes pyruvate:flavodoxin oxidoreductase required for electron transport in nitrogenase reduction [216]. The regulatory NifA protein encoded in the coral genome might activate, on stimulation by the integration host factor (INF), transcription of nitrogen fixation (nif) operons of RNA polymerase [217], and both of these proteins are encoded in the coral genome. Additional to this transcriptional control, post-translational nitrogenase activity is controlled by reversible ADP-ribosylation of a specific arginine residue in the nitrogenase complex [218]. NAD (+)-dinitrogen-reductase ADP-D-ribosyltransferase (DraT) inactivates the nitrogenase complex while ADP-ribosylglycohydrolase (DraG) removes the ADP-ribose moiety to restore nitrogenase activity, and both of these enzymes are encoded in the coral genome. Given that genes encoding essential constituent proteins of nitrogenase assembly appear incomplete, corals are unlikely to fix nitrogen per se, but co-opted elements of the coral genome to regulate processes of nitrogen fixation by its diazotrophic consortia is a prospect worthy of exploration [219].
Nitrofying/nitrifying bacteria and archaea express the enzyme ammonia monooxygenase that converts fixed ammonia to nitrite (via hydroxylamine) and the enzyme nitrite (oxido)reductase completes the oxidation of nitrite to nitrate, and both of these enzymes are entrained in the genome of A. digitifera ( Table 8). The ammonia monooxygenase subunit A (amoA) of archaeal consorts has been described in nine species of coral from four reef locations [220], but the presence of amoA in the coral genome, together with encoded ammonium transport proteins, was not anticipated. Another protein of prokaryotic origin encoded in the coral genome is nitrate reductase (periplasmic, assimilatory and respiratory), the latter being required for anaerobic respiration by bacteria [221], and unlike the nitrate reductase family of sulphite oxidase enzymes in eukaryotes, the nitrate reductases of prokaryotes (K00363) belong to the DMSO reductase family of enzymes. Also encoded in the coral genome are a nitrite transporter (NirC) and a formatedependent nitrite reductase (NrfA) required for nitrite ammonification [222]. In addition to nitrite reduction, NrfA reduces nitric oxide, hydroxylamine, nitrous oxide and sulphite, the last providing a metabolic link between nitrogen and sulphur cycling in coral metabolism. Other enzymes of nitrogen metabolism encoded in the coral genome are the carbamoyl-phosphate synthase family of enzymes [223] that catalyses the ATP-dependent synthesis of carbamoyl phosphate used for the production of urea (ornithine cycle) to provide a ready store of fixed-N in the urea-nitrogen metabolism of corals [224]. Another nitrogen source comes from glutamate dehydrogenase (GDH) that reversibly converts glutamate to α-ketoglutarate with liberation of ammonia, and as expected [225], this enzyme is encoded in the coral genome, together with the prokaryotic nitrogen regulatory protein PII of glutamine synthase, which in bacteria is activated in response to nitrogen availability. Encoded also is histidine ammonia-lyase (histidase) that liberates ammonia (and urocanic acid) from cytosolic stores of histidine. It is now accepted that uric acid deposits accumulated by symbiotic algae provide a significant store of nitrogen for the coral holobiont [226], so it is noteworthy that the coral genome encodes urate oxidase (uricase) to catalyse uric acid oxidation to allanotoin from which urea and ureidoglycolate are produced in a reaction catalysed by allantoicase (allantoate amidinohydrolase), both of which known isoforms are present in the coral genome. Encoded in the coral genome is also urease to catalyse the hydrolysis of urea, presumably excreted by its algal symbionts, with the release of carbon dioxide and ammonia to meet the nitrogen demand of the coral holobiont during periods of low nitrogen availability. Similarly, xanthine dehydrogenase (xanthine: NAD + -oxidoreductase) acts by oxidation on a variety of purines, including hypoxanthine, to yield urate for the recycling of nitrogen in coral nutrition. Many of the aforementioned proteins of nitrogen metabolism, including Nif proteins, have been detected in the proteome of an endosymbiontenriched fraction of the coral S. pistillata [39]. Notwithstanding consideration of the rapid diffusion rate of nitric oxide (NO) or its apparent short biological half-life [227], there is debate about the provenance of endogenously produced NO in signalling the bleaching of corals in response to environmental stress. Elevated nitric oxide synthase (NOS) activity and NO production in algal symbionts has been attributed to the thermal stress response of corals [228,229], whereas the host is ascribed to be the major source of NO during exposure to elevated temperature [230,231]. While our annotation may not resolve this dispute, we show ( Table 8) that nitric oxide synthase enzymes (Nor D, Nor E, Nor Q and an invertebrate NOS protein) are encoded in the genome of A. digitifera, together with a nitric oxideinteracting protein (NOIP) that in higher animals regulates neuronal NOS activity [232]. Nitric oxide is an intermediate of nitrite reduction catalysed by nitrite reductase (NIR), which by further reduction produces ammonia. The coral genome also encodes nitric oxide dioxygenase (NOD) that converts nitric oxide to nitrate. Accordingly, enhanced expression of NIR (NO reduction) or NOD (NO oxidation) could ameliorate the NOsignalling response of coral bleaching presumed activated by environmental stress.

DNA repair
Cellular DNA is prone to damage caused by the products of normal metabolism and by exogenous agents. Damage to DNA from metabolic processes include the oxidation of nucleobases and strand interruptions by the production of reactive oxygen species (ROS), from alkylation of nucleotide bases, from the hydrolysis of bases causing deamination, depurination and depyrimidination, and from the mismatch of base pairs from errors in DNA replication. Damage affected by external agents include exposure to UV light causing pyrimidine dimerization and free radical-induced damage, exposure to ionising radiation causing DNA strand breaks, thermal disruption causing hydrolytic depurination and single-strand breaks, and by xenobiotic contamination to cause DNA adduct formation, nucleobase oxidation and DNA crosslinking. Most of these lesions affect structural changes to DNA that alter or prevent replication and gene transcription at the site of DNA damage. Thus, recognition and repair of DNA abnormalities are vital processes essential to maintain the genetic integrity of the coral genome. Since there are multiple pathways causing DNA damage at diverse molecular sites, there are likewise diverse and overlapping processes available to repair cellular DNA damage. Of the many nuclear repair processes, photoreactivation (photolyase), base excision repair and nucleotide excision repair are the main elements for the repair of cellular DNA damage.
Exposure to sunlight is an absolute requirement for phototrophic symbiosis, but excessive exposure of corals to solar ultraviolet radiation can inflict direct damage to DNA by pyrimidine dimerization and 6-4 photoadduct formation and cause indirect damage by the production of ROS to initiate free-radical damage. While there have been abundant studies on the sensitivity of corals to solar ultraviolet radiation, only a few have examined the effects of solar UV to cause DNA damage. Photoreactivation has been shown to be an important repair pathway for reversing UV-activated DNA damage in adult coral [233] and coral planulae [234]. UV damage to DNA was first demonstrated by the detection of unrepaired cyclobutane pyrimidine dimers (CPDs) in the host tissues and algal symbionts of the coral Porities porites, in which CPDs had increased in a UV dose-dependent manner [235], whereas CPDs and 6-4 pyrimidine-pyrimidone photoadducts in the coral Montipora verrucosa holobiont were correlated inversely with levels of coral "sunscreen" protection [236]. The effects of solar UV radiation causing DNA lesions in coral have been determined by use of the comet assay [237], and UV-induced DNA damage and repair has been examined in the symbiotic anemone Aiptasia pallida [238]. The comet assay showed also that DNA lesions in coral planulae had increased on acquiring algal symbionts, presumably from greater ROS production resulting as a by-product of photosynthesis [239]. Iron-induced oxidative stress was found likewise to enhance DNA damage in the coral Pocillopora damicornis as determined by the occurrence of DNA apurinic/apyrimidinic sites caused by hydrolytic lesions [240]. Significantly, DNA damage in the host and algal symbionts of the coral Montastraea faveolata was found to occur simultaneously during thermal "bleaching" stress, and DNA damage is further enhanced on exposure to greater irradiances of solar radiation [241]. Nevertheless, despite the serious risk of unrepaired DNA damage to coral survival, the DNA repair processes of corals to mitigate the detrimental effects of environmental stress have not been adequately characterised at the transcriptome level of expression [29,242].
Our annotation of the sequenced genome of A. digitifera has revealed genes encoding a large repertoire of DNA repairing enzymes and their adaptor proteins (Table 9). Given strong evidence for DNA photoreactivation in corals having been reported [233,234], it was surprising to find only one gene in single copy that encodes a sole photolyase enzyme for reversing pyrimidine dimer and 6-4 photoadduct formation. Notably, we found genes encoding 6 members of the ERCC family of nucleotide excision repair enzymes, together with the UV excision repair protein RAD23, for the repair of UV-induced DNA damage. More abundant are the DNA mismatch repair enzymes from the MLH, MSH, Mut and PMS protein families and related glycosylase/lyase proteins for repairing erroneous insertion, deletion and mis-incorporation of bases to arise during DNA replication and recombination. There is additionally a specific gene that encodes a 3′-endonuclease protein that has a preference to correct mispaired nucleotide sequences. Abundant also are other members of the RAD-family of DNA repair proteins, including 28 sequence copies of a gene encoding the RAD50 protein for DNA double-strand break repair that, together with members of the MRE, Rec, REV, Swi5/Sae3, XRCC and XRS families of recombination and polymerase proteins, have complementary roles in DNA repair. Apparent also in the genome are the DNA helicase proteins, including RuvB-like proteins, which are primarily involved in DNA replication and transcription, but assist also in the repair of DNA damage by separating double strands at affected sites of DNA damage to facilitate repair. Of the multiple families of ATPdependent DNA helicase proteins encoded in the coral genome, RecQ and helicase Q predominate. Encoded in the coral genome are 5 homologues of the DNA repair alkB proteins that reverse damage to DNA from alkylation caused by chemical agents by removing methyl groups from 1-methyl adenine and 3-methyl cytosine products in single-stand DNA. Annotated also are genes encoding DNA ligase 3 for repairing single-strand breaks, DNA ligase 4 to repair double-strand breaks, and a DNA cross-link repair 1C protein with single-strand specific endonuclease activity that may serve in a proofreading function for DNA polymerase. Taken together, expressing this arsenal of DNA protection may provide corals with limited ability to transcribe gene-encoded adaptation to a changing global environment.

Stress response proteins
Annotation of the A. digitifera genome reveals a wide assortment of thermal shock proteins, molecular chaperones and other stress response elements that are given in (Table 10), excluding antioxidant and redox-protective proteins which are described in the next section. Heat shock proteins 70 kDa, 90 kDa, 110kDA, HspQ and HspX (the last two proteins being homologues of the bacterial heat shock factor sigma32 and α-crystallin, respectively) are encoded in the coral genome, together with several HSP gene transcription factors. HSPs play a role in various cellular functioning such as protein folding, intracellular protein trafficking and resistance to protein denaturation. HSP expression is usually increased on exposure to elevated temperatures and other conditions of biotic and abiotic stress that include infection, inflammation, metabolic hyperactivity, exposure to environmental toxicants, ultraviolet light exposure, starvation, hypoxia and desiccation [243]. HSPs and chaperones are transcriptionally regulated and are induced by heat shock transcription factors [244], of which there are several encoded in the coral genome. Since HSPs are found in virtually all living organisms, it is not surprising that cnidarian hsp transcription and protein expression (HSP60, HSP70 and HSP90) have been profiled as a stress determinant [245][246][247][248][249][250] and early warning indicator of coral bleaching [251][252][253][254]. The coral genome reveals also a cold shock protein encoded by the cspA gene family, but profiling its expression with other stress response proteins activated by sub-optimum cold temperatures [255] has not been reported. Additionally, the coral genome encodes transcription of a homologue of the universal stress protein A (UspA), a member of an ancient and conserved group of stress-response proteins [256,257], which have been studied mostly in bacteria [258] but have been described also in several plants [259] and animals, including members of the Cnidaria [260]. Usp transcripts have been quantified in the thermal stress response of the coral Montastraea faveolata [261] and its aposymbiotic embryos [262]. Another gene product of potential interest is a homologue of the oxidative-stress responsive protein 1 (OXSR1) that belongs to the Ser/Thr kinase family of proteins, as do other mitogen-stress activated protein kinases (MAPKs), that regulate downstream kinases in response to environmental stress [263] by interacting with the Hsp70 subfamily of proteins [264]. Another significant response protein encoded in the coral genome (Table 10) is a homologue of the stress-induced phosphoprotein 1 (30 domain sequence alignments), known also as the Hsp70-Hsp90 organising protein (HOP) belonging to the stress inducible (STI1) family of proteins, which is a principle adaptor protein that mediates the functional cooperation of molecular chaperones Hsp70 and Hsp90 [265,266]. It is yet to be determined if Hop1 transcription may serve as a primary indicator of environmental stress in corals. Molecular chaperones are a diverse family of proteins expressed by both prokaryotic and eukaryotic organisms that serve to maintain correct protein folding in a 3dimensional functional state, assist in multiprotein complex assembly and protect proteins from irreversible aggregation at synthesis and during conditions of cellular stress [267]. Additionally, heat shock proteins and their co-chaperones may regulate cell death pathways by inhibition of apoptosis [268]. The coral genome encodes a large number of DnaJ subfamily (J-domain) chaperones (Hsp40) that with co-chaperone GrpE (Table 10) regulates the ATPase activity of Hsp70 (DnaK in bacteria) to enable correct protein folding [269]. The coral genome encodes homologues of the molecular chaperones HscA (specialised Hsp70), the redox-regulated chaperone Hsp33, HtpG (high temperature protein G), members of the calnexin/calreticulin chaperone system of the endoplasmic reticulum, a mitochondrial chaperone BCS1 protein necessary for the assembly of the respiratory chain complex III and a specific chaperone of trimethyl N-oxide reductase (TorA). The coral genome also encodes hypoxia-inducible factors (HIFs) that moderate the deleterious effects of hypoxia on cellular metabolism (reviewed in [270]). In the HIF signalling cascade, the alpha subunits of HIF are hydroxylated at conserved proline residues by HIF prolyl-hydroxylases allowing their recognition for pro-teasomal degradation, which occurs during normoxic conditions but is repressed by oxygen depletion. Hypoxiastabilised HIF1 upregulates the expression of enzymes principally of the oxygen-independent glycolysis pathway, and in higher animals promotes vascularisation, whereas the mammalian HIF2 paralogue regulates erythropoietin control of hepatic erythrocyte production in response to hypoxic stress [271]. The roles of HIF1 and HIF2 homologues in corals have been established, with HIF1 regulation of glycolysis critical to metabolic function during the dark diurnal anoxic state of coral respiration [193,272]. Heat shock proteins that repair unfolded or misfolded protein have a complementary function to the ubiquitinproteasome system (ubiquitins not tabulated) that selects damaged protein for degradation [273], such that HSP chaperones and the proteasome act jointly to preserve cellular proteostasis [274,275]. Thus, several proteasome chaperones and assembly chaperones are encoded in the A. digitifera genome (Table 10). While proteasome chaperones serve to target aberrant proteins for ubiquination, the proteasome chaperones facilitates 20S assembly for biogenesis of the multiunit 26S proteasome that is activated in response to stress [276,277], possibly by FtsJ (aka RrmJ), a well-conserved heat shock protein having novel ribosomal methyltransferase activity that targets methylation of 26S rRNA under heat shock control [278,279]. The HspQ protein encoded in the coral genome, although studied almost exclusively in bacteria, is known to stimulate degradation of denatured proteins caused by hyperthermal stress, particularly DnaA that initiates DNA replication in prokaryotes [280]. Specifically, HspQ (heat shock factor sigma32) regulates the expression of Clp ATPase-dependent protease family enzymes [281,282], of which ClpA, ClpB, ClipE, the protease adaptor protein ClpS [283] and the unfoldase ClpX protein [284] are encoded in the coral genome (Table 10). HspX is a small 16 kDa α-crystallin chaperone (Acr) protein belonging to the Hsp20 family of proteins [285] that suppresses thermal denaturation and aggregation of proteins [285]. Significantly, Acr proteins are known to bind with carbonic anhydrase [286] and may have importance in moderating stress-induced loss of calcium deposition. Thus, HspX/ Acr expression may account for differences in the thermal sensitivity of corals to calcification that varies among genera [287]. In a different context, HspX is attracting considerable attention for its potential to elicit long-term protective immunity against human Mycobacterium tuberculosis infection by chaperoning a host-protective antigen [288] that by extension, but yet untested, may likewise repress virulence in the initiation and progression of microbial coral disease [289,290]. The coral genome encodes complete membership of the human sirtuin (SIRT1-7) family of NAD(+)-dependent protein deacetylases and ADP-ribosyltransferases. Mammalian SIRT1 (a homologue of yeast Sir2) is an important regulator of metabolism, cell differentiation, stress response transcription and pathways of cellular senescence (reviewed in [291]). SIRT proteins regulate chromatin function through deacetylation of histones that promote subsequent alterations in the methylation of histones and DNA to affect, via deactivation of nuclear transcription factors and co-regulators, epigenetic control of nuclear transcription. As NAD + -dependent enzymes, SIRT1 can regulate gene expression in response to cellular NAD + / NADH redox status providing a metabolic template for epigenetic transcriptome reprogramming [292,293]. In the human genome repertoire, SIRT1 modulates cellular responses to hypoxia by deacetylation of HIF1α [294] and inhibits nitric oxide synthesis by suppression of the nuclear factor-kappaB (NF-κB) signalling pathway [295], SIRT2 promotes oxidative stress resistance by deacetylation of forkhead box O (FOXO) proteins [296], SIRT3 decreases ROS production in adipocytes [297], SIRT4 regulates fatty acid metabolism and stressresponse elements of mitochondrial gene expression [298], SIRT5 is a protein lysine desuccinylase and demalonylase of unknown function [299], SIRT6 activates base-excision repair [300] and SIRT7 inhibits apoptosis induced by oxidative stress by deacylation of p53 [301,302]. The significance of coral SIRT proteins, by analogy, to exert stress tolerance is yet to be examined.
Metallochaperones are an important class of enzymes that transport co-factor metal ions to specific proteins [303]. The copper chaperone protein ATX1 (human ATOX1) delivers cytosolic copper to Cu-ATPase proteins and serves as a metal homeostasis factor to prevent Fenton-type production of highly reactive hydroxyl radicals. ATX1, which is strongly induced by molecular oxygen, functions additionally as an antioxidant to protect cells against the toxicity of both the superoxide anion and hydrogen peroxide [304]. Encoded also is a specific copper chaperone essential to the activation of Cu/Zn superoxide dismutase [305,306] that is enhanced by photooxidative stress in scleractionian corals [307], although reported to be less pronounced in the host than in symbiotic algae [308]. In addition to high light exposure, reef-building corals of shallow reef flats are occasionally exposed to the atmosphere for periods that can last several hours during extreme low tides. Hence, species that are adapted to withstand acute desiccation (anhydrobiosis) have a better chance of surviving such conditions. The disaccharide trehalose is an osmolyte that in some plants and animals allows them to survive prolonged periods of desiccation [309]. The hydrated sugar has high water retention that forms a gel phase when cells dehydrate, which on rehydration allows normal cellular activity to resume without damage that would otherwise follow a dehydration/rehydration cycle. Furthermore, trehalose is highly effective in protecting enzymes in their native state from inactivation from thermal denaturation [310]. Given that A. digitifera is endemic on shallow reef flats prone to exposure at low tides [311], it is not surprising that the coral genome encodes trehalose synthase and a facilitated trehalose transporter for protection against dehydration.

Antioxidant and redox-protective proteins
Oxygen is vital for life, but it can also cause damage to cells, particularly at elevated levels. In coral symbiosis, the photosynthetic endosymbionts of corals typically produce more oxygen than the holobiont is able to consume by respiration, so that coral tissues are hyperoxic with tissue pO 2 levels often exceeding 250% of air saturation during daylight illumination [193]. Furthermore, because algal symbionts reside within the endodermal cells of their host, coral tissues must be transparent to facilitate the penetration of downwelling light required for photosynthesis by their algal consorts. In clear shallow waters this entails concurrent exposure to vulnerable molecular sites of both partners to damaging wavelengths of ultraviolet radiation. The synergistic effects of tissue hyperoxia and UV exposure can cause oxidative damage to the symbiosis via the photochemical production of cytotoxic oxygen species [312] that are produced also during normal mitochondrial function [313]. Consequently, protective proteins (antioxidant enzymes) are expressed to maintain the fine balance between oxygen metabolism and the production of potentially toxic reactive oxygen species (ROS). If this balance is not maintained by regulation of oxidative and reductive processes (redox regulation), oxidative stress occurs by the generation of excess ROS, causing damage to DNA, proteins, and lipids. Corals elaborate a variety of molecular defences that including the production of UV-protective sunscreens, (MAAS), antioxidants, antioxidant enzymes, chaperones and heat shock proteins, which are often inducible under conditions of enhanced oxidative stress [307], including conditions that elicit coral bleaching [314,315]. An excellent review on the formation of ROS and the role of antioxidants and antioxidant enzymes in the field of redox biology is given by Halliwell [316].
Annotation of the A. digitifera genome reveals sequences encoding two isoforms of the antioxidant enzyme superoxide dismutase (SOD) from both the Cu/Zn and Fe/Mn families of SOD (Table 11). These metalloprotein enzymes catalyse the dismutation of superoxide to yield molecular oxygen and hydrogen peroxide, the latter being less harmful than superoxide. Superoxide can oxidize proteins, denature enzymes, oxidize lipids and fragment DNA. By removing superoxide, SOD protects also against the production of reactive peroxynitrite formed by the combination of superoxide and nitric oxide, which is a precursor reactant for production of the supra-reactive hydroxyl radical. Hydrogen peroxide per se is a mild oxidant, but it readily oxidises free cellular ferrous iron to ferric iron with production of hydroxyl radicals via the Fenton reaction. Accordingly, both the removal of hydrogen peroxide and the expression of proteins, such as transferrin, (bacterio)ferritins and metallothioneins, that bind reactive (transition) metal ions is important to protect cellular components from acute oxidative damage. Oddly, only a metallothionein expression activator was found encoded in the coral genome without finding a sequence to activate transcription of the actual metallothionein protein gene.
As expected from the foregoing, the genome of A. digitifera encodes the antioxidant enzyme catalase (CAT) that is highly efficient in decomposing hydrogen peroxide to yield molecular oxygen and water. Two isoforms of CAT are encoded at multiple sites. One is a peroxisomal eukaryotic CAT enzyme that targets the removal of hydrogen peroxide formed as a by-product of oxidase enzymes, and the other is a related catalase domain-containing protein presumed also to decompose hydrogen peroxide.
Glutathione peroxidise (GPx) reduces both hydrogen peroxide and lipid hydroperoxides, the latter of which are formed by radical-induced lipid autoxidation. Phototrophic organisms, including higher plants, utilise ascorbate peroxidase (APx) as a primary catalyst for the reduction of hydrogen peroxide and lipid hydroperoxides. However, unlike the freshwater cnidarian H. viridis [164], there is no evidence for transfer of APx-encoding genes to A. digitifera. The antioxidant enzymes SOD, CAT, GPx and APx are well characterised in the algal and animal partners of coral symbiosis (reviewed in [317]). Additionally, the coral genome has sequences encoding alkyl hydroperoxide reductase, hydroperoxide lyase, phospholipidhydroperoxide glutathione peroxidase, thiol peroxidase and multiple isoforms of peroxiredoxin, all of which function in the detoxification of organo-hydroperoxides that are produced as a by-product of aerobic metabolism. Additionally, sulfiredoxin (Table 11) repairs peroxiredoxins when these enzymes are inhibited by over-oxidation [318].
Thioredoxins and glutaredoxins have important secondary roles in regulating multiple pathways in many biological processes, including redox signalling of apoptotic pathways, which have been attributed to processes involved in coral bleaching [56]. Other enzymes that regulate cellular thiol-disulfide homeostasis in this coral are monothiol glutaredoxin and protein-disulfide reductase. The coral genome encodes the ubiquitous thioredoxin system of antioxidant proteins ( Table 11) that act as electron donors to peroxidases and ribonucleotide reductase (the latter not tabulated). By cysteine thiol-disulfide exchange, thioredoxins function as a protein thiol-disulfide oxidoreductase [319]. In the thioredoxin system, thioredoxins are maintained in their reduced state by NADPH-dependent, flavoenzyme thioredoxin reductase [320]. Peptidemethionine (R)-S-oxide reductase can additionally rescue thioredoxin from oxidative inactivation by disulfide reduction. Related glutaredoxins share many of the functions of thioredoxins but are reduced directly by glutathione, rather than by a specific reducing enzyme, while in turn glutathione is kept in its native state by NADPH: glutathione reductase.
In recent years there has been a particular focus on the role of ROS in coral bleaching, fuelled by dire prediction of future catastrophic episodes caused by environmental change affected by global warming [321]. Early predictions of coral bleaching were based principally on physical environmental parameters, rather than on the determination of the physiological state of coral populations to such conditions. While gene expression markers are being developed to monitor sub-bleaching levels of stress in situ (e.g., [261]), Kenkel et al. [322] opined that the current challenge for implementing expression-based methods lies in identifying coral genes demonstrating the most pronounced and consistent stress response, preferably with a large dynamic range to enable reliable quantification. To this end, we offer in Table 11 the annotation of novel redox-related genes for examination as potential candidate biomarkers to monitor the physiological response of A. digitifera to environmental stress.

Proteins of cellular apoptosis
Apoptosis is the signalling of programmed cell death (PCD) that occurs in multicellular organisms in response to cellular injury. A key feature of apoptosis is the activation of endogenous endonucleases causing nuclear fragmentation, chromatin condensation and chromosomal DNA fragmentation, which typically presents in affected cells by the morphological appearance of plasma membrane blebbing and cell shrinkage. Caspases and related family member proteases are described as "executioners" of apoptosis that on post-translational activation degrade the regulatory proteins that prevent DNA degradation. Fragmentation of nuclear DNA is one of the hallmarks of apoptotic cell death that occurs by PCD stimuli in a wide variety of proliferating cells. NF-κB is a protein complex that controls the transcription of DNA that can induce the expression of nitric oxide synthesis (NOS) to produce NO that is a well-known promoter of the of the pro-apoptotic transcription factor p53 cellcycle gatekeeper of the caspase cascade. In contrast to necrosis, which is the outcome of PCD, apoptosis mediates the fragmentation of damaged cells, which by phagocytosis are removed or degraded in phagolysosomes to spare surviving cells from the uncontrolled release of cytotoxic agents. Proteins of the caspasemediated apoptotic cascade are regarded as products of constituent housekeeping genes that are necessary to maintain healthy multicellular function [323]. In the progression of cnidarian bleaching, apoptotic pathways are activated [322][323][324][325], but not all corals that suffer bleaching are destined to die [326,327]. Coral survival has been attributed to having a high level of apoptotic protection at the onset of coral bleaching [328] and during post-bleaching recovery [329] by specific activation of anti-apoptotic Bcl-2 proteins in surviving cells [330].
Cnidarians have a complex apoptotic protein network that has exceptional ancestral complexity and is comparable to that of higher vertebrates [331,332]. Cnidarian metamorphosis is tightly coupled with caspase-dependent apoptosis [333] and subsequent host-symbiont selection by post-phagocytic winnowing of Symbiodinium genotypes during the establishment of coral-dinoflagellate mutualism [334]. As expected, the coral genome of A. digitifera encodes multiple isoforms of genes that transcribe the caspase family of apoptotic effectors (Table 12). Included in this signalling pathway are the pro-and antiapoptotic Bax/Bcl regulators and Bcl-2 athanogene (DNAbinding) activators of apoptosis. Notable in our annotation dataset are multiple genes that encode the protein domains of the apoptotic protease-activating factor (Apaf ) that triggers assembly of the apoptosome leading to caspase activation [335]. Additional to this arsenal of cell cycle regulators are the death associated protein-6 (DAXX), a Fas-binding adaptor of c-Jun N-terminal kinase (JNK) activation [336], death-associated protein kinase (DAPK), a mediator of calcium/calmodulin-regulated Ser/ Thr kinase [337], and the programmed cell death 6-inter acting protein (PDCD6IP), which binds to PDCD-6 for execution of apoptosis via the caspase-3 pathway [338]. PDCD6IP activation of apoptosis is an enigma since PDCD-6 is not encoded in the coral genome, nor is caspase-3. Other cell cycle regulators are the p53 binding and p53-associated parkin-like proteins, and the activating TP53 regulating kinase protein and TP53 apoptosis effector of TP53 gene expression.
Our genome annotation reveals 73 sequence matches for expressing the Apaf protein domain that, in conjunction with a high copy number for expressing caspase-8 (28 protein sequence matches), may enhance coral survival during embryogenesis by suppressing receptor-induced protein kinase (45 sequence matches) during early development [339]. The most conserved function of the CAPS2/RIPK adaptor (45 sequence matches) encoded in the coral genome is its essential regulation of apoptosis [340]. We find a wide repertoire of genes that additionally encode proteins that mediate apoptosis (Table 12). Amongst these are the calpain Ca +2 -sensing family of proteins that initiate the signalling of apoptotic pathways [341]. There are 79 matches to sequences that encode the tumor necrosis Fas superfamily member 6 (TNFRSF6) receptor, which coupled with the death domain (FADD) protein is a cell signalling mediator for recruitment of caspase-8 that activates the apoptotic cysteine protease cascade. Coincident in the genome are 67 sequences encoding the leucine-rich repeat and death domain-containing (LRDD) adaptor that, by interacting with other p53-inducible death domain-containing (PIDD) proteins such as FADD, induces the caspase-2 pathway of apoptosis in response to DNA damage [342]. Elements of the NF-κB signalling pathway of cnidarians are highly conserved traits [343], which includes the caspase cascade and the pro-apoptotic and anti-apoptotic Bcl-2 family of proteins [344]. The coral genome of A. digitifera encodes the pleiotropic nuclear factor NF-κB p105 subunit, and astonishingly there are 212 sequence matches to the NF-κB inhibitor-like protein 2 domain with fewer matches to the NF-κB inhibitor-like protein 1 and NF-κB family inhibitors alpha, delta and epsilon. Evident in our genome annotation is the tumor necrosis factor-alpha induced protein 3 (TNFAIP3), a cytokine produced by activated (inflammatory) macrophages. Although TNF cytokines are a major extrinsic mediator of cellular apoptotic pathways, the precise function of the superfamily members of TNF ligands and receptors (Table 12) remains elusive in coral symbiology.

Microbial symbiosis and pathogenicity
It is well established that corals associate with a vast consortia of microbes, including phototrophic symbionts (Symbiodinium spp.) and other eukaryotic microbionts, cyanophytes, heterotrophic bacteria, archaea and viruses [345]. Corals harbour diverse and abundant prokaryotic communities with distinct populations residing in separate habitats of the host skeleton, tissues and surface mucus layer (reviewed in [203]). Microbial populations are dominated by a few coral-specific taxonomic traits [346], but the majority of the population comprises a high number of taxonomically diverse, low-abundance ribotypes [347] with much of the diversity within the coral microbiome belonging to the "rare" biosphere [348,349]. The coral microbiome is vital to the nutrition and health of the holobiont [350] and contributes significantly to the protection of coral reef ecosystems against the detrimental effects of organic enrichment [351,352]. One emerging threat to coral reefs is the outbreak of infectious diseases (reviewed in [353]). Although highly subjective and with little experimental evidence to date, the coral probiotic hypothesis [354] suggests that the coral prokaryotic microbiome can adapt to changing environmental conditions by selective microbial reorganisation to impart greater resistance to disease and pathogenmediated bleaching [355]. Whether the coral microbiome can respond to changing environmental conditions more rapidly than by host genetic mutation and selection based on contemporary phenotypic evolution on ecological time-scales [356], is a topic of current debate [357]. Corals, like other invertebrates, have an innate immune system based on self-histocompatibility recognition (reviewed in [358]), but to date few adaptive components have been identified [359]. Corals do not produce antibodies and thus lack a true adaptive immune system. Nonetheless, corals once susceptible to infection and bleaching caused by a specific bacterial agent can become immune to the invading pathogen by a phenomenon termed "experience-mediated tolerance", a precept of the hologenome theory of evolution [360], although how this process occurs is largely unknown. In our annotation of the genome sequence of A. digitifera we uncovered genes encoding the expression of disease resistance proteins (Table 13), two of which match the plant RPM1 and RPS2 pathogen resistance proteins that guard against disease by binding with pathogen avirulence receptors [360,361]. Significant also is a gene to express the pathogenesis-related protein PR-1 (29 sequence domain matches) that is inducible in plants for systemic acquired resistance to pathogenic invasion [362]. We uncovered also multiple genes encoding the expression of myeloperoxidase (MPO) enzymes. MPOs produce hypochlorous acid from hydrogen peroxide and chloride ion (requiring heme as a cofactor), and it oxidizes tyrosine to the tyrosyl radical using hydrogen peroxide as an oxidizing agent. Hypochlorous acid and tyrosyl radicals are strong cytotoxic agents that in higher organisms are used as a primary defence by neutrophils to protect against invading pathogens. Phenoloxidase (tyrosinase) activity is reported to contribute to the innate defence system of A. millepora and Porites sp. [363] via activation of the melanin-signalling pathway that is induced in response to coral bleaching and localised disease [364,365]. Three genes of A. digitifera encode tyrosinase enzymes (data not tabulated) to account for the phenoloxidase activity reported in corals. The genome of A. digitifera also reveals homologues of genes that promote bacterial pathogenicity (Table 13), including virulence factors that are expressed and excreted by invading pathogens (bacteria, viruses, fungi and protozoa) to inhibit certain protective functions of the host. Such are the bacterial Type III cytotoxic effector protein and multiple Type IV Cag pathogenicity island proteins encoded in the coral genome. Many Gram-negative bacteria utilize Type III secretion proteins, which are regulated by quorum sensing, to deliver cytotoxic effector proteins into eukaryote host cells during infection. Cag (cytotoxin-associated) pathogenicity island (PAI) proteins are encoded by mobile genetic elements of the Type IV system secreting both proteins and large nucleoprotein complexes [366] that may be transferred between prokaryotes to enhance selected traits of virulence [367]. Our annotation reveals genes encoding six pathogenicity island proteins (Table 13) with similarity to the Cag PAI proteins of the human Heliobacter pylori, an infectious bacterium causing peptic ulcers that may lead to the development of stomach cancer. While many properties of Type III and IV secretion system proteins have been well characterized in bacteria, the functional purpose of homologous genes in A. digitifera, if expressed, are unknown.
The genome of A. digitifera contains genes of bacterial origin that encode the motility quorum-sensing regulator of the GCU-specific mRNA interferase toxin and acyl homoserine lactone synthesis used for the communication of quorum sensing between bacteria to enable the coordination of group behaviour based on collective population density. Apparent in our annotation (Table 13) is a wide array of microbial penicillin-binding proteins (PBPs) that have an affinity for β-lactam antibiotics that by binding to PBPs prevent bacteria from constructing a cell wall. There are genes also to enhance antibiotic resistance, including potential expression of a penicillinase repressor, a methicillin resistance protein and bleomycin hydrolase (cysteine peptidase). Additionally, isopenicillin-N synthase and an isopenicillin-N epimerase, both of which catalyse key steps in the biosynthesis of penicillin and cephalosporin antibiotics, are encoded in the coral genome. Taken as a whole, we demonstrate an extensive presence of ancient nonmetazoan genes that are maintained in the genome of A. digitifera, as is reported in the genomes of A. millepora and the anemone N. vectensis [368]. Recent thought on genome evolution places these ancestral conserved domains as 'orphan' or 'taxonomically restricted' genes [352,369,370], rather than acquired later by horizontal gene transfer. There is, of course, little knowledge of how or when, if at all, these non-metazoan genes are expressed or even their function to mediate pathogenicity in the coral holobiont.

Proteins of viral pathogenicity
Marine viruses were of minor interest until 1989, when it was realised that virus-like particles (VLPs) are the most abundant biological entities to occupy aquatic environments with variable numbers reaching~10 8 VLPs ml -1 [371]. Typically, VLPs surpass the number of marine bacteria by an order of magnitude in coastal waters [372]; their diversity is extremely high and many are specific to the marine environment [373,374]. Significant VLP numbers are reported from the surrounding waters of oceanic coral reef atolls [375], in waters flowing across the reef substratum [376] and in samples taken within the close vicinity of coral colonies [377,378]. The viral load within the surface microlayer of scleractinian corals is enumerated as being 10 7 -10 8 VLPs mL -1 [379] and, based on VLP morphological diversity, is attributed to infecting various microbial hosts (bacteria, archaea, cyanobacteria, fungi and algae) residing within the coral mucus [380]. VLPs have been observed in the epidermal and gastrodermal tissues of corals and occasionally occur in the mesogloea [381]. Latent viruses were found to infect Symbiodinium isolated from several scleractinian corals [382][383][384] with a preponderance of eukaryotic algaeinfecting phycodnaviruses suggested [385]. A wide range of bacteriophage and eukaryotic virus families have been identified within scleractinians using metagenomic analyses [207,[386][387][388], with bacteriophages being by far the most abundant entities (Wood-Charlson EM, Weynberg KD, Suttle CA, Roux S, van Oppen MJH: Methodological biases in coral viromics, submitted).
The importance of the coral-virus interactome in bleaching and disease (reviewed in [185,389]) is founded on reports showing that VLP abundances are higher in the seawater immediately surrounding diseased compared to that of healthy corals [378], that latent viruses are induced by heat stress in symbiotic dinoflagellates of the sea anemone Anemonia virdis [382] and the coral Pavona danai [383], and that UV exposure induces a latent virus-like infection in cultured Symbiodinium [187]. Quantitative 454 pyrosequence analysis of the coral Porites compressa on exposure to reduced pH, elevated nutrients or thermal stress showed that the abundance of its viral consortia varied across treatments, but notably a novel herpes-like virus increased by up to 6 orders of magnitude on exposure to abiotic stress [387], although some caution may be warranted in assessing the reliability of such determinations [Wood-Charlson et al., submitted]. Unexpectedly, the proteome of an endosymbiont-enriched fraction of the coral Stylophora pistillata showed a significant 114-fold increase in a viral replication protein on thermal bleaching [39], which is consistent with the finding of VLP induction in P. compressa by similar treatment [387].
General aspects of histocompatibility [390][391][392][393] and the genetic structure of innate immune receptors of the Cnidaria [363,[394][395][396][397][398][399][400][401], including the immune response effected by coral disease and bleaching [364,402], have been examined extensively, hence further elaboration here is unnecessary. Instead, we focus on proteins that directly regulate the pathogenicity of coral-associated microbes and viruses. The A. digitifera genome encodes protein homologues having either putative antiviral and viruspromoting activities (Table 14). These homologues include the antiviral "superkiller" helicase SKI2 protein that acts by blocking viral mRNA translation [403] and, together with the superkiller proteins SKI3 (69 sequence alignments) and SKI8 of the exosome complex, function in a 3′-mRNA degradation pathway [404]. The coral genome encodes also three exoribonuclease (RNase) enzymes (XRN, XRN2 and RNB) with antiviral RNAdegrading properties [405,406]. Annotation of the coral genome reveals homologues to four interferon proteins (IFNB, IFNG, IFNW1 and IFNT1). Interferons are potent and selective antiviral cytokines [407], which are induced by viral infection or by sensing dsRNA, a byproduct of viral replication, leading to the transcription of interferon-stimulated genes whose products have antiviral activities and others having antimicrobial, antiproliferative/antitumor or immumomodulatory effects [408,409]. Included in the coral antivirus defence system are three members of the interferon regulatory transcription factor (IRF1, IRF2 and IRF8) family proteins. IRF1 and IRF2 are transcriptional activators of cytokines and other target genes [410]; IRF1 is known to trans-activate the tumor suppressor protein p53 [411] while IRF2 regulates post-transcriptional induction of NO synthase [412]. Conversely, IRF8 is an interferon consensus sequencebinding protein that is a negative (interference) regulator of enhancer elements common to interferon-inducible genes [413]. The coral genome additionally includes an interferon-stimulated 20 kDa protein (ISG20) RNase specific to deactivation of singled-stranded RNA viruses [414]. The coral genome encodes several interferoninducible proteins, notably interferon gamma induced GTPase (IGTP) that accumulates in response to IFNB [415], the interferon-induced GTP-binding protein Mx1 that is a key element of host antiviral defence [416], the interferon-induced helicase C domain-containing pro-tein1 (aka MDA-5), which is an immune receptor that senses viral dsRNA to activate the interferon antiviralresponse cascade [417] and the interferon-induced transmembrane protein (IFITM1) that suppresses cell growth [418]. The coral genome encodes the interferon-gamma receptor 2 (IFNGR2) transmembrane protein that activates downstream signal transduction cascades that control cell proliferation and apoptosis [419]. Encoded also is a homologue of the human bone marrow stromal cell antigen 2 (BST2) that inhibits retrovirus infection by preventing VLP release from infected cells [420]. Additionally encoded is a mitochondrial antiviral-signalling protein (MAVS) that triggers the host immune response by activation of the nuclear transcription factor NF-kB and the interferon regulatory transcription factor IRF3 which coordinates the expression of type-1 interferons such as IFNB [421].
The coral genome encodes a full set of baculoviral IAP repeat-containing proteins BIRC 1-6 ( Table 14). The IAP (inhibitor of apoptosis) family proteins were first identified secreted by baculovirus to protect infected cells from death in the progression of viral replication [422]. Expressed by most eukaryotic organisms (reviewed in [423]), their IAP function is presumably conserved in corals. The coral genome encodes a full set of poliovirus receptor-related proteins (PVRL1-4) of the immunoglobulin superfamily, which bind and transport herpesviruses at the cellular membrane in the establishment of latent infections (reviewed in [424]). Encoded also is a complement component (3d/Epstein Barr virus) receptor 2 (CR2) protein that binds to the Epstein-Barr virus Herpes viridae with antigenic activity for disease prevention [425]. Another encoded protein is a homologue of the human immunodeficiency virus type 1 (HIV-1) enhancerbinding protein (HIVEP; aka EBP1) that attaches to the HIV long terminal repeat (LTR) region to activate transcription via the HIV LTR [426]. Present in the coral genome is also a homologue of the influenza virus nonstructural binding protein NS1A-BP that interacts with the NS1 virulence factor of the influenza A virus Orthomyxoviridae to interfere with NS1-inhibition of pre-mRNA splicing within the host nucleosome [427]. NS1A-BP inhibits NS1A-mediated disruption of the host immune response caused by restricting interferon production and the antiviral effects of IFN-induced proteins [428]. The genome of A. digitifera encodes an integration host factor subunit beta (IHFB), first discovered as a host factor for bacteriophage λ integration of mobile genetic elements, that in E. coli is involved in multiple processes of DNA replication, site-specific recombination and gene expression [429]. A homologue of the MFS transporter feline leukemia virus subgroup C receptor (FLVCR) cell surface protein is encoded in the coral genome, which in cats confers susceptibility to FeLV-C infection [430]. Encoded also is a viral integration site 1 (EVI1) that in humans is an oncogenic transcription factor, often activated by viral infection, to cause proliferation of invasive tumours [431]. Arguably, these homologue proteins typically expressed in such distantly related species may have similar relevance in viral interactions of the coral holobiome.
How these regulatory proteins and viral receptors interact and respond to viral infection in corals is yet to be realised. The absence of virion-specific sequences (e.g. for nucleic acid replication or capsid structure) suggests that proviral DNA is absent from the coral genome, or it may be an artefact of the limited number of marine viral sequences deposited in public databases. Discovery of viral activity through proteomics [39] may, therefore, suggest that viral proteins are synthesised from a lytic infection, but this requires confirmation.

Toxins and venom
A review of protein sequences deposited in the UniProt database in October 2012 shows that there are 150 known cnidarian toxins. These toxins have diverse biological activities (neurotoxins, pore-forming cytolysins and venom phospholipases) used to capture prey and for protection against predators [432] that are best characterised in sea anemones (Actiniaria) with 141 sequences deposited [433,434]. The cytotoxin MCTx-1 isolated from the Net Fire Coral Millepora dichotoma is the only toxin from a coral deposited in Uniprot (accession number A8QZJ5). However, our initial examination of the predicted proteome of A. digitifera shows 18 proteins with similarity to bacterial toxins and associated regulatory proteins (Table 15). Unlike reports from proteomic examination of the coral S. pistillata [39] and nematocysts (stinging organelles) of the jellyfish Olindias sambaquiensis [435], Tamoya haplonema, Chiropsalmus quadrumanus, Chrysaora lactea (PF Long et al., pers comm), by sea anemones [434] and by the highly dangerous box jellyfish Chironex fleckeri [436,437], no venoms typical of higher animals were found in the A. digitifera genome. This was because our annotation was carried out using the KEGG database (release v58 [53]) to relate A. digitifera protein sequences to KEGG orthologues. The KEGG database is a collection of proteins from well characterised and ubiquitous biochemical pathways. Animal venoms, however, are highly specialised proteins for which this release of the KEGG database does not contain any described orthologues.
KEGG orthology-based annotation of the A. digitifera genome reveals genes encoding protein homologues of 10 bacterial toxins, 7 regulatory toxin proteins and a botulinum protein substrate (Table 15). Of the 9 toxin homologues, one with similarity to anthrax edema factor (EF) adenylate cyclase (CyaA) is one of three proteins that comprise the anthrax toxin of Bacillus anthracis, the other two being a protective antigen (PA) and lethal factor (LF). Without the LF protein, anthrax CyaA has no known toxic effects in animals [438], although the EF protein does play an important role in disabling cellular functions vital for microbial host defences [439]. The A. digitifera genome encodes a secretion virulence factor exotoxin A-like protein produced by Pseudomonas aeruginosa, which for this bacterium affects local tissue damage, bacterial invasion and immunosuppression within their eukaryote host [440] with pathogenicity similar to that of the diphtheria toxin [441]. Another encoded protein is a murine-like toxin (Ymt) produced by the enterobacterium Yersinia pestis, which is the causative agent responsible for transmission of the notorious bubonic plague [442]. Additionally, two hemolytic enterotoxins similar to NheA and NheBC produced by Bacillus cereus [443], an enterotoxin (EntA) similar to that of Staphylococcus aureus [444], a Shiga-like enterotoxin (StxB) produced by Shigella dysenteria, the diarrhoea-causing toxin A/B (TcdAB) such as that secreted by Clostridium difficile [445], and a protein similar to the zonula occludens (tight junction) enterotoxin (Zot) secreted by Vibrio cholera [446] are encoded in the A. digitifera genome. Within the predicted proteome is also a homologue of the vacuolating cytotoxin (VacA) produced by Helicobacter pylori that colonises the gastric mucosa of the human stomach epithelium [447].
Although a direct homologue of the cholera toxin (CT) was not found encoded in the A. digitifera genome (Table 15), a protein similar to its transcriptional activator ToxR was. ToxR not only controls the expression of CT in Vibrio cholera [448], but also a co-regulated pilin (TcpA) protein that is under control of the ToxR regulon cascade [449]. Bacterial TcpA protein is assembled into toxincoregulated pili that induce the transfer of DNA by horizontal exchange of genetic material during conjugation [450]. TcpA and two toxin co-regulated biosynthetic proteins (TcpI and Tcps) of the bacterial virulence-associated pilus appendage [451] are encoded in the coral genome. Entrained also are the motility quorum-sensing interference regulator MsqR and its transcriptional regulator MsqA that in Eschericia coli controls biofilm formation by inhibiting quorum-sensing motility, and together the MqsR/MqsA complex represses the lethal cold shock-like protein cspD gene [452] that on expression impairs DNA replication [453]. The A. digitifera genome likewise encodes a Type III secretion system T3SS cytotoxic effector (BteA) protein [454] that in Gram-negative invasive bacteria is translocated into host cells to suppress innate immunity to enhance virulence [455,456]. However, the ecophysiological significance of these toxigenic proteins and allied regulators, if indeed expressed by the coral genome, is unknown.
In addition to using the KEGG database, we undertook a BLAST search of the predicted proteome of A. digitifera against peptide sequences for all animal venoms using the annotated UniProtKB/Swiss-Prot Tox-Prot program [457]. This search revealed a large number of accession hits from the predicted proteome, although these are unlikely to be true multiple copies given that the genome sequence has yet to be completely assembled. However, just taking a single accession number from each annotation reveals a complex array of 83 toxins that represents the predicted venom of A. digitifera (Table 16); UniProt BLAST E-values are given in Additional file 1: Table S16b. These venoms are highly diverse and are significantly homologous to toxins from a wide variety of venomous marine and terrestrial creatures such as fish, reptiles, other cnidarians, conesnails, stinging insects and even a venomous mammal (Shrew), covering the complete range of pharmacological properties known in venoms, including cytolytic, neurotoxic, haemotoxic, phospholipase, proteinase and proteinase inhibitor activities. Both the number of toxins predicted in the venom of A. digitifera and the degree of homology to such widely divergent phyla is remarkable. Accordingly, cnidarian venoms may possess unique biological properties that might generate new   leads in the discovery of novel pharmacologically active drugs. Gene duplication followed by mutation and natural selection is widely held as the key mechanism whereby the large diversity of toxins found within a single venom could have evolved [458,459]. Conversely, primary mRNA splicing patterns have been shown to account for the diversity of metallopro-teinases in the pit viper Bothrops neuwiedi [460]. Variations in peptide processing have also been shown by proteomics and transcriptomics to explain how a limited set of genes transcripts could generate thousands of toxins in a single species of cone snail [461]. Despite these various processes that could account for the evolution of toxin diversity, it has never been demonstrated how gene duplications or variations in transcript or peptide processing could have radiated across the very different poisonous creatures found on Earth. Our data (Table 16) reveal that the predicted toxins of A. digitifera venom are orthologues to all of the most important superfamilies of peptide/protein venoms found in diverse taxa. We posit that the origins of toxins in the venoms of higher organisms may have arisen from deep eumetazoan innovations and that the molecular evolution of these venom super gene families can now be addressed taking an integrated venomics approach using Cnidaria such as the jellyfish as model systems [462].

Detoxification proteins of the chemical defensome
There have been considerable advancements made to better understand the effects of pollution on coral reef habitats. The three main categories of environmental pollutants from anthropogenic sources are nutrient enrichment (eutrophication), hydrocarbon pollution and heavy metal contamination. Eutrophication from terrestrial inputs are a significant threat to coral reefs stemming from the discharge of treated sewage, the runoff of agricultural fertilizers (plus herbicides and pesticides), and by sedimentation caused by the erosion of organic-rich soils [463].
Notwithstanding that eutrophication can shift coral reef communities towards macroalgae domination [19], nitrogen and phosphorus enrichment can diminish coral growth and affect the photosynthetic performance of their algal symbionts [464]. Nutrient enhancement alters multiple pathways of primary metabolism that in coral is complicated by the photosynthetic demands of its symbiotic partners. While corals respond to hypertrophic levels of nutrients by activating general stress-response proteins [465], there are no specific proteins known to mitigate the cellular effects of nutrient enrichment on corals per se, and we have not attempted to identify such in this study. Gene families and their regulators that defend against chemical stressors comprise the chemical defensome encoding a network of detoxifying proteins that allows an organism to sense, transform and eliminate potentially toxic endogenous metabolites and xenobiotic contaminants [466]. Expressed proteins of the chemical defensome include the biotransformation cytochrome P450 (CYP) family of enzymes, conjugating enzymes, efflux transporters, heavy metal membrane pump exporters and their transcriptional activators. Annotation of the genome of A. digitifera reveals multiple genes encoding 20 hemoproteins belonging to the Phase II cytochrome P450 superfamily of monooxidase enzymes that catalyse the oxidation of diverse organic substances (Table 17). The substrates of CYP enzymes include intermediates of lipid metabolism and sterol/steroid biosynthesis, and include the detoxification of exogenous xenobiotics. Of significance are the CYP1A-type (aryl hydrocarbon hydroxylase) enzymes that have been studied widely in the hepatic response of fishes to polycyclic aromatic hydrocarbon (PAH) contamination (from crude or fuel oil) and exposure to polychlorinated biphenyl and   [467]). CYP450 activity has been detected in the corals Favia fragum [468], Siderastrea siderea [469], Montastraea faveolata [470] and Pocillopora damicornis, [471]. Furthermore, CYP encoding sequences have been extracted from the genome of N. vectensis [472] and the transcriptome of A. millepora [29]. As well as providing chemical defence, mixed-function CYPs perform multiple endogenous tasks that are often taxon-specific. Hence, the orthology and substrate specificity of coral CYP enzymes cannot be predicted solely on homology to CYPs of known function assigned to higher metazoans. Similar to the function of CPY enzymes, there are genes encoding p-hydroxybenzoate 3-monooxygenase, an oxidoreductase catalyzing aryl oxidation and the soluble and microsomal forms of epoxide hydrolase that converts epoxides, formed by the degradation of aromatic compounds, to trans-diols that by conjugation are readily excreted. Conjugating enzymes to eliminate hydroxylated substrates are the detoxifying UDP-glucuronosyltransferase and sulfotransferase families of enzymes. Estrone sulfotransferase is significant for inactivation of exogenous (contraceptive) estrogens [473] and similar endocrinedisruptive contaminants released from treated wastewater [474]; their occurrence in marine waters are known to disrupt the reproduction and development of fish [475] and corals [476]. Glutathione S-transferase (GST) enzymes catalyse the addition of reduced glutathione to the reactive sites of electrophilic toxins [477]. Surprisingly, only two isoforms of GST were detected in the A. digitifera genome (Table 17), whereas 18 distinct GST-encoding genes (6 classes + 1 fungal-type) were classified from genome sequences of N. vectensis [472]. This unexpected genome reduction of GST elaboration in A. digitifera begs further examination. Many toxicological studies on the effects of pollution on cnidarian fitness have focused on their response to heavy metal contamination, including copper, cadmium, mercury and zinc [478,479]. In scleractinian corals the uptake and toxic effects of copper [480][481][482][483], cadmium [482] and mercury [484,485] have been studied at the metabolic level with specific studies to examine the effects of heavy metal toxicity on coral fertilisation [486][487][488], settlement [487], metamorphosis [486] and in coral bleaching [489]. Yet, the identification of molecular markers to monitor the response of Cnidaria to sub-lethal levels of heavy metal exposure has been elusive [490]. We were delighted to uncover in our annotation a wide range of genes to express metal-specific (arsenic, copper, mercury, nickel/cobalt and tellurium) resistance, transportation and membrane pump exporting proteins that, together with non-specific heavy metal ion export proteins (Table 17), might prove useful for monitoring the environmental response of A. digitifera to heavy metal contamination. Included in the heavy metal defensome are the Mer-family of transcriptional regulators of Hg-and Zn-resistance proteins and a periplasmic ionbinding protein attributed to the Hg detoxification system of bacteria [491]. Enzymes specific for arsenic detoxification are an arsenate oxidoreductase for conversion of arsenate to arsenite [492] and arsenite methyltransferase for conversion of arsenite to the less toxic dimethylarsenite that is amenable to excretion [493]. Such processes may enhance the resilience of corals exposed to natural [494] and site-affected [495] levels of arsenic contamination. In contrast, there were no (organo)cyanide detoxification genes apparent in the A. digitifera genome, but one sequence (v1.01601; K10814) encodes for hydrogen cyanide synthase of unknown metabolic purpose (data not tabulated). Ancillary evidence suggests that the expression of HCN synthase could be linked to quorum sensing [496] for regulating microbial densities of the coral holobiont community.

Epigenetic and DNA-remodelling proteins
In all Kingdoms of life, DNA methylation and chromatin remodelling is pivotal to the regulation of gene transcription independent of underlying allelic variation. One such process mediated by epigenetic changes in eukaryotic biology is the all-important cellular differentiation during morphogenetic development. Epigenetic modifications cause the activation, regulation or silencing of certain genes without changing the basic DNA code. Changes in epigenetic regulation can persist during cell division and across multiple generations [497]. In addition, cytosine methylation may be associated with a higher mutation rate, because deamination of the methylated base produces thymine resulting in C/T mutations, which on reproduction may be transmitted by the germline to subsequent generations in selective processes of evolution [498]. On the other hand, environmentally induced destabilisation of the epigenome can produce epigenetic gene variants (epialleles) that activate transcription and mobilization of DNA transposable elements, which may subsequently lead to stable heritable traits of environmental adaptation, as does occur by genetic imprinting in plants [499]. Transposition has thus the potential to direct increased frequencies of permanent genetic mutations for selective adaptation. One way by which genes are regulated at the epigenome is through the remodelling of the chromatin histone-DNA complex (the nucleosome), which by post-translational modification changes the template structure of DNA associated histone proteins. These modifications are affected by histone-lysine (and histone-arginine) N-methyltransferase enzymes (Table 18) by which these proteins may be further modified by acetylation, ADP-ribosylation, ubiquination, and phosphorylation (annotation not tabulated).
The methylation pattern of histone lysine residues is highly predictive of the gene expression states of transcriptional activation and repression [500]. Necessary epigenomic reprogramming of histone modification at different stages of cell development is affected by the activation of histone and lysine-specific demethylase enzymes (Table 18). Determinants for recognition of the histone code are being revealed by a growing body of experimental data providing valuable information on the molecular tractability of Table 18 Epigenetic and DNA-remodelling proteins in the predicted proteome of A. digitifera binding sites involved in epigenetic signalling [501], which will enhance further insight to epigenetic function.
Direct epigenetic modification of DNA (or mRNA) occurs by methylation of cytosine, and to a lesser extent adenosine and guanine, by nucleobase-specific DNA methyltranferases (Table 18) to give 5-methylcytosine (5-meC), 3-methyladenosine (3-meA) and 3-methylguanine (3-meG) nucleotides, respectively. The principal modification product, 5-methylcytosine behaves much like regular cytosine by pairing with guanine, but in areas of high cytosine methylation, genome transcription is strongly repressed (reviewed in [502]), together with the repression of other chromatin-dependent processes, including the incorporation of transposable elements [503]. Alteration in the methylation status of the entire genome, individual chromosomes or at specific gene sites is essential for normal cellular function, but processes for reprogramming methylated DNA at different stages of cell development, unlike the reversal of histone modifications, is poorly defined [504]. While there are abundant enzymes to repair DNA damage caused by spurious N-alkylation, direct nucleotide C-demethylation (via the hypothetical "DNA demethylase" [505]) is thermodynamically infeasible. Instead, removal of epigenetic C-methylated nucleobases occurs by several base-repair pathways involving DNA excision or mismatch repair enzymes. The genome of A. digitifera encodes expression of a specific DNA glycosylase enzyme [506] for excision of 3-meA, but there are no such enzymes encoded for the excision of 5-meC and 3-meG, although there is encoded a 5-methylcytosine-specific restriction enzyme. Another pathway for DNA demethylation requires base-specific deamination by the AID/ Apobec family of deaminase enzymes that, for example, converts 5-meC to thymine that is replaced subsequently by cytosine by C/T mismatch repair enzymes. These methylated nucleobases are recognized for deamination by the cytosine, adenosine and guanine deaminase enzymes [507] that are encoded in the A. digitifera genome, and their deaminated bases are subsequently removed by DNA mismatch repair enzymes. Additionally, the genome of A. digitifera encodes a methlycytosine dioxygenase enzyme that converts 5-methylcytosine to 5-hydroxymethycytosine (5-hmC), which is recognized for removal by the base excision repair pathway [508] or via its 5-hmC deaminated intermediate [507]. Combined, these DNA demethylation pathways are able to remodel epigenetic modifications at different stages of cell development.
Most current knowledge on DNA and protein methylation comes from studies of mammals and plants, while our understanding of the extent and roles of DNA methylation in invertebrates, marine invertebrates in particular, is still limited [509]. Little is known about the epigenetic potential of corals to acclimatize and adapt to the thermal and synergistic stressors that cause wide-spread coral "bleaching" [510]. Yet, given that acclimatization occurs via the generation of epiallele variants that can in some instances lead to stable heritable traits of environmental adaptation, there is growing interest in the prospect that epigenetic modifications in corals or their algal symbionts [511] may drive adaptation to defend against the damaging threat imposed by rising temperatures from global climate change. It is anticipated that this field of study will rapidly accelerate with the need to better understand epigenetic processes that may contribute to the persistence of coral reefs.

Conclusions
We offer ZoophyteBase as an unprecedented foundation to interrogate the molecular structure of the predicted A. digitifera proteome. Some key findings include proteins with relevance to host-symbiont function, dysfunction and recovery including those that direct vacuolar trafficking and proteins linking symbiont photosynthesis to coral calcification. An extensive catalogue of mammalian-like proteins essential to neural function and venoms related to distant animal phyla suggests their origins lie deep in early eumetazoan evolution. Homologues of prokaryotic genes that have not been described previously in any eukaryote genome such as flagella proteins, proteins essential for nitrogen fixation and photosynthesis point towards lateral gene transfer, perhaps mediated by viruses, that may lead to "shared" metabolic adaptations of symbiosis, and provide corals with limited ability for gene-encoded adaptation to a changing global environment. It is anticipated that understanding how the genome of a coral hosts interacts with that of its vast array of symbionts, and how it may regulate its metabolic quotient, for example through biochemical or epigenetic modification, will rapidly accelerate our ability to predict the fate of coral reefs.

Availability and requirements
ZoophyteBase was constructed using the Metagenome/ Genome Annotated Sequence Natural Language Search Engine (MEGGASENSE). This is a general system for the annotation of sequence collections and presentation of the results in a database that can be searched using biologically intuitive search terms. In this implementation, the predicted proteome of A. digitifera (genome assembly v. 1.0 [48]) was used as the source of protein sequences. The annotation was carried out using the KEGG database (release v58 [51]) to relate A. digitifera protein sequences to KEGG orthologues. The homologous protein sequences were used to construct hidden Markov model (HMM) profiles using the HMMER3 package [49]. The predicted proteome sequences of A. digitifera were searched with the HMM profiles to link proteins to appropriate KEGG orthologues [50,512]. A web interface was developed with various tools. The search platform Lucene/Solr [52] was used to implement natural language searches. Protein sequences provided by the user can be used for BLAST [50] searches against the coral proteome. Selected sequences of the coral proteome can be analysed with third party software (e.g. [53]) to interrogate conserved domains. ZoophyteBase is deployed using Apache-Tomcat (version 7.0.28 for Linux ×64 [513]) on the Ubuntu Linux server of the Section of Bioinformatics at the Faculty of Food Technology and Biotechnology, University of Zagreb, Croatia and is accessible at our published web address [47].