Transcriptome analysis of orange-spotted grouper (Epinephelus coioides) spleen in response to Singapore grouper iridovirus

Background Orange-spotted grouper (Epinephelus coioides) is an economically important marine fish cultured in China and Southeast Asian countries. The emergence of infectious viral diseases, including iridovirus and betanodavirus, have severely affected food products based on this species, causing heavy economic losses. Limited available information on the genomics of E. coioides has hampered the understanding of the molecular mechanisms that underlie host-virus interactions. In this study, we used a 454 pyrosequencing method to investigate differentially-expressed genes in the spleen of the E. coioides infected with Singapore grouper iridovirus (SGIV). Results Using 454 pyrosequencing, we obtained abundant high-quality ESTs from two spleen-complementary DNA libraries which were constructed from SGIV-infected (V) and PBS-injected fish (used as a control: C). A total of 407,027 and 421,141 ESTs were produced in control and SGIV infected libraries, respectively. Among the assembled ESTs, 9,616 (C) and 10,426 (V) ESTs were successfully matched against known genes in the NCBI non-redundant (nr) database with a cut-off E-value above 10-5. Gene ontology (GO) analysis indicated that "cell part", "cellular process" and "binding" represented the largest category. Among the 25 clusters of orthologous group (COG) categories, the cluster for "translation, ribosomal structure and biogenesis" represented the largest group in the control (185 ESTs) and infected (172 ESTs) libraries. Further KEGG analysis revealed that pathways, including cellular metabolism and intracellular immune signaling, existed in the control and infected libraries. Comparative expression analysis indicated that certain genes associated with mitogen-activated protein kinase (MAPK), chemokine, toll-like receptor and RIG-I signaling pathway were alternated in response to SGIV infection. Moreover, changes in the pattern of gene expression were validated by qRT-PCR, including cytokines, cytokine receptors, and transcription factors, apoptosis-associated genes, and interferon related genes. Conclusion This study provided abundant ESTs that could contribute greatly to disclosing novel genes in marine fish. Furthermore, the alterations of predicted gene expression patterns reflected possible responses of these fish to the virus infection. Taken together, our data not only provided new information for identification of novel genes from marine vertebrates, but also shed new light on the understanding of defense mechanisms of marine fish to viral pathogens.


Background
The orange-spotted grouper (E. coioides), an important cultured marine fish with a high market value, is an ideal model for studying sex differentiation and reproduction [1,2]. Rapid expansion of aquaculture has, however, led to an increased incidence of disease outbreaks in recent years [3,4]. Emerging viral infectious diseases, including iridovirus and nodavirus, have caused serious damage to the grouper aquaculture industry with mortality rates due to iridovirus infections ranging from 30% (adult fish) to 100% (fry) [5][6][7]. To date, three iridoviruses that were isolated from diseased groupers have been characterized: Singapore grouper iridovirus (SGIV), orange-spotted grouper iridovirus (OSGIV) and Taiwan grouper iridovirus (TGIV) [5,6,8]. Nevertheless, the molecular mechanisms associated with iridovirus pathogenesis and virus-host interactions are largely unknown, due to the limited amount of available genomic information on E. coioides.
Rapid progress in next-generation sequencing technologies can be used for large-scale efficient and economical production of ESTs. De novo transcriptome sequencing using 454 pyrosequencing has thus become an important method for studying non-model organisms [9][10][11][12]. Transcriptome sequencing facilitates functional genomic studies, including global gene expression, novel gene discovery, assembly of full-length genes, and single nucleotide polymorphism (SNP) discovery [9,13]. To our knowledge, the genome sequence of E. coioides is still unavailable, and this has hindered the progress of immunological and developmental research. To overcome this obstacle, the 454 pyrosequencing technology was applied to determine the transcriptome sequence of E. coioides spleen tissue and a comparative analysis of transcriptome data between the control and the SGIV infected group was performed in this study. The data obtained disclosed a great deal of novel gene information in marine fish and suggested that several intracellular immune signaling pathways were involved in virus infection. These results will shed new light on the understanding of marine fish defense mechanisms to viral pathogens.

Sequence analysis of ESTs from different cDNA libraries
Sequencing data from two different libraries was submitted to the NCBI database (accession number is SRA040065.1). In the control (C) and the SGIV (V) infected spleen libraries, a total of 428867 and 446009 ESTs were sequenced, respectively. Following adaptor sequence and low quality sequences trimming 407,027 (C) and 421,141 (V), high-quality ESTs were obtained from the two libraries. After sequence assembly, 60,322 non-redundant ESTs were generated in the control library, including 36,076 singlets and 24,246 contigs with an average length of 504 bp. In the infected library, 66,063 non-redundant ESTs were generated, including 40,527 singlets and 25,536 contigs, with an average length of 547 bp (Table 1).
All the contigs and singlets were designated as unique sequences and used for further comparative sequence analysis between the two libraries. After a homology search in the non-redundant protein database at the National Center for Biotechnology Information (NCBI), a total of 9,616 (C) and 10,426 (V) unique sequences showed significant BLASTX hits of known protein sequences. The distribution of significant BLASTX hits over different organisms was analyzed. Due to the lack of E. coioides genomic information, the majority of sequences in the two libraries matched genes or fragments from Tetraodon nigroviridis ( Figure 1).

Functional annotation based on GO, COG and KEGG analysis
The putative functions of unique sequences in two different libraries were analyzed according to Gene Ontology (GO) and Clusters of Orthologous Groups of protein (COGs) classifications. Analysis of GO categories showed that the functional distribution of the genes of the two libraries was similar. A total of 14,166 and 14,352 unique sequences map to biological processes, 15,130 and 14,923 sequences map to cellular components, and 7,137 and 7,252 sequences map to molecular functions in the control and SGIV infected libraries, respectively. In both libraries, most of the corresponding biological process genes were involved in cellular processes, biological regulation and metabolic processes. Most of the cellular component genes encode proteins associated with parts of cells and cell organelles; most of the molecular function genes were associated with binding, catalytic activity, and transporter activity ( Figure 2).
Classification of the unigenes into COG categories is critical for functional and evolutionary studies [14]. Among the 25 COG categories, the cluster in the control library for "translation, ribosomal structure and biogenesis" represented the largest group (185 ESTs), followed by the "posttranslational modification, protein turnover, chaperones" and "general function prediction" clusters. Similarly, in the SGIV infected library, the cluster for "translation, ribosomal structure and biogenesis" represented the largest group (172 ESTs) followed by "general function prediction" and "posttranslational modification, protein turnover, chaperones" clusters ( Figure 3).
KEGG is a pathway-based categorization of orthologous genes that provides useful information for predicting functional profiles of genes [15]. In this study the unique sequences of two libraries were categorized within the KEGG database. The matched sequences were involved in metabolism processes, cellular processes, signal transduction and cell cycles. Partial KEGG pathways associated with immune and inflammation responses are listed in

Putative genes involved in up-regulation or downregulation during SGIV infection
Among unique sequences that shared > 30% identity (E value < 1e-5) to known genes in the NCBI database, 2,057 genes were cross-expressed in both the control and the SGIV-infected libraries. Using the Fisher's exact test based on the number of homologous ESTs, we found that 755 genes were significantly up-regulated, while 695 genes were significantly down-regulated in response to SGIV infection. A large number of genes were only present in either the control library or the SGIV-infected library. The up-regulated and down-regulated partial genes are listed in Tables 3 and 4, respectively. The alternated genes included cytoskeletal genes, enzymes, and other immune-related genes, such as chemokines, interleukins and interferon-induced proteins. These genes have different expression patterns during SGIV infection, which implies that they may play an important role in physiological processes associated with SGIV infection.

Validation of the changes in gene expression by quantitative real-time PCR
To validate whether the up-regulated or down-regulated genes identified by statistical analysis were involved in SGIV infection, we detected the relative expression of partial genes using quantitative real time-PCR (qRT-PCR). As shown in Figure 4, the relative expression of IL-8, Chemokine (C-C motif) ligand 18 (CCL18), g-type lysozyme (g-lysozyme) and cystatin B increased significantly after SGIV infection, compared with the expression of these genes in the control fish. In contrast, the expression of the interferon-inducible GTPase 1 (IIGP1),

Discussion
An increasing number of reports reveal that transcriptome sequencing of cDNA has became an efficient strategy for generating enormous sequences that represent expressed genes [16]. Transcriptomes from a number of species including those from Drosophila melanogaster, yeast, Caenorhabditis elegans and various mammals and plants were carried out for different purposes [17][18][19][20][21]. However, genome and transcriptome data for many "lower" vertebrate species, particularly marine fishes, have not been disclosed. To our knowledge, a limited numbers of E. coioides genes were cloned and characterized, based on the bioinformatic analysis, including those involved in immune responses after pathogenic attack, growth and development [22][23][24][25][26][27]. Given that the Amino acid transport and metabolism B: Carbohydrate transport and metabolism C: Cell division and chromosome partitioning D: Cell envelope biogenesis, outer membrane E: Cell motility and secretion F: Chromatin structure and dynamics G: Coenzyme metabolism H: Cytoskeleton I: Defense mechanisms J: DNA replication, recombination, and repair K: Energy production and conversion L: Extracellular structures M: Function unknown N: General function prediction only O: Inorganic ion transport and metabolism P: Intracellular trafficking and secretion Q: Lipid metabolism R: Nuclear structure S: Nucleotide transport and metabolism T: Posttranslational modification, protein turnover, chaperones U: RNA processing and modification V: Secondary metabolites biosynthesis, transport, and catabolism W: Signal transduction mechanisms X: Transcription Y: Translation, ribosomal structure and biogenesis Clusters of orthologous groups (COG) classification   spleen is one of the most important organs associated with immune responses in fish and is also the main target organ for SGIV infection, the transcriptome sequencing of the E. coioides spleen can be expected to provide a significant number of ESTs for marine fish immune responses and contribute to understanding iridovirushost interactions [5]. After removal of overlapping sequences between the control and SGIV-infected libraries, we obtained 65374 non-redundant consensus sequences from E. coioides. With the exception of sequences related to cellular structure and metabolism, abundant sequences were found to be homologous to known immune-relevant genes in other species, based on the BLAST, Conserved Domain Database (CDD), and SWISS-PROT annotation [28][29][30]. More than 80 sequences shared homology to signaling molecules of the mammalian mitogen-activated protein kinase (MAPK) pathways, such as critical molecules associated with extracellular signal-regulated kinase (ERK), p38 MAPK, Ras, RSK2, MKK4, MKK7, ASK1, MEK1/2 and Raf1. The mammalian MAPK signaling pathway was activated during virus infection and contributed to virus replication [31][32][33]. Although the MAPK signaling molecules including ERK, c-Jun N-terminal kinase (JNK) and p38 MAPK were activated in the spleens of SGIV-infected fish (EAGS) cells, identifying the exact roles of these molecules during SGIV replication will benefit from the E. coioides EST information [34,35]. With the exception of homologue components in the MAPK cascade, different members of interferon-related genes were obtained, including the interferon-induced protein viperin, the interferon-stimulated gene 15 (ISG15), interferon-induced protein 35 kD (IFP35), interferonstimulated gene 56 (ISG56), and interferon regulatory factors (IRF-1, IRF-2, IRF-3, IRF-4, IRF-5, IRF-7, IRF-8 and IRF-9). Interferon-induced, or stimulated, genes were important for the resistance of the host to virus infection, including virus entry, replication and release [36][37][38]. The E. coioides IRF-1, IRF-2 and IRF-7 genes have been cloned and characterized and IRF-7 was confirmed as being vitally important for SGIV replication [39,40]. Human ISG15 expression is strongly up-regulated during viral infections, such as human cytomegalovirus (HCMV) and herpes simplex virus (HSV), and ISG15 up-regulation was considered to be involved in different strategies relating to the antiviral response [41][42][43][44]. IFP35 and ISG56 were also involved in the cellular antiviral response against virus infection [38,45]. A detailed investigation on the functions of E. coioides interferon-related genes during SGIV infection will contribute greatly to understanding how the SGIV exploited, or evaded, the host interferon immune response.
We also obtained sequences that shared homology to SGIV-encoded immune evasion genes, including lipopolysaccharide-induced tumor necrosis factor-α factor (LITAF), tumor necrosis factor receptor (TNFR), ubiquitin and Bcl-2 [46][47][48]. Iridovirus-encoded LITAF and Bcl-2 could mediate the fate of host cells by regulating apoptosis [47,48]. It has been reported that many viral immune evasion genes are considered as "stolen" mimics from the host and such viruses may interfere with the host response by modulating or disrupting the function of corresponding host genes [49][50][51]. The discovery of these sequences will be helpful in studies on host-virus interactions. In addition, we also found that other molecules such as lectin, hepcidin, lysozyme and antimicrobial peptide are involved in immune responses. The functions of these genes during virus infection will be investigated in the further studies.
Based on results from exploratory statistical analysis, we identified genes that are up-regulated or down-regulated after SGIV infection. The present data from qRT-PCR analysis validated the hypothesis that expression of partial genes is regulated by SGIV infection, including cytokine, cytokine receptor and transcription factor, apoptosis-associated genes, interferon-related genes, and cytoskeleton genes. Previous studies indicated that the expression of different groups of genes relating to cellular structure, apoptosis, gene transcription and immune regulation were altered in response to virus infections or other stimuli [37,[52][53][54]. Further research into the roles of these differentially-expressed genes will contribute to an increased understanding of the critical events that take place during SGIV replication.

Conclusions
In summary, we studied the immune response of marine fish to virus infection using SGIV infected E. coioides as a model. More than 400 000 high-quality ESTs were obtained from the E. coioides spleen cDNA library by 454 sequencing. These unique sequences contribute greatly to the investigation into changes in gene expression patterns and their molecular functions during pathogens infection, and also provide an abundant data source for the identification of novel genes in E. coioides. This gene information can be used to provide further insights into the functions of chemokines, proinflammatory factors, interferon-induced genes and other cytokines and will thus stimulate further study on the immune response of E. coioides to pathogens. The experimental validation of the gene expression alterations during SGIV infection provides new insights into understanding iridovirus-host interactions.

E. coioides and virus challenge
To construct spleen cDNA libraries, groupers (E. coioides) of 15 cm total length were obtained from a local farm in Guangzhou, China. Sampling detection indicated that these fish tested negative to SGIV infection. All the fish were maintained in a laboratory recirculating seawater system at 25-30°C for 2 weeks. Healthy fish that displayed normal levels of activity were used in this study. The virus suspension used as a challenge was collected from SGIV-infected GS cells. The fish were challenged by injecting with 0.2 ml of the SGIV suspension (1 × 10 5 TCID 50 /ml). As a control, an equal volume of PBS was likewise injected. At 48 h post-infection, fish were sacrificed and tissue samples were taken from the spleens. These were stored in liquid nitrogen for later RNA extraction.

RNA extraction, cDNA library construction and 454 sequencing
Total RNA was extracted from the spleens of the control and SGIV-infected fish using an SV total RNA Isolation kit (Promega). The cDNA library preparation and 454-pyrosequencing were performed as described in Salem et al. [11]. This encompassed a number of procedures as described below. In brief, the first and second strand cDNA were synthesized from 1 μg of total RNA

Data analysis
To analyze the data generated by the FLX sequencer, the sequences of adapters, low complexity and low-quality sequences were filtered out by using Seq-clean and LUCY software [55]. The screened high-quality sequences were de novo assembled used CAP3 software under default parameters [56]. ESTs that did not form contigs were designated as singlets. Putative functions of all the unique sequences (contigs and singlets) were predicted using local BLASTall programs against sequences in the NCBI non-redundant (nr) protein database and the swissprot database (E-value < 1e-5). Each unique sequence was used to determine the COG term, GO term, and the involvement of KEGG pathway database [14,15].
To compare the gene expression profile between two different libraries, EST occurrence was evaluated statistically. The abundance of unique sequence, expressed as an increase or decrease if the number of hits in SGIVinfected library, was classed as "significantly more" or "significantly less" than that of a normal library. The statistical significance of ESTs with different abundance values was determined using Fisher's exact test [57,58]. A P value of < 0.05 was considered as statistically significant.

Quantitative real-time PCR
Quantitative real-time PCR was carried out using a LightCycler ® 480 Real-Time PCR System (Roche), with SYBR Green as the fluorescent dye, according to the manufacturer's protocol (TOYOBO). Different genes including cytokines (IL-8, CCL18), cytokine receptors (CCR4), transcription factors (TFIID), apoptosis-associated genes (cystatin B), interferon-related genes (GILT, IIGP1) and others (lysozyme G) were used for validation. Primer sequences are listed in Table 5. Reaction conditions were as follows: 95°C for 1 min, followed by 40 cycles at 94°C for 15 s and at 60°C for 1 min; all the reactions were performed in biological triplicates and samples were normalized using β-actin. Results were expressed as relative fold of β-actin in each experiment, as mean ± SD.