Complete sequence determination of a novel reptile iridovirus isolated from soft-shelled turtle and evolutionary analysis of Iridoviridae

Background Soft-shelled turtle iridovirus (STIV) is the causative agent of severe systemic diseases in cultured soft-shelled turtles (Trionyx sinensis). To our knowledge, the only molecular information available on STIV mainly concerns the highly conserved STIV major capsid protein. The complete sequence of the STIV genome is not yet available. Therefore, determining the genome sequence of STIV and providing a detailed bioinformatic analysis of its genome content and evolution status will facilitate further understanding of the taxonomic elements of STIV and the molecular mechanisms of reptile iridovirus pathogenesis. Results We determined the complete nucleotide sequence of the STIV genome using 454 Life Science sequencing technology. The STIV genome is 105 890 bp in length with a base composition of 55.1% G+C. Computer assisted analysis revealed that the STIV genome contains 105 potential open reading frames (ORFs), which encode polypeptides ranging from 40 to 1,294 amino acids and 20 microRNA candidates. Among the putative proteins, 20 share homology with the ancestral proteins of the nuclear and cytoplasmic large DNA viruses (NCLDVs). Comparative genomic analysis showed that STIV has the highest degree of sequence conservation and a colinear arrangement of genes with frog virus 3 (FV3), followed by Tiger frog virus (TFV), Ambystoma tigrinum virus (ATV), Singapore grouper iridovirus (SGIV), Grouper iridovirus (GIV) and other iridovirus isolates. Phylogenetic analysis based on conserved core genes and complete genome sequence of STIV with other virus genomes was performed. Moreover, analysis of the gene gain-and-loss events in the family Iridoviridae suggested that the genes encoded by iridoviruses have evolved for favoring adaptation to different natural host species. Conclusion This study has provided the complete genome sequence of STIV. Phylogenetic analysis suggested that STIV and FV3 are strains of the same viral species belonging to the Ranavirus genus in the Iridoviridae family. Given virus-host co-evolution and the phylogenetic relationship among vertebrates from fish to reptiles, we propose that iridovirus might transmit between reptiles and amphibians and that STIV and FV3 are strains of the same viral species in the Ranavirus genus.


Background
Iridoviruses are nuclear and cytoplasmic large DNA viruses (NCLDVs), which infect invertebrates and poikilothermic vertebrates, such as insects, fish, amphibians and reptiles, crustaceans and mollusks [1]. The serious systemic diseases caused by some members of the Iridoviridae family have made an important impact on modern aquaculture and wildlife conservation. The current members of the family Iridoviridae can be divided into five genera: Ranavirus, Lymphocystivirus, Megalocytivirus, Iridovirus and Chloriridovirus [2]. Typical characteristics of all iridoviruses include the icosahedral viral particles (~120 to 300 nm) present in the cytoplasm; also, the iridovirus genomes are circularly permuted and terminally redundant [3,4]. At present 13 iridovirus agents isolated from amphibians, fish and insects have been sequenced completely. These include Lymphocystis disease virus 1 (LCDV-1, genus Lymphocystivirus), Chilo [5,6].
Soft-shelled turtle iridovirus (STIV), the causative agent of a novel viral disease called 'red neck disease' in the farmed soft-shelled turtle (Trionyx sinensis) in China was first reported in 1998 [7]. The virus could be propagated in several fish cell lines and caused an obvious cytopathogenic effect (CPE). To our knowledge, although several iridovirus-like agents from reptiles such as turtles have been isolated, no genomic information on a reptile iridovirus has been reported [8][9][10][11]. To facilitate understanding of the molecular mechanism of reptile iridovirus pathogenesis, we determined the complete genomic sequence of STIV and compared its genome structure with other sequenced iridoviruses to help determine its taxonomic position and evolutionary status.

Features of the STIV genome
The determination of the STIV complete genome sequence was carried out by 454 Life Sciences Technology as described [12]. About 2.1 million bp were sequenced, covering nearly 20-fold of the STIV genome sequence. The individual sequences were assembled into a continuous sequence using GS De Novo Assembler software (Roche). The results indicated that the complete STIV genome con-sists of 105 890 bp with 98.5% identity to the complete FV3 genome. The G+C content of STIV is 55.1% ( Figure  1). Computer assisted analysis revealed 105 potential open reading frames (ORFs), which encode polypeptides ranging from 40 to 1,294 amino acids. The locations, orientations, sizes and BLASTP results for the putative ORFs are shown in Table 1. Forty-two individual putative gene products showed significant homology to functionally characterized proteins of other species. Forty-nine ORFs with unknown function have orthologs in other sequenced iridovirus genomes and 14 ORFs share no homology with other iridovirus genes. Seven ORFs (003L, 019R, 022L, 026L, 036L, 080R and 081R) that partially overlapped with others are not annotated as ORFs in the FV3 genome. The other seven ORFs (023R, 033R, 039R, 069L, 078R, 101L and 105R) have corresponding orthologs in the FV3 genome, but their annotations were missed in analysis [13]. The reconstructed common ancestor of the NCLDVs had at least 41 genes [14], whereas in the STIV genome only 20 putative protein products shared homology with the ancestral proteins of NCLDVs, including proteins involved in viral DNA replication, transcription, virion packaging and morphogenesis (see Additional File 1). In addition, a few noncoding regions were identified in the STIV genome and this feature is similar to FV3. In these regions, 20 microRNAs were predicted and are described in detail below.

Repetitive sequences
Repetitive sequences are not only found in eukaryotic genomes [15], but have also been identified in large DNA viruses, where they are involved in genome replication and gene transcription [16,17]. Similar to other iridoviruses, the STIV genome contains 21 repeat sequences (Table 2). Interestingly, a 34 tandem repeated CA dinucleotide called microsatellite or simple sequence repeat (SSR) was closely associated with a predicted gene encoding for a ring finger protein (ORF078L) in the STIV genome. Such a repeat sequence has only been reported in the FV3 genome, but not in other sequenced iridoviruses or mammalian large DNA viruses. These SSRs could serve to modify viral genes involved in gene regulation, transcription and protein function and modification in their function mainly depends on the number of repeats [18]. The biological functions of the repeat sequences and the CA dinucleotide microsatellite in STIV remain to be characterized.

DNA replication and repair
STIV encodes a protein (ORF063R) similar to family B DNA polymerases, which contains a nucleotide-polymerizing domain fused to an N-terminal exonuclease domain. In eukaryotes and prokaryotes, DNA polymerase is an essential replication enzyme and is able to proofread misincorporated nucleotides as well as replicate DNA   Schematic organization of the STIV genome [19]. Besides these functions, the poxvirus DNA polymerases could also play critical roles in catalyzing concatemer formation and promoting virus recombination [20,21]. Some viruses, such as baculoviruses and poxviruses, not only exploit the host cell proliferating cell nuclear antigen (PCNA) proteins to contribute to viral DNA replication [22], but also encode PCNA-like genes by themselves [23,24]. A homolog of PCNA was identified in STIV. STIV also encodes a homologue of the poxvirus D5 family proteins (ORF025R) that contains a unique D5N domain and belongs to the helicase superfamily III within the AAA+ ATPase class [25]. The highly conserved D5 protein is required for the viral DNA replication or lagging-strand synthesis [26].
Other putative proteins encoded by STIV with known or presumed functions in viral DNA replication, recombination and repair included thymidine kinase (ORF092R), virion packaging ATPase (ORF016R), helicase (ORF057L) and tyrosine kinase (ORF031R) as well as FLAP endonuclease (ORF100R) with a conserved nuclease domain (Nand I-regions). The FLAP endonuclease homologs are not only present in STIV and other iridoviruses, but also in the poxvirus, ascovirus and mimivirus [14]. Interestingly, FLAP endonuclease homologs have been identified in herpesviruses and shown to destabilize preexisting host mRNAs in infected cells [27]. Thus, the protein product of ORF100R might function in STIV virogenesis.

Proteins involved in transcription
The gene products involved in transcription include two DNA-dependent RNA polymerase subunits (DdRP, ORF010R and ORF064L), transcription factor-like pro-teins (ORF001L), transcription elongation factor S-II/ TFIIS (ORF088R) and a putative NIF/NLI interacting factor containing a CTD phosphatase domain (ORF041R). The DNA-dependent RNA polymerases (DdRPs) are multifunctional enzymes and exist ubiquitously in prokaryotes, eukaryotes and cytoplasmic DNA viruses [28,29]. The putative protein encoded by ORF088R contains a C2C2 zinc finger domain and is homologous to the TFIIS, which is ubiquitous in many organisms and plays an important role in transcript elongation [30,31]. Virally encoded TFIIS regulate the elongation potential of the viral RNA polymerase during vaccinia virus infection [32].

Nucleotide metabolism
Four proteins involved in nucleotide metabolism were predicted in the STIV genome, including the large and small subunits of the ribonucleotide reductase (RNR, ORF042R and ORF071L respectively), deoxyuridine triphosphate nucleotidohydrolase (dUTPase, ORF066R) and RNase III (ORF087L). Viral RNR is either required for virus growth or is involved in anti-apoptosis functions during viral pathogenesis [33,34]. A putative dUTPase homolog encoded by ORF066R contains five conserved motifs and a conserved Tyr residue as the substrate binding site. dUTPase is an essential enzyme and plays multiple cellular roles [35]. In cells infected with Epstein-Barr virus, virally encoded dUTPase homologs function as highly specific enzymes for efficient replication, or serve to upregulate several proinflammatory cytokines [36,37].
STIV ORF087L also contains a well-conserved RNase III catalytic domain that is required for the cleavage of double stranded (ds)RNA templates [38]. Nearly all STIV encoded nucleotide metabolism enzymes have orthologs in other large DNA viruses. This is consistent with the view that the frequent acquisition of nucleotide metabolism enzymes during DNA virus evolution appears to reflect specific adaptations of viruses for the different types of cells in which they propagate [22].

Structural proteins
Despite the emerging information about iridovirus genomes, there has been little focus on the roles of structural proteins in viral pathogenesis. Three putative structural proteins were examined in the STIV genome. ORF096R encodes a major capsid protein 463 amino acids long that shares 99% identity to FV3. Similar to the MCP gene, the two other genes, ORF002L and ORF055R, are also highly conserved in all sequenced iridovirus genomes [5]. ORF002L encodes a putative membrane protein with a poxvirus conserved region and a C-terminal transmembrane domain. In addition, ORF055R is a myristylated membrane protein homolog with two adjacent transmembrane domains and a conserved sequence M-G-X-X-X-(S/T/A) for N-terminal glycine myristylation. The myristylated membrane protein encoded by vaccinia virus plays a role in virus assembly [39]. The roles of the two putative membrane proteins of STIV during viral infection need to be evaluated.

Virus-host interactions
In addition to the essential genes required for virus replication, STIV also contains several putative genes involved in host-virus interactions, especially in immune evasion. STIV ORF054R shares 40% identity with the vaccinia virus 3-beta-hydroxysteroid oxidoreductase-like protein (3-β-HSD), which has been suggested to contribute to virulence by suppressing inflammatory responses [40]. In addition, three proteins that might be involved in apoptotic signaling have also been identified: ORF067R encodes a protein containing caspase recruitment domain (CARD) and ORF082L encodes a protein sharing sequence homology with the lipopolysaccharide induced tumor necrosis factor-alpha (LITAF) of viruses and eukaryotes [41,42]. There is also a Bcl-2-like protein (ORF103R) containing BH1, BH2 domains and a typical 'NWGR' signature motif. Bcl-2 homologs are also found in herpesviruses, poxvirus, African swine fever virus (ASFV) and adenoviruses [43]. Considering that several iridovirus agents can induce apoptosis during infection, and that virally induced apoptosis aids the progression of replication and dissemination [44,45], these apoptosis-regulating genes might manipulate the balance of life and death in STIV infected cells. In addition, the virally encoded eIF-2α decoy could inhibit eIF-2α phosphorylation and block interferon action during virus infections. Interestingly, STIV ORF030R also displays a truncated eIF-2α-like protein as well as FV3 ORF026R, which is different from the complete eIF-2α homologs conserved among eukaryotes and other viruses, suggesting that STIV and FV3 are likely isolates of the same viral species.

Noncoding RNAs
MicroRNAs (miRNAs) are key regulators of gene expression in higher eukaryotes. Recently, miRNAs have been identified from viruses with double-stranded DNA genomes. The computational method has been applied successfully to predict miRNAs encoded by herpes simplex virus 1 and human cytomegalovirus [46,47]. We applied the same algorithm to the STIV genome and searched for 21-nucleotide (nt) sequences with hairpinstructured precursors. Twelve precursor sequences encoding 20 miRNA candidates were identified in the STIV genome ( Table 3). MicroRNAs of mammalian viruses play important roles during infection, such as repressing host immune responses and apoptosis, and regulating gene expression [48,49]. Whether the potential miRNAs are functional in STIV needs further investigation. We also examined the arrangement of 20 conserved genes, including the major capsid protein and other proteins involved in genome replication, transcription and modification. Given that the origin of virus genome replication is unclear, the MCP gene was chosen as the starting point for all iridovirus genomes. As shown in Figure 3, STIV has a gene order in common with FV3 and TFV, but shows obvious differences from ATV, SGIV and GIV. In addition, the orders of these genes are significantly discriminative among different genera. The presence of inversion in ATV and different gene arrangements are consistent with the high recombination frequency in iridoviruses.

Phylogenetic analysis
To test the phylogenetic relationship of STIV with other members of iridoviruses, the full-length protein sequences encoded by four conserved core genes, including the major capsid protein (MCP), a myristilated membrane protein, ribonuclease III and DNA polymerase (DNA pol) were used for phylogenetic analysis. The alignments were performed using ClustalX and the unweighted parsimony bootstrap consensus tree was obtained by heu-ristic search with 100 bootstrap replicates. As shown in Figure 4A, the results from four phylogenetic trees provided consistent evidence that STIV is most closely related to FV3, the typical species of the genus Ranavirus, followed by TFV, ATV, SGIV and GIV.
Furthermore, given the significant difference in the genome length between vertebrate and invertebrate iridoviruses, a phylogenetic analysis based on the complete genomes of 11 sequenced vertebrate iridovirus isolates was performed. The results further suggested that STIV is most closely related to FV3 ( Figure 4B). Given the nature of virus-host coevolution and the phylogenetic relationships among vertebrates from fish to reptiles, we propose that the iridovirus might transmit between reptiles and amphibians, and that STIV and FV3 are strains of the same viral species belonging to the Ranavirus genus of family Iridoviridae. Interclass infections of iridovirus have been observed by in vivo and in vitro studies on sympatric species of fish and amphibians that can be infected by the same virus [50]. Whether the STIV infects frogs and FV3 infects turtles are questions that need to be evaluated.

Gene gain and loss in the Iridoviridae family
During virus-host coevolution, gene gain and loss are likely to have host-specific effects. The acquired genes could contribute to the evasion of host defenses, while the lost genes may coincide with either the loss of an anti-genic signal to the host cell immune system or the gain of virulence [51,52]. To better understand the evolution of gene content in the Iridoviridae family, we analyzed the gene gain and loss events among the 13 sequenced iridovirus agents. According to our strict homology definition, only 11 clusters of orthologous groups (COGs) contained a homolog from all the iridovirus isolates. Several previously defined conserved core genes were excluded, including the putative replication factor and proliferating cell nuclear antigen (PCNA)-like proteins. These genes shared additional homology characteristics such as a predicted conserved domain, but showed poor alignment scores. We generated a phylogenetic tree based on these 11 concatenated proteins showing the number of genes gained and lost at each branch. As shown in Figure 5, although our mapping of gene gain and loss assumes that gene loss could occur throughout the tree, reptile ranavirus and amphibian ranavirus (+2/-) have less gene gain-and-loss events than fish ranavirus (+50/-24), fish lymphocystivirus (+65/-26), fish megalocytivirus (+86/-19) and insect iridovirus (+105/-). The variance among ranaviruses supported the point that SGIV and GIV were classified into the second Ranavirus group. Moreover, both STIV and FV3 gained five and lost four genes compared with TFV during evolution, again suggesting that STIV shares the highest identity with FV3. In addition, a number of COGs were only present within a specific genus. Tumor necrosis factor receptor (TNFR) homologs or TNFR-associated pro-

Conclusion
In summary, the present study provided the complete genome sequence of turtle iridovirus. The phylogenetic tree and dot plot analyses suggested that STIV, a novel reptile iridovirus isolate, and FV3 are strains of the same virus species belonging to the genus Ranavirus in the family Iridoviridae. The genome data will not only contribute to better understanding the reptile iridovirus pathogenesis, but also shed light on the evolution of the different iridovirus isolates.

Virus propagation and genome DNA preparation
The virus strain used for genome sequencing was STIV (strain 9701) isolated from diseased red-neck turtle (Trionyx sinensis) in China [7]. Fathead minnow (FHM) cells were cultured in Minimum Essential Medium (MEM, Gibco/Invitrogen) containing 10% fetal bovine serum (FBS, Gibco). When STIV-infected FHM cells exhibited 80% CPE, cells were collected and frozen at -20°C. The frozen cells were thawed, and cell debris was removed by centrifugation at 4 000 × g for 30 min at 4°C and the supernatant containing STIV was ultracentrifuged in a Beckman (rotor type, SW41) at 28 000 rpm (~130 000 × g) for 1 h at 4°C. The pellet was resuspended in 1 ml of PBS and further centrifuged using discontinuous sucrose gradient (20,30,40,50 and 60%) centrifugation at 28 000 rpm (~130 000 × g) for 1 h. The virus particle band was collected and used to prepare the STIV genomic DNA using phenol-chloroform extraction as described [53].

DNA sequencing
Sequencing of STIV genome was carried out using a pyrosequencing platform, the Genome Sequencer 20 (GS20) System (454 Life Science Corporation, Roche). Briefly, after the quality of STIV genome DNA had been assessed by agarose gel electrophoresis and analysed by Agilent bioanalyzer (Agilent Technologies, Santa Clara, CA, USA), 10 μg samples were sheared by nebulization into 300-500 bp fragments. The whole genomic library was amplified using GS20 emPCR kits and sequenced with the 454 Life Science GS 20 instrument according to the manufacturer's recommendations. The GS De Novo Assembler software generates a consensus sequence of the whole DNA sample by assembly of de novo shotgun sequencing reads into contigs and subsequent ordering of these contigs into scaffolds. The average reading frame length was about 100 bp with 20-fold coverage of the whole genome. To fill the gaps, 16 oligonucleotide primers were used to amplify by polymerase chain reaction (PCR) directly from the genome DNA and the corresponding PCR products were sequenced using an automated ABI 3730 apparatus (Applied Biosystems, Shanghai, China).

Genome structure prediction
Nucleotide and amino acid sequences were analyzed using the DNASTAR software package (Lasergene, Madison, WI, USA). The genomic organization was drawn using the DNAMAN program. Nucleotide sequence and protein database searches were performed using the BLAST programs at the NCBI website http:// www.ncbi.nlm.nih.gov. The whole genome sequence was also submitted to http://www.softberry.com (Softberry Inc., Mount Kisco, NY, USA) for identification of all putative ORFs. For more refined analyses, conserved motifs and domains and putative functions of deduced STIV proteins composed of 40 or more amino acids with homologies to other proteins in sequence databases were identified using several online programs as follows: for conserved motifs and domains, http://smart.embl-heidel berg.de and http://www.ncbi.nlm.nih.gov/Structure/cdd/ wrpsb.cgi were used; for transmembrane domain predictions, http://www.cbs.dtu.dk/services/TMHMM-2.0/ was used. DNA repetitive sequences were detected computationally using REPuter and a tandem repeats finder [54]. The STIV microRNA prediction was carried out as described [47].

Iridovirus phylogeny
To analyze the evolutionary position of STIV in the family Iridoviridae, four conserved iridovirus genes, which are also present in other large DNA viruses, were evaluated using the PHYLIP program based on the amino acid alignment. Multiple alignments of proteins and nucleotide sequences were generated using the MAFFT 6 and ClustalX programs [55,56]. In addition, a phylogenetic tree was constructed using MEGA version 4 with complete genomic sequences corresponding to the available sequencing data of iridoviruses.

Gene gain and loss events in the Iridoviridae family
All the putative iridovirus genes were obtained from NCBI databases and the all-against-all BLASTP similarity search was performed. The different iridovirus genes were regarded as COGs based on protein sequence similarity. The homologs were determined if one hit the other in the BLASTP search with an e-value ≤ 10 -5 and the maximal produced alignments covered at least 60% of the longer protein, while the homologous proteins from multiple copies of a gene in one genome were counted only once. Eleven sets of COGs were aligned independently using the ClustalX alignment program, then the alignments were concatenated into a single alignment and a neighbor-joining (NJ) tree was constructed using MEGA version 4. Gene gain and loss events were processed with PAML software The genomic arrangement of 20 conserved genes in the fam-ily Iridoviridae Figure 3 The genomic arrangement of 20 conserved genes in the family Iridoviridae. Genes are indicated by black outline boxes. The MCP genes were designated as the starting point for all iridovirus genomes and genome names are listed on the right. Horizontal distances are shown proportional to base pair distances and the vertical lines indicate the conserved genes in different iridovirus isolates. The following are the conserved genes according to their order in the STIV genome: major capsid protein (096R); immediate-early protein ICP-46 (097R); FLAP endonuclease (100R); putative replicating factor (001R); DNA-dependent RNA polymerase II largest subunit (010R); D6/D11-like helicase (011L); A32 virion packaging ATPase (016R); unknown protein (021R); unknown (024L); D5 family NTPase (025R); NIF/NLI interacting factor (041R); myristylated membrane protein (055R); phosphotransferase (060R); DNA polymerase (063R); DNAdependent RNA polymerase subunit II (064L); ribonucleotide reductase small subunit (071R); Ribonuclease III (087L); proliferating cell nuclear antigen, PCNA (091R); thymidine kinase (092R) and thiol oxidoreductase (094R).
Phylogenetic analysis of STIV with other iridovirus isolates based on four conserved core genes and the complete genome sequence Figure 4 Phylogenetic analysis of STIV with other iridovirus isolates based on four conserved core genes and the complete genome sequence. (A) Complete amino acid sequences of major capsid protein (ORF096R), myristilated membrane protein (ORF055R), DNA polymerase (ORF063R) and ribonuclease III (ORF087L) of STIV, FV3, TFV, ATV, SGIV, GIV, LCDV-C, LCDV-1, CIV, IIV-3, ISKNV, RBIV and OSGIV were aligned using Clustal-X and parsimony bootstrap trees generated using PHYLIP. Numbers above branches indicate bootstrap support values based on 100 replicates. (B) Unrooted phylogenetic tree of vertebrate iridoviruses based on the complete genomic sequences. Alignments were made using the MAFFT 6 program and a dendrogram was constructed using the MEGA4 program.
package and assigned to branches in the phylogenetic tree [57].

Additional file 1
Ancestor proteins of large DNA viruses that present or absent in STIV genome. In the STIV genome, only twenty putative protein products shared homology with the ancestral proteins of NCLDVs, including proteins involved in viral DNA replication, transcription, virion packaging and morphogenesis. Click here for file [http://www.biomedcentral.com/content/supplementary/1471-2164-10-224-S1.doc] Phylogeny of the iridovirus based on concatenated protein sequences and the gene gain and loss events