Sequence analysis of an Archaeal virus isolated from a hypersaline lake in Inner Mongolia, China
- Eulyn Pagaling†1,
- Richard D Haigh†1,
- William D Grant1,
- Don A Cowan2,
- Brian E Jones3,
- Yanhe Ma4,
- Antonio Ventosa5 and
- Shaun Heaphy1Email author
© Pagaling et al; licensee BioMed Central Ltd. 2007
Received: 12 July 2007
Accepted: 09 November 2007
Published: 09 November 2007
We are profoundly ignorant about the diversity of viruses that infect the domain Archaea. Less than 100 have been identified and described and very few of these have had their genomic sequences determined. Here we report the genomic sequence of a previously undescribed archaeal virus.
Haloarchaeal strains with 16S rRNA gene sequences 98% identical to Halorubrum saccharovorum were isolated from a hypersaline lake in Inner Mongolia. Two lytic viruses infecting these were isolated from the lake water. The BJ1 virus is described in this paper. It has an icosahedral head and tail morphology and most likely a linear double stranded DNA genome exhibiting terminal redundancy. Its genome sequence has 42,271 base pairs with a GC content of ~65 mol%. The genome of BJ1 is predicted to encode 70 ORFs, including one for a tRNA. Fifty of the seventy ORFs had no identity to data base entries; twenty showed sequence identity matches to archaeal viruses and to haloarchaea. ORFs possibly coding for an origin of replication complex, integrase, helicase and structural capsid proteins were identified. Evidence for viral integration was obtained.
The virus described here has a very low sequence identity to any previously described virus. Fifty of the seventy ORFs could not be annotated in any way based on amino acid identities with sequences already present in the databases. Determining functions for ORFs such as these is probably easier using a simple virus as a model system.
The three domain description of cellular life on earth, Eukarya, Bacteria and Archaea is a firmly established biological tenet . Each domain has an associated, probably vastly diverse, virus population [2–6]. Thousands of viruses infecting representatives of the domain Eukarya have been described and many of their DNA/RNA genomic sequences determined . Something like 5–6000 viruses (bacteriophages) infecting representatives of the domain Bacteria have been described, at least morphologically, although rather fewer DNA/RNA genomic sequences have been determined . In contrast we are largely ignorant about viruses infecting representatives of the domain Archaea. Just 40 or so have been described and the genomic sequences of only a few have been determined, sixteen being listed in Genbank. All archaeal viruses so far discovered have dsDNA genomes, both linear and circular [8, 9]. Archaeal viruses having an RNA genome have not yet been identified and perhaps do not exist .
The domain Archaea is divided into four established kingdoms, the Crenarchaeota, the Euryarchaeota, the uncultivated Korarchaeota and the very recently identified Nanoarchaeota [10, 11]. Virus particles associated with the first two phyla have been identified, recently reviewed in . About 24 viruses of crenarchaeotes have been identified, often with unusual shapes, e.g. droplets and bottle shapes never observed elsewhere; these viruses have no obvious relationship to phage infecting members of the domain Bacteria [8, 9]. Similarly about 20 viruses infecting members of the Euryarchaeota have been identified of which 15 infect haloarchaea, recently reviewed in . These are mostly head/tail viruses of the order Caudovirales, including myoviruses and siphoviruses that may be distantly related to those infecting the domain Bacteria [8, 9]; although other morphotypes have also been observed . Only six viruses of the haloarchaea have been sequenced. All were isolated by the Dyall-Smith laboratory in Melbourne, from hypersaline sources in Australia, except for φ Ch1. φ Ch1, a temperate myovirus with a 58.5 kb linear genome, the host of which is the haloalkaliphile Natrialba magadii  was isolated from a laboratory strain and presumably originates, like the host, from Africa. Lytic viruses HF1 and the closely related HF2, having linear genomes of 75.9 kb and 77.7 kb, infect the haloarchaea Haloferax lucentense and Halorubrum coriense respectively [14, 15]. His1 and the distantly related His2 spindle shaped viruses with linear genomes of 14.5 and 16 kb respectively, both have lytic and carrier status in Haloarcula hispanica . Finally a lytic icosahedral virus SH1, having a linear genome of 31 kb infects Har. hispanica [17, 18].
We have been studying both archaeal and bacterial prokaryotic diversity in Chinese salt lakes in Inner Mongolia; as part of this study we looked for virus particles associated with haloarchaea. In this report we describe the complete genomic sequence of a ~43 kb virus BJ1.
Description of site and lake water parameters
Lake Bagaejinnor is a hypersaline lake in Inner Mongolia, China [coordinates N45 08 527 E116 36 167]. The lake was sampled in September 2003. It had substantially evaporated over the summer, exposing expanses of [pink salt – encrusted] mud flats and had been reduced to small pools and lagoons of salt – saturated colourless water, pH 8.5. The pink colouration of the salt crystals indicated the presence of haloarchaea. The chemical composition of lake water was determined using laser inductively coupled plasma optical emission spectrometry by the Department of Geology, University of Leicester. Carbonate/bicarbonate concentrations were determined by titration with H2SO4 using a Digital Titrator Model 16900 according to manufacturer's instructions (Hach Systems for Analysis). Chemical concentrations were Na, 5.32 M; Cl, 4.61 M; S 1.07 M; Mg, 0.35 M; K, 33.25 mM; Br, 8.05 mM; HCO3, 7.4 mM; B, 4.25 mM; CO3, 3.3 mM; Ca, 0.77 mM; Li, 0.33 mM.
Obviously this is a seasonal chemical analysis of the lake water, the composition of which continually varies, more dilute in spring following the winter thaw and then gradually becoming concentrated by the hot summer winds. We used trial and error techniques to find an appropriate medium where we could pour both top and bottom agars. Medium composition was influenced by very high salt concentrations interfering with agar solidification and causing "salting out" of some of the components. The eventual salt composition of this medium was identical to that determined for the lake above with the following exceptions; Na was at 2.85 M, Cl was at 2.6 M, S was at 0.642 M, Ca and Li were omitted completely.
Identification of a haloarchaeal host
Plaques for BJ1 required one to two weeks to appear on plates because the host is slow growing. Plaque size for BJ1 was variable between experiments ranging from 1–5 mm in diameter, probably due to slight changes in growth conditions; they were also irregularly shaped and turbid. No attempt was made to optimise plaque formation by modifying temperature, salt concentrations or host strain.
Characterisation of virus genome
Genomic nucleic acid ran on 1.2% TAE agarose gels as a discrete single band larger than a 23 kb DNA marker band. (data not shown). PFGE also suggested a genomic size greater than 23 kb but less than 48 kb (Fig 3, panel b). Bam H1 digestion of the genomic DNA gave 21 discrete bands ranging in size from 6.5 kb to ~500 bp (Fig 3, panel c). From the size of these fragments we estimated a genome size of 42.7 kb, remarkably close to the size eventually determined by sequencing (42.271 kb, see below). In silico digestion of the determined sequence with Bam HI showed that it would generate 20 different fragments i.e. 4949, 4661, 3762, 3235, 3185, 2952, 2434, 2406, 2004, 1949, 1679, 1617, 1505, 1314, 1275, 1094, 816, 781, 563, and 90 bps, with sizes in close agreement to those we observed. Thus the genomic DNA is not subject to methylation at Bam HI sites.
Genome sequence of BJ1
See Figure 4 and Table 1. The double stranded genomic DNA isolated from virus particles is shown as a circular sequence 42, 271 bp long with a G+C content of 64.8 mol% [EMBL: AM419438]. Exonuclease III susceptibility showed that the DNA is linear but sequence assembly indicated it to be circular. This indicates that the genome is terminally redundant (and may be circularly permuted). It is unclear if the BJ1 genome ever forms a circular molecule but if it does then cos sites are unlikely to be involved as digests with three infrequent cutting restriction enzymes (Hind III, Eco RV and Eco RI) followed by melting at 80°C failed to show any change in the number of bands compared to un-melted digests (data not shown).
Predicted ORFs in virus BJ1
The Shine/Dalgarno sequence from Halobacterium (Halorubrum) saccharovorum 16S rRNA gene sequence (Accession HSU17364), which is the closest phylogenetic match to the phage host was complemented (AGGAGGUGA) and used to search 5–15 bp upstream of each putative start site for the presence of putative ribosome binding sites (RBS). 51 of the 70 ORFs had sequences suggestive of a RBS, (Table 1). One particular stretch of 6 predicted ORFs (ORF43-ORF48) showed no obvious RBSs at all. A lack of a RBS for some genes is not surprising as archaeal transcription/translation is a mosaic of prokaryotic and eukaryotic mechanisms and the first gene of an operon, or a singly transcribed gene often lacks a RBS [22–24].
The majority of the ORFs (59/70) had a low calculated isoelectric point (pI < 5), which is similar to the acidic proteins of halophilic organisms [15, 25]. Just three small ORFs (less than 74 aa) were predicted to be extremely basic (pI > 10). No ORF larger than 100 aa had a pI above 6.3. 63 ORFs and the tRNA are coded on one strand (designated forward) and 7 are on the reverse strand. One ORF, 30 (13255–14700 bp) overlaps entirely with another, ORF31 (13270–14487 bp), running in the opposite direction. It seems probable that both ORFs are coding, ORF30 because it overlaps with the start and stop codons of the ORFs before and after it i.e. 29 and 32, with a good consensus RBS; ORF31 because it shows significant homology to integrases, (see below).
BJ1 ORF analysis
BJ1 ORFs with identifiable BlastX matches to data base entries.
Homologs (% Identity)
59% similarity (E 10-8) to ORF58 halovirus φ Ch1 (AAM88732)
54% similarity (E10-5) to protein Haloquadratum walsbyi (CAJ52235)
similarity (E 10-13) to protein from Natronomonas pharaonis (CR936257.1)
similarity (E 10-7) to a protein of φ Ch1 (NP 665930.1)
No significant match to any described protein. InterPro suggests DNA binding protein
similarity (E 10-3) to bacterial proteins with DnaJ domain; role in DNA replication?
65% similarity (E 10-6) to protein (AAG20925) of Halobacterium sp. NRC-1. A metal regulated homodimeric repressor with a 'winged helix' DNA binding domain
60% similarity (E 10-67) to a Har. marismortui protein YP_136906; member of the ORC1/CDC-6 superfamily of NTPases involved in DNA replication
54% similarity (E 10-13) to halovirus φ H1 repressor protein(AAV47198.1) with a winged helix DNA binding domain
66% similarity (E 10-17) to the Hqr. walsbyi PadR transcriptional regulator (CAJ51359.1)
similarity is to a Har. marismortui phage integrase (E 10-66) 45% ID (AAV47153 the λ bacteriophage recombinase family, pfam00589
DNA helicase? 62% similarity (E 10-128) to Har. marismortui protein (AAV47142) of the Cdc-46/Mcm family of DNA dependent ATPases.
68% similarity (E 0.05) to ArsR-like transcriptional regulator (CAJ51299) from Hqr. Walsbyi (92 amino acids long); the similarity being from amino acids 15–68 in ORF39 with 20–72 in CAJ51299
56% similarity (E 10-37), to halovirus HF1 protein (AAO61337.1) which may be a YonJ like, small subunit of the DNA polymerase, (COG1311)
54% similarity to Listonella pelagia phage phiHSIC small terminase subunit (YP_224235.1
43% similarity (E 0.01) to Streptococcus pneumoniae bacteriophage EJ-1 large terminase (CAE82121)
54% similarity (E 10-77) to the putative portal protein (NP_665924) of Nab. magadii virus φ Ch1.
49% similarity (E 10-13) to the capsid protein gpD (AAM88683) of halovirus φ Ch1
48% similarity (E 10-29) to hp32 (CAA56442) of Hbt. salinarum virus φ H and 47% similarity (E 10-24) to the capsid protein gpE (AAG32163) of halovirus φ Ch1
51% similarity (E 10-15) to Enterococcus faecium glycosyl transferase (EAN10921). LPS biosynthesis protein.
The remaining 15 ORFs could have functions tentatively ascribed to them on the basis of amino acid similarity, (Table 2). We place them into three groups. (i) Those probably concerned with DNA replication, gene expression and possibly integration, i.e. ORFs 5, 6, 16, 20, 21, 31, 35, 39 and 43. (ii) Those proteins likely to be involved in virus assembly, i.e. ORFs 48, 49, 50, 52 and 53. (iii) Those proteins with other identifiable functions, i.e. ORF1.
Nine direct repeats were observed greater than 13 nucleotides; the largest was 17 nucleotides, i.e. GGCGGCATCCAACTCGG repeated at positions 34076 and 34120. All of the repeats were located in putative ORFs and we can infer nothing of significance for them. A number of perfect and imperfect inverted repeat/stemloop structures were identified, often having loops 100 s–1000 s of nucleotides in size. One perfect palindrome is located at nucleotides 14226GTCCGCTGGA/TCCAGCGGAC14247 in ORF31, the putative integrase gene. Another palindrome separated by 3 nucleotides (lower case) is 42048ACTATCCGACtggGTCGGATAGT42070; again both are present in putative ORFs and their significance is unclear although the last palindrome is located 209 nucleotides from the 3' end of the genome. The BJ1 genome has a low incidence of CTAG and GATC sequences, just three of each of these palindromes being present. This incidence is low, both compared to the statistically expected incidence, (every 256 base pairs) and compared to the related tetramers CGAG and GCTC which were both found 36 times. CTAG and GATC sequences appear to be selected against by many haloviruses e.g. these palindromes are absent from the genomes of HF1, HF2, His2 and SH1 . This selection pressure is thought to be due to the avoidance of restriction-modification systems in the host cells , and there is evidence that CTAG and GATC palindromes are used by haloarchaeal systems [27, 28].
Predicted ORFs in the sequence inserted into ORF 32 and their highest BlastX matches. Nucleotide numbering is from the 5' end of the insertion sequence; nucleotide 8685 corresponds to nucleotide 14790 in the BJ1 genomic sequence. The sequence at the site of insertion was tgctcggtcgtcaa/CGACGCCGACGACGGCGA; lower case variant, upper case BJ1 ORF 32. Orfs are in the forward direction with respect to the virus genome unless indicated by a - sign. * indicates a truncated ORF because of incomplete sequencing (V10) or the insertion event itself (V1 and ORF32) aa indicates the number of amino acids.
Homologs (% Identity)
67% – ornithine cyclodeaminase Natronomonas pharaonis DSM 2160
36% – hypothetical protein VNG6157H Halobacterium sp. NRC-1
70% – cell division protein pelota Natronomonas pharaonis DSM 2160.
28% – hypothetical protein NP4342A Natronomonas pharaonis
38% – hypothetical protein rrnAC2062 Haloarcula marismortui
38% – Alpha/beta hydrolase fold protein Ralstonia eutropha JMP134
75% – hypothetical protein HQ2797A Haloquadratum walsbyi DSM 16790
73% – RtcB-like protein 1 Natronomonas pharaonis DSM 2160
61% – hypothetical protein NP3986A Natronomonas pharaonis DSM 2160
64% – 3-hydroxy-3-methylglutaryl-coenzyme A reductase (HMG-CoA reductase) Haloferax volcanii
100% Phage BJ1 hypothetical protein
Morphological criteria used for virus classification is outlined by the International Committee for Taxonomy of Viruses . Virus BJ1 is an icosahedral head/tailed virus and as such is assigned to the order Caudovirales with examples infecting members of both the domains Bacteria and Archaea. BJ1 can also be assigned to the Bradley classification group B and might tentatively be assigned to the family Siphoviridae due to the apparent absence of a contractile tail, base plate and tail fibres and the presence of striations in the tail fibre. If we assume that this classification is phylogenetically justified then it could indicate that the Caudovirales originated before the divergence of the Bacteria and Archaea . An alternative explanation is that the Caudovirales originally infected members of the domain Bacteria but that horizontal gene exchange from mesophilic Bacteria to the Archaea and the subsequent stabilisation of these genes in the Archaea allowed the Caudovirales to spread into the domain Archaea [Certainly we have detected diverse bacterial populations in the water of Lake Bagaejinnor, SH unpublished] .
As described in the Introduction, very few viruses infecting the domain Archaea have been described and as yet we have little idea as to the extent of virus diversity in this domain. The virus we describe here may not be a common or dominant member of the virus community infecting haloarchaea in saline waters. We screened for lytic virus particles forming plaques on archaeal lawns. These requirements for host culturability, good lawn formation and plaque formation are probably extremely restrictive. As pointed out by others, there is a genuine need to develop other isolation and culture techniques to study both the dominant virus populations and the true extent of archaeal virus variation in samples such as these – perhaps using a combination of electron microscopy and metagenomic sequence studies.
The GC content of BJ1 at 65 mol% is quite close to that reported for Hrr. spp aidingense, lacusprofundi and saccharovorum, varying from about 63–71 mol% [19, 20]. The host strain for BJ1 clearly belongs to the genus Halorubrum having 98% 16SrRNA gene sequence identity to these Halorubrum species. Its precise taxonomic relationship to these species, in particular if it belongs to a new Halorubrum species is the subject of current studies.
Of the ORFs identified in BJ1 described in the results, all of the statistically significant matches are recorded, (Table 2). Six of the ORFs (9, 20, 50, 52, 53, 55) are most closely related to the haloarchaeal temperate, isometric head/contractile tail viruses φ Ch1 and the intensively studied, φ H . These two viruses are closely related to each other, the completed genome of φ Ch1 shows 97% homology to the genome of φ H, which is about 60% complete. ORF 43 is most closely related to a gene from the haloarchaeal isometric head/contractile tail virus HF1. There are no similarities with the ORFs from either the spindle (His1, His2) or icosahedral (SH1) shaped haloarchaeal viruses described in the Introduction. The most significant matches were ORFs 16, 31, 35, which are almost certainly the origin of replication complex, integrase and helicase functions respectively of the virus, having highly significant matches to full length proteins in Har. marismortui. ORF50 was also closely related to the putative portal protein (NP_665924) of Nab. magadii virus φ Ch1.
Speculatively, almost all ORFs are in the forward strand in the same direction consistent with a rolling circle mechanism of DNA replication. The 7 ORFs on the reverse strand including the integrase may be poorly expressed. A few ORFs had GTG starts (but with good RBS sequences) and the other ORFs lacked RBS sequences altogether, presumably both coding features control/reduce expression levels. The fact that putative Int gene is coded for on the minor strand with no RBS and that it overlaps with ORF 30 on the major strand may indicate that its expression is tightly controlled; perhaps most infections are lytic with a small proportion of lysogenic events. The suggestion of operons indicated in Fig 4 is also entirely speculative and based on the presence of overlapping stop and start signals, one run of ORFs from 43–48 has no RBS at all. Proteins with putative functions involved in DNA replication and transcription are found in ORFs 1–43, putative structural proteins are found after ORF48 consistent with early and late expression of operons.
Although BJ1 stocks are clonal in origin, the genomic DNA preparation is obviously and necessarily derived from a virus pool. Genome sequence projects often therefore give rise to heterogeneous sequences. We found one substantial region of heterogeneity in ORF 32 at nucleotide 14790 involving either a large insertion or more probably a substitution event (since terminally redundant virus genomes usually package genomes in a 'head full' mechanism). To distinguish between these possibilities requires more sequencing. The variant sequence probably involves the acquisition of host derived DNA since the GC content is higher (72.6%) than that of the virus (64.8%) and close to that reported for Hrr. saccharovorum (71%). Obviously this insertion/substitution has taken place about 300 nucleotides away from the putative integrase gene. The integrase gene in viruses is often the site of insertion as well. We speculate that this variant sequence in the virus population is the result of an integration/excision event (possibly aberrant) during the virus infection to prepare genomic DNA. This may indicate that BJ1 is a lysogenic virus; plaques were certainly turbid consistent with this suggestion but further experiments will be required to prove it. Whether the virus population with this variant sequence is viable will also require further studies. Certainly virus populations with insertions and or substantial genomic deletions can be viable or at least rescued by functional virus genomes.
Many interesting features remain to be discovered about the BJ1 virus. Optimal growth conditions for this virus need to be established and its host range determined. This will facilitate studies on its environmental stability, patterns of transcription, protein functions, lysogenic potential and the viability of the variant virus. Assignment of protein functions to ORFs which cannot be assigned any function based on sequence identity is probably easier using a virus as a model than any other genome. A systematic effort on this front will reduce the number of unclassified ORFs that metagenomic and archaeal sequencing projects so often throw up.
Cultivation of prokaryotes from environmental samples
Isolates were grown on a modified Classic Halophile Medium (mCHM) broth, . This was made in two components; component 1 contains 1% (w/v) yeast extract, 0.75% (w/v) casamino acids, 0.248% (w/v) KCl and 0.3% (w/v) trisodium citrate; component 2 contains 0.162% (w/v) Na2B4O7, 0.084% (w/v) NaBr, 7.116% (w/v) MgCl2.7H2O, 13% (w/v) NaCl, 4.56% (w/v) Na2SO4, 0.062% (w/v) NaHCO3 and 0.036% (w/v) Na2CO3, pH 8.0. Both components were autoclaved separately and mixed once cooled to 60°C, then stored at room temperature. 2% (w/v) agar was added to component 1 if required to make mCHM agar plates, while 0.7% (w/v) agar was added to component 1 to make soft top agar. Prokaryotes were cultivated from brine, salt or sediment samples. Brine was filtered on site through sterile 0.45 μm membrane filters in a 250 ml capacity polycarbonate filter unit (Sartorius) using a Nalgene hand pump until flow stopped. Membrane filters were immediately placed in cold sterile stabilisation buffer (10 mM Tris-HCl, pH 8.0, 1 mM EDTA, 2 M NaCl) and agitated to resuspend the cells. Filtered waters were placed in sterile falcon tubes. Samples were placed immediately on ice until they could be stored at -20°C, usually within 6 hours of collection. Either, cell suspensions from agitated filters were serially diluted and plated onto mCHM agar plates, or about 0.5 g sediment and salt crust was resuspended in 0.5 ml of mCHM and serial dilutions plated onto the mCHM agar plates. These were incubated for two months at 37°C and were periodically checked for the appearance of new colonies which were picked and grown on fresh plates. Sub-culturing was continued on the same medium until purity was achieved. Isolated colonies were then grown in mCHM broth to an OD695 of 2 to 4, and maintained on sterile beads at -80°C for long-term storage in mCHM broth with 30% (v/v) sterile glycerol.
Identification of haloarchaeal isolates by 16S rRNA gene sequencing
Pure cultures, see above, were lysed in 100 μl nanopure water and boiled for 10 min. Cell debris was pelleted by centrifugation at 13 000 × g for 10 min. 1 μl cell lysate was used in a PCR reaction containing (75 mM Tris-HCl, pH 8.8, 20 mM (NH4)2SO4, 0.01% (v/v) Tween 20), 0.2 mM dNTPs, 3 mM MgCl2, 20 p mol forward primer, 20 p mol reverse primer, 2.5 U Taq polymerase and nanopure water to a final volume of 50 μl. To amplify the 16S rRNA genes, the Archaeal domain specific primer 27Fa, 5'-TCY GGT TGA TCC TGS CGG-3',  and rP1 5'-ACG GHT ACC TTG TTA CGA CTT-3',  were used. Reaction conditions were: 95°C for 2 min, followed by 30 cycles of 95°C for 30 s, 50°C for 40 s and 72°C for 2 min, followed by 10 min extension time at 72°C. PCR products were purified using the QIAquick PCR Purification Kit (Qiagen) and stored at -20°C until required. DNA sequencing, also see below, was done by Lark Technologies, Cambridge UK using 27Fa and rP1 primers described above (corresponding to nucleotides 27–1492 with E. coli as the reference sequence). The DNA sequences were analysed using the BLASTN homology search program , which is available at the National Centre for Biotechnology Information to identify close matches.
Strains were placed on a phylogenetic tree using Molecular Evolutionary Genetics Analysis (MEGA) version 3.1 , using the Jukes and Cantor nucleotide substitution model for sequence alignment and the Neighbour-Joining method of tree inference. The support for each node was determined by assembling a consensus tree of 500 bootstrap replicates.
Isolation of haloarchaeal virus by plaque assays
Haloarchaeal strains identified as described above were grown in soft top agar. mCHM bottom agar plates were overlaid with mCHM soft top agar containing 0.75% (w/v) agar, kept molten in a 55°C water bath until required. 300 μl of the haloarchaeal strain (OD approximately 0.2 at 695 nm, avoiding absorbance by the archaeal pigments) was added to 3 ml agar cooled to approximately 50°C and mixed. This was immediately poured on top of the bottom agar and left to set. The plates were carefully inverted and incubated in a sealed bag at 37°C for a week or longer. If good lawns were formed the strain was used to isolate haloarchaeal virus as follows: 10 μ l of Bagaejinnor lake water passed through both a 0.45 and 0.22 μm filter (both from Millipore) was added to 1 ml cell culture and incubated at 37°C in an orbital shaker at 150 rpm overnight. The culture was plated in soft top agar as described and the resulting lawns checked for the appearance of lytic plaques. Single plaques selected for purification were picked with a sterile toothpick. Virus particles were then resuspended in 100 μl mCHM broth; this was then used to infect the host as previously described. This process of plaque purification was repeated twice to ensure that the virus samples were pure. Virus particles remained stable in mCHM broth when placed at 4°C for at least 1 year.
Transmission electron microscopy
5 μ l of the virus sample was adsorbed onto glow discharged, carbon coated pioloform grids and fixed in glutaraldehyde vapour for 3 min. Excess sample was blotted from the grid using filter paper. Salts were removed by washing with distilled water. The sample was visualised by negative staining using 1% (w/w) uranyl acetate and viewed on a JEOL 1220 transmission electron microscope fitted with a SIS Megaview III digital camera system. Captured Images were viewed and analysed using the Image J program .
Viral nucleic acid extraction
Attempts to purify virus nucleic acid from infected liquid cultures were unsuccessful. Accordingly 30 μl of virus stock (~106pfu/ml) were added to 300 μl of host cell culture (OD approximately 0.2 at 695 nm). Virus particles were left to adsorb onto the host cells for 15 min at room temperature, mixed with soft top agar and poured and incubated as described above to give agar plates with a high density of virus plaques. 0.5 ml halovirus diluent [60% (v/v) of a salt solution containing; 0.3% (w/v) KCl, 0.162% (w/v) Na2B4O7, 0.084% (w/v) NaBr, 7.116% (w/v) MgCl2.7H2O, 13% (w/v) NaCl, 4.56% (w/v) Na2SO4, 0.062% (w/v) NaHCO3 and 0.036% (w/v) Na2CO3; 29% (v/v) H20; 1% (v/v) 1 M Tris pH 7.2; 10% (v/v) glycerol] was added to each plate and the virus harvested by scraping off the soft top agar and homogenising by vortexing for 30 s. Agar and cell debris was pelleted by centrifugation at 10 000 rpm for 20 mins. The supernatant was transferred to a fresh clean tube. To increase the yield of virus particles, the pellet was resuspended in 2 ml halovirus diluent and the previous steps of homogenisation and centrifugation were repeated. Combined supernatants were passed through a 0.45 μm filter and then a 0.22 μm filter to further remove agar and cell debris. To remove any exogenous non-virus nucleic acids DNase I and RNase A were each added to a final concentration of 1 μg/ml and the sample left at room temperature for 30 min.
Virus particles were precipitated by the addition of 1/8 volume polyethylene glycol (PEG) 6000 solution (2.5 M NaCl, 20% (w/v) PEG 6000) and left to incubate for 15 min at room temperature. Virus particles were pelleted by centrifugation at 13 000 × g for 5 min. The supernatant was carefully removed and the pellet resuspended in 100 μl phosphate buffered saline (0.8% w/v NaCl, 0.121% w/v K2HPO4 and 0.034% w/v KH2PO4). To extract genomic nucleic acid from the virus, the pellet was mixed with an equal volume of phenol chloroform and centrifuged for 30 s. The top nucleic acid containing aqueous layer was transferred to a fresh tube. Excess phenol chloroform was removed by ether extraction. The nucleic acid was ethanol precipitated, redissolved in 20 μl Tris-EDTA, pH 8.0 and left to rehydrate at 4°C overnight. An extraction from 20 plates typically yielded 1–2 μ g nucleic acid.
Genome characterisation and sequencing
1 μ g virus nucleic acid was treated with either excess DNase I (NEB), RNase A (Sigma) or Exonuclease III (NEB) in the manufacturers reaction buffer and incubated at 37°C for 10 min, 60 min or 30 min respectively. Reactions were electrophoresed on Tris-Acetate-EDTA (TAE) agarose gels and stained with SYBR green. Viral nucleic acids were ran on a 1% agarose pulse field gel (BioRad) in 0.5× TBE buffer at 14°C in a CHEF DR-II apparatus (BioRad). The run time was 22 h with a voltage gradient of 6 V/cm and a linearly ramped pulse time of 50 to 90 s at an angle of 120°.
BJ1 genomic DNA was digested with Bam HI (giving approximately 20 fragments ranging in size from 100 bp to 5 kbp, and cloned into Bam HI-digested pUC18Not I vector . Resulting clones were sequenced using vector-specific oligonucleotide primers pUCF, 5'-GTTTTCCCAGTCACGACGTTG-3' and pUCR, 5'-CACAGGAAACAG CTATGACC-3'; these sequences were used to design further primers to primer walk across the clones. The high G+C content (~65 mol%) of the initial sequences was used to identify restriction enzymes that would likely cut the phage genome to give smaller (on average 500–1000 bp) fragments. Secondary libraries of Sst I and Xho I fragments were created in pUC18Not I and representative clones of these libraries were sequenced using pUCF and pUCR and subsequent primer walking. Finally the remaining gaps were filled by designing primers to the ends of the larger contigs, orientating these contigs by PCR using phage genome as template, and then primer walking out from the contigs using the PCR amplified products as sequencing template. The genomic sequence was assembled using the Lasergene SeqMan 7.0 program (DNAStar). Final coverage of the genome was 4-fold with the majority sequenced on both of the strands or, where bidirectional sequencing was impractical, with multiple sequence runs on the same strand.
Potential ORFs were assigned using the programs FGENESB  and GeneMark.hmm v2.5a . tRNA sequences were identified using the tRNAscan-SE program in . Translations of potential ORF sequences to amino acids were made with the SeqBuilder program (DNAStar). Statistics for each of the ORFs were calculated using the program ProtParam .
GC skew was calculated using the online base composition tools at . BLAST (blastp and tblastn) and PSI-BLAST  were used to search for possible homologies to known proteins, or proteins predicted by translation of the unannotated DNA sequence in GenBank. Inverted repeats in the DNA sequence were identified using Einverted  and PALINDROME ; direct repeats were located using Palim .
This research was supported by the European Commission research programme.
'Quality of life and management of living resources', project MultigenomeAccess Technology for Industrial Catalysts (QLRT-2001-01972).
- Wheelis ML, Kandler O, Woese CR: On the nature of global classification. Proc Natl Acad Sci USA. 1992, 89: 2930-2934. 10.1073/pnas.89.7.2930.PubMed CentralPubMedView ArticleGoogle Scholar
- Angly FE, Felts B, Breitbart M, Salamon P, Edwards RA, Carlson C, Chan AM, Haynes M, Kelley S, Liu H, Mahaffy JM, Mueller JE, Nulton J, Olson R, Parsons R, Rayhawk S, Suttle CA, Rohwer F: The marine viromes of four oceanic regions. PLoS Biol. 2006, 4: e368-10.1371/journal.pbio.0040368.PubMed CentralPubMedView ArticleGoogle Scholar
- Breitbart M, Hewson I, Felts B, Mahaffy JM, Nulton J, Salamon P, Rohwer F: Metagenomic analyses of an uncultured viral community from human feces. J Bacteriol. 2003, 185: 6220-6223. 10.1128/JB.185.20.6220-6223.2003.PubMed CentralPubMedView ArticleGoogle Scholar
- Breitbart M, Salamon P, Andresen B, Mahaffy JM, Segall AM, Mead D, Azam F, Rohwer F: Genomic analysis of uncultured marine viral communities. Proc Natl Acad Sci USA. 2002, 99: 14250-14255. 10.1073/pnas.202488399.PubMed CentralPubMedView ArticleGoogle Scholar
- Cann AJ, Fandrich SE, Heaphy S: Analysis of the virus population present in equine faeces indicates the presence of hundreds of uncharacterized virus genomes. Virus Genes. 2005, 30: 151-156. 10.1007/s11262-004-5624-3.PubMedView ArticleGoogle Scholar
- Edwards R, Rohwer F: Viral metagenomics. Nat Rev Microbiol. 2005, 3: 504-510. 10.1038/nrmicro1163.PubMedView ArticleGoogle Scholar
- Virus Taxonomy: Classification and Nomenclature of Viruses. Edited by: Fauquet CM, Mayo MA, Maniloff J, Desselberger U, Ball LA. 2005, Elsevier, Amsterdam
- Ackermann HW: 5500 Phages examined in the electron microscope. Arch Virol. 2007, 152: 227-243. 10.1007/s00705-006-0849-1.PubMedView ArticleGoogle Scholar
- Prangishvili D, Forterre P, Garrett RA: Viruses of the Archaea: a unifying view. Nat Rev Microbiol. 2006, 4: 837-848. 10.1038/nrmicro1527.PubMedView ArticleGoogle Scholar
- Barns SM, Delwiche CF, Palmer JD, Pace NR: Perspectives on archaeal diversity, thermophily and monophyly from environmental rRNA sequences. Proc Natl Acad Sci USA. 1996, 93: 9188-9193. 10.1073/pnas.93.17.9188.PubMed CentralPubMedView ArticleGoogle Scholar
- Huber H, Hohn MJ, Rachel R, Fuchs T, Wimmer VC, Stetter KO: A new phylum of Archaea represented by a nanosized hyperthermophilic symbiont. Nature. 2002, 417: 63-7. 10.1038/417063a.PubMedView ArticleGoogle Scholar
- Dyall-Smith M, Tang SL, Bath C: Haloarchaeal viruses: how diverse are they?. Res Microbiol. 2003, 154: 309-313. 10.1016/S0923-2508(03)00076-7.PubMedView ArticleGoogle Scholar
- Klein R, Baranyi U, Rossler N, Greineder B, Scholz H, Witte A: Natrialba magadii virus φ Ch1: first complete nucleotide sequence and functional organization of a virus infecting a haloalkaliphilic archaeon. Mol Microbiol. 2002, 45: 851-863. 10.1046/j.1365-2958.2002.03064.x.PubMedView ArticleGoogle Scholar
- Tang SL, Nuttall S, Dyall-Smith M: Haloviruses HF1 and HF2: evidence for a recent and large recombination event. J Bacteriol. 2004, 186: 2810-2817. 10.1128/JB.186.9.2810-2817.2004.PubMed CentralPubMedView ArticleGoogle Scholar
- Tang SL, Nuttall S, Ngui K, Fisher C, Lopez P, Dyall-Smith M: HF2: a double-stranded DNA tailed haloarchaeal virus with a mosaic genome. Mol Microbiol. 2002, 44: 283-296. 10.1046/j.1365-2958.2002.02890.x.PubMedView ArticleGoogle Scholar
- Bath C, Cukalac T, Porter K, Dyall-Smith M: His1 and His2 are distantly related, spindle-shaped haloviruses belonging to the novel virus group, Salterprovirus. Virology. 2006, 350: 228-239. 10.1016/j.virol.2006.02.005.PubMedView ArticleGoogle Scholar
- Bamford DH, Ravantti JJ, Ronnholm G, Laurinavicius S, Kukkaro P, Dyall-Smith M, Somerharju P, Kalkkinen N, Bamford JK: Constituents of SH1, a novel lipid-containing virus infecting the halophilic euryarchaeon Haloarcula hispanica. J Virol. 2005, 79: 9097-9107. 10.1128/JVI.79.14.9097-9107.2005.PubMed CentralPubMedView ArticleGoogle Scholar
- Porter K, Kukkaro P, Bamford JK, Bath C, Kivela HM, Dyall-Smith ML, Bamford DH: SH1: A novel, spherical halovirus isolated from an Australian hypersaline lake. Virology. 2005, 335: 22-33. 10.1016/j.virol.2005.01.043.PubMedView ArticleGoogle Scholar
- McGenity TJ, Grant WD: Genus Halorubrum. Bergey's Manual of Systematic Bacteriology. Edited by: Boone DR, Castenholz RW. 2001, Springer, 1: 320-324. 2Google Scholar
- Cui HL, Tohty D, Zhou PJ, Liu SJ: Halorubrum lipolyticum sp. nov. and Halorubrum aidingense sp. nov., isolated from two salt lakes in Xin-Jiang, China. Int J Syst Evol Microbiol. 2006, 56: 1631-1634. 10.1099/ijs.0.64305-0.PubMedView ArticleGoogle Scholar
- Grigoriev A: Strand-specific compositional asymmetries in double-stranded DNA viruses. Virus Res. 1999, 60: 1-19. 10.1016/S0168-1702(98)00139-7.PubMedView ArticleGoogle Scholar
- Bell SD, Jackson SP: Transcription and translation in Archaea: a mosaic of eukaryal and bacterial features. Trends Microbiol. 1998, 6: 222-228. 10.1016/S0966-842X(98)01281-5.PubMedView ArticleGoogle Scholar
- Sartorius-Neef S, Pfeifer F: In vivo studies on putative Shine-Dalgarno sequences of the halophilic archaeon Halobacterium salinarum. Mol Microbiol. 2004, 51: 579-588. 10.1046/j.1365-2958.2003.03858.x.PubMedView ArticleGoogle Scholar
- Tolstrup N, Sensen CW, Garrett RA, Clausen IG: Two different and highly organized mechanisms of translation initiation in the archaeon Sulfolobus solfataricus. Extremophiles. 2000, 4: 175-179. 10.1007/s007920070032.PubMedView ArticleGoogle Scholar
- Mongodin EF, Nelson KE, Daugherty S, Deboy RT, Wister J, Khouri H, Weidman J, Walsh DA, Papke RT, Sanchez Perez G, Sharma AK, Nesbo CL, MacLeod D, Bapteste E, Doolittle WF, Charlebois RL, Legault B, Rodriguez-Valera F: The genome of Salinibacter ruber: convergence and gene exchange among hyperhalophilic bacteria and archaea. Proc Natl Acad Sci USA. 2005, 102: 18147-18152. 10.1073/pnas.0509073102.PubMed CentralPubMedView ArticleGoogle Scholar
- Bickle TA, Krüger DH: Biology of DNA restriction. Microbiol Rev. 1993, 57: 434-450.PubMed CentralPubMedGoogle Scholar
- Holmes ML, Nuttall SD: Construction and use of halobacterial shuttle vectors and further studies on Haloferax DNA gyrase. J Bacteriol. 1991, 173: 3807-3813.PubMed CentralPubMedGoogle Scholar
- Allers T, Mevarech M: Archaeal genetics – the third way. Nature Rev Genet. 2005, 6: 58-73. 10.1038/nrg1504.PubMedView ArticleGoogle Scholar
- Zillig W, Prangishvilli D, Schleper C, Elferink M, Holz I, Albers S, Janekovic D, Gotz D: Viruses, plasmids and other genetic elements of thermophilic and hyperthermophilic Archaea. FEMS Microbiol Rev. 1996, 18: 225-236. 10.1111/j.1574-6976.1996.tb00239.x.PubMedView ArticleGoogle Scholar
- Stolt P, Zillig W: Transcription of the halophage phi H repressor gene is abolished by transcription from an inversely oriented lytic promoter. FEBS Lett. 1994, 344: 125-128. 10.1016/0014-5793(94)00347-5.PubMedView ArticleGoogle Scholar
- Purdy KJ, Cresswell-Maynard TD, Nedwell DB, McGenity TJ, Grant WD, Timmis KN, Embley TM: Isolation of haloarchaea that grow at low salinities. Environ Microbiol. 2004, 6: 591-595. 10.1111/j.1462-2920.2004.00592.x.PubMedView ArticleGoogle Scholar
- Rees HC, Grant WD, Jones BE, Heaphy S: Diversity of Kenyan soda lake alkaliphiles assessed by molecular methods. Extremophiles. 2004, 8: 63-71. 10.1007/s00792-003-0361-4.PubMedView ArticleGoogle Scholar
- Weisburg WG, Barns SM, Pelletier DA, Lane DJ: 16S ribosomal DNA amplification for phylogenetic study. J Bacteriol. 1991, 173: 697-703.PubMed CentralPubMedGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol. 1990, 215: 403-410.PubMedView ArticleGoogle Scholar
- Kumar S, Tamura K, Nei M: MEGA3: Integrated software for Molecular Evolutionary Genetics Analysis and sequence alignment. Brief Bioinform. 2004, 5: 150-163. 10.1093/bib/5.2.150.PubMedView ArticleGoogle Scholar
- Upgrade. [http://rsb.info.nih.gov/ij/upgrade/index.html]
- de Lorenzo V, Herrero M, Jakubzik U, Timmis KN: Mini-Tn5 transposon derivatives for insertion mutagenesis, promoter probing, and chromosomal insertion of cloned DNA in Gram-negative eubacteria. J Bacteriol. 1990, 172: 6568-6572.PubMed CentralPubMedGoogle Scholar
- Softberry. [http://www.softberry.com]
- GeneMark. [http://opal.biology.gatech.edu/GeneMark/index.html]
- tRNAscan-SE Search Server. [http://lowelab.ucsc.edu/tRNAscan-SE/]
- ProtParam tool. [http://www.expasy.ch/tools/protparam.html]
- DNA base composition analysis tool. [http://molbiol-tools.ca/Jie_Zheng]
- NCBI/BLAST home. [http://www.ncbi.nlm.nih.gov/BLAST/]
- emboss einverted. [http://bioweb.pasteur.fr/docs/EMBOSS/einverted.html]
- PALINDROME. [http://bioweb.pasteur.fr/seqanal/interfaces/palindrome.html]
- PROTEOME ALLIANCE. [http://tp12.pzr.uni-rostock.de/~moeller/palim/index.html]