Complete genome determination and analysis of Acholeplasma oculi strain 19L, highlighting the loss of basic genetic features in the Acholeplasmataceae

Background Acholeplasma oculi belongs to the Acholeplasmataceae family, comprising the genera Acholeplasma and ‘Candidatus Phytoplasma’. Acholeplasmas are ubiquitous saprophytic bacteria. Several isolates are derived from plants or animals, whereas phytoplasmas are characterised as intracellular parasitic pathogens of plant phloem and depend on insect vectors for their spread. The complete genome sequences for eight strains of this family have been resolved so far, all of which were determined depending on clone-based sequencing. Results The A. oculi strain 19L chromosome was sequenced using two independent approaches. The first approach comprised sequencing by synthesis (Illumina) in combination with Sanger sequencing, while single molecule real time sequencing (PacBio) was used in the second. The genome was determined to be 1,587,120 bp in size. Sequencing by synthesis resulted in six large genome fragments, while the single molecule real time sequencing approach yielded one circular chromosome sequence. High-quality sequences were obtained by both strategies differing in six positions, which are interpreted as reliable variations present in the culture population. Our genome analysis revealed 1,471 protein-coding genes and highlighted the absence of the F1FO-type Na+ ATPase system and GroEL/ES chaperone. Comparison of the four available Acholeplasma sequences revealed a core-genome encoding 703 proteins and a pan-genome of 2,867 proteins. Conclusions The application of two state-of-the-art sequencing technologies highlights the potential of single molecule real time sequencing for complete genome determination. Comparative genome analyses revealed that the process of losing particular basic genetic features during genome reduction occurs in both genera, as indicated for several phytoplasma strains and at least A. oculi. The loss of the F1FO-type Na+ ATPase system may separate Acholeplasmataceae from other Mollicutes, while the loss of those genes encoding the chaperone GroEL/ES is not a rare exception in this bacterial class. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-931) contains supplementary material, which is available to authorized users.


Conclusions:
The application of two state-of-the-art sequencing technologies highlights the potential of single molecule real time sequencing for complete genome determination. Comparative genome analyses revealed that the process of losing particular basic genetic features during genome reduction occurs in both genera, as indicated for several phytoplasma strains and at least A. oculi. The loss of the F 1 F O -type Na + ATPase system may separate Acholeplasmataceae from other Mollicutes, while the loss of those genes encoding the chaperone GroEL/ES is not a rare exception in this bacterial class.

Background
Acholeplasma species comprise bacteria of the family Acholeplasmataceae in the class Mollicutes, characterised by the lack of sterol requirement for growth and thereby separated from Mycoplasmataceae and Spiroplasmataceae [1]. The majority of Acholeplasma spp. are described as saprophytes and commensals. An evident assignment as pathogens is hampered by the fact that several Acholeplasma spp. are distributed ubiquitously. Moreover, no primary pathogen is described for this genus. However, the isolation of strains from diseased animals, and classification as putative animal pathogens, applies to species such as A. axanthum and A. oculi [2,3]. This assignment of the type strain A. oculi 19L (syn. A. oculusi) was the result of its isolation from goat eyes infected with keratoconjunctivitis and re-infection experiments [2]. However, the assignment of A. oculi to this disease is rare in contrast to several Mycoplasma spp. [4].
Besides Acholeplasma, the Acholeplasmataceae family also includes the provisory taxon 'Candidatus Phytoplasma'. Phytoplasmas are associated with several hundred plant diseasesand thus significant economic losses [5]. After insect vector-mediated transmission, phytoplasmas colonise as intracellular obligate parasites the sieve cells of a plant, often resulting in abnormal growth and reduced vitality. No general evidence for pathogenesis by acholeplasmas in colonised insects and plants has been provided to date. However, a recently published study on the A. laidlawii strain PG-8 supports its phytopathogenicity, which can be increased after nanotransformation in ultramicroform cells and might be correlated to extracellular vesicle formation under experimental conditions [6]. Further studies are needed in this respect, but the results may indicate a mechanism shared by both genera. In contrast, experimentally proven effector proteins or membrane proteins involved in phytoplasma-host interaction were not identified in the acholeplasmas [7]. These genetic elements of phytoplasmas might have originated from horizontal gene transfers. Massive gene loss, in combination with duplication events and genome instability, separates the phytoplasmas from the acholeplasmas. The complete genome sequences of eight strains of this family have been published, comprising the acholeplasmas A. laidlawii strain PG-8A [8], A. brassicae strain O502 and A. palmae strain J233 [7] and the phytoplasmas 'Ca. P. australiense' strain rp-A [9] and NZSb11 [10], 'Ca. P. asteris' strain OY-M [11] and AY-WB [12] and 'Ca. P. mali' strain AT [13]. In the past, all five phytoplasma strains and A. laidlawii were sequenced by applying the whole genome shotgun approach and using plasmid or fosmid libraries as templates for dyeterminator sequencing (Sanger sequencing). In determining the chromosome sequences of A. brassicae and A. palmae, a combination of Sanger sequencing and next generation sequencing methods (pyrosequencing, 454 Life Sciences/Roche) was applied for the first time to this bacterial family [7].
Both taxa show characteristic gene losses. In comparison to acholeplasmas, phytoplasmas lack the F 1 F O ATPase synthetase complex, the cell division protein FtsZ, a wider variety of ABC transporters, the Rnf complex and the membrane protein SecG of the Sec-dependent secretion system. Moreover, acholeplasmas possess a rich repertoire of enzymes involved in carbohydrate metabolism, fatty acids, isoprenoids and partial amino acid metabolism [7]. Because these findings were inferred from the analyses of three acholeplasma and five phytoplasma genome sequences, it remains unclear as to what extent these differences between the two genera can be truly generalised or if the other acholeplasmas might share some of these features of their genetic repertoire with the phytoplasmas. Therefore, we determined the complete genome of A. oculi strain 19L by applying two different strategies based on sequencing by synthesis (SBS, Illumina) and, in a second approach, single molecule real time (SMRT, PacBio) sequencing. The subsequent analyses highlight the efficiency of current sequencing technologies and provide remarkable insights into the evolution of Acholeplasmataceae.

Comparison of assemblies derived from SBS and SMRT sequencing
The SBS and SMRT approaches enabled the efficient reconstruction of the complete genome sequence in independent experiments. SBS sequencing provided 1,095,574 paired-end quality passed reads with an average length of 101 nt (total read length of 110,652,974 nt). De novo assembly of the SBS-derived reads alone led to the incorporation of 964,613 reads (88%) into six large contigs (513260 bp, 244477 bp, 109253 bp, 547590 bp, 106516 bp and 52461 bp in size), showing a 64-fold read coverage on average (Table 1, Figure 1) and reaching a total contig length of 1,573,557 bp. In turn, the mapping of SBS reads on the finished genome sequence revealed no uncovered regions. Gap regions derived from the SBS de novo assembly cover repetitive sequences of high similarity. In detail, two gaps (4,748 bp and 4,898 bp in size) include the rRNA operon regions (99% sequence identity), two gaps (1,661 bp each) include two transposases (100% sequence identity), one gap (186 bp) borders Acholeplasma phage L2 (>92% sequence identity) and the smallest gap (176 bp) is located close to a heavy metal translocating P-type ATPase (92% sequence identity). Gaps derived from the assembly of SBS reads were closed by primer-walking (Sanger sequencing), resulting in a complete circular chromosomal sequence.
The SMRT sequencing approach provided 42,300 SMRT reads with a mean read length of 6,747 nt (total read length of 285,414,973 nt). A total of 38,875 reads enabled the gapless reconstruction of the circular chromosome at a size of 1,587,116 bp. A consensus concordance of 99.9991% and 144.1-fold average sequence coverage were reached. Around 8% of the SMRT-derived reads were rejected due to insufficient quality or incomplete read processing during the assembly.
Apart from the 14 kb size difference in the total contig length obtained by the two methods, it is remarkable that only 28 positions differ, concerning 31 bases in total ( Table 2). Amplifying these positions, and re-sequencing by Sanger, led to the unambiguous assignment of 22 positions. In addition, the results indicated the presence of six polymorphic positions. Thus, both approaches enabled the efficient reconstruction of the complete chromosome sequence. The final sequence is in accordance with the results obtained through the Sanger sequencing of PCR products covering the ambiguous regions. Consequently, only polymorphisms occurring at high frequency in one PCR template were detected as double peaks in the chromatograms. We conclude that the majority of differences indicate that polymorphisms were already present in the original cell population. Furthermore, one of the substitutions results in a nonsynonymous exchange (Aocu_13520/ rpsE) within the 30S ribosomal protein S5, while three substitutions result in synonymous base exchanges (Aocu_08380, Aocu_10040/ mutS, Aocu_14610) and two positions are located in intragenic regions. A high incidence of differences (14 nucleotides, Table 2) was identified within a putative phage-associated region (position 972,742-977,480), which may undergo rapid degradation and comprises an integrase (Aocu_08840) and fragments of a restrictionmodification system (truncated restriction endonuclease, Aocu_08860; truncated restriction endonuclease, Aocu_ 08870 and hsdM, Aocu_08880).
In summary, Sanger sequencing confirmed the SBSderived sequences for 22 out of 28 differences. The deviating SMRT data at these positions may indicate errors or rare sequence variations within the final chromosome sequence of 1,587,120 bp. Polymorphic sites at six sequence positions of the chromosome are supported by SMRT assembly, deviating SBS reads and Sanger sequencing.
Benchmarks of the genome of A. oculi and its comparison to other Acholeplasmataceae The finished consensus sequence of A. oculi strain 19L consists of 1,587,120 bp encoding two rRNA operons,  34 tRNA genes and 1,471 predicted protein coding genes (Table 3). A. laidlawii strain PG-8A is the closest known relative of A. oculi strain 19L, which is supported by the construction of the phylogenetic tree ( Figure 2). This close relationship is also reflected by the prediction of 1,068 shared proteins (77%) compared to 866 (60%) and Sixteen primer pairs were designed for the PCR and Sanger sequencing of 28 different regions resulting from the SMRT (PacBio) and SBS (Illumina) approaches. Primer pairs, numbers of identified differences, positions in the submitted sequence, SMRT-and SBS-determined sequences in any particular position are provided in addition to the Sanger sequencing results derived from the PCR product. Ambiguous results obtained by Sanger sequencing (*) were interpreted from the sequencing chromatograms.
973 (57%) proteins shared with A. brassicae strain O502 and A. palmae strain J233 ( Figure 3). The predicted core of the four Acholeplasma spp. consists of 703 proteins and the calculated pan-genome comprises 2,867 proteins in total ( Figure 4). The highest number of unique proteins (570) possesses A. brassicae, which also exhibits the largest genome in this family (1,877,792 bp, 1,704 protein coding genes, Table 3). Only a basic set of proteins is shared between A. oculi and the five complete phytoplasma genomes. The 293 to 310 predicted shared proteins (27% to 59%) are consistent with previously calculated numbers for other acholeplasmas [7,8] (Figure 3). The second highest number of unique proteins (440) is predicted for A. palmae (Figure 4), which is the closest known relative of the phytoplasmas [7] (Figure 2) and is also supported by the highest number of predicted shared proteins with the five completely sequenced phytoplasmas (Additional file 1). In second position among the acholeplasmas, A. oculi shares many of its proteins with the phytoplasmas, supported by its phylogenetic assignment and the received PanOCT results. This analysis is also supported by A. oculi and A. laidlawii, which share the highest number of proteins amongst the acholeplasmas (Figure 3). The phytoplasmas' genome reduction process is reflected by the low number of 294 proteins assigned to the shared core within the pan-genome (2,077 proteins in total; Figure 5). Phytoplasma genomes are characterised by extensive gene losses, transposon-mediated gene duplication [12] and horizontal gene-integration events [15]. Comparing the pan-genomes of acholeplasmas and phytoplasmas, Venn analysis highlights basic differences in the overall gene content. Complete Acholeplasma and 'Ca. Phytoplasma' genomes collectively encode 402 and 14 predicted unique proteins, respectively ( Figure 6, Additional file 2). The 14 unique genes, which are common to the genus 'Ca. Phytoplasma' , encode nine hypothetical proteins and five proteins with known functions. Two of the hypothetical proteins contain a sequencevariable mosaic (SVM) motif [16] and comprise SAP05 (AYWB_032), which is described as a putative effector protein [17] inducing the formation of smooth young rosette leaves that lack serrations along the leaf margin [18], and SAP30 (AYWB_402), which is similar to SAP11 containing an eukaryotic nuclear localisation signal [19,20]. This group of unique genes also includes two phytoplasma proteins involved in a suggested alternative pathway in the carbohydrate metabolism of phytoplasmas [7,13,21]. The malate/Na + symporter (MleP) provides a carbon source which undergoes oxidative decarboxylation by malate dehydrogenase (SfcA), thereby providing pyruvate processed by the dehydrogenase multienzyme complex and providing acetyl coenzyme A. The phosphotransacetylation of acetyl-CoA performed by the PduL-like protein provides acetyl-phosphate, which is processed via acetate kinase (AckA) and results in the formation of ATP and acetate. A. oculi does not encode MleP, SfcA and the phosphate acetyltransferase (Pta), which is common in mycoplasmas, though it is suggested that PduL fulfills this function in Acholeplasmataceae [7,21] including A. oculi. However, the alternative energyyielding pathway of phytoplasmas utilising malate is clearly not encoded in the analysed acholeplasma genomes. Furthermore, the PanOCT analysis predicted that phytoplasmas encode unique AAA+ ATPase, thymidylate kinase and a DNA-dependent RNA polymerase sigma 70 factor RpoD (IPR013325, IPR014284, IPR007627, Additional file 2). RpoD exhibits only insignificant BlastP hits to acholeplasmas' sigma factors (minimal e-value 9e-08, score 47), and no ortholog was predicted via PanOCT. The existence of a phytoplasma-specific Figure 2 Phylogenetic tree based on 16S rRNA gene sequences of acholeplasmas (orange) and phytoplasmas (green). The tree was calculated by employing the maximum likelihood algorithm and bootstrap calculation for 1,000 replicates (only values of at least 70% are shown). The bar indicates 0.05 substitutions per nucleotide. The accession numbers are given in parentheses. Mycoplasma genitalium strain G37 is set as an out-group. Species with complete chromosomes available are shown in bold. Roman numerals are given according to acholeplasma clades [14]. The coloured boxes indicate that gene encoding F 1 F 0 Na + ATP synthase (light blue), V 1 V O Na + ATP synthase (yellow), V 1 V O H + ATP synthase (dark blue), GroEL (red) or SecB (violet) are present (limited to complete genome sequences). sigma factor points towards some peculiarities in their regulatory system. The other two deduced proteins showed similarities in BlastP analysis to some acholeplasma proteins, albeit they differed in small domain structures. For instance, the AAA+ ATPase of phytoplasmas gave a hit to the ATP-dependent zinc metalloprotease FtsH, which also contains the AAA+ domain structure, and the thymidylate kinases of acholeplasmas showed an additional conserved site (predicted by the PROSITE database search, http://prosite.expasy.org/)contrary to the thymidylate kinases of phytoplasmas. The overall high number of 402 unique proteins for the four acholeplasmas is interpreted with respect to the diverse environments colonised by acholeplasmas.
A. oculi is also separated from the other three acholeplasmas by the presence of a putative manganese efflux pump (MntP, Aocu_03470), thereby enabling the exportation of manganese ions, which are toxic in higher amounts. The functional relevance of MntP for manganese homoeostasis has been demonstrated for E. coli [24]. The direct comparison of the A. oculi and the E. coli MntP protein shows 31% identical and 56% similar residues. In addition, A. oculi encodes a cadmium resistance transporter (CadD, Aocu_08600) and one amidohydrolase (AmhX, Aocu_08940). In Bacillus subtilis, AmhX enables the cleavage of the amide bond between non-active conjugated amino acids and may mobilise indole-3-acetic acid (IAA) from inactive storage forms in plants besides several other functions [25] (IPR017439 [26]). A. oculi was also detected on plant surfaces [27]. Therefore, one may speculate whether A. oculi can stimulate the growth of colonised plant tissue. Hints for such a manipulation of the IAA metabolism of plants have also been obtained for A. palmae and A. brassicae encoding a putative auxin efflux carrier protein [7], though no experimental studies are available.
A. oculi is separated from the other Acholeplasma spp. by encoding several additional transcriptional regulators such as ubiC (Aocu_00680), gntR (Aocu_00690), Cro/C1 family proteins (Aocu_01770, Aocu_05750, Aocu_08910 and Aocu_13020) and TetR family proteins (Aocu_14450) not assigned to other Cro/C1-type or TetR family proteins in this family. In total, A. oculi encodes 13 Cro/C1 family proteins, nine of which are shared, and four TetR family proteins, one of which is shared by the other acholeplasmas.
Furthermore, A. oculi is separated from other Acholeplasmataceae by encoding the GDP-D-glyceroα-D-manno-heptose biosynthesis pathway providing D-glycero-D-manno-heptose (HddA, GmhA, HddC, GmhB; Aocu_04590-620). This is a precursor of the inner core lipopolysaccharide [28]. These proteins are similar to those found in the pathway that was reconstructed for the Gram-positive bacteria Aneurinibacillus thermoaerophilus strain DSM 10155 (member of Bacillus/Clostridium group) [28]. For acholeplasmas, there is only one report by Mayberry et al. [29] that A. modicum contains heptose among the glycolipids.
Moreover, A. oculi encodes two additional proteins, thus playing a role in the biosynthesis of the amino acid methionine. MetW (Aocu_08790) synthesises methionine from homoserine (IPR010743 [30]), which provides an additional pathway to produce methionine needed in the initiation of translation. The diaminopimelate epimerase (DapF, Aocu_08990) belongs to the aspartate pathway (IPR001653), from which the four amino acids lysine, threonine, methionine and isoleucine can be synthesised.
All species of the Acholeplasmataceae encode a protein core for the Sec-dependent secretion system (Ffh, FtsY, SecA, SecE, SecY and YidC), whereas the four analysed Acholeplasma spp. additionally encode the membrane protein SecG. The chaperone SecB, which is only encoded in A. laidlawii and A. oculi, binds the precursor protein and directs it to the SecA protein. The function of SecB can also be fulfilled by the proteins DnaK and DnaJ [31], which are encoded in all genome sequences of the family, or by GroEL and GroES [32]. A. oculi lacks the common chaperone GroEL/ES (Figure 7), consistent with conclusions drawn from the draft sequences of phytoplasma strains [33] and the analyses of other species in the Mollicutes that these genes are not essential [34]. The complete genome sequences of the Acholeplasmataceae encode the trigger factor (TF), dnaK, dnaJ, grpE and hrcA. Other heat shock proteins, such as Hsp20, were not identified in A. palmae and 'Ca. P. mali'. Hsp33 is only identified in the acholeplasmas.
Beside GroEL/ES, A. oculi lacks the complete gene set encoding the F 1 F O -type Na + ATPase, which was identified in A. laidlawii, A. brassicae and A. palmae (Figures 8  and 9). Therefore, A. oculi, A. laidlawii and A. palmae encode one V-type Na + ATPase. A. palmae differs in gene content by encoding no atpC subunit for this ATPase. In addition, all genes encoding the V 1 V O H + ATPase are present in all four acholeplasma strains. Summing up, each acholeplasma species possesses at least one full operon which encodes at least either one H + or one Na + ATPase system. The NtpG subunit, namely the rotated central stalk next to NtpD and NtpC [35], is missing in all species. In contrast to the acholeplasmas, the F-and V-type ATPases were not identified in phytoplasmas.

Discussion
F 1 F O ATPases and V 1 V O ATPases are membrane complexes which function either as H + -or Na + -translocators [37] (Figure 9). The F 1 F O ATPase consists of two unitsthe integral membrane protein F O (atpBEF) acting as a proton channel and the peripheral catalytic stalk F 1 (atpHAGDC). The V 1 V O ATPase is built by the integral membrane protein V O (ntpIK) and the peripheral catalytic stalk V 1 (ntpECFABD) [38]. The difference between both transporters is that the V-type ATPase only works in one direction by hydrolysing ATP to produce either a proton or a sodium motive force, while additionally the F-type ATPase is able to act in the other direction by allowing the regulation of the cellular ion pool using the proton motive force, which leads to ATP generation [39].
Following the sequence-based prediction of Dzioba et al. [36], the classification of the ion translocating profile can be inferred from the alignment of the protein sequences of the subunits AtpE (F-type ATPase) and NtpK (V-type ATPase). Certain conserved binding motifs are represented by the amino acids at specific positions, in order to specify an H + -or a Na + -translocation ( Figure 9). As a result, one F 1 F O -type Na + ATPase is suggested to be encoded by all acholeplasmas except for A. oculi, and one V-type Na + ATPase is predicted for all acholeplasmas except for A. brassicae. It remains unclear as to whether the V-type Na + ATPase of A. palmae is working despite the lack of an atpC subunit, although this species additionally encodes the F 1 F O -type Na + ATPase. Moreover, protein sequence alignment leads to the conclusion that all acholeplasmas encode one V-type H + ATPase. Deductively, all acholeplasmas encode at least one Na + and one H + translocator. This finding stands in accordance with the evidence that Acholeplasma laidlawii strain B possesses a (Na + -Mg 2+ )-ATPase which is capable of actively extruding sodium ions against the concentration gradient [40]. This previously described, but not genetically characterised, cation pump was linked to the characteristically low intracellular sodium level of these bacteria.
Ultimately, the loss of the F 1 F O -type Na + ATPase in Acholeplasmataceae, as is the case for A. oculi, may probably be compensated by the V-type Na + ATPase. The loss of this genetic module in phytoplasmas remains unclear, but it might be interpreted in respect to the adaptation of phytoplasmas in the intracellular environment with constant osmotic conditions. The comparison of both V-type ATPase operons encoded by A. oculi highlights low sequence identities of the involved proteins (24% to 52%) and differences in protein lengths ( Figure 9). This leads to the suggestion that the operons did not derive from a duplication event. Besides F 1 F O ATPase, the loss of groEL/ES is remarkable. Native protein folding is conducted by molecular chaperones such as GroEL/ES (Hsp60), DnaK (Hsp70), DnaJ, GrpE, SecB and other heat-shock proteins (Hsp) [41]. GroEL complexes (800 kDa) consisting of two stacked heptameric rings exhibit ATPase activity [42]. The smaller GroES (10 kDa), together with ATP, binds to GroEL and forms the GroEL/GroES complex. DnaK prevents off-pathway reactions or stabilises certain folding intermediates. DnaJ and GrpE act as co-helpers for  Alignment made by partial protein sequences of the F-type ATPase subunit c (AtpE) and the V-type ATPase subunit K (NtpK). Assignments were made according to Dzioba et al. [36]. Species are highlighted regarding their phylum assignment to Tenericutes (red), Firmicutes (green) or others (violet). Superscripted numbers indicate position in protein sequence. *UF = unknown function.
DnaK [41]. GroEL/ES is probably replaced by the trigger factor (TF) and DnaK, which has already been shown by Kerner et al. [43] for E. coli or by the HrcA protein, which is commonly found as a part of the heat-shock regulation of bacteria [44]. TF/DnaK and HrcA are encoded in all analysed species of the Acholeplasmataceae (Figure 7). Several Mollicutes are known to have lost groEL and groES, such as Mesoplasma florum, Mycoplasma hyopneumoniae, Ureaplasma parvum serovar 3, Ureaplasma urealyticum, Mycoplasma mobile and some further Mycoplasma spp. [44]. It is likely that there are even more Mollicutes lacking these proteins. Saccardo et al. [33] suggested, based on draft sequences, that there are four phytoplasma strains of the 16SrIII group that probably lack GroEL/ES. The possibility that this genetic feature can be lost within the Mollicutes is supported by experiments with transposon mutagenesis, showing that GroEL is not or only weakly regulated during heat shock for M. genitalium or M. pneumonia [45], thereby leading to the suggestion that this chaperone is not essential for Mycoplasma spp. in general and may represent an evolutionary remnant. Evolutionary loss could apparently be possible due to either the fact that GroEL is immunogenic, and therefore it would be advantageous to get rid of it by avoiding an immune response in mammals [44] a benefit for A. oculi when infecting mammalsor alternatively Mollicutes possess small genomes which encode few proteins; consequently, they own fewer substrate proteins, which have to be correctly folded by GroEL.

Conclusions
This study demonstrated the efficiency of the SMRT approach in the complete de novo determination of bacterial genomes. A. oculi encodes, like other Acholeplasma spp., rich genetic content in comparison to phytoplasmas. The relatively small core genome of phytoplasmas should be interpreted with respect to their intracellular parasitism and their corresponding poor metabolic repertoire. In contrast, acholeplasmas depend on a richer genetic repertoire due to their widespread distribution and colonisation of diverse micro-habitats. However, for the first time, the deduced protein content of A. oculi highlights that the loss of basic genetic elements, including the chaperone GroEL/ES and the F 1 F O -type Na + ATPase system, took place in both genera of the Acholeplasmataceae. One could therefore speculate that the common V-type H + ATPase system in acholeplasmas may regulate the cellular proton pool, and the V-type Na + ATPase system may compensate for the lack of the F 1 F O -type Na + ATPase. The loss of GroEL/ ES is interpreted as being not extraordinary for Mollicutes and seems to have occurred several times within this class.

Cultivation
A. oculi strain 19L isolate was kindly provided by Jerry K. Davis (Purdue University School of Veterinary Medicine, West Lafayette, Ind., USA) from the strain collection of the International Organization for Mycoplasmology (IOM). Cells were cultivated in ATCC® Medium 1039 (www.atcc.org) supplemented with 0.2% polymyxin B (Roth, Karlsruhe, Germany) and 0.2% penicillin G (Merck, Darmstadt, Germany) at 28°C for about 14 days and collected by centrifugation (20 min, 10,000 rpm, 4°C). The DNA isolation of A. oculi strain 19L for SBS was performed with the DNeasy Blood & Tissue Kit (Qiagen, Hildesheim, Germany) and according to the manufacturer's instruction. DNA isolation needed for preparing the PacBio 10-kb library high molecular weight genomic DNA was performed according to Moore et al. [46].

Sequencing and assembly of SBS data
DNA-Seq libraries were prepared from fragmented DNA (COVARIS S2, Woburn, Massachusetts, USA) according to recommendations made by the supplier (TruSeq DNA sample preparation v2 guide, Illumina, San Diego, CA, USA). Libraries were quantified by fluorometry, immobilised and processed onto a flow cell with a cBot followed by sequencing by synthesis by applying TruSeq v3 chemistry on a HiSeq2500 (all components by Illumina).
The de novo assembly of the reads was performed in CLC Genomics Workbench 7.0 (www.clcbio.com). The assembly data was exported as a BAM file, indexed using SAMtools [47] and imported in Gap5 [48]. Gaps were closed by PCR and primer-walking by applying dyeterminator sequencing performed on an ABI 310 capillary sequencer (Life technologies, Carlsbad, CA, USA).

Sequencing and assembly of SMRT data
A 10-kb library was prepared and processed as recommended by Pacific Biosciences (www.smrtcommunity. com/SampleNet/Sample-Prep). Library construction and subsequent sequencing were performed using the SMRTbell Template Preparation Reagent Kit 1.0, DNA/ Polymerase binding kit P4-C2, MagBead Kit and DNA Sequencing Kit 2.0 (all components supplied by Pacific Biosciences, Menlo Park CA, USA.). The genome was sequenced using PacBio RS II technology (P4-C2 chemistry). Data collected on the PacBio RS II instrument were processed and filtered (SMRT analysis software, version 2.1). All experiments were conducted according to the manufacturers' instructions on a single SMRT cell. Obtained data were analysed on the SMRT Portal V2.1.1 (www.pacb.com/devnet/) by applying the integrated Celera® Assembler. SMRT sequencing and SBS were performed by the Max Planck-Genome-centre Cologne, Germany (http://mpgc.mpipz.mpg.de/home/).

Identification of sequencing differences comparing both sequencing methods
Rare sequencing differences were identified via BlastN (low complexity filter off, word size 7) [49] by applying the SMRT-derived genome sequence as a reference. In addition, SBS data were mapped onto the SMRT sequence in CLC Genomics Workbench 7.0, and Primer-BLAST [50] was used for designing oligonucleotide pairs, thus enabling the PCR amplification of conflict regions ( Table 2). Sequences of PCR products were determined by applying dye-terminator sequencing.

Annotation of the genome sequence
The oriC region was determined through the cumulative GC-skew calculation of the chromosome sequence in Artemis [51] and the determination of the DnaA-boxes [8]. The adjusted genome sequence was automatically annotated in RAST [52] and annotation was manually curated in Artemis by incorporating additional analyses obtained from the InterProScan database [53], RNAmmer [54] and tRNAscan-SE [55].  [13]. The results obtained by the software were also used for the prediction of the pan-, dispensable-and core-genome [57] of each genera and the family [58].