- Research article
- Open Access
Simple sequence repeats in Helicobacter canadensis and their role in phase variable expression and C-terminal sequence switching
BMC Genomicsvolume 11, Article number: 67 (2010)
Helicobacter canadensis is an emerging human pathogen and zoonotic agent. The genome of H. canadensis was sequenced previously and determined to contain 29 annotated coding regions associated with homopolymeric tracts.
Twenty-one of the repeat-associated coding regions were determined to be potentially transcriptionally or translationally phase variable. In each case the homopolymeric tract was within the predicted promoter region or at the 5' end of the coding region, respectively. However, eight coding sequences were identified with simple sequence repeats toward the 3' end of the open reading frame. In these cases, the repeat tract would be too far into the coding region to be mediating translational phase variation. All of the 29 coding region-associated homopolymeric tracts display variability in tract length in the sequencing read data.
Twenty-nine coding regions have been identified in the genome sequence of Helicobacter canadensis strain NCTC13241 that show variations in homopolymeric tract length in the bacterial population, indicative of phase variation. Five of these are potentially associated with promoter regions, which would lead to transcriptional phase variation. Translational phase variation usually switches expression of a gene ON and OFF due to the repeat region being located sufficiently close to the initiation codon for the resulting frame-shift to lead to a premature termination codon and stop the translation of the protein. Sixteen of the 29 coding regions have homopolymeric tracts characteristic of translational phase variation. For eight coding sequences with repeats located later in the reading frame, changes in the repeat tract length would alter the protein sequence at the C-terminus but not stop the expression of the protein. This mechanism of C-terminal phase variation has implications for stochastic switching of protein sequence in bacterial species that already undergo transcriptional and translational phase variation.
The gastric pathogen Helicobacter pylori was the first identified Helicobacter species, was first cultured in 1984, and is associated with peptic ulcers, mucosa-associated lymphoid tissue lymphomas, chronic active gastritis, chronic atrophic gastritis and subsequent carcinomas, persistent diarrhoea, and increased susceptibility to other diseases [1–3]. Since 1984, over 20 further Helicobacter species have been identified, many of which have been implicated as animal pathogens, but in general these remain poorly characterised.
To date, three H. pylori genome sequences have been published, one each isolated from cases of gastritis , duodenal ulcer , and chronic atrophic gastritis . Five additional genome sequences, not yet completed and published, are from three cases of gastric carcinoma, one from gastric ulcer, and one from a remote Amazonian village. But humans are not the only species plagued by gut problems associated with these bacteria; Helicobacter acinonychis, the genome sequence of which was published in 2006 , is believed to contribute to the severe gastritis experienced by cheetahs and other big cats that can lead to premature death of these felines in captivity [8, 9]. At the other end of the mammalian spectrum, mice colonies with increased incidence of liver tumours were found to be colonized with Helicobacter hepaticus [10, 11], for which strain ATCC 51449 has been genome sequenced . In addition, the Helicobacteraceae includes Wolinella succinogenes, a non-pathogen that can be isolated from the rumen of cattle  and considered to be phylogenetically an intermediate between Helicobacter and Campylobacter . A genome sequence is available for W. succinogenes strain DSM1740 .
Some isolates which might have previously been taken for known Helicobacter spp. or Campylobacter spp. are instead novel Helicobacter species . Helicobacter pullorum has been repeatedly isolated from chickens and detected in chicken faeces by molecular methods [16–21]. This species is also found in humans [18, 20, 22–25], where it is associated with gastroenteritis, liver disease, diarrhoea, and gall bladder disease [20, 24–26]. H. pullorum differs from most other Helicobacter spp. in that it lacks a flagellar sheath and is unable to hydrolyze indoxyl acetate . It is this later phenotype that eventually revealed that some of the isolates being classified as H. pullorum were actually a novel species, Helicobacter canadensis [18, 20, 24] which is indoxyl acetate positive . Labelled an emerging pathogen as a zoonotic agent , H. canadensis has been isolated from geese , rodent faeces , swine , and humans, where it has been associated with diarrhoea and bacteraemia [30, 31].
During the annotation of the complete genome sequence of H. canadensis strain NCTC13241 , simple sequence repeats were sought and a repertoire of 29 homopolymeric tract-associated coding regions was identified. Five candidates for transcriptional phase variation and 16 candidates for translational phase variation were identified. The remaining eight annotated coding sequences were identified with long poly-G tracts (≥ 10 bp) toward the end of the annotated coding region. These represent a novel mechanism for phase variation in which stochastic switching of simple sequence repeat tract lengths mediates changes in expressed protein sequence. This is distinct from transcriptional phase variation, in which the distance and facing of promoter elements are altered, and from translational phase variation, in which frame-shifts resulting from repeat tract length changes lead to premature termination. We propose that this C-terminal phase variation may also be found in other species employing transcriptional and translational phase variation.
Results and Discussion
Search for phase variable genes
Given the presence of an extensive repertoire of phase variable genes in other Helicobacter spp. [4, 5, 11, 33, 34], simple sequence repeats were sought in the H. canadensis strain NCTC13241 genome sequence data. If repeats were discovered to be within the context of a predicted coding region or predicted promoter such that they had the potential to mediate phase variation, then that feature was annotated as potentially phase variable (Table 1).
Based on previous investigations [33–35], searches were made of the genome sequence data for homopolymeric tracts greater than or equal to (G)7, (C)7, (A)9, and (T)9 and for dinucleotide repeat tracts with five or more copies of the dinucleotide. No potential dinucleotide-mediated phase variable CDSs were identified in the H. canadensis genome sequence data. For all homopolymeric tracts the context of the repeat, tract length, and presence of a frame-shift were assessed to determine that the H. canadensis genome sequence contains 21 potential phase variable genes (Table 1). Of these, six are strong candidates, eight are good candidates, and two are putative candidates for translational phase variation, while five are possibly promoter-associated, mediating transcriptional phase variation. Strong candidates contained frame-shift mutations or long tract lengths (>9), or both. Good candidates contained long tracts in the appropriate position to mediate translational switching. One of the putative candidates (HCAN_0659) contained a shorter (G)9 tract length and no frame-shift, while the other (HCAN_0344) would be considered a strong candidate if an alternative initiation codon 5' of the repeat were chosen rather than that annotated.
It is interesting to note that in all cases the translational phase variation is mediated by poly-G tracts, whereas candidates with poly-C, -A, -T, -GA, -CT, -TC, -AT, and -AG tracts have been reported in other Helicobacter spp. [4, 5, 11, 33, 34]. The sole occurrence of a poly-A tract potentially involved in phase variation in this H. canadensis is in the putative promoter region of HCAN_0162, a conserved hypothetical gene.
C-terminal phase variation
In addition to the potentially phase variable genes identified, eight CDSs were found to contain homopolymeric tracts at the 3' end (Table 2) of the reading frame. In this location these could not be mediating phase variable expression of the CDS as most of the mRNA would be translated before reaching the late homopolymeric tract and subsequent termination codon. In most cases, an alteration in the reading frame at that point would change the length of the encoded protein by only a few amino acids (Table 2).
It is interesting to note that, like the transcriptional and translational phase variable CDSs, all of these C- terminal variation CDSs contain poly-G tracts and all are (G)10 or more (Table 2). Given the strength in poly-G tracts in the transcriptionally and translationally phase variable genes (Table 1) and the length of these tracts (Table 2), this data supports a role for the instability of these tracts at the end of the gene, as well as at the beginning and within the promoter.
One of these, flaG (HCAN_0914) contains a (G)11 tract five bases before the termination codon and alteration of the tract length would lead to slight changes in the length of the encoded protein. As annotated, there are two amino acids before the termination codon. In the other two frames there would be 16 amino acids or no amino acids before the termination codons in those frames (Table 2). FlaG has been shown to affect flagellar length and adherence to Hep-2 cells in Aeromonas . Thus it is intriguing that changes in the length of this homopolymeric tract would alter the C-terminus of this protein.
For HCAN_0660, a change in reading frame could result in a merger of this CDS and the next downstream CDS, HCAN_0661 encoding 82 amino acids. Although both CDSs are currently hypothetical, such an arrangement suggests that the protein sequences may have roles that can alternatively be filled as individual proteins or jointly as one protein. This finding, that a phase variation event can determine whether two proteins are made separately or are translated as one protein, is perhaps unprecedented. It may be that degeneration events have disrupted similar fusions in this or other bacterial species, which are not now apparent from sequence data.
Population-level variation within homopolymeric tracts
Previous studies into phase variable genes of the Helicobacter spp. have indicated that phase variable switching of a gene can be demonstrated at the population level through DNA sequencing. It has been shown that variations in the lengths of homopolymeric tracts are attributable to variations in the DNA template population and are not an artefact of the sequencing process, although this may be influenced by the length of repeat and the experimental conditions .
When the read data from the genome sequencing was analyzed for variations in homopolymeric tract length, it was found that all of the repeats investigated showed some degree of variation (Tables 1 &2). While it is known that homopolymeric tracts can be problematic for this sequencing technology (454, Roche), the results are consistent with the level of variation that would be expected at the population level. Phase variable bacteria are known to have dynamically changing populations undergoing phase variation events that lead to intra-population diversity and, in some cases, visible differences on culture plates, including colony sectoring . Variation in repeat lengths at the sequence level should therefore be expected. In each case the range of the repeat lengths observed here is indicated in Tables 1 &2 and the alignments of the read data is available in Additional files 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28 and 29).
This level of variation is suggestive evidence of phase variation of these coding regions, indicating that each of these coding regions warrants further study. This evidence is particularly important for those that are putative candidates (Table 1) and those that contain C-terminal repeats (Table 2). In all of the translational phase variation candidates the repeat lengths would correspond to both ON and OFF states of full-length gene expression. The highest degree of variability is seen in HCAN_0457, encoding the vacuolating cytotoxin precursor, with 12 to 20 copies of the poly-G repeat.
Potential structural consequences of C-terminal phase variation
Each of the eight CDSs identified as containing a homopolymeric tract at the C-terminal end of the reading frame were translated and compared against the NCBI Conserved Domain Database (CDD). In each case where structural data was available, the Cn3D model was assessed to determine what the potential structural consequences would be of alterations in the length and sequence of the C-terminus.
The C-terminal structures available for proteins which share similarity with HCAN_0165 (rfbB) all have an α-helix at the C-terminus. This is potentially eliminated in the shorter forms of this CDS in frames 1 and 2, mediated by a poly-G tract.
The structures of proteins with similar conserved domains to HCAN_0641 and HCAN_0914 suggest that changes in the length of this CDS would alter the C-terminal β-sheet. HCAN_0914 is flaG, therefore this potential change in β-sheet structure may have an effect on the H. canadensis flagella.
Changes in the poly-G repeat within HCAN_0643 would not alter the length of this protein, however changes would alter the sequence. In this case, the structure of CDD similar protein Fmt suggests that this would be within a largely unstructured C-terminus.
For the remaining four CDSs, there were either no CDD hits (HCAN_0653) or none that were full-length (HCAN_0660, HCAN_0665, and HCAN_1332).
Distribution of phase variable genes
When the H. canadensis strain NCTC13241 genome sequence was investigated to identify novel CDSs not found in other Helicobacter genome sequences, only one example of a contiguous cluster longer than 5 genes was found (HCAN_0630 to HCAN_0663). This region is notable for possessing three copies of asnB, encoding asparagine synthetase (HCAN_0654, HCAN_0657, and HCAN_0662). A fourth copy of asnB (HCAN_0730) is 69 kb outside this cluster adjacent to one of the two STT3 domain-containing PglB copies (HCAN_0729 and HCAN_0930) . Most of the other coding sequences in the region are of unknown function, lacking significant matches in GenBank. Notably, a high frequency of potentially phase variation-mediating homopolymeric tracts were detected in and around this region, with eight out of 21 identified candidate phase variable genes (Table 1) and five out of eight C-terminal variable genes (Table 2) being within or near this cluster. These encode a putative oxidoreductase (HCAN_0632), a 3-oxoacyl-[acyl-carrier-protein] reductase (HCAN_0641), a methionyl-tRNA formyltransferase (HCAN_0643), two putatitive methyltransferases (HCAN_0647 and HCAN_0671), a 2-polyprenyl-3-methyl-5-hydroxy-6-metoxy-1,4- benzoquinol methylase (HCAN_0659), three conserved hypothetical proteins (HCAN_0648, HCAN_0651, and HCAN0665), and four hypothetical proteins (HCAN_0653, HCAN_0655, HCAN0660, and HCAN_0670). This cluster of repeat tracts within the CDSs may suggest that this region of the chromosome is a particular hot-spot for the presence of phase variable tracts. This may have functional consequences for the interaction of these gene products or their regulatory controls.
The genome of H. canadensis NCTC13241 contains a capsular polysaccharide export locus (HCAN_0144 to HCAN_0149, HCAN_0150) encoding orthologues of KpsS, KpsD, KpsE, KpsT, KpsM, and KpsC from C. jejuni. This is the first evidence of a polysaccharide capsule in the Helicobacter spp. and, like in C. jejuni, this was only revealed once the genome had been sequenced. The presence of an annotated sialyltransferase just outside of this locus (HCAN_0152, siaD) suggests capsule sialylation, which is known to be an important virulence factor in Neisseria meningitidis  and group B Streptococcus . In C. jejuni, however, only LOS has been shown to be sialylated . Two strong, frame-shifted candidates for translational phase variation (HCAN_0151 and HCAN_0153) flank this sialyltransferase CDS, encoding a hypothetical protein and a polysaccharide deacetylase family protein, respectively.
H. canadensis strain NCTC13241 contains five candidate transcriptional phase variable CDSs, 16 candidate translational phase variable CDSs, and eight candidate C-terminal phase variable CDSs. In all cases the read data is indicative of repeat tract length variation in the bacterial population pool collected for DNA extraction and sequencing. A previous study of bacterial genome sequences has suggested that due to their instability, repeat tracts are selected against in coding sequences. When they are present, there is a demonstrated bias toward homopolymeric tracts within the first one-fifth of the coding region , the location expected for translational phase variation. This data also shows that there are a higher proportion of homopolymeric tracts in the final one-fifth of the coding region than in the internal three-fifths, which may support C-terminal variation in other bacterial species . Alternations of the tract length in the candidate C-terminal phase variable H. canadensis CDSs would result in differences in the C-terminus of the encoded proteins, which may impact the function, specificity, and/or antigenicity of the products. Similarly placed repeats have been identified in the Neisseria spp. (Snyder, previously unreported personal observation), which extensively utilizes phase variation for transcriptional and translational expression switching. Indeed, the potential for 3' repeats to generate gene fusions, as seen here between HCAN_0660 and HCAN_0661, has been previously speculated . Such genes containing stochastic switches late in the coding region warrant further investigation in the laboratory. In addition, the genome sequence data from other species that employ phase variation should be re-investigated in light of this finding in H. canadensis.
Using the DNA sequence search facility within the GenDB annotation software , simple sequence repeats greater than or equal to (G)7, (C)7, (A)9, and (T)9 were identified. In addition, dinucleotide repeats greater than or equal to four copies were found. Tandem repeats in the genome sequence data were identified using Tandem Repeats Finder . The context of identified repeats was investigated within GenDB and associated CDSs were annotated as described previously . The NCBI BLASTP search was used to access the Conserved Domain Database and associated links to Cn3D structural data files, which were visualized using Cn3D version 4.1 available from the NCBI.
the NCBI Conserved Domain Database.
Clemens J, Albert MJ, Rao M, Qadri F, Huda S, Kay B, van Loon FP, Sack D, Pradhan BA, Sack RB: Impact of infection by Helicobacter pylori on the risk and severity of endemic cholera. J Infect Dis. 1995, 171 (6): 1653-1656.
Cover TL, Blaser MJ: Helicobacter pylori infection, a paradigm for chronic mucosal inflammation, a pathogenesis and implications for eradication and prevention. Adv Intern Med. 1996, 41: 85-117.
Marshall BJ, Warren JR: Unidentified curved bacilli in the stomach of patients with gastritis and peptic ulceration. Lancet. 1984, 1 (8390): 1311-1315. 10.1016/S0140-6736(84)91816-6.
Tomb JF, White O, Kerlavage AR, Clayton RA, Sutton GG, Fleischmann RD, Ketchum KA, Klenk HP, Gill S, Dougherty BA: The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature. 1997, 388 (6642): 539-547. 10.1038/41483.
Alm RA, Ling LS, Moir DT, King BL, Brown ED, Doig PC, Smith DR, Noonan B, Guild BC, deJonge BL: Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori. Nature. 1999, 397 (6715): 176-180. 10.1038/16495.
Oh JD, Kling-Backhed H, Giannakis M, Xu J, Fulton RS, Fulton LA, Cordum HS, Wang C, Elliott G, Edwards J: The complete genome sequence of a chronic atrophic gastritis Helicobacter pylori strain: evolution during disease progression. Proc Natl Acad Sci USA. 2006, 103 (26): 9999-10004. 10.1073/pnas.0603784103.
Eppinger M, Baar C, Linz B, Raddatz G, Lanz C, Keller H, Morelli G, Gressmann H, Achtman M, Schuster SC: Who ate whom? Adaptive Helicobacter genomic changes that accompanied a host jump from early humans to large felines. PLoS Genet. 2006, 2 (7): e120-10.1371/journal.pgen.0020120.
Eaton KA, Dewhirst FE, Radin MJ, Fox JG, Paster BJ, Krakowka S, Morgan DR: Helicobacter acinonyx sp. nov., isolated from cheetahs with gastritis. Int J Syst Bacteriol. 1993, 43 (1): 99-106.
Munson L, Nesbit JW, Meltzer DG, Colly LP, Bolton L, Kriek NP: Diseases of captive cheetahs (Acinonyx jubatus jubatus) in South Africa: a 20-year retrospective survey. J Zoo Wildl Med. 1999, 30 (3): 342-347.
Boutin SR, Shen Z, Rogers AB, Feng Y, Ge Z, Xu S, Sterzenbach T, Josenhans C, Schauer DB, Suerbaum S: Different Helicobacter hepaticus strains with variable genomic content induce various degrees of hepatitis. Infect Immun. 2005, 73 (12): 8449-8452. 10.1128/IAI.73.12.8449-8452.2005.
Suerbaum S, Josenhans C, Sterzenbach T, Drescher B, Brandt P, Bell M, Droge M, Fartmann B, Fischer HP, Ge Z: The complete genome sequence of the carcinogenic bacterium Helicobacter hepaticus. Proc Natl Acad Sci USA. 2003, 100 (13): 7901-7906. 10.1073/pnas.1332093100.
Wolin MJ, Wolin EA, Jacobs NJ: Cytochrome-producing anaerobic Vibrio succinogenes, sp. n. J Bacteriol. 1961, 81: 911-917.
Vandamme P, Falsen E, Rossau R, Hoste B, Segers P, Tytgat R, De Ley J: Revision of Campylobacter, Helicobacter, and Wolinella taxonomy: emendation of generic descriptions and proposal of Arcobacter gen. nov. Int J Syst Bacteriol. 1991, 41 (1): 88-103.
Baar C, Eppinger M, Raddatz G, Simon J, Lanz C, Klimmek O, Nandakumar R, Gross R, Rosinus A, Keller H: Complete genome sequence and analysis of Wolinella succinogenes. Proc Natl Acad Sci USA. 2003, 100 (20): 11690-11695. 10.1073/pnas.1932838100.
Melito PL, Woodward DL, Bernard KA, Price L, Khakhria R, Johnson WM, Rodgers FG: Differentiation of clinical Helicobacter pullorum isolates from related Helicobacter and Campylobacter species. Helicobacter. 2000, 5 (3): 142-147. 10.1046/j.1523-5378.2000.00022.x.
Ceelen LM, Decostere A, Chiers K, Ducatelle R, Maes D, Haesebrouck F: Pathogenesis of Helicobacter pullorum infections in broilers. Int J Food Microbiol. 2007, 116 (2): 207-213. 10.1016/j.ijfoodmicro.2006.12.022.
Ceelen LM, Decostere A, Bulck Van den K, On SL, Baele M, Ducatelle R, Haesebrouck F: Helicobacter pullorum in chickens, Belgium. Emerg Infect Dis. 2006, 12 (2): 263-267.
Gibson JR, Ferrus MA, Woodward D, Xerry J, Owen RJ: Genetic diversity in Helicobacter pullorum from human and poultry sources identified by an amplified fragment length polymorphism technique and pulsed-field gel electrophoresis. J Appl Microbiol. 1999, 87 (4): 602-610. 10.1046/j.1365-2672.1999.00858.x.
Miller KA, Blackall LL, Miflin JK, Templeton JM, Blackall PJ: Detection of Helicobacter pullorum in meat chickens in Australia. Aust Vet J. 2006, 84 (3): 95-97.
Stanley J, Linton D, Burnens AP, Dewhirst FE, On SL, Porter A, Owen RJ, Costas M: Helicobacter pullorum sp. nov.-genotype and phenotype of a new species isolated from poultry and from human patients with gastroenteritis. Microbiology. 1994, 140 (Pt 12): 3441-3449. 10.1099/13500872-140-12-3441.
Zanoni RG, Rossi M, Giacomucci D, Sanguinetti V, Manfreda G: Occurrence and antibiotic susceptibility of Helicobacter pullorum from broiler chickens and commercial laying hens in Italy. Int J Food Microbiol. 2007, 116 (1): 168-173. 10.1016/j.ijfoodmicro.2006.12.007.
Ceelen L, Decostere A, Verschraegen G, Ducatelle R, Haesebrouck F: Prevalence of Helicobacter pullorum among patients with gastrointestinal disease and clinically healthy persons. J Clin Microbiol. 2005, 43 (6): 2984-2986. 10.1128/JCM.43.6.2984-2986.2005.
Ceelen LM, Haesebrouck F, Favoreel H, Ducatelle R, Decostere A: The cytolethal distending toxin among Helicobacter pullorum strains from human and poultry origin. Vet Microbiol. 2006, 113 (1-2): 45-53. 10.1016/j.vetmic.2005.10.020.
Fox JG: The expanding genus of Helicobacter: pathogenic and zoonotic potential. Semin Gastrointest Dis. 1997, 8 (3): 124-141.
Young VB, Chien CC, Knox KA, Taylor NS, Schauer DB, Fox JG: Cytolethal distending toxin in avian and human isolates of Helicobacter pullorum. J Infect Dis. 2000, 182 (2): 620-623. 10.1086/315705.
Fox JG, Dewhirst FE, Shen Z, Feng Y, Taylor NS, Paster BJ, Ericson RL, Lau CN, Correa P, Araya JC: Hepatic Helicobacter species identified in bile and gallbladder tissue from Chileans with chronic cholecystitis. Gastroenterology. 1998, 114 (4): 755-763. 10.1016/S0016-5085(98)70589-X.
Waldenstrom J, On SL, Ottvall R, Hasselquist D, Harrington CS, Olsen B: Avian reservoirs and zoonotic potential of the emerging human pathogen Helicobacter canadensis. Appl Environ Microbiol. 2003, 69 (12): 7523-7526. 10.1128/AEM.69.12.7523-7526.2003.
Goto K, Jiang W, Zheng Q, Oku Y, Kamiya H, Itoh T, Ito M: Epidemiology of Helicobacter infection in wild rodents in the Xinjiang-Uygur autonomous region of China. Curr Microbiol. 2004, 49 (3): 221-223. 10.1007/s00284-004-4287-6.
Inglis GD, McConville M, de Jong A: Atypical Helicobacter canadensis strains associated with swine. Appl Environ Microbiol. 2006, 72 (6): 4464-4471. 10.1128/AEM.02843-05.
Fox JG, Chien CC, Dewhirst FE, Paster BJ, Shen Z, Melito PL, Woodward DL, Rodgers FG: Helicobacter canadensis sp. nov. isolated from humans with diarrhea as an example of an emerging pathogen. J Clin Microbiol. 2000, 38 (7): 2546-2549.
Tee W, Montgomery J, Dyall-Smith M: Bacteremia caused by a Helicobacter pullorum-like organism. Clin Infect Dis. 2001, 33 (10): 1789-1791. 10.1086/323983.
Loman NJ, Snyder LA, Linton JD, Langdon R, Lawson AJ, Weinstock GM, Wren BW, Pallen MJ: Genome sequence of the emerging pathogen Helicobacter canadensis. J Bacteriol. 2009, 191 (17): 5566-5567. 10.1128/JB.00729-09.
Salaun L, Linz B, Suerbaum S, Saunders NJ: The diversity within an expanded and redefined repertoire of phase-variable genes in Helicobacter pylori. Microbiology. 2004, 150 (Pt 4): 817-830. 10.1099/mic.0.26993-0.
Saunders NJ, Peden JF, Hood DW, Moxon ER: Simple sequence repeats in the Helicobacter pylori genome. Mol Microbiol. 1998, 27 (6): 1091-1098. 10.1046/j.1365-2958.1998.00768.x.
Snyder LA, Butcher SA, Saunders NJ: Comparative whole-genome analyses reveal over 100 putative phase-variable genes in the pathogenic Neisseria spp. Microbiology. 2001, 147 (Pt 8): 2321-2332.
Rabaan AA, Gryllos I, Tomas JM, Shaw JG: Motility and the polar flagellum are required for Aeromonas caviae adherence to HEp-2 cells. Infect Immun. 2001, 69 (7): 4257-4267. 10.1128/IAI.69.7.4257-4267.2001.
Wisniewski-Dye F, Vial L: Phase and antigenic variation mediated by genome modifications. Antonie Van Leeuwenhoek. 2008, 94 (4): 493-515. 10.1007/s10482-008-9267-6.
Estabrook MM, Christopher NC, Griffiss JM, Baker CJ, Mandrell RE: Sialylation and human neutrophil killing of group C Neisseria meningitidis. J Infect Dis. 1992, 166 (5): 1079-1088.
Wessels MR, Rubens CE, Benedi VJ, Kasper DL: Definition of a bacterial virulence factor: sialylation of the group B streptococcal capsule. Proc Natl Acad Sci USA. 1989, 86 (22): 8983-8987. 10.1073/pnas.86.22.8983.
Linton D, Karlyshev AV, Hitchen PG, Morris HR, Dell A, Gregson NA, Wren BW: Multiple N-acetyl neuraminic acid synthetase (neuB) genes in Campylobacter jejuni: identification and characterization of the gene involved in sialylation of lipo-oligosaccharide. Mol Microbiol. 2000, 35 (5): 1120-1134. 10.1046/j.1365-2958.2000.01780.x.
van Passel MW, Ochman H: Selection on the genic location of disruptive elements. Trends Genet. 2007, 23 (12): 601-604. 10.1016/j.tig.2007.08.017.
Meyer F, Goesmann A, McHardy AC, Bartels D, Bekel T, Clausen J, Kalinowski J, Linke B, Rupp O, Giegerich R: GenDB--an open source genome annotation system for prokaryote genomes. Nucleic Acids Res. 2003, 31 (8): 2187-2195. 10.1093/nar/gkg312.
Benson G: Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999, 27 (2): 573-580. 10.1093/nar/27.2.573.
Nick Loman is supported by BBSRC grant BBE111791. Lori Snyder was previously supported by BBSRC grant BBE111791.
All authors contributed to the original genome sequencing project  and have read and approved the final manuscript. LS and NL conducted the homopolymeric tract searches. LS analyzed the repeats in their context and wrote the manuscript.