Comparative genomics of Clostridium bolteae and Clostridium clostridioforme reveals species-specific genomic properties and numerous putative antibiotic resistance determinants

Background Clostridium bolteae and Clostridium clostridioforme, previously included in the complex C. clostridioforme in the group Clostridium XIVa, remain difficult to distinguish by phenotypic methods. These bacteria, prevailing in the human intestinal microbiota, are opportunistic pathogens with various drug susceptibility patterns. In order to better characterize the two species and to obtain information on their antibiotic resistance genes, we analyzed the genomes of six strains of C. bolteae and six strains of C. clostridioforme, isolated from human infection. Results The genome length of C. bolteae varied from 6159 to 6398 kb, and 5719 to 6059 CDSs were detected. The genomes of C. clostridioforme were smaller, between 5467 and 5927 kb, and contained 5231 to 5916 CDSs. The two species display different metabolic pathways. The genomes of C. bolteae contained lactose operons involving PTS system and complex regulation, which contribute to phenotypic differentiation from C. clostridioforme. The Acetyl-CoA pathway, similar to that of Faecalibacterium prausnitzii, a major butyrate producer in the human gut, was only found in C. clostridioforme. The two species have also developed diverse flagella mobility systems contributing to gut colonization. Their genomes harboured many CDSs involved in resistance to beta-lactams, glycopeptides, macrolides, chloramphenicol, lincosamides, rifampin, linezolid, bacitracin, aminoglycosides and tetracyclines. Overall antimicrobial resistance genes were similar within a species, but strain-specific resistance genes were found. We discovered a new group of genes coding for rifampin resistance in C. bolteae. C. bolteae 90B3 was resistant to phenicols and linezolide in producing a 23S rRNA methyltransferase. C. clostridioforme 90A8 contained the VanB-type Tn1549 operon conferring vancomycin resistance. We also detected numerous genes encoding proteins related to efflux pump systems. Conclusion Genomic comparison of C. bolteae and C. clostridiofrome revealed functional differences in butyrate pathways and in flagellar systems, which play a critical role within human microbiota. Most of the resistance genes detected in both species were previously characterized in other bacterial species. A few of them were related to antibiotics inactive against Clostridium spp. Some were part of mobile genetic elements suggesting that these commensals of the human microbiota act as reservoir of antimicrobial resistances. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3152-x) contains supplementary material, which is available to authorized users.


Background
Clostridium bolteae and Clostridium clostridioforme are members of the normal intestinal microbiota of humans, which can cause intra-abdominal infections when the natural intestinal barrier is altered. C. bolteae is present in stools of most children, but counts are significantly higher in autistic children than in controls [1]. Chronic diarrheal episodes associated with some forms of autism could be attributed to an overabundance of C. bolteae and release of end products of metabolism, such as butyrate, propionate and acetate, that alter the motility and contraction rate of the gastrointestinal tract [2].
The two opportunistic pathogens have been isolated in intra-abdominal infections, bacteremia and in abscesses and various drug susceptibility patterns have been reported [3]. Resistance to penicillin G is common but only a few C. clostridioforme produce a beta-lactamase, as detected by the nitrocefin test [4]. Most of the strains are susceptible to ampicillin-sulbactam, piperacillintazobactam, imipenem, and metronidazole. Resistance to clindamycin and moxifloxacin was previously reported in the complex C. clostridioforme [5]. In general, C. bolteae appears more resistant than C. clostridioforme, with higher MICs of penicillin G, ampicillin-sulbactam, ticarcilllin, piperacillin and piperacillin-tazobactam and more strains producing beta-lactamases [6]. Resistance to aminopenicillins, lincosamides and quinolones has been reported but not documented at genetic level.
Comparative analysis of 16S rRNA gene sequences of clostridial strains place C. clostridioforme in the subcluster XIVa of clostridia, including several non-spore-forming cocci and mesophilic spore-forming rod shaped bacteria [7]. C. bolteae was previously reported as a member of the C. clostridioforme complex including C. clostridioforme (formally), Clostridium aldenense, Clostridium citroniae and Clostridium hathewayi [5,6]. A divergence of 3 % in 16S rRNA separates C. bolteae from C. clostridioforme, but few phenotypic characters distinguish the two species, such as lactose fermentation, which is a key phenotypic test [8] and little is known about their genetics.
Commensal anaerobes of gastrointestinal tract have been proposed as reservoir for various antibiotic resistance determinants [9]. However, the sequencing effort for Clostridium spp., mainly concerns C. difficile a cause of post-antibiotic diarrhoea. Only two genomes of C. bolteae (strains BAA613 & WAL-14578) and three of C. clostridioforme (strains 2149FAA.1, WAL-7855 and CM201.1) are available in public databases.
The aim of this study was to analyse the genomes of six clinical isolates of C. bolteae and six clinical isolates of C. clostridioforme in order to identify genetic specificities between these closely related species. We focused our analysis on (CDSs coding for) functions, which could have a physiological effect within microbiota, and provided the catalogue of antimicrobial resistance genes of the two opportunistic pathogens.

Results and discussion
General features of genomes reveal intra and interspecies variations A total of 1 to 21 contigs were generated from assembly of reads from Illumina (134 to 185-fold coverage) for the six strains of C. bolteae (Table 1). A total of 10 to 48 contigs were generated (82 to 264-fold coverage) for the six strains of C. clostridioforme. Total genome size varied between species and strains. The size of C. bolteae ranged from 6159 kb for strain 90A7 to 6480 kb for strain 90B3 with 5833 and 6059 DNA-coding sequences (CDSs), respectively, and four 16S rRNA genes. The genome size of C. clostridioforme was smaller, from 5467 kb for strain 90A3 to 5970 kb for strain 90A6 with 5231 to 5916 CDSs, respectively, and four 16S rRNA genes. The phylogenetic tree based on the 16S rRNA sequences showed that the C. bolteae and C. clostridioforme studied were closely related to C. hathewayi, C. aldenense, C. citroniae, C. saccharolyticum and C. symbiosum, members of the Clostridium cluster XIVa of Firmicutes, as previously reported [6,7] (data not shown).
Genomes of C. bolteae and C. clostridioforme are large genomes, where genetic redundancy is prevalent (data not shown). The redundant genes were involved in a variety of metabolic pathways, including carbon metabolism, transport, iron metabolism and amino acid biosynthesis. The differences in the number of CDSs between genomes reflected variation in genetic redundancy more than gain or loss of particular functions. In addition, genomes integrated mobile elements i.e. transposons, Insertion Sequences, plasmids or phages (integrase, capsid protein,…) indicative of lateral gene transfers. Some of them carried antimicrobial resistance genes (see below).
To examine the pangenome of the two species, we compared the 97,210 CDSs obtained from the 12 newly sequenced genomes with those of five other genomes (C. bolteae BAA613, C. bolteae WAL-14578, C. clostridioforme CM201.1, C. clostridioforme 2149FAA.1, and C. clostridioforme WAL-7855). All CDSs were clustered using the BlastClust algorithm at high stringency, above a 90 % sequence identity cut-off and 90 % length overlap. A total of 10,530 clusters were found. Only 2294 (21.78 %) clusters were shared by the two species.
In using only genomes newly sequenced, we estimated the (species) core genome and (strain-specific) genes of the six C. bolteae, and the six C. clostridioforme (Table 1). A total of 3714 genes formed the core genome of C. bolteae. The number of strain-specific genes in this species varied from 73 to 846. In C. clostridioforme, 3660 genes defined the core genome. A total of 2409 clusters were shared by the two species; 1305 genes were specific to C. bolteae and 1251 to C. clostridioforme. C. bolteae 90A7 and 90B8 had the largest number of unique genes for this species (735 and 846, respectively). C. clostridioforme 90A8, with 1006 (17 %) unique genes had the largest number of strain specific genes in this study. These strains integrated a high number of mobile elements. Some unique CDSs were annotated as transporters or regulators. Few of them were involved in defence mechanisms (antimicrobial resistance genes..) or metabolic pathways. Most of them, often surrounded by CDSs from phages or transposons, were of unknown functions (data not shown).

Functional differences between species in the core genomes
The challenge of our study was to provide reliable information from draft genomes. Therefore, we focused our analysis on the core genomes. The classification of the CDSs according to the Clusters of Orthologous Groups (COGs) system allowed to give an overview of the functions displayed by the two species. The core genomes of C. bolteae and C. clostridioforme were enriched (over 7 % of total COG matched counts) in COG categories K, E, G and R relative to Transcription (309 and 290 CDSs), Amino acid transport and metabolism (335 and 276 CDSs), Carbohydrate transport and metabolism (431 and 425 CDSs) and General function prediction (366 and 331CDSs) ( Table 2). While C. bolteae and C. clostridioforme are phenotypically related, the pattern of functions obtained through change in COG annotation differed between the two species (Table 2). 30 additional CDSs of the Nucleotide transport and metabolism (F), 97 CDSs of the Amino acid transport and metabolism (E) and 79 CDSs coding for Signal transduction mechanisms (T) categories were specific for C. bolteae. 40 CDSs coding for the Cell wall/ membrane/envelope biogenesis (M), 50 CDSs for Replication, recombination and repair (L) and 18 CDSs for the Lipid transport and metabolism (I) categories were specific for C. clostridioforme. Differences between metabolic pathways in C. clostridioforme and C. bolteae seem to be large enough to support delineation of the species.
Among carbohydrate pathways, C. bolteae and C. clostridioforme harboured different systems for the assimilation of lactose, which differ in their phosphorylation states, intermediate metabolites, and bioenergetics (Additional file 1: Table S1). Genes coding for a βgalactosidase, which hydrolyzes lactose yielding glucose and galactose, were found in both species. An alternative lactose catabolic pathway, the lactose/cellobiose dependent phosphotransferase system (lac/cell-PTS) was found in almost all genomes of C. bolteae. The lac/cell-PTS operon, previously described in C. acetobutylicum [10], consists of genes for the 6-phospho-β-galactosidase, phosphoglycerate mutase, and lichenan operon transcriptional antiterminator and of two copies of genes for lactose/cellobiose family IIC, IIB and IIA components. By such a system, lactose is phosphorylated at the C-6 carbon and the internalized lactose 6-phosphate is degraded in galactose 6-phosphate and glucose by the 6-phospho-β-galactosidase. In addition, the gene for the 6-phospho-β-galactosidase and the genes for the lactose/cellobiose family components were lacking in C. bolteae 90A7. It is likely that this system, inducible by cellobiose or lactose and regulated by several repressors (described in other Gram positive bacteria [11,12]) accounts for the lactose-negative phenotype in C. bolteae [6]. By using our annotation system, we detected a galactose operon repressor (GalR) among lacI-family regulators, in C. clostridioforme (all, except 90A8), but not in C. bolteae. Laboratory experiments are needed to determine how the transcription factors from the two species mediate preferences in the utilization of certain carbohydrates over others.
Other distinctive features between the two species were CDSs coding for secondary metabolites biosynthesis and transport and catabolism which were only found in C. bolteae (Table 2).
Interestingly, the number of genes of the cell motility and secretion category (N) (34 and 18 CDSs) was different between the two species. Among them, we found CDSs encoding flagella motility recognized as essential virulence factors for most motile pathogens. Overall, twenty-four genes (46 clusters + 3 orphans) represented the flagellar operon in the genomes of C. bolteae. Among them, genes for flagellin (fliC) and flagellar cap (fliD), one of the multiple cell-surface adhesins of the bacteria, revealed cluster specificity and microevolution. Genes coding for fliD were represented by one cluster and two additional genes in C. bolteae 90A7 and 90B8. FliC sequences from C. bolteae 90A9, 90B3 and 90B8 formed one cluster, those from 90A5 and 90B7 clustered in another group, and sequences from C. bolteae 90A7 remained orphans (unique genes) after clustering (Fig. 1). They were closely related to flagellin sequences of C. citroniae and C. hathewayi, other Clostridium spp. of the group XIVa, isolated occasionally from human infections. In addition, C. bolteae 90A9 and 90B3 shared a second operon of only 19 genes in syntheny, including a flagellin gene (flaA) closely related to those of C. clostridioforme (63 % identity). Based on conserved residues L87, Q88, R89 and Q96 critical for TLR5 signalling and flagellin polymerisation, these proteins were predicted to have pro-inflammatory properties [13]. In C. clostridioforme, twenty genes (57 other clusters) organised in a single operon encoded the flagellar apparatus. FlaA sequences from C. clostridioforme belonged to a phylogenetic group closely related to flagellin sequences from Eubacterium cellulosovens isolated from the rumen. Overall, flagellin genes and loci organization related to flagella were different between species (Additional file 2: Table S2 and Additional file 3: Table S3), suggesting that motility, chemotaxis, and occurrence of potential interactions with the colonic mucosa are species specific [14].

Species differences in pathways for butyrate synthesis
Comparison of whole genome sequences revealed that pathways for butyrate synthesis, which play a key role in colonic health in humans, were present in C. bolteae and C. clostridioforme.
The two species were butyrate producers through different and complementary ways (Fig. 2, Additional file 4: Table S4). All C. clostridioforme, except 90A8, carried a locus coding for the Acetyl-CoA pathway (from Acetyl-CoA to butyryl-CoA), including genes for the beta-hydroxylbutyrylCoA dehydrogenase (hbd), thiolase (thl), crotonase (cro), butyryl-CoA dehydrogenase (bcd) and two electron transfer proteins (ETF alpha, ETF beta) (Additional file 4: Table S4). Only, C. bolteae 90A8 and C. clostridioforme 2149FAA.1 contained another putative bcd (74.9 % identity) in their genomes (data not shown). The locus composition and arrangement were similar to that in Faecalibacterium prausnitzii, a major butyrate producer of the human large intestine [15]. The Acetyl-CoA pathway was not found in C. bolteae. Both species shared genes for the two hydroxy-glutaryl-CoA dehydrogenase (HgCoAd) and the glutaconyl-CoA decarboxylase (Gcd) from the Glutarate pathway that can lead to crotonyl CoA and to butyryl-CoA via bcd genes [16].
The final conversion from butyryl-CoA to butyrate can be performed by the butyrate kinase (buk) and the phosphotransbutyrylase (ptb) present in both species (Additional file 4: Table S4). The group of buk sequences from C. clostridioforme branched in the vicinity of the buk sequence from C. citroniae on the phylogenetic tree (Additional file 5: Figure S1). Buk sequences from C. bolteae formed distinct monophyletic groups and sequences distributed among the phylogenetic trees, suggesting polymorphism and/or functional variations of the enzyme in this species. Other genes for transferases from the lysine pathway (ato-alpha and beta subunit ; but-acetate CoA transferase), detected near the butyrate locus in six genomes of C. clostridioforme, can be involved as final enzymes. Genes from the 4-aminobutyrate pathway (4hbt) can be another alternative for the terminal step in C. bolteae 90A5, 90B7, WAL14578, and BAA613 [16].
The butyryl-CoA:acetate CoA-transferase (but) of the acetyl-CoA pathway, the final step in butyrate production predominant in Clostridium XIVa, was neither found in the common core genomes nor in the genomes of the C. bolteae studied [17]. In the human gut, previous studies on colonic isolates of healthy individuals have illustrated that but pathway predominates [18]. Further studies are needed to assess the impact of butyrate production through the Glutarate pathway on health of colonic cells, in particular in autism where C. bolteae is overabundant [2].

Identification of antibiotic resistance determinants
Drug resistance genes that had not been recognized by automated annotation were identified by homology sequence research on ARDB. Resistance genes were predicted on a value up to 40 % identity (50 % of positive substitutions) on 70 % of length above the cut-off value usually recommended (see list in Table 3). Then, the gene content and genetic organization of microbial resistance loci of the six C. bolteae and the six C. clostridioforme were compared to previous data obtained from C. clostridioforme CM201.1 in our laboratory.
Because sequence-based predictions might potentially identify determinants that do not lead to antimicrobial resistance, susceptibility testing was performed to obtain information on the predicted response of bacteria to antibiotics. The strains included in this study showed resistance patterns, including ampicillin, macrolides, lincomycin and quinolones, now common in anaerobes (Table 4). Both genomic data (CDSs and annotations) and phenotypic susceptibility tests were considered to identify antibiotic resistance determinants (Tables 3 and 4). Preliminary assays for cloning certain genes were also performed in order to check their capacity to confer antibiotic resistance (see below).

Genes of resistance to antibiotics used for treatment of anaerobic infections
A total of 76 clusters and 21 strain-specific genes potentially involved in antimicrobial resistance were identified (Table 3). It's included from 42 to 50 CDSs in C. bolteae and 48 to 58 CDSs in C. clostridioforme. From 27 to 42 CDSs per genomes were related to drug resistance mechanisms to beta-lactams, glycopeptides, macrolides, lincosamides, and metronidazole.
Seven clusters involved in beta-lactam resistance are shared or part of the core genome of the two species (Fig. 3). Three types of beta-lactamases, including class A beta-lactamase, class C beta-lactamase, class D [Oxa type] beta-lactamase and several metallo-enzymes were recognized in the twelve genomes. All the strains studied, selected for their resistance to ampicillin, shared the gene blaCLO1, previously found in C. clostridioforme CM201.1 (unpublished), but the structure of integrative conjugative element (ICE) observed in CM201.1     3  3  3  3  3  3  3  3  3  3  3  3  3   Tetracycline  Tet40  1  1  1   TetO  1  1  1  1  1  1  1  1 TetW   was not found in the new genomes sequenced. The gene blaCLO1 confers resistance to aminopenicillins and carboxypenicillins in E. coli, and its activity is inhibited by clavulanate and sulbactam. Nine amino acid changes were observed in beta-lactamases of C. bolteae 90A9, 90B3 and 90A8. This closely related beta-lactamase was flanked by insertion sequences (IS66) and a putative gene for class D beta-lactamase (COG 2602) also described in Clostridium sp M62/1 from the human intestinal microflora (HMP project). Genes for class C beta-lactamases, previously found in the chromosomes of enteric bacteria (COG2680), were also present in C. bolteae and C. clostridioforme. A high number of predicted genes (32 CDSs) were involved in resistance to glycopeptides (Additional file 6: Figure S2). C. clostridioforme 90A8 was the single strain with all genes required for glycopeptide resistance in agreement with the phenotype (MIC > 256 mg/l). Vancomycin resistance in this strain was attributed to a VanB-type operon borne by a Tn1549-like element. Unfortunately, a deletion of ten nucleotides within the relaxase gene of Tn1549 leads to the inability of 90A8 to transfer vancomycin resistance in vitro [19]. The other genomes of C. clostridioforme included part of the VanD-type vancomycin resistance operon, but the D-Ala-Lac ligase vanD gene was disrupted by a stop codon leading to a truncated protein (Additional file 6: Figure S2). In addition vanH and vanY, which encode a D-lactate dehydrogenase and a DD carboxypeptidase, respectively, were missing. Similarly, the genomes of C. bolteae harboured four CDSs, homolog of vanR G vanU G-vanG vanY G , which formed an incomplete and nonfunctional operon due to the lack of a serine racemase gene. The high number of CDSs coding for glycopeptide resistance (including vanD or vanG known to be chromosomal and not transferable) found in the genomes of C. bolteae and C. clostridioforme, suggests that they are part of ancestral entire operons, which have evolved in the absence of antibiotic selective pressure [20]. Presence of incomplete van operon is intriguing, but similar observations in other anaerobes living in microbiomes such as Clostridium difficile 630 or Ruminococcus spp., have been reported [21,22].
Homologues to the adenylyltransferase [lnuA] gene conferring resistance to the lincosamides, were present in the core genome of both species (Fig. 3). LnuA genes of C. clostridioforme and C. bolteae had 70 to 72 % identity with orthologs found in C. hathewayi and C. citroniae, respectively. C. clostridioforme 90B1 and 90A6 Fig. 3 Distribution of antibiotic resistance genes shared between and within the core of C. bolteae and C. clostridioforme. Genes overlapping at least 90 % length and 90 % of similarity were considered homologs. Resistance genes were predicted on a value up to 40 % identity (50 % of positive substitutions) on 70 % of length by homology sequence research on ARDB. For all C. clostridioforme and some C. bolteae 23S rRNA methyltransferase Cfr-like harboured an additional lnu gene (68 % identity with LnuA 90A5 ), with no traces of mobile elements. Lincomycin resistance is common in C. bolteae and C. clostridioforme, often associated with resistance to clindamycin. LnuA proteins of the two species displayed 51 to 54 % of identity with LnuA from Staphylococcus suggesting common functionality, but the role of lnu genes is difficult to establish due to the presence of other putative mechanisms [23]. Similarly, all strains were resistant to erythromycin, whereas two genes homolog to the erythromycin ribosome methylase gene, ermB, were only predicted in the genomes of C. clostridioforme 90A4 and 90A8. Overexpression of multidrug efflux pumps and Macrolide-, and various Macrolide-Lincosamide-Streptogramin B-specific efflux systems, such as MacB, MefA, VgaA, MsrA/MsrB, and CcmA (Table 3, Additional file 7: Figure S3), found in the genomes studied, can lead to macrolide and lincosamide resistance [24]. In addition, two clusters of CDSs coding for xenobiotic acetyltransferases related to VatB (48 % identity) were found in the core genomes of each of the species (Fig. 3). VatB inactivates virginiamycin [25], but here, the resistance was not detected by anti-microbial susceptibility tests (Table 4).
A cluster of CDSs homologues to the metronidazole resistance (nim) genes was detected in the core genomes of both species, and metronidazole was very active on the species studied [26]. In Bacteroides fragilis it has been demonstrated that increased expression of nim genes when downstream from IS elements leads to metronidazole resistance [27]. In the lack of IS directly upstream the nim genes, the mechanism to confer metronidazole resistance to C. bolteae and C. clostridioforme remains to be established.

Unexpected observation of new genes of resistance
Regarding other drug resistances, genome data revealed genes related to chloramphenicol and rifampin resistance mechanisms (Table 3). In most genomes of each species, we found CDSs coding for a group A chloramphenicol acetyltransferase [cat] which can inactivate chloramphenicol. However, only C. bolteae 90B3 and 90B8 were resistant to chloramphenicol. The genome of strain 90B8 contained a second copy of the cat gene borne by a Tn4451-like transposon (96 and 90 % identity with Tn4451 and Tn4453, respectively). The 90B3 genome contained a CDS homolog to the gene cfr coding for a 23S rRNA methyl-transferase largely spread in Gram-positive bacteria. As expected, the strain 90B3 was also resistant to florfenicol, tiamulin, and linezolid (MIC = 16 mg/l). Other 23S rRNA methyltransferase (Cfr-like) CDSs were detected in an environment rich in transposable elements (Tn 6103-6110-CTn4 [fragments]) in the genomes of C. clostridioforme and C. bolteae 90A5 and 90B7, but the rRNA methylation did not appear to affect the susceptibility to chloramphenicol (Fig. 3).
The analysis of genomic data allowed to recognize CDSs homologues to rifampin-ADP-ribosyltransferase (arr) genes in C. bolteae but not in C. clostridioforme. No mobile elements or traces of mobile elements were found around the arr genes suggesting they were indigenous to this species. This new Arr sequences branched in the vicinity of Arr proteins from C. saccharoperbutylacetonicum and some Cyanobacteria on the phylogenetic tree (Additional file 8: Figure S4). They were distinct from Arr-2 proteins of Enterobacteriaceae and from Arr proteins of Mycobacterium and Streptomyces spp.. All strains, except 90A7, were susceptible to rifampin. In the absence of mutations in rpoB (known to be responsible for rifampin resistance), resistance of C. bolteae 90A7 (MIC: 32 mg/l) was likely due to positive selection of mutations in aar Cbol90A7 . Susceptibility to rifampin of other C. bolteae was likely due to the lack of promoters upstream from arr (as predicted by in silico analysis) or to nucleotide substitutions within arr leading to amino acid replacement and functional inactivation (data not shown).
Concerning the resistance of all strains against moxifloxacin and ciprofloxacin, all strains of C. bolteae showed several substitutions in the "quinolone-resistance-determining region" (QRDR) of gyrB. We didn't find any substitutions in this region for gyrA, nor described in the protein of the quinolone-resistant epidemic strain, C. difficile 027 [28]. Therefore, GyrB was likely the preferred target in acquisition of quinolone resistance in these two species. In addition, several CDSs coding for AcrB inner membrane transporter were present in all the strains. These transporters are part of a resistance-nodulation-division [RND] multidrug efflux pump, known to increase efflux of quinolones in some Gram-negative bacteria [29]. Further studies are needed to determine their influence in the loss of susceptibility of Clostridium spp. to fluoroquinolones.
Overall, similar resistance profiles against antibiotics in C. bolteae and C. clostridioforme can result from various mechanisms.
Genes of resistance to antibiotics less active or inactive against Clostridium spp The genomes of C. bolteae and C.clostridioforme carried one or two copies of the undecaprenyl pyrophosphate phosphatase gene bacA, and 2 to 5 copies of the efflux pump genes, bcrA, involved in bacitracin resistance, in agreement with their low susceptibility [26]. We also found one cluster of CDSs homolog to the dihydrofolate reductase gene, dfrA20 (41 % identity / 97 % length) of Pasteurella multocida in the core genome of both species that could explain the poor activity of trimethoprim on our Clostridium spp. [30]. In addition, C. clostridioforme 90A6 harbored a CDS identical to dfrA from Enterococcus faecium. This gene detected in an environment rich in mobile elements is consistent with a new example of horizontal transfer between Enterococcus spp. and Clostridiales.
Interestingly, the genomes of C. bolteae or C. clostridioforme contained various resistance genes against antibiotics naturally inactive on these species. Five CDSs were homologs of genes that phosphorylate, acetylate or adenylylate aminoglycosides. Four of these putative resistance genes were detected in an environment of mobile elements. Three genes, aadE, sat4 and aph(3')-III, conferring resistance to streptothricin, streptomycin and kanamycin, respectively, was found part of a transposon delineated by two IS1182 copies in C. clostridioforme 90A3 (two copies), 90A6 and 90B1. The aph(3')-III detected was identical to the aph(3')-III, part of the multidrug resistant plasmid PF856 from E. faecium [31], also related to an internal domain (99 % identity) of a SSCmec element of Staphylococcus aureus HT20040085. Similarly, the aminoglycoside 6-adenylyltransferase gene ant(6')-Ia, conferring resistance to streptomycin was shared by C. clostridioforme 90A1, 90A3 (2 copies), 90A4, 90A6 (2 copies) and C. bolteae 90A7. The adenylyltransferase gene aad(9')-b which mediates resistance to streptomycin/ spectinomycin was found in C. bolteae 90B3. Two copies of the acetyl transferase aac(6')-Im were also present in the genome of C. clostridioforme 90B1. Homologs of AAC(6')-Im which confers resistance to tobramycin and amikacin resistance was also found in E. coli (96 % identity), Coprococcus sp, C. difficile and Enterococcus faecium (data not shown). In addition, three CDSs coding for an aminoglycoside kinase (APH), known to be widely distributed in Gram-positive bacteria, were also observed among all the strains [32].
Numerous tetracycline resistance genes were also detected in C. bolteae and C. clostridioforme. They include both efflux genes such as tet40, and ribosome protection determinants such as tetO, tetW, and tet32, previously reported as circulating in gut microflora among distantly related bacteria [33].

Conclusion
We studied the genomes of C. bolteae and C. clostridioforme, two species of the complex Clostridioforme, which can behave either as members of the human microbiota or as opportunistic pathogens.
We compared the genomes of six clinical isolates of C. bolteae and 6 strains of C. clostridioforme with available genome sequences in laboratory and international databanks. These data were used as a basis to reveal differences in functional patterns between the two species. Among them, differences in flagella coding genes and butyrate pathways can potentially influence host-gut microbiota interactions.
The patterns of resistance genes in the genomes were also of peculiar interest: (i) these bacterial species harbour specific and indigenous putative resistance genes, which included ABC transporters, antibiotic modifying enzymes, rRNA methyltransferases, (ii) Other resistance genes were acquired as shown by their location within mobile elements in the genomes studied. Some of them corresponded to genes mainly spread in gut bacteria. Others were new determinants, which remain to be analysed for their ability to confer antibiotic resistance. This study emphasizes the role of commensal bacteria of the digestive microbiota as reservoir for antibiotic resistances.

Future directions
As the costs of whole-genome sequencing continue to decline, it becomes increasingly available in routine diagnostic laboratories to detect antimicrobial resistance genes in genomes as substitute of traditional methods for resistance identification. However, the true challenge will remain to extract the relevant information from the large amount of data and to ensure the functionality of the genes detected.

Bacterial strains, molecular Identification and antimicrobial susceptibility testing
We studied retrospectively 6 strains of C. bolteae and 6 strains of C. clostridioforme, resistant to beta-lactams. Strains were isolated from intra-abdominal infections from patients (without links) over 2 years in two hospitals, Paris Saint Joseph and CHRU of Nancy, France.
Genomic DNA for sequencing was prepared from cells by lysis in lysozyme, incubation in proteinase K/SDS, followed by a standard phenol/chloroform extraction procedure. Prior to library preparation DNA quality was assessed by Nanodrop analysis (Thermo Scientific).
Molecular identification of strains was performed by sequencing a 1483-bp PCR fragment from 16S rRNA using universal primers B27F (5'-AGAGTTTGATCCTGGCT CAG) and U1492R (5'-GGTTACCTTGTTACGACTT) [34]. The taxonomic assignment of sequences was checked on the RDP classifier of the Ribosomal Database Project v11.1 [35]. Closely related 16S rRNA sequences found in the database were aligned with the newly determined sequences and a phylogenetic tree was re-constructed according to the neighbor-joining method of the Phylip package [36]. The stability of the groupings was estimated by bootstrap analysis (100 replications). The identification of C. bolteae 90A7, previously misnamed C. clostridioforme 90A7, was corrected (see Results and discussion).

Whole genome sequencing
This sequencing project was part of the Human Microbiome U54 initiative of the Broad Institute (broadinstitute.org). For each genome, paired end libraries were generated using Illumina's Phusion-based library kits following the manufacturer's protocols (Illumina, Hayward, CA, USA). Samples were multiplexed and sequenced on Illumina GAIIx machines and base-called following the manufacturer's protocols. Individual samples of paired 90 nt reads generated 76 to 185-fold coverage of the genomes of C. bolteae (~6.37 Mb) and 82 to 264-fold coverage of the genomes of C. clostridioforme (~5.75 Mb), respectively (Table 1). Sequences were de novo assembled, yielding 1 to 21 contigs for genomes of C. bolteae and 10 to 48 contigs for those of C. clostridioforme.
Accession to our set of genomic sequences of Clostridium spp in public databases