Skip to main content

Genome sequence of adherent-invasive Escherichia coli and comparative genomic analysis with other E. coli pathotypes



Adherent and invasive Escherichia coli (AIEC) are commonly found in ileal lesions of Crohn's Disease (CD) patients, where they adhere to intestinal epithelial cells and invade into and survive in epithelial cells and macrophages, thereby gaining access to a typically restricted host niche. Colonization leads to strong inflammatory responses in the gut suggesting that AIEC could play a role in CD immunopathology. Despite extensive investigation, the genetic determinants accounting for the AIEC phenotype remain poorly defined. To address this, we present the complete genome sequence of an AIEC, revealing the genetic blueprint for this disease-associated E. coli pathotype.


We sequenced the complete genome of E. coli NRG857c (O83:H1), a clinical isolate of AIEC from the ileum of a Crohn's Disease patient. Our sequence data confirmed a phylogenetic linkage between AIEC and extraintestinal pathogenic E. coli causing urinary tract infections and neonatal meningitis. The comparison of the NRG857c AIEC genome with other pathogenic and commensal E. coli allowed for the identification of unique genetic features of the AIEC pathotype, including 41 genomic islands, and unique genes that are found only in strains exhibiting the adherent and invasive phenotype.


Up to now, the virulence-like features associated with AIEC are detectable only phenotypically. AIEC genome sequence data will facilitate the identification of genetic determinants implicated in invasion and intracellular growth, as well as enable functional genomic studies of AIEC gene expression during health and disease.


Crohn's Disease (CD) is a chronic inflammatory bowel disease of the intestinal tract characterized by a strong activation of the intestinal immune system. A complex interaction of genetic, immunologic, and environmental factors contribute to the immunopathology of CD but despite intensive investigation over the last half-century, a unifying etiology of inflammatory bowel diseases (IBD) has not been uncovered [1, 2]. Abundant clinical and experimental data implicate luminal bacteria or bacterial products in both the initiation and perpetuation of chronic intestinal inflammation [24]. Some pathological manifestations observed in CD, including ulcers of the mucosa, mural abscesses and macrophage recruitment and activation, also occur in well-recognized infectious diseases caused by Shigella, Salmonella and Yersinia, in which invasion into mucosal epithelial cells is an important virulence trait [3]. However, a growing body of evidence indicates that the balance between host defence responses and the commensal microbiota plays a key role in the pathogenesis of IBD [2]. Patients with CD display an increased number of coliforms in their feces, particularly during periods of active disease [5] and E. coli antigens are found in most intestinal resection specimens from these patients [6]. Furthermore, it has been shown that early and chronic ileal lesions of CD patients harbour high levels of E. coli that might participate in disease pathogenesis [711]. E. coli strains isolated from the ileal lesions of CD patients can exhibit adherent and invasive capabilities in both gastrointestinal epithelial cells and macrophages [10, 12], a phenotype that was the basis for a new pathogenic group called adherent and invasive E. coli (AIEC) [12, 13]. AIEC are enriched in ileal lesions in human CD [7] and are associated with expression of proinflammatory cytokines and inflammation in mice expressing human carcinoembryonic antigen-related cell adhesion molecule (CEACAM) receptors [14]. The predominance of AIEC in human CD patients, in conjunction with a growing body of biological and animal model data [15] has generated intense interest into the possible role of AIEC in the initiation or maintenance of chronic inflammation associated with CD.

We previously reported on a clinical AIEC isolate with serotype O83:H1 (strain NRG857c) that was isolated from the terminal ileum of a patient with CD [16]. NRG857c belongs to the same serogroup as the historical AIEC isolate called LF82 first described over a decade ago [10] for which much of the experimental data on AIEC phenotypes have been documented. AIEC do not harbour common virulence factors found in various other pathogenic E. coli, and so the genetic basis for their invasive phenotype, proinflammatory nature and association with CD are not fully understood. Here, we report the complete genome sequence of AIEC NRG857c that includes a 150-kb plasmid. We found that AIEC are closely related to a group of extraintestinal pathogenic E. coli (ExPEC) associated with urinary tract infections and neonatal meningitis, a finding that confirms and extends previous work [17]. The comparison of this genome with other ExPEC, enteropathogenic E. coli, AIEC LF82, and commensal E. coli facilitated the identification of 41 high-confidence genomic islands and 66 genes unique to E. coli displaying the adherent and invasive phenotype.

Results and Discussion

Genome sequencing and gap closure

AIEC strain NRG857c was shotgun sequenced to 40-fold coverage using pyrosequencing. Assembly of the raw sequence data generated 48 contiguous regions (contigs) greater than 2-kb with a total size of 4.84-Mb. Contigs were assembled by aligning the larger contigs to an optical restriction map using MapSolver and by BLASTX analysis of contigs ends. The majority of gaps between contigs were identified because contigs ends were syntenic with single-copy genes in previously sequenced E. coli genomes. PCR primers were designed to amplify across these gaps followed by sequencing to generate "super-contigs" (see Additional File 1, Figure S1). Final gap closure was achieved after incorporation of sequence data for the seven ribosomal RNA operons. Plasmid contigs were identified by BLASTX analysis. Gap closure for the plasmid was done using BLASTN analysis of the terminal sequences from which PCR primers were designed. Amplification and sequencing of these regions resulted in the assembly, but not closure, of a single plasmid contig.

General features of the NRG857c AIEC genome

The chromosome of NRG857c is 4,747,819 bp (50.68% G + C content), encoding 4,431 genes (Figure 1, Table 1). The plasmid is 147,060 bp (50.92 G+C content) and encodes 155 genes (Table 1). The sequence of both the NRG857c chromosome and plasmid has been deposited in GenBank [GenBank: CP001855, GenBank: CP001856].

Figure 1

Comparative genome atlas of NRG857c. The chromosome of NRG857c (two outermost rings are CDS on forward and reverse strand) was compared with those of selected E. coli strains, starting from the outer layer LF82 (AIEC; pale green), APEC-O1 (APEC; blue), CFT073 (UPEC; yellow), MG1655 (K12/commensal; purple) and enterohemorrhagic E. coli O157:H7 Sakai (EHEC, red). Genomic islands were plotted on the NRG857c chromosome (grey blocks). The G+C content and G/C skew are also plotted as indicated.

Table 1 General features of NRG857c genome and other E. coli strains

Phylogenetic position of NRG857c

The phylogeny of AIEC NRG857c was resolved in two ways. First, a phylogenetic tree based on the optical map data was constructed using the unweighted pair group method with arithmetic mean (UPGMA) along with the in silico derived NcoI fragments for other sequenced E. coli strains (Figure 2A). The second method involved multi-locus sequence typing (MLST) with seven housekeeping genes as described previously [18] (Figure 2B; Additional File 2, Table S1), followed by comparison to sequences from other strains [19]. In both analyses NRG857c clustered with avian pathogenic E. coli (APEC-O1), and the uropathogenic E. coli isolates 536 and CFT073. Also in this group was LF82, another AIEC strain of the same serotype as NRG857c (O83:H1) whose genome sequence was retrieved from Genoscope ( see note added in revision). LF82 shows high sequence similarity to our strain as analyzed by MapSolver (Additional File 3, Figure S2), by BLASTN analysis (Figure 1), and by phylogenetic analysis (Figure 2).

Figure 2

Phylogenetic analysis of NRG857c compared with representative strains of other enteric bacteria. (A) A phylogenetic tree based on the unweighted pair group method with arithmetic mean was constructed from the optimal map data and in silico NcoI restriction digests of other enteric bacterial chromosomes. (B) MLST-based analysis of NRG857c with other enteric bacteria was performed as described in the Methods and sequence data was used to construct a phylogenetic tree. Numbers on the tree branches represent bootstrap support from 1000 bootstrap replicates with a minimum cut-off of 65%. Accession numbers for gene sequences can be found in Additional File 2, Table S1.

A general comparison of the total genome content of NRG857c with several other E. coli pathotypes is shown in Table 1. The majority of human ExPEC belong to phylogenetic group B2 and are categorized based on their clinical spectrum of disease, including urinary tract infections (UPEC) and neonatal meningitis (NMEC) [2023]. AIEC strains cluster genetically with ExPEC and share some of their phenotypic traits including the ability to colonize mucosal epithelial cells, invade eukaryotic host cells, and to induce inflammatory responses in host animals [24, 25]. Although the prototype EPEC strain E2348/69 (serotype O127:H6) and other EPEC strains belong to the same phylogenetic group as the ExPEC strains [26], they are not generally considered to be invasive organisms. However, recent data suggests that at least two type III secreted proteins (EspT and EspF) can facilitate EPEC invasion into non-phagocytic cells and may define a new category of invasive EPEC [27, 28].

Genomic islands and unique sequences associated with AIEC

Genomic islands (GI) comprise a horizontally acquired flexible gene pool that is a major driver in evolution and niche specialization of pathogenic bacteria [29]. Recent computational methods that take advantage of genetic signatures indicative of horizontal gene transfer enable the high-confidence prediction of GIs in annotated bacterial genomes [30]. To identify putative genomic islands in NRG857c, we used IslandViewer, which uses three independent methods for island prediction, IslandPick, IslandPath-DIMOB and SIGI-HMM. Using the methods and established thresholds described previously [31], we identified 35 genomic islands (GI-1 to GI-35) on the NRG857c chromosome ranging from 4 to 25-kb, with G+C content differing significantly from genome mean and with poor conservation among the other non-AIEC pathotypes shown in Figure 1 (see Additional File 4, Table S2 for full list of genomic islands and gene content analysis). We limited our comparative analysis here to the strains most related to NRG857c and to two well-described E. coli strains of commensal and pathogenic nature. The conservation of these 35 islands between NRG8578c and LF82 was high, suggesting that they may encode traits unique to the adherent and invasive phenotype. Five of the genomic islands (GI-6, -7, -8, -10 and -16) code for defective prophages, three (GI-14, -22, -29) are fimbrial islands, and three (GI-20, -26 and -30) appear to be involved in lipopolysaccharide or capsular polysaccharide biosynthesis. GI-23 is noteworthy because it encodes an EmrKY-TolC multidrug resistance efflux pump and the sensor kinase, EvgA, involved in acid resistance and multidrug resistance in E. coli[32]. GI-15 and GI-19 appear to be metabolic islands involved in the transport and metabolism of various sugars. An additional six genomic islands were identified on the large plasmid (PI-1 to PI-6 in Figure 3) (see Additional File 4, Table S2 for full list of plasmid islands and gene content analysis).

Figure 3

Genomic islands in NRG857c. Genomic islands in the NRG857c chromosome (A) and plasmid (B) were predicted using stringent bioinformatics criteria as described in the Methods. Genomic islands are plotted to scale in blue and labelled clockwise on the genome maps. On the plasmid, genes involved in antimicrobial resistance are indicated in red.

To date, restriction profiles or other biased analyses such as pulse field gel electrophoresis (PFGE), MLST or typing for known virulence genes common to intestinal pathogenic E. coli have failed to uncover unique genetic determinants implicated in the AIEC phenotype [17]. To begin to identify single genetic determinants unique to AIEC, we carried out whole-genome comparisons between NRG857c, LF82, and 29 other non-AIEC genomes of E. coli. NRG857c and LF82 show considerable sequence similarity and synteny (Additional File 3, Figure S2) with 46 chromosomal genes unique to NRG857c and 10 chromosomal genes unique to LF82 (see Additional File 5, Table S3 for full list of genes unique to AIEC). The large plasmids from NRG857c and LF82 show almost no conservation between them (see below), suggesting that they have different ancestry.

Panseq, a Web-based tool designed to analyse the "pan-genome" of closely-related genome sequences, was used to identify genes common to AIEC strains NRG857c and LF82, but absent in other members of this phylogenetic cluster (i.e. APEC-O1, 536, and CFT073). We programmed Panseq to find unique sequences of at least 2-kb present in NRG857c and LF82 but absent in APEC-O1, 536 and CFT073. In this analysis, we found 21 sequences with a combined length of 155-kb that are unique to AIEC strains. Several of these sequences code for prophage elements including a 19.7-kb region encoding the morphogenesis and packaging modules of a P22-like prophage (NRG857_04720 - NRG857_04815). A second interesting region of 47.2-kb extends, with one interruption, from NRG857_09990 to NRG857_10240 and codes for several proteins involved in intermediary metabolism including transport of propanol/propanediol and galactitol. BLASTN analysis of this region revealed two sub-regions, one 20.3-kb and the other 4.4-kb, which are not found in the complete genome sequence of any other E. coli strain. The latter region shows 71% sequence coverage to a region from the complete genome of Citrobacter rodentium ICC168, while approximately half of the longer sequence is also found in an uncharacterized E. coli strain ATCC 8739. This 10.7-kb region has no nucleotide similarity with any other fully sequenced bacterium. BLASTX revealed similarity in this region to two hypothetical Vibrio coralliilyticus ATCC BAA-450 proteins [GenBank: ZP_05883689, GenBank: ZP_05883688] adjacent to orthologs in Burkholderia cenocepacia HI2424 [GenBank: YP_833853, GenBank: YP_833854], which are described as hypothetical proteins.

Plasmid analysis

The 150-kb plasmid in NRG857c is different from the plasmid found in LF82. Whereas plasmid pNRG857c shows significant regions of identity to plasmids in other seropathotypes of E. coli, the 110-kb plasmid of strain LF82 (pLF82) has very little similarity to pNRG857c or pAPEC-O1 (APEC-O1), pColBM (APEC-O103), pUTI189 (UPEC UTI189) and pO157 Sakai (EHEC O157:H7) (Figure 4). The extrachromosomal plasmid in NRG857c is a antimicrobial resistance plasmid with a suite of genes encoding resistance to aminoglycosides, β-lactams, chloramphenicol, mercury, quaternary ammonium salts, sulfonamides, tetracycline, and trimethoprim, several of which appear to be enclosed as transposon blocks. The plasmid may be capable of conjugal transfer as it encodes several tra genes, although we have not experimentally tested this. In addition, there are genes for colicins M and V production and immunity. The antibiotic resistance genes are clustered in three regions of the plasmid in PI-2, PI-3 and PI-4 (Figure 3B). The mercury resistance cassette is identical to IS5075 found in IncA/C2 plasmids pRYC103T24 [GenBank: GQ293500.1], pLEW517 [GenBank: DQ390455.1], NR1 [GenBank: DQ364638.1] and R100 [GenBank: AP000342.1]. The β-lactam-macrolide region is identical to sequences present in plasmid pTZ3721 [GenBank: AB020531.1] and pTZ3723 [GenBank: AB038654.1]. Also of interest to us were several genes involved in siderophore production and iron metabolism. Plasmid pNRG857c has the sitABCD operon that encodes proteins involved in the periplasmic and inner membrane transport of iron and manganese. Two outer membrane proteins (IutA and FepA) are also encoded by the plasmid and are involved in translocation of iron across the membrane. IutA (NRG857_30235) is the ferric-aerobactin receptor, while FepA (NRG857_30015) is an iron-enterobactin outer membrane transporter, both of which are involved in the tonB-dependent transport pathway for iron and also the OM receptor for the colicins [33]. IutA and FepA are encoded on plasmids pAPEC-O103-ColBM, pAPEC-O1-ColBM, pCVM29188_146 (from Salmonella enterica serovar Kentucky, [34]), pVM01 (from the APEC strain E3, [35]), and pLVPK (from Klebsiella pneumoniae CG43, [36]). Interestingly, the chromosome contains a FepA paralog (NRG857_02640). The presence of several iron-acquisition genes suggests that Fur regulation of these plasmid-encoded genes occur [37, 38]. As predicted, the consensus DNA sequence for Fur binding (WAATDRNWNYNAWTW) is found in the upstream regulatory region, [39]) of the iroBCDE, sitABCD, iucABCD-iutA operons, and the shiF and fepA genes.

Figure 4

Gene content analysis of plasmid pNRG857c and comparison to representative strains of other E. coli. BLASTN analysis was performed between each CDS in plasmid pNRG857c against each CDS in pLF82, pO157Sakai, pUTI89, and pAPEC-O1-ColBM. Genes in pNRG857c with orthologs in the other plasmids, defined as >85% identity over entire length of the gene, are connected with a coloured line.

Identification of other potential virulence determinants

The chromosome of AIEC strain NRG857c encodes a variety of potential virulence factors (Table 2). As mentioned above, the plasmid carries several potential virulence factors including genes for iron acquisition. This would suggest that the plasmid contributes to the overall virulence of this bacterium, however we have demonstrated previously that a plasmid-cured variant was still able to attach to and invade epithelial cells in vitro[16].

Table 2 Putative virulence factors in NRG857c genome

(i) Type VI secretion system

We identified genes for a complete type VI secretion system (T6SS) that are associated with virulence in other invasive organisms (Table 3) [4042]. T6SS are phage-related secretion systems found in many Gram-negative pathogens and are thought to be involved in supporting an intracellular lifestyle, although their distribution is not restricted to pathogenic bacteria [43]. The T6SS in NRG857c is found in GI-2, a low GC region of the chromosome directly downstream from a tRNA which is a common integration site for mobile genetic elements. This T6SS island encodes the conserved core elements of the secretion apparatus, including the valine-glycine repeat protein G (VgrG/NRG857_01165), the ClpV ATPase (NRG_01105) and the hemolysin coregulated protein (Hcp/NRG857_01155) that is 100% identical to Hcp in APEC-O1 and the UPEC strains UT189 and 536. We also identified a second Hcp upstream of this conserved locus (NRG_01080) that is 100% identical to Hcp in E. coli S88 (O45:K1:H7) that causes neonatal meningitis [44], suggesting that this T6SS island is a mosaic with different ancestries. Other organisms, including Vibrio cholerae, have two hcp genes in different parts of the genome [45], which may impart different functionalities on the secretion apparatus. Whether the T6SS in AIEC facilitates intracellular survival and/or growth will require additional experimentation that we are currently pursuing.

Table 3 Type VI secretion system core proteins in NRG857c

(ii) Adhesins

NRG857c contains genes that are important for adhesion and invasion of AIEC LF82, including nlp1, htrA, yfgL, and dsbA[4649]. The SPAAN program [50] as well as BLASTP with relaxed stringency was used to identify and extensive list of additional predicted adhesins (Table 4). The majority of the fimbrial operons in NRG857c are found in other E. coli strains, with the exception of the long polar fimbriae (Lpf; NRG857-17915-17923), which might be important for tissue tropism. A second Auf fimbrial system with a potential role as a colonization factor is encoded by genes NRG857_16960 through _17005. Other potential mediators of invasion include a hemagglutinin/invasin (NRG857_17920 to _17923) and an Ibe invasin (NRG857_21885 to _21890). In previous work, the invasion of brain endothelial cells was found to be mediated by the Ibe invasin, and was located on a genomic island called GimA [51]. The presence of GimA was almost exclusive to ExPEC strains of phylogroup B2, and we now show that ibe is also present in AIEC, suggesting it may be involved in invasive properties of certain strains.

Table 4 Predicted invasion and adhesion factors in NRG857c

In mouse models of AIEC-induced colitis, inflammation requires type I pili expression by the bacterial cells, as no colitis is induced by ΔfimH mutant bacteria [14]. Colitis in this model requires the expression of human CEACAM receptors by transgenic mice, suggesting that the type I pili of AIEC can induce a proinflammatory response via CEACAM receptors in the gut mucosa. In support of this, FimH, the adhesin tip protein, is necessary but not sufficient for adhesion of AIEC strain LF82 to Intestine-407 cells [52]. Polymorphisms in the FimH sequence have been identified in E. coli isolated from IBD patients and healthy individuals. In particular, 7 amino acid variants are associated with E. coli from IBD tissue and 2 variants are associated with E. coli from healthy individuals [53]. Interestingly, FimH in NRG857c contains two disease-associated amino acid variants (N91S, S99N, and none of the SNPs associated with healthy tissue (A48V, A140V). Whether or not these variants are associated with different inflammatory responses or subtle differences in adherence in vivo will be important areas for future work.

(iii) Transcriptional regulators of virulence genes

NRG857c contains global transcriptional regulators including phoP-phoQ, envZ-ompR, slyA and the negative regulators hns, hha, and fis involved in genome architecture and transcriptional regulation [54]. Although these transcriptional factors are common to many bacterial species, in most Gram-negative pathogens they coordinate transcription of virulence genes including secretion system, toxins, adhesins and flagellar biosynthesis machinery [55, 56]. With this completed genome sequence, functional genomics approaches are now possible to understand the regulons of these transcription factors and their roles in intracellular survival and growth of AIEC. Indeed, Fis levels in the cell have already been associated with regulating the adhesive properties of AIEC strain LF82 [57].

(iv) Iron acquisition

Iron acquisition is an essential virulence trait in other ExPEC and these systems are expressed during urinary tract infections in vivo[58, 59]. Since NRG875c had an abundance of iron uptake systems, we designed experiments to test the role of iron acquisition during infection. We made an aerobactin transport mutant by deletion of iutA and tested whether this iron transport system was important for intracellular survival and the ability to colonize animals. We found that the iutA mutant was able to synthesize but not transport aerobactin (Additional file 6, Table S4). To investigate the invasive properties of ΔiutA, we conducted standard gentamicin protection assays in J774.1 macrophage cells, which did not reveal a significant difference in the uptake at 2 h of the wild type and the iutA mutant (Figure 5A). However, by 4 h after infection and thereafter, the iutA mutant had a significant defect in intracellular survival and/or replication compared to wild type cells. To determine whether the transport of aerobactin was important for bacterial infection in vivo, streptomycin pre-treated mice were infected with wild type NRG857c and the isogenic iutA mutant as described previously for a Salmonella infection model [60]. Wild type NRG857c was recovered in ~50-fold more abundance in the intestinal tissue compared to ΔiutA (Figure 5B).

Figure 5

Iron uptake by the aerobactin system is important for intracellular survival and for mouse colonization. (A) J774.A1 macrophage cells were infected with wild type NRG857c or iutA mutant cells. The survival of intracellular bacteria was determined at various times after infection. Data are the mean survival of intracellular bacteria with standard deviation. (*, P < 0.05, Mann Whitney) (B) The aerobactin iron transport system improves colonization in vivo. Groups of mice were infected orally with wild type NRG857c or iutA mutants. Colonization of the small intestine by NRG857c AIEC was determined three days after infection by enumerating the number of cfu in tissue homogenates. Data are the means with standard errors. (**, P < 0.005, Mann Whitney).


The two broad hypotheses accounting for the immunopathology of IBD, including deregulation of the intestinal immune system, and dysbiosis of the commensal microbiota [61], are likely not mutually exclusive. Both pathways could be operationalized at the same time and in response to known genetic and environmental triggers. Regarding the genetic correlates of the AIEC phenotype, our genome sequence and comparative analyses provide many testable hypotheses to uncover the adhesive, invasive, and proinflammatory nature of AIEC. The fact that the 35 genomic islands in NRG857c are, in many cases, highly orthologous in LF82 but weakly conserved or absent in other E. coli pathotypes and commensal organisms is suggestive that these genomic islands may have an influential role in the expression of the AIEC phenotype. It is also likely that evolved differences in gene expression, or regulatory evolution, has played a pivotal role in generating phenotypic diversity involved in pathogen-like behaviour of AIEC, as we have shown previously for another intracellular pathogen [62, 63]. Functional genomics studies enabled by this work will be forthcoming.


AIEC strain and genome sequencing

Escherichia coli AIEC strain NRC857c was isolated from a biopsy of a Crohn's disease patient at the Charite Hospital, Germany [16]. A mutant in aerobactin transport (designated RAA002) was created by disruption of the iutA gene using allelic exchange from a suicide plasmid as described previously [64]. For preparation of genomic DNA, wild type NRC857c cells were grown on solid Luria-Bertani (LB) agar at 37°C. Genomic DNA was extracted from 10 mg of bacteria scraped from a plate using the BioRobot EZ1 with the EZ1 DNA kit (Qiagen, Hilden, Germany). For plasmid purification, bacteria were grown in 4 L of LB broth and plasmid was isolated using a Maxi-prep kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions. Total genomic DNA was sequenced using a Genome Sequencer FLX System (454 Life Sciences, Branford, CT, USA) at the McGill University and Genome Quebec Innovation Centre (Montreal, QC, Canada).

Phylotype grouping, optical mapping, and in silico similarity clustering

Phylogenetic determinations were performed by in silico MLST using seven housekeeping genes (aspC, clpX, fadD, icdA, lysP, mdh and uidA). Analysis was performed using the software package MEGA4 [65, 66] and the Neighbour-Joining method under the Tajima-Nei model. An optical map of NRG857c was generated using the restriction enzyme NcoI (OpGen Inc., Madison, WI) and used for contig ordering. Unweighted Pair-Group Method using Arithmetic averages (UPGMA) similarity clustering of the restriction fragments generated in the whole genome optical map of NRG857c with in silico maps of publicly available E. coli isolates was performed using MapSolver version 2.1.1 (OpGen Inc., Madison, WI).

Gap closure

Outward facing primers annealing to adjacent contigs were designed using Primer3Plus, synthesized by SigmaGenosys (Oakville, ON, Canada) and used to amplify DNA of NRG857c using the Expand Long Template PCR system (Roche, Mannheim, Germany). PCR products were analysed on agarose gels, purified with a Montage PCR purification kit (Millipore, Billerica, MA, USA) and sequenced using Sanger sequencing (University of Guelph, ON, Canada). Finished sequence was assembled using SeqManPro (DNASTAR Inc., Madison, WI). For ribosomal RNA (rRNA) operons, primers were designed using the syntenic flanking sequences of each rRNA operon in the E. coli strain CFT073 [67]. These seven rDNA amplicons were sequenced using the flanking primers and specifically designed 16S (rrs) and 23S (rrl) primers based on sequence alignment with CFT073 rDNAs.

Genome annotation and in silico identification of genes unique to AIEC strains, NRG957c and LF82

The genome sequence was subjected to automated annotation using the NCBI Prokaryotic Genomes Automatic Annotation Pipeline with the resulting GenBank data incorporated into Kodon (Applied Maths Inc., Austin, TX) for manual curation. A protein database was constructed from 22 Escherichia coli genomes available in GenBank. All of the open reading frames of NRG857c predicted by Glimmer 3 [68] were searched against the protein database using BLASTX running locally [69]. The same comparison was performed using the LF82 nucleotide sequences. A script written with the BioPerl toolkit [70] was used to parse the BLAST output files for sequences that did not have any matches, or sequences with only weak matches using the criteria: (E-value ≥ 0.01), or (Percent Identity < 50%), or (<50% of the query length was used in the BLAST alignment). The predicted ORFs of NRG857c were compared against those of strain LF82 to identify those unique to each strain. Additional comparative genomics analyses were carried out using Panseq [71] and 29 publicly-available E. coli genome sequences (see Additional File 7, Table S5 for list of E. coli genomes and accession numbers used for comparative analyses). The functions of identified sequences were predicted using the annotation engine AutoFACT [72]. Circular genome atlases were generated using CGView [73, 74] or Circos [75].

Gentamicin protection assays

J774A.1 macrophage cells were seeded at 5 × 105 cells/well in DMEM with L-glutamine and 10% FBS for 16 h prior to infection. Cells were infected at a multiplicity of infection of 10 with wild type NRG857c or the iutA mutant. Infected cells were incubated at 37°C for 2 h, then washed and treated for 2 h with 100 μg/ml gentamicin. At various times post-infection, cells were washed and lysed with 0.1% Triton X-100 in PBS, followed by serial plating on LB agar. Gentamicin protection experiments were performed in triplicate and reported as the percent survival with standard error with statistical significance determined by Student's t test.

Mouse infections

All animal experiments were performed in accordance with protocols approved by the local animal ethics committee at the University of Texas Medical Branch, Galveston, Texas. Female ICR mice of 20-25-g (Charles River Laboratories) were used after 72 h of quarantine as described previously [76]. Briefly, food-restricted animals received streptomycin (5 g/L in drinking water supplemented with 7% fructose) for 48 h prior to oral inoculation with NRG857c or the iutA mutant. Groups of mice (n = 6) were orally inoculated with a suspension of NRG857c bacteria in a final volume of 0.4 mL delivered by gavage (20-gauge needle). The animals were maintained for 72 h, after which the animals were killed and the small intestines removed for homogenization and enumeration of the bacterial load. Groups were compared using the Mann Whitney non-parametric test.

Siderophore utilization and iron uptake bioassays

The synthesis of siderophores by AIEC O83:H1 was analyzed by the colorimetric Arnow assay to detect catechol siderophores [77] and the ferric perchlorate assay for hydroxamates [78]. To restrict the iron availability in liquid or solid medium, the iron chelator 2,2'-dipyridil was used. To examine the ability to use various siderophores or iron compounds as iron sources, overnight cultures of AIEC O83:H1 were diluted to 1 × 105 bacteria per ml and seeded into L agar containing 2,2'-dipyridil. Plates were spotted with 5 μl of 8 μM hemin or 5 μl of an overnight culture of a siderophore-producing strain. A sterile disk containing 20 μl of 10 mM FeSO4 was placed on each plate. Growth was monitored around the spots or disk after 18 to 24 hours at 37°C.


  1. 1.

    Sartor RB: Current concepts of the etiology and pathogenesis of ulcerative colitis and Crohn's disease. Gastroenterol Clin North Am. 1995, 24: 475-507.

    CAS  PubMed  Google Scholar 

  2. 2.

    Xavier RJ, Podolsky DK: Unravelling the pathogenesis of inflammatory bowel disease. Nature. 2007, 448: 427-434. 10.1038/nature06005.

    CAS  PubMed  Article  Google Scholar 

  3. 3.

    Campieri M, Gionchetti P: Bacteria as the cause of ulcerative colitis. Gut. 2001, 48: 132-135. 10.1136/gut.48.1.132.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  4. 4.

    Strober W, Fuss IJ, Blumberg RS: The immunology of mucosal models of inflammation. Annu Rev Immunol. 2002, 20: 495-549. 10.1146/annurev.immunol.20.100301.064816.

    CAS  PubMed  Article  Google Scholar 

  5. 5.

    Giaffer M, Holdsworth C, Duerden B: Virulence properties of Escherichia coli strains isolated from patients with inflammatory bowel disease. Gut. 1992, 33: 646-650. 10.1136/gut.33.5.646.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  6. 6.

    Liu SL, Sanderson KE: I-CeuI reveals conservation of the genome of independent strains of Salmonella typhimurium. J Bacteriol. 1995, 177: 3355-3357.

    CAS  PubMed Central  PubMed  Google Scholar 

  7. 7.

    Darfeuille-Michaud A, Boudeau J, Bulois P, Neut C, Glasser AL, Barnich N, Bringer MA, Swidsinski A, Beaugerie L, Colombel JF: High prevalence of adherent-invasive Escherichia coli associated with ileal mucosa in Crohn's disease. Gastroenterology. 2004, 127: 412-421. 10.1053/j.gastro.2004.04.061.

    PubMed  Article  Google Scholar 

  8. 8.

    Martin H, Campbell B, Hart C, Mpofu C, Nayar M: Enhanced Escherichia coli adherence and invasion in Crohn's disease and colon cancer. Gastroenterology. 2004, 127: 80-93. 10.1053/j.gastro.2004.03.054.

    CAS  PubMed  Article  Google Scholar 

  9. 9.

    Darfeuille-Michaud A: Adherent-invasive Escherichia coli: a putative new E. coli pathotype associated with Crohn's disease. Int J Med Microbiol. 2002, 292: 185-193. 10.1078/1438-4221-00201.

    CAS  PubMed  Article  Google Scholar 

  10. 10.

    Boudeau J, Glasser A, Masseret E, Joly B: Invasive ability of an Escherichia coli strain isolated from the ileal mucosa of a patient with Crohn's disease. Infect Immun. 1999, 67: 4499-4509.

    CAS  PubMed Central  PubMed  Google Scholar 

  11. 11.

    Schultsz C, van den Berg F, ten Kate F, Tytgat G: The intestinal mucus layer from patients with inflammatory bowel disease harbors high numbers of bacteria compared with controls. Gastroenterology. 1999, 117: 1089-1097. 10.1016/S0016-5085(99)70393-8.

    CAS  PubMed  Article  Google Scholar 

  12. 12.

    Darfeuille-Michaud A, Neut C, Barnich N, Lederman E, Di Martino P, Desreumaux P, Gambiez L, Joly B, Cortot A, Colombel JF: Presence of adherent Escherichia coli strains in ileal mucosa of patients with Crohn's disease. Gastroenterology. 1998, 115: 1405-1413. 10.1016/S0016-5085(98)70019-8.

    CAS  PubMed  Article  Google Scholar 

  13. 13.

    Rolhion N, Darfeuille-Michaud A: Adherent-invasive Escherichia coli in inflammatory bowel disease. Inflamm Bowel Dis. 2007, 13: 1277-1283. 10.1002/ibd.20176.

    PubMed  Article  Google Scholar 

  14. 14.

    Carvalho FA, Barnich N, Sivignon A, Darcha C, Chan CH, Stanners CP, Darfeuille-Michaud A: Crohn's disease adherent-invasive Escherichia coli colonize and induce strong gut inflammation in transgenic mice expressing human CEACAM. J Exp Med. 2009, 206: 2179-2189. 10.1084/jem.20090741.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  15. 15.

    Barnich N, Darfeuille-Michaud A: Adherent-invasive Escherichia coli and Crohn's disease. Curr Opin Gastroenterol. 2007, 23: 16-20. 10.1097/MOG.0b013e3280105a38.

    PubMed  Article  Google Scholar 

  16. 16.

    Eaves-Pyles T, Allen CA, Taormina J, Swidsinski A, Tutt CB, Jezek GE, Islas-Islas M, Torres AG: Escherichia coli isolated from a Crohn's disease patient adheres, invades, and induces inflammatory responses in polarized intestinal epithelial cells. Int J Med Microbiol. 2008, 298: 397-409. 10.1016/j.ijmm.2007.05.011.

    CAS  PubMed  Article  Google Scholar 

  17. 17.

    Martinez-Medina M, Mora A, Blanco M, Lopez C, Alonso MP, Bonacorsi S, Nicolas-Chanoine MH, Darfeuille-Michaud A, Garcia-Gil J, Blanco J: Similarity and divergence among adherent-invasive Escherichia coli and extraintestinal pathogenic E. coli strains. J Clin Microbiol. 2009, 47: 3968-3979. 10.1128/JCM.01484-09.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  18. 18.

    Konczy P, Ziebell K, Mascarenhas M, Choi A, Michaud C, Kropinski AM, Whittam TS, Wickham M, Finlay B, Karmali MA: Genomic O island 122, locus for enterocyte effacement, and the evolution of virulent verocytotoxin-producing Escherichia coli. J Bacteriol. 2008, 190: 5832-5840. 10.1128/JB.00480-08.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  19. 19.

    Rasko DA, Rosovitz MJ, Myers GS, Mongodin EF, Fricke WF, Gajer P, Crabtree J, Sebaihia M, Thomson NR, Chaudhuri R, et al: The pangenome structure of Escherichia coli: comparative genomic analysis of E. coli commensal and pathogenic isolates. J Bacteriol. 2008, 190: 6881-6893. 10.1128/JB.00619-08.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  20. 20.

    Gross WB: Diseases due to Escherichia coli in poultry. Escherichia coli in domestic animals and humans. 1994, C.L. G. Wallingford, UK: CAB International, 237-259.

    Google Scholar 

  21. 21.

    Kaper JB, Nataro JP, Mobley HL: Pathogenic Escherichia coli. Nature Reviews Microbiology. 2004, 2: 123-140. 10.1038/nrmicro818.

    CAS  PubMed  Article  Google Scholar 

  22. 22.

    Russo TA, Johnson JR: Proposal for a new inclusive designation for extraintestinal pathogenic isolates of Escherichia coli: ExPEC. J Infect Dis. 2000, 181: 1753-1754. 10.1086/315418.

    CAS  PubMed  Article  Google Scholar 

  23. 23.

    Croxen MA, Finlay BB: Molecular mechanisms of Escherichia coli pathogenicity. Nat Rev Microbiol. 2010, 8: 26-38.

    CAS  PubMed  Google Scholar 

  24. 24.

    Antão EM, Wieler LH, Ewers C: Adhesive threads of extraintestinal pathogenic Escherichia coli. Gut Pathog. 2009, 1: 22-10.1186/1757-4749-1-22.

    PubMed Central  PubMed  Article  Google Scholar 

  25. 25.

    Kim KS: Pathogenesis of bacterial meningitis: From bacteraemia to neuronal injury. Nature Reviews Neuroscience. 2003, 4: 376-385. 10.1038/nrn1103.

    CAS  PubMed  Article  Google Scholar 

  26. 26.

    Bando SY, Andrade FB, Guth BEC, Elias WP, Moreira-Filho CA, Pestana de Castro AF: Atypical enteropathogenic Escherichia coli genomic background allows the acquisition of non-EPEC virulence factors. FEMS Microbiology Letters. 2009, 299: 22-30. 10.1111/j.1574-6968.2009.01735.x.

    CAS  PubMed  Article  Google Scholar 

  27. 27.

    Bulgin R, Arbeloa A, Goulding D, Dougan G, Crepin VF, Raymond B, Frankel G: The T3SS effector EspT defines a new category of invasive enteropathogenic E. coli (EPEC) which form intracellular actin pedestals. PLoS Pathog. 2009, 5 (12): e1000683-10.1371/journal.ppat.1000683.

    PubMed Central  PubMed  Article  Google Scholar 

  28. 28.

    Weflen AW, Alto NM, Viswanathan VK, Hecht G: E. coli secreted protein F promotes EPEC invasion of intestinal epithelial cells via an SNX9-dependent mechanism. Cell Microbiol. 2010, 12: 919-929. 10.1111/j.1462-5822.2010.01440.x.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  29. 29.

    Ochman H, Lawrence JG, Groisman EA: Lateral gene transfer and the nature of bacterial innovation. Nature. 2000, 405: 299-304. 10.1038/35012500.

    CAS  PubMed  Article  Google Scholar 

  30. 30.

    Juhas M, van der Meer JR, Gaillard M, Harding RM, Hood DW, Crook DW: Genomic islands: tools of bacterial horizontal gene transfer and evolution. FEMS Microbiol Rev. 2009, 33: 376-393. 10.1111/j.1574-6976.2008.00136.x.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  31. 31.

    Langille MGI, Brinkman FSL: IslandViewer: an integrated interface for computational identification and visualization of genomic islands. Bioinformatics. 2009, 25: 664-665. 10.1093/bioinformatics/btp030.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  32. 32.

    Nishino K, Inazumi Y, Yamaguchi A: Global analysis of genes regulated by EvgA of the two-component regulatory system in Escherichia coli. J Bacteriol. 2003, 185: 2667-2672. 10.1128/JB.185.8.2667-2672.2003.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  33. 33.

    Devanathan S, Postle K: Studies on colicin B translocation: FepA is gated by TonB. Mol Microbiol. 2007, 65: 441-453. 10.1111/j.1365-2958.2007.05808.x.

    CAS  PubMed  Article  Google Scholar 

  34. 34.

    Fricke WF, McDermott PF, Mammel MK, Zhao S, Johnson TJ, Rasko DA, Fedorka-Cray PJ, Pedroso A, Whichard JM, Leclerc JE, et al: Antimicrobial resistance-conferring plasmids with similarity to virulence plasmids from avian pathogenic Escherichia coli strains in Salmonella enterica serovar Kentucky isolates from poultry. Applied and Environmental Microbiology. 2009, 75: 5963-5971. 10.1128/AEM.00786-09.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  35. 35.

    Tivendale KA, Noormohammadi AH, Allen JL, Browning GF: The conserved portion of the putative virulence region contributes to virulence of avian pathogenic Escherichia coli. Microbiology (Reading, Engl). 2009, 155: 450-460.

    CAS  Article  Google Scholar 

  36. 36.

    Chen YT, Chang HY, Lai YC, Pan CC, Tsai SF, Peng HL: Sequencing and analysis of the large virulence plasmid pLVPK of Klebsiella pneumoniae CG43. Gene. 2004, 337: 189-198. 10.1016/j.gene.2004.05.008.

    CAS  PubMed  Article  Google Scholar 

  37. 37.

    Baichoo N, Helmann JD: Recognition of DNA by Fur: a reinterpretation of the Fur box consensus sequence. J Bacteriol. 2002, 184: 5826-5832. 10.1128/JB.184.21.5826-5832.2002.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  38. 38.

    Chen Z, Lewis KA, Shultzaberger RK, Lyakhov IG, Zheng M, Doan B, Storz G, Schneider TD: Discovery of Fur binding site clusters in Escherichia coli by information theory models. Nucleic Acids Res. 2007, 35: 6762-6777. 10.1093/nar/gkm631.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  39. 39.

    Mey AR, Wyckoff EE, Kanukurthy V, Fisher CR, Payne SM: Iron and fur regulation in Vibrio cholerae and the role of fur in virulence. Infect Immun. 2005, 73: 8167-8178. 10.1128/IAI.73.12.8167-8178.2005.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  40. 40.

    Pukatzki S, Ma AT, Sturtevant D, Krastins B, Sarracino D, Nelson WC, Heidelberg JF, Mekalanos JJ: Identification of a conserved bacterial protein secretion system in Vibrio cholerae using the Dictyostelium host model system. Proc Natl Acad Sci USA. 2006, 103: 1528-1533. 10.1073/pnas.0510322103.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  41. 41.

    Ma AT, McAuley S, Pukatzki S, Mekalanos JJ: Translocation of a Vibrio cholerae type VI secretion effector requires bacterial endocytosis by host cells. Cell Host Microbe. 2009, 5: 234-243. 10.1016/j.chom.2009.02.005.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  42. 42.

    Bingle LE, Bailey CM, Pallen MJ: Type VI secretion: a beginner's guide. Curr Opin Microbiol. 2008, 11: 3-8. 10.1016/j.mib.2008.01.006.

    CAS  PubMed  Article  Google Scholar 

  43. 43.

    Shrivastava S, Mande SS: Identification and functional characterization of gene components of Type VI Secretion system in bacterial genomes. PLoS ONE. 2008, 3 (8): e2955-10.1371/journal.pone.0002955.

    PubMed Central  PubMed  Article  Google Scholar 

  44. 44.

    Plainvert C, Bidet P, Peigne C, Barbe V, Medigue C, Denamur E, Bingen E, Bonacorsi S: A new O-antigen gene cluster has a key role in the virulence of the Escherichia coli meningitis clone O45:K1:H7. J Bacteriol. 2007, 189: 8528-8536. 10.1128/JB.01013-07.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  45. 45.

    Pukatzki S, McAuley SB, Miyata ST: The type VI secretion system: translocation of effectors and effector-domains. Curr Opin Microbiol. 2009, 12: 11-17. 10.1016/j.mib.2008.11.010.

    CAS  PubMed  Article  Google Scholar 

  46. 46.

    Bringer MA, Barnich N, Glasser AL, Bardot O, Darfeuille-Michaud A: HtrA stress protein is involved in intramacrophagic replication of adherent and invasive Escherichia coli strain LF82 isolated from a patient with Crohn's disease. Infect Immun. 2005, 73: 712-721. 10.1128/IAI.73.2.712-721.2005.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  47. 47.

    Barnich N, Bringer MA, Claret L, Darfeuille-Michaud A: Involvement of lipoprotein NlpI in the virulence of adherent invasive Escherichia coli strain LF82 isolated from a patient with Crohn's disease. Infect Immun. 2004, 72: 2484-2493. 10.1128/IAI.72.5.2484-2493.2004.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  48. 48.

    Bringer MA, Rolhion N, Glasser AL, Darfeuille-Michaud A: The oxidoreductase DsbA plays a key role in the ability of the Crohn's disease-associated adherent-invasive Escherichia coli strain LF82 to resist macrophage killing. J Bacteriol. 2007, 189: 4860-4871. 10.1128/JB.00233-07.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  49. 49.

    Rolhion N, Barnich N, Claret L, Darfeuille-Michaud A: Strong decrease in invasive ability and outer membrane vesicle release in Crohn's disease-associated adherent-invasive Escherichia coli strain LF82 with the yfgL gene deleted. J Bacteriol. 2005, 187: 2286-2296. 10.1128/JB.187.7.2286-2296.2005.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  50. 50.

    Sachdeva G, Kumar K, Jain P, Ramachandran S: SPAAN: a software program for prediction of adhesins and adhesin-like proteins using neural networks. Bioinformatics. 2005, 21: 483-491. 10.1093/bioinformatics/bti028.

    CAS  PubMed  Article  Google Scholar 

  51. 51.

    Homeier T, Semmler T, Wieler LH, Ewers C: The GimA locus of extraintestinal pathogenic E. coli: does reductive evolution correlate with habitat and pathotype?. PLoS ONE. 2010, 5 (5): e10877-10.1371/journal.pone.0010877.

    PubMed Central  PubMed  Article  Google Scholar 

  52. 52.

    Boudeau J, Barnich N, Darfeuille-Michaud A: Type 1 pili-mediated adherence of Escherichia coli strain LF82 isolated from Crohn's disease is involved in bacterial invasion of intestinal epithelial cells. Mol Microbiol. 2001, 39: 1272-1284. 10.1111/j.1365-2958.2001.02315.x.

    CAS  PubMed  Article  Google Scholar 

  53. 53.

    Sepehri S, Kotlowski R, Bernstein CN, Krause DO: Phylogenetic analysis of inflammatory bowel disease associated Escherichia coli and the fimH virulence determinant. Inflamm Bowel Dis. 2009, 15: 1737-1745. 10.1002/ibd.20966.

    PubMed  Article  Google Scholar 

  54. 54.

    Dorman CJ: Nucleoid-associated proteins and bacterial physiology. Adv Appl Microbiol. 2009, 67: 47-64. full_text.

    CAS  PubMed  Article  Google Scholar 

  55. 55.

    Fass E, Groisman EA: Control of Salmonella pathogenicity island-2 gene expression. Curr Opin Microbiol. 2009, 12: 199-204. 10.1016/j.mib.2009.01.004.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  56. 56.

    Yoon H, McDermott JE, Porwollik S, McClelland M, Heffron F: Coordinated regulation of virulence during systemic infection of Salmonella enterica serovar Typhimurium. PLoS Pathog. 2009, 5 (2): e1000306-10.1371/journal.ppat.1000306.

    PubMed Central  PubMed  Article  Google Scholar 

  57. 57.

    Miquel S, Claret L, Bonnet R, Dorboz I, Barnich N, Darfeuille-Michaud A: Role of decreased levels of Fis histone-like protein in Crohn's disease-associated adherent invasive Escherichia coli LF82 bacteria interacting with intestinal epithelial cells. J Bacteriol. 2010, 192: 1832-1843. 10.1128/JB.01679-09.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  58. 58.

    Miethke M, Marahiel MA: Siderophore-based iron acquisition and pathogen control. Microbiol Mol Biol Rev. 2007, 71: 413-451. 10.1128/MMBR.00012-07.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  59. 59.

    Hagan EC, Mobley HL: Uropathogenic Escherichia coli outer membrane antigens expressed during urinary tract infection. Infect Immun. 2007, 75: 3941-3949. 10.1128/IAI.00337-07.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  60. 60.

    Barthel M, Hapfelmeier S, Quintanilla-Martinez L, Kremer M, Rohde M, Hogardt M, Pfeffer K, Russmann H, Hardt WD: Pretreatment of mice with streptomycin provides a Salmonella enterica serovar Typhimurium colitis model that allows analysis of both pathogen and host. Infect Immun. 2003, 71: 2839-2858. 10.1128/IAI.71.5.2839-2858.2003.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  61. 61.

    Strober W, Fuss I, Mannon P: The fundamental basis of inflammatory bowel disease. J Clin Invest. 2007, 117: 514-521. 10.1172/JCI30587.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  62. 62.

    Tomljenovic-Berube AM, Mulder DT, Whiteside MD, Brinkman FS, Coombes BK: Identification of the regulatory logic controlling Salmonella pathoadaptation by the SsrA-SsrB two-component system. PLoS Genet. 2010, 6 (3): e1000875-10.1371/journal.pgen.1000875.

    PubMed Central  PubMed  Article  Google Scholar 

  63. 63.

    Osborne SE, Walthers D, Tomljenovic AM, Mulder DT, Silphaduang U, Duong N, Lowden MJ, Wickham ME, Waller RF, Kenney LJ, et al: Pathogenic adaptation of intracellular bacteria by rewiring a cis-regulatory input function. Proc Natl Acad Sci USA. 2009, 106: 3982-3987. 10.1073/pnas.0811669106.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  64. 64.

    Torres AG, Redford P, Welch RA, Payne SM: TonB-dependent systems of uropathogenic Escherichia coli: aerobactin and heme transport and TonB are required for virulence in the mouse. Infect Immun. 2001, 69: 6179-6185. 10.1128/IAI.69.10.6179-6185.2001.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  65. 65.

    Kumar S, Nei M, Dudley J, Tamura K: MEGA: a biologist-centric software for evolutionary analysis of DNA and protein sequences. Brief Bioinformatics. 2008, 9: 299-306. 10.1093/bib/bbn017.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  66. 66.

    Tamura K, Dudley J, Nei M, Kumar S: MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol Biol Evol. 2007, 24: 1596-1599. 10.1093/molbev/msm092.

    CAS  PubMed  Article  Google Scholar 

  67. 67.

    Welch RA, Burland V, Plunkett G, Redford P, Roesch P, Rasko D, Buckles EL, Liou SR, Boutin A, Hackett J, et al: Extensive mosaic structure revealed by the complete genome sequence of uropathogenic Escherichia coli. Proc Natl Acad Sci USA. 2002, 99: 17020-17024. 10.1073/pnas.252529799.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  68. 68.

    Delcher AL, Harmon D, Kasif S, White O, Salzberg SL: Improved microbial gene identification with GLIMMER. Nucleic Acids Research. 1999, 27: 4636-4641. 10.1093/nar/27.23.4636.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  69. 69.

    Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. Journal of Molecular Biology. 1990, 215: 403-410.

    CAS  PubMed  Article  Google Scholar 

  70. 70.

    Stajich JE, Block D, Boulez K, Brenner SE, Chervitz SA, Dagdigian C, Fuellen G, Gilbert JGR, Korf I, Lapp H, et al: The Bioperl toolkit: Perl modules for the life sciences. Genome Research. 2002, 12: 1611-1618. 10.1101/gr.361602.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  71. 71.

    Laing C, Buchanan C, Taboada EN, Zhang Y, Kropinski A, Villegas A, Thomas JE, Gannon VP: Pan-genome sequence analysis using Panseq: an online tool for the rapid analysis of core and accessory genomic regions. BMC Bioinformatics. 2010, 11: 461-10.1186/1471-2105-11-461.

    PubMed Central  PubMed  Article  Google Scholar 

  72. 72.

    Koski LB, Gray MW, Lang BF, Burger G: AutoFACT: an automatic functional annotation and classification tool. BMC Bioinformatics. 2005, 6: 151-10.1186/1471-2105-6-151.

    PubMed Central  PubMed  Article  Google Scholar 

  73. 73.

    Stothard P, Wishart DS: Circular genome visualization and exploration using CGView. Bioinformatics. 2005, 21: 537-539. 10.1093/bioinformatics/bti054.

    CAS  PubMed  Article  Google Scholar 

  74. 74.

    Grant JR, Stothard P: The CGView Server: a comparative genomics tool for circular genomes. Nucleic Acids Res. 2008, 36: W181-184. 10.1093/nar/gkn179.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  75. 75.

    Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ, Marra MA: Circos: an information aesthetic for comparative genomics. Genome Research. 2009, 19: 1639-1645. 10.1101/gr.092759.109.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  76. 76.

    Moen ST, Blumentritt CA, Slater TM, Patel SD, Tutt CB, Estrella-Jimenez ME, Pawlik J, Sower L, Popov VL, Schein CH, et al: Testing the efficacy and toxicity of adenylyl cyclase inhibitors against enteric pathogens using in vitro and in vivo models of infection. Infect Immun. 2010, 78: 1740-1749. 10.1128/IAI.01114-09.

    CAS  PubMed Central  PubMed  Article  Google Scholar 

  77. 77.

    Arnow LE: Colorimetric determination of the components of 3,4-dihydroxyphenylalaninetyrosine mixtures. Journal of Biological Chemistry. 1937, 118: 531-537.

    CAS  Google Scholar 

  78. 78.

    Atkin CL, Neilands JB, Phaff HJ: Rhodotorulic acid from species of Leucosporidium, Rhodosporidium, Rhodotorula, Sporidiobolus, and Sporobolomyces, and a new alanine-containing ferrichrome from Cryptococcus melibiosum. J Bacteriol. 1970, 103: 722-733.

    CAS  PubMed Central  PubMed  Google Scholar 

  79. 79.

    Miquel S, Peyretaillade E, Claret L, de Vallee A, Dossat C, Vacherie B, Zineb el H, Segurens B, Barbe V, Sauvanet P: Complete genome sequence of Crohn's disease-associated adherent-invasive E. coli strain LF82. PLoS ONE. 2010, 5 (9): 10.1371/journal.pone.0012714.

Download references


We thank Dr. Paul Stothard at the Department of Agricultural, Food and Nutritional Science at the University of Alberta for help in generating Figure 1, and Dr. Alexander Swidsinski for providing the original NRG857c isolate. This work was supported by an Innovation in IBD grant to BKC from the Crohn's and Colitis Foundation of Canada and by an operating grant to BKC from the Canadian Institutes of Health Research (MOP-82704). BKC is a CIHR New Investigator (MSH-83721) and the recipient of the Early Researcher Award from the Ontario Ministry of Research and Innovation and the Young Investigator Award in the Biological Sciences from Boehringer Ingelheim (Canada) Ltd.

Note added in revision

While this paper was being revised, Miquel and colleagues reported the genome sequence of LF82, a prototype AIEC isolate [79].

Author information



Corresponding author

Correspondence to Brian K Coombes.

Additional information

Authors' contributions

All authors contributed to the writing of this manuscript as well as overall project design; AK developed the gap closure strategy and manually screened the NCBI pipeline data; MM, PK and KZ carried out the laboratory experiments for closing the chromosome and plasmid sequences; AV, JN and PK performed the bioinformatics analyses.

Electronic supplementary material

Alignment of Nco

Additional File 1: I optical map of NRC857c with nine super-contigs generated from shotgun sequencing. The NcoI optical restriction map of NRG857c was aligned with the in silico-generated NcoI restriction maps of nine super-contigs arising from the shotgun sequencing and assembly of the genome. The vertical lines are alignment marks identifying similar restriction fragments between two aligned contigs. (PDF 319 KB)

Additional File 2:Accession numbers and gene coordinates used for in silico MLST analysis. (XLS 25 KB)


Additional File 3:Alignment of NcoI optical map of NRC857c with the in silico-generated map of LF82. The vertical lines are alignment marks identifying similar restriction fragments between two aligned contigs. The region highlighted in red is a region of DNA that is translocated in LF82. (PDF 595 KB)

Additional File 4:Table S2: Predicted Genomic Islands in NRG857c. (XLSX 59 KB)

Additional File 5:Genes unique to NRG857c and/or LF82. (XLSX 37 KB)

Additional File 6:Iron transport in AIEC NRG857c and aerobactin uptake mutant. (PDF 79 KB)

Additional File 7:List of E. coli genomes used for comparative genomics analyses. (XLSX 47 KB)

Authors’ original submitted files for images

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and Permissions

About this article

Cite this article

Nash, J.H., Villegas, A., Kropinski, A.M. et al. Genome sequence of adherent-invasive Escherichia coli and comparative genomic analysis with other E. coli pathotypes. BMC Genomics 11, 667 (2010).

Download citation


  • Genomic Island
  • Unweighted Pair Group Method With Arithmetic
  • Iron Acquisition
  • Intracellular Survival
  • Neonatal Meningitis