In Gram-negative bacteria, the cytoplasm is surrounded by inner membrane (IM) and outer membrane (OM), which are separated by an inter-membrane space, called the periplasm. Most of the newly synthesized proteome remains in the cytoplasm, but in addition, different machineries are involved in the translocation of non-cytoplasmic proteins to different subcellular localizations, including the inner or outer membrane, the periplasmic space, or the extracellular space. Some of these machineries recognize their substrate proteins by an N-terminal signal peptide (SP) for the translocation process, while other machineries are SP-independent. The IM, which is a phospholipid lipid bilayer, is mostly occupied by transmembrane α-helical proteins, by inner membrane lipoproteins on its periplasmic side, and by other membrane associated proteins on both sides of the membrane. In contrast, the asymmetric OM, which consists of phospholipids only in the inner leaflet of the membrane and lipopolysaccharides in the outer leaflet, is mostly occupied by transmembrane (outer membrane) β-barrel proteins, and by outer membrane lipoproteins on its periplasmic side
The biogenesis of an outer membrane β-barrel protein (OMP) begins with the translocation of the newly synthesized, unfolded protein across the IM into the periplasm via the Sec translocation machinery, which requires a cleavable general SP. Once the unfolded OMP reaches the periplasm, it uses the SurA or Skp-DegP pathway to reach the OM. SurA, Skp and DegP are periplasmic chaperones, which interact with unfolded OMPs by protecting them from aggregation and thus help them to reach the OM
[2, 3]. It has been shown that the SurA pathway and the Skp/DegP pathway can work in parallel, but that the SurA pathway plays an important role when the cell is under normal growth conditions, while under stress conditions, the Skp-DegP pathway plays the major role
Once periplasmic chaperones deliver the OMPs to the OM, the folding and insertion of the protein into the membrane is mediated by the β-barrel assembly machinery (BAM), without an external energy source
 such as ATP or ion gradients. This machinery involves an essential multi-domain protein, BamA (Omp85), which consists of a 16-stranded transmembrane β-barrel domain, and of a large periplasmic part that consists of five POTRA (polypeptide transport-associated) domains. BamA is highly conserved in Gram-negative bacteria and also has homologues in mitochondria (Sam50) and chloroplasts (Toc75-V)
. In addition, the BAM complex, at least in E. coli, consists of four lipoproteins, BamB, BamC, BamD and BamE, among which only BamD is essential and conserved in most Gram-negative bacteria
. Recent HMM-based sequence analysis by Anwari et al.  showed that BamB and BamE are mainly present in α-, β- and γ-proteobacteria, while BamC is present only in β- and γ-proteobacteria. They also found a new lipoprotein subunit in the BAM complex, named BamF, which is present exclusively in α-proteobacteria.The BAM complex recognizes OMPs as its substrates via binding to an amphipathic C-terminal β-strand of the unfolded β-barrel
, but the exact binding mode is still not clear. It was suggested that C-terminal β-strand binds to BamD
, once the unfolded OMPs are delivered to the BAM complex by periplasmic chaperones. But a recent BamC and BamD subcomplex crystal structure shows that the unstructured N-terminus of BamC binds to the proposed substrate binding site of BamD
. The C-terminal β-strand of an OMP β-barrel domain typically contains an aromatic residue at its C-terminus. It has been reported that deletion or substitution of this C-terminal residue negatively affects the biogenesis of OMPs
[10, 11]. Also, in vitro studies showed that the E. coli OM porin PhoE, when lacking its C-terminal Phe residue, fails to open the Omp85/BamA channel
. In both studies, overexpression of the mutant OMP was lethal to the cells. At lower concentration, the mutant protein was tolerated and got inserted into the membrane. This leads to the suggestion that a weak insertion signal other than the C-terminal residue or β-strand is present
Robert et al. observed that the N. meningitidis OM porin PorA or its C-terminal β-strand did not open the E. coli Omp85/BamA channel, and the comparison of the C-terminal β-strands from N. meningitidis and E. coli OMPs showed a high preference of positive amino acids at the penultimate (+2) position in neisserial OMPs. When they mutated E. coli PhoE or its C-terminal β-strand, changing Gln for Lys at the +2 position, it did not open the channel any more; in contrast, a Neisseria PorA peptide with Gln instead of Lys increased the channel activity considerably. These studies and the fact that high concentrations of neisserial OMPs were lethal in E. coli cells, lead to the conclusion that the C-terminal insertion signal is species-specific and that the residues at the +2 position were important for this phenomenon. The number of peptides/proteins used in the comparison in the study
 was very low, compared to the total number of OMPs present in the E. coli or N. meningitidis genomes; moreover, the phenomenon was only compared between two organisms, one β- and one γ-proteobacterial species. Since neisserial OMPs could be expressed in E. coli at low expression rates, either the neisserial C-terminal insertion signal is weakly recognized by E. coli BAM complex, or other β-strands in the full length protein might act as a weak insertion signal.
Thus, there seems to be at least some overlap in the peptide recognition. The intention of this study was to use computational methods to quantify this overlap, and to find out whether the observed (partial) species specificity of the insertion signal is exhibited by all Gram-negative bacterial organisms.