Prediction of peptidoglycan hydrolases- a new class of antibacterial proteins
© The Author(s). 2016
Received: 6 April 2016
Accepted: 19 May 2016
Published: 27 May 2016
The efficacy of antibiotics against bacterial infections is decreasing due to the development of resistance in bacteria, and thus, there is a need to search for potential alternatives to antibiotics. In this scenario, peptidoglycan hydrolases can be used as alternate antibacterial agents due to their unique property of cleaving peptidoglycan cell wall present in both gram-positive and gram-negative bacteria. Along with a role in maintaining overall peptidoglycan turnover in a cell and in daughter cell separation, peptidoglycan hydrolases also play crucial role in bacterial pathophysiology requiring development of a computational tool for the identification and classification of novel peptidoglycan hydrolases from genomic and metagenomic data.
In this study, the known peptidoglycan hydrolases were divided into multiple classes based on their site of action and were used for the development of a computational tool ‘HyPe’ for identification and classification of novel peptidoglycan hydrolases from genomic and metagenomic data. Various classification models were developed using amino acid and dipeptide composition features by training and optimization of Random Forest and Support Vector Machines. Random Forest multiclass model was selected for the development of HyPe tool as it showed up to 71.12 % sensitivity, 99.98 % specificity, 99.55 % accuracy and 0.80 MCC in four different classes of peptidoglycan hydrolases. The tool was validated on 24 independent genomic datasets and showed up to 100 % sensitivity and 0.94 MCC. The ability of HyPe to identify novel peptidoglycan hydrolases was also demonstrated on 24 metagenomic datasets.
The present tool helps in the identification and classification of novel peptidoglycan hydrolases from complete genomic or metagenomic ORFs. To our knowledge, this is the only tool available for the prediction of peptidoglycan hydrolases from genomic and metagenomic data.
KeywordsPeptidoglycan hydrolase N-acetylglucosaminidase N-acetylmuramidases Lytic transglycosylases Endopeptidase N-acetylmuramoyl-L-alanine Carboxypeptidase Cell wall hydrolases Support Vector Machine Random Forest
The compounds which act against bacterial infection either by suppressing its growth or by killing the bacterium are mainly considered as antibacterial agents such as sulfonamide derivatives and tetracycline antibiotic . These antibiotics have been widely used as medicines for humans and animals for fighting against bacterial infection. However, in the last decades, these antibiotics have not shown consistent effectiveness against bacterial infections due the emergence of drug resistance in bacteria against these antibiotics . This problem poses a serious challenge towards the researchers to discover either newer drug molecules with lower bacterial drug resistance or to look for the alternatives of antibiotics . Recently, peptidoglycan hydrolases have been proposed as potential alternative for antibiotics due to their bacteriolytic activity with multifarious spectrum [4–6]. Among the various sites of action of the antibacterial agents, bacterial cell wall has been a widely used target which is also the target site for peptidoglycan hydrolases .
The bacterial cell wall is made up of glycan strands which are cross-linked by flexible peptide side chains, providing strength and rigidity to the bacterial cell wall . The peptidoglycan of both gram-positive and gram-negative bacteria comprises of repeating units of N-acetylglucosamine (NAG) and β-(1–4)-N-acetylmuramic acid (NAM) cross-linked by peptide stem chains attached to the NAM residues . First two peptides of tetra-peptide chain generally consist of L-alanine and D-glutamine or isoglutamine and the last residue is generally D-alanine. The third residue of stem peptide varies across bacteria and is lysine in coccoid gram-positive bacteria and meso-diaminopimelate (mDAP) in gram-negative bacteria and in many gram-positive rods such as Listeria and Bacillus species . The peptidoglycan layer is highly dynamic during cell growth and reshapes on division.
Several studies have been carried out on cell wall autolysins (peptidoglycan hydrolases) in various bacterial populations with roles pertaining to the peptidoglycan turnover along with the other functions in bacteria. Though potentially lethal, these autolysins are universally present among bacteria that have peptidoglycan. Lysostaphin is one of the most studied peptidoglycan hydrolase, excreted by Staphylococcus simulans cleaving the peptidoglycan chain of Staphylococcus aureus without affecting itself . Zoocin A which is produced by Streptococcus zooepidemicus 4881 is also a bacteriolytic cell wall hydrolase similar to lysostaphin . It has recently been demonstrated to be potentially effective in controlling and treating infection caused by Staphylococcus aureus group of bacteria. Millericin B which is another antimicrobial murein hydrolase produced by Streptococcus milleri NMSCC061 inhibits the growth of several bacterial species . A muraminidase Cpl-1 is also a phase lytic enzyme and was used for the first time to treat pneumococcal meningitis infection through intravenous administration . Antimicrobial properties of peptidoglycan hydrolase (such as lysozyme) have been known since several decades . The efficacy of lysozyme has been shown for skin treatment and in the infections of mucus membranes and is used as an ingredient in wound healing ointments [21, 22]. Bacteriophage endolysin PlyC in an aerosolized form was active against pathogenic Streptococcus equi and is considered as the first narrow-spectrum disinfectant against the bacterial strain . Endolysin PlyV12 and Enterolysin A are known peptidoglycan hydrolase having anti-enterococcal activity [24, 25]. Several other peptidoglycan hydrolases such as Acd, LytA, or PL-1 were identified and purified from various origins having antibacterial activity mainly against gram-positive bacteria [26–28]. Taken together, these studies underscore the potential of using peptidoglycan hydrolases as antibacterial agents in several applications including therapeutics.
Identification and classification of novel peptidoglycan hydrolases in the completely sequenced genomes becomes difficult due to the lack of homology of these hydrolases with the previously well characterized peptidoglycan hydrolases. Therefore, in the present work, a machine learning based approach using Random Forest (RF) has been used and demonstrated for the identification and classification of novel peptidoglycan hydrolases from genomic and metagenomic data. The predicted novel peptidoglycan hydrolases belonging to different classes would provide leads for further characterization and potential application as antibacterial agents specifically against various bacterial species.
Results and discussion
Selection of machine learning method
To select appropriate machine learning method and feature inputs for construction of final module providing the most accurate classification, the Amino Acid Composition (AAC) and Dipeptide Composition (DPC) were used as features calculated from randomly selected 10 % data from the positive and negative datasets. At the first level of binary classification, the idea was to optimize the parameters for construction of a module which can predict peptidoglycan hydrolases from input sequences. The selected (10 %) data was used to carry out ten-fold cross-validation in WEKA. It is apparent from Additional file 1 that LibSVM performed better than other machine learning techniques as it showed the highest accuracy of 93.15 % in the case of DPC features and 91.3 % in case of AAC features. RF also showed good results in case of both AAC (86.64 %) and DPC features (86.62 %).
At the second level, termed as ‘multiclass classification’, for a protein predicted as peptidoglycan hydrolase, the category of the protein is predicted using the information from the five categories (see Methods) based on their site of action. AAC features and DPC features of randomly selected 10 % data from the total dataset, where each sequence was tagged with its respective category, were used for evaluation using ten-fold cross-validation using WEKA. LibSVM performed better than the other machine learning methods and showed the highest accuracy of 90.01 % for DPC features and 87.68 % for AAC features. RF also showed good results with an accuracy of 83.45 % in case of DPC as feature input, and 82.63 % when AAC was used as feature input (Additional file 2).
These results suggest that both LibSVM and RF, using AAC and DPC features and on using 10 % of the total data, performed better than the other machine learning methods, and can be further evaluated and optimized on the complete dataset. The further optimization was carried out to select the best features and a machine learning method for construction of the final module.
Optimization of LibSVM and random forest modules
For binary classification
Comparative performance of LibSVM and RF using Amino Acid and Dipeptide as feature inputs for binary classification
For multiclass classification
Performance of LibSVM using Amino acid and Dipeptide composition as feature inputs for multiclass classification
Performance of Random Forest (RF) final models using Amino acid and Dipeptide composition as feature inputs for multiclass classification
From the optimization results, it is evident that at the first level of classification the performances of both LibSVM and RF modules were comparable; however, at the second level the RF module performed better than LibSVM module (Tables 1, 2 and 3). Therefore, for the multiclass classification RF module was considered as the final classifier, and for the binary classification further evaluation was carried out to select between the LibSVM and RF modules. Using these modules, three prediction approaches were constructed using the following strategy.
In the first approach, at the first level (for binary classification) the LibSVM module was implemented using DPC as the feature input and linear kernel at cost factor of 1. The query proteins were classified as ‘positive’ or ‘negative’ hits at this level and all the positive hits were further used as query protein at the second level (for multiclass classification). At the second level, the RF module constructed using DPC at mtry = 64 and ntree = 500 was used for the classification of positive hits (predicted peptidoglycan hydrolases) obtained from the first level.
The second approach was also implemented using the same methodology, however, at the first level the RF module constructed using DPC at mtry = 64, ntree = 500 has been used for binary classification in place of LibSVM module. The RF module at second level remained the same as mentioned in the first approach. The query proteins were analyzed using the same procedure as mentioned in the first tool.
The third approach had only a single level where the RF module constructed using DPC at mtry = 64 and ntree = 500 was used for the classification of query proteins into the five categories as mentioned in Additional file 3.
Performance evaluation of the three approaches
Two datasets were used to examine the performance of the three approaches. The first dataset was constructed using randomly selected 250 known peptidoglycan hydrolases from 24 bacterial genomes (from validation set) and was used as a query to evaluate the performance of the three approaches. The performance was evaluated in terms of the number of peptidoglycan hydrolases which could be predicted positively and the time required for the prediction by each tool. It is apparent that the third approach which used only the RF module for multiclass classification performed better (220 correct predictions out of 250 proteins in 2.90 s) as compared to the other two approaches (Additional file 4).
The complete ORFs predicted from the metagenomic dataset (MH0016) were used as the second dataset for the performance evaluation. BLAST was performed for these ORFs against the positive dataset and the ORFs which showed significant (E <1e-6 and identity ≥80 %) best hits with the positive dataset were considered as true peptidoglycan hydrolases. These selected 41 ORFs were further used to evaluate the results obtained from the three approaches. In this case also, the performance of the third approach was better (30 correct predictions out of 41 in 19 s) than the other two approaches (Additional file 5). It is evident from these results that the third approach which involved only one level of classification using RF module showed the best performance among all the three approaches and, hence was selected for developing the prediction tool termed as ‘HyPe’.
Validation of ‘HyPe’ on independent genomic datasets
The first independent dataset consisted of protein sequences from 24 new bacterial genomes and was used to evaluate the performance of HyPe. The annotated peptidoglycan hydrolases from each genome was used as the reference dataset to compare the predictions made by HyPe and BLAST. To evaluate the performance of BLAST, local alignment was carried out for the proteins present in each genome against the positive dataset. The proteins which showed significant (E <1e−6 and identity ≥80 %) similarity with the positive dataset were selected as positive hits. Similarly, the proteins from each of the 24 bacterial genomes were analyzed using HyPe to predict peptidoglycan hydrolases from each genome.
Performance of HyPe on independent genomic dataset
Alcanivorax pacificus RT type strain W11 5
Bacillus anthracis strain PAK 1
Bacillus cereus strain 03BB87
Bacillus subtilis T30
Bacillus thuringiensis strainHD571
Campylobacter jejuni subsp jejuni strain YH001
Chlamydia trachomatis strainL2b CS19 08
Clostridium botulinum strain NCTC8550
Cronobacter sakazakii SP291
Francisella tularensis subsp. novicida U112
Haemophilus influenzae strain Hi375
Halomonas sp strain KO116
Lactobacillus sp. wkB8
Listeria monocytogenes Serovar 4b Strain IZSAM Lm hs2008
Listeria monocytogenes strain NTSN
Mycobacterium tuberculosis H37RvSiena
Neisseria meningitidis LNP21362
Pasteurella multocida OH1905
Rickettsia rickettsii str Morgan
Staphylococcus aureus strain FCFHV36
Streptococcus iniae strain ISNO
Vibrio alginolyticus NBRC 15630
Vibrio tubiashii ATCC 19109
Weissella ceti strain WS74
Validation of ‘HyPe’ on real metagenomic datasets
The second independent dataset consisted of 24 metagenomic samples obtained from the Human Gut Microbial Gene Catalogue and processed using the methodology discussed in the methods section (Additional file 7). The complete proteins having length ≥100 amino acids were used as query for each metagenome. Using BLAST, the maximum (83) number of peptidoglycan hydrolases were predicted for metagenomic sample MH0045, whereas, using HyPe the maximum (288) number of peptidoglycan hydrolases were predicted for metagenomic sample MH0085, among all metagenomic samples. The maximum number of common predictions by BLAST and HyPe was 79 for MH0074. The complete results from the comparison of BLAST and HyPe on all 24 metagenomic samples are provided in Additional file 8.
Development of the HyPe pipeline
The web server for HyPe is developed using the standalone HyPe application which can be used for the identification of peptidoglycan hydrolases from complete genomic or metagenomic ORFs (Additional file 9). Query sequence will pass through the RF module to predict positive hits (peptidoglycan hydrolases) and also to categorize the resultant peptidoglycan hydrolases into their respective classes.
For the analysis of genomic and metagenomic proteins, separate options ‘Genomic’ and ‘Metagenomic’ has been provided at ‘Application’ page. The user should upload the protein sequence file in FASTA format using the ‘Genomic’ option. The ORFs (in FASTA format) predicted by any gene prediction software should be uploaded using the ‘Metagenomic’ option. The page with the ‘Job ID’ will be displayed to access the results after submission of a query. The standalone version of HyPe is developed for usage on the Linux OS based computer and requires the installation of free packages such as R and Random Forest (Details on Download page on website http://metagenomics.iiserb.ac.in/hype, http://metabiosys.iiserb.ac.in/hype, and in Additional file 10.
Prediction of peptidoglycan hydrolases from pathogenic bacteria using HyPe webserver
Forty-five pathogenic and 3 non-pathogenic (Bdellovibrio bacteriovorus, Bifidobacterium animali and Deinococcus radiodurans) genomes were analyzed using HyPe web server to identify the peptidoglycan hydrolases. The list of predicted peptidoglycan hydrolases from these most common pathogenic bacterial species is provided in Additional file 11. Some of the peptidoglycan hydrolases with well-established antibacterial activity could be identified using HyPe, such as Lysostaphin from Staphylococcus simulans with activity against Staphylococcus aureus , LasA from Pseudomonas aeruginosa  with lytic activity against various Staphylococcus species (i.e., S. saprophyticus, S. epidermidis and S. warneri) , PlyL from Bacillus anthracis which cleaves the cell wall of several Bacillus species when applied exogenously , N-acetylmuramoyl-L-alanine amidase from various organisms which has lytic effect with varying spectrum of activity [6, 32–34], and lytic activity of lysozymes which has been known since long time . Large number of peptidoglycan hydrolases could be predicted in various gram-positive and gram-negative bacterial species, most of them being hypothetical proteins (205 of 1203) (Additional file 11), enabling identification and characterization of novel cell wall hydrolases. With the wide spectrum of activity of these proteins it would be possible to customize the usage of peptidoglycan hydrolases according to the type of antibacterial spectral requirement. It is also plausible to believe that exploration of this class of antibacterial proteins from metagenomic data would lead to identification of novel cell wall hydrolases with desired antibacterial spectra and activity.
The novel strategies for antibacterial development are requisite to tackle the ongoing struggle between emergence of resistance and slow development of new antibiotics. Only a few peptidoglycan hydrolases have yet been identified from completely sequenced genomes as potential antibacterial agents. However, from different metagenomic datasets, more diversely active (with both narrow and broad range spectrum of activity as murein hydrolase) peptidoglycan hydrolases could be identified which is an unexplored area for this class of proteins. These novel peptidoglycan hydrolases have the potential to be used, as shown previously, in food industry for preservation, in agriculture for achieving resistance against phytopathogenic bacteria, and as antibacterial agents [36, 37]. The peptidoglycan hydrolases could be developed into a new class of antibacterial agents to counteract the problem of antibiotic resistance in pathogenic organisms [38–41]. Identification and characterization of novel peptidoglycan hydrolases will also provide insights into better understanding of pathophysiology of various pathogens. To the best of our knowledge, this is the only tool available to predict the peptidoglycan hydrolases from genomic and metagenomic data.
Construction of datasets
Construction of peptidoglycan hydrolase dataset and negative dataset
A total of 399,933 sequences were retrieved from NCBI protein database (website) using the following terms n-acetylmuramidases, n-acetylmuramoyl-l-alanine amidase, n-acetylglucosaminidase, and “murein and carboxypeptidase” “murein and endopeptidase” to obtain the peptidoglycan hydrolases from different bacterial origin. Sequences with ambiguous terms (hypothetical, like, similar, related, unknown, possible, probable, putative, partial, uncharacterized, predicted, inhibitor, regulator, enhancer, unnamed, precursor, fraction, and repressor) in annotations were removed and not considered for further analysis. A total of 281,313 sequences which remained after the removal of ambiguous terms were clustered at 95 % identity using CD-HIT .
The resulting dataset consisting of 62,572 representative sequences was labeled as the positive dataset. To construct the negative dataset, 547,085 protein sequences were retrieved from UniProtKB/Swiss-Prot database (http://www.uniprot.org/downloads, version 20 Nov 2014) . The sequences with annotations containing the search terms used to construct the positive dataset were removed. Clustering was performed for the remaining sequences in the negative dataset at 80 % identity using CD-HIT to avoid over-representation of similar sequences. The resultant dataset consisted of 195,220 sequences. From this dataset, in order to remove sequences which could be remotely related to the positive dataset, BLAST was performed at e-value < 10 using the sequences of the positive dataset as query sequence . All those sequences which showed a similarity with the sequences in the positive dataset were removed and the remaining sequences were considered as the negative dataset. For appropriate training, the sequences smaller than 100 amino acids were removed from the positive and negative datasets. The final positive and negative datasets contained 23,062 and 62,837 sequences, respectively.
Positive dataset were manually curated to classify the peptidoglycan hydrolases sequences into three main categories depending on the site of action at peptidoglycan. The three representative categories were N-acetylmuramoyl-L-alanine amidase (acting on bond between the tetra-peptide side chain and peptidoglycan chain) (NAMLAAmidases), murein peptidases (including endopeptidases and carboxypeptidases) (Peptidases) and the enzymes acting on peptidoglycan chain (N-acetylmuramidases, N-acetylglucosaminidase) (Peptidoglycan_Chain). Remaining sequences which could not be classified into any of the three categories were collectively considered as unclassified peptidoglycan hydrolases and constituted the fourth category (Additional file 3).
Independent genomic datasets
For the performance evaluation of the prediction method, an independent set was created using 24 new bacterial genomes which were released at EMBL-EBI (http://www.ebi.ac.uk/genomes/bacteria.html) during January to May 2015. The positive and negative datasets constructed in the earlier sections contained sequences from genomes which were released earlier to given time period. Thus, it eliminates any chances of biasness in the predictions by the tool since the sequences in the independent set were not included in the training set. Each genome from the independent set was analyzed using the prediction tool to identify peptidoglycan hydrolases. BLAST was performed for each genome against the positive dataset and the results of BLAST were compared with the results of the prediction tool. Further, the peptidoglycan hydrolases were manually identified from each genome using same the set of keywords (positive set) as used in the previous section to identify all known peptidoglycan hydrolases in that genome.
Twenty-four metagenomic samples were obtained from the Human Gut Microbial Gene Catalogue  (Additional file 7). The paired-end reads from each sample were assembled in to a single read of average length 131 bp by FLASH  and were further assembled into contigs using MEGAHIT . ORF predictions were carried out in contigs using MetaGeneMark  and only complete ORFs (with start and stop codon) having length ≥100 amino acids were used for the evaluation of prediction tool.
Amino acid and dipeptide composition
Machine learning techniques
Selection of machine learning method
In the preliminary analysis, functional domains were also searched in the peptidoglycan hydrolase protein sequences in order to apply HMM, however, domain based approach could not be implemented due to the reasons mentioned in Additional file 12.
The AAC and DPC as features and the different machine learning approaches were evaluated using WEKA, which provides several machine learning algorithms for classification, regression and clustering analysis . From the positive and negative dataset, only 10 % of the data was used for the evaluation at first level termed as binary classification. The first level is for the binary classification of a protein as peptidoglycan hydrolase or other function. After the first level where a protein is identified as peptidoglycan hydrolase, the second level is for the classification of this protein into the various categories based on the site of action of peptidoglycan hydrolases. For multi-class classification, the sequences in positive dataset which were classified into four categories were labeled as A, B, C and D, and the negative dataset were labeled as E (Additional file 3) and only 10 % of the data was used for the evaluation at the second level.
Support Vector Machine (SVM)
SVM was implemented via LibSVM package (http://www.csie.ntu.edu.tw/~cjlin/libsvm/) . It is a supervised machine learning algorithm which is used to distinguish the data based on observed patterns within the data. LibSVM provides the flexibility to optimize the number of parameters and kernels [52, 53]. The kernel which provided better results at the time of optimization was considered for the construction of SVM module for the development of prediction tool. To evaluate the unbiased performance of the LibSVM, five-fold cross-validation experiment was performed.
Random Forest (RF)
RF was implemented using the randomForest package provided in R (http://cran.r-project.org//) . RF is one of the implementation of an ensemble-learning method based on the construction of decision trees for classification and regression . Each tree in the forest works as independent model and the output depends upon the overall performance. This algorithm shows better performance as compared to other machine learning methods since several model works together in this approach and the final class is predicted via overall decisions given by individual models. The mode of classification is decided by bootstrapping of classification trees, by choosing a mtry value (which decides the number of variables to be used at each node to split) and trying to minimize the out of bag (OOB) error rate. The OOB error mainly depends upon the strength of the relationship between the trees and strength of each tree [54, 56]. The performance of RF was assessed using various parameters such as mtry and ntree. The parameters which showed the best performance were used for the final RF module construction for the development of prediction tool.
Comparison with BLAST
BLAST (version 2.2.26) was used in this study to compare the performance of the prediction tool on genomic and real metagenomic datasets.
Cross-validation and performance evaluation
AAC, Amino Acid Composition; DPC, dipeptide composition; FN, false negative; FP, false positive; GlcNAc, N-acetyl-D-glucosamine; mDAP, meso-diaminopimelate; MurNAc, N-Acetylmuramic acid; NAG, N-acetylglucosamine; NAM, β-(1–4)-N-acetylmuramic acid; OOB, out of bag; ORF, open reading frame; RF, Random Forest; SVM, Support Vector Machines; TN, true negative; TP, true positive.
We thank MHRD, Govt of India, funded Centre for Research on Environment and Sustainable Technologies (CREST) at IISER Bhopal for its support. However, the views expressed in this manuscript are that of the authors alone and no approval of the same, explicit or implicit, by MHRD should be assumed. We thank Mr. Vishnu Prasoodanan P. K. for help in preparing Fig. 1.
We thank the intramural funding received from IISER Bhopal for carrying out this work.
Availability of data and materials
Complete package of HyPe including the training set will be available at the website after the acceptance of the manuscript.
AKS participated in the design of the study, carried out the metagenomic and machine learning analysis and drafted the manuscript. SK conceived the work, participated in the design of the study, carried out the initial data curation and revised the manuscript. HK participated in initial optimization of SVM and RF. DBD carried out the curation of amidases from selected genomes. VKS conceived the work and participated in the design of the study, and drafted the manuscript. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Anderson R, Groundwater P, Todd A, Worsley A. Antibacterial agents: chemistry, mode of action, mechanisms of resistance and clinical applications. John Wiley & Sons; 2012.Google Scholar
- Fischbach MA, Walsh CT. Antibiotics for emerging pathogens. Science. 2009;325(5944):1089–93.View ArticlePubMedPubMed CentralGoogle Scholar
- Kamysz W. Are antimicrobial peptides an alternative for conventional antibiotics. Nucl Med Rev Cent East Eur. 2005;8(1):78–86.PubMedGoogle Scholar
- Parisien A, Allain B, Zhang J, Mandeville R, Lan C. Novel alternatives to antibiotics: bacteriophages, bacterial cell wall hydrolases, and antimicrobial peptides. J Appl Microbiol. 2008;104(1):1–13.PubMedGoogle Scholar
- Borysowski J, Weber-Dąbrowska B, Górski A. Bacteriophage endolysins as a novel class of antibacterial agents. Exp Biol Med. 2006;231(4):366–77.Google Scholar
- Kumar A, Kumar S, Kumar D, Mishra A, Dewangan RP, Shrivastava P, Ramachandran S, Taneja B. The structure of Rv3717 reveals a novel amidase from Mycobacterium tuberculosis. Acta Crystallogr Sect D: Biol Crystallogr. 2013;69(12):2543–54.View ArticleGoogle Scholar
- Bush K. Antimicrobial agents targeting bacterial cell walls and cell membranes. Rev Sci Tech. 2012;31(1):43–56.PubMedGoogle Scholar
- Wang S, Shaevitz JW. The mechanics of shape in prokaryotes. Front Biosci (Schol Ed). 2013;5:564–74.View ArticleGoogle Scholar
- Meroueh SO, Bencze KZ, Hesek D, Lee M, Fisher JF, Stemmler TL, Mobashery S. Three-dimensional structure of the bacterial cell wall peptidoglycan. Proc Natl Acad Sci U S A. 2006;103(12):4404–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Le Bourhis L, Werts C. Role of Nods in bacterial infection. Microbes Infect. 2007;9(5):629–36.View ArticlePubMedGoogle Scholar
- Reith J, Mayer C. Peptidoglycan turnover and recycling in Gram-positive bacteria. Appl Microbiol Biotechnol. 2011;92(1):1–11.View ArticlePubMedGoogle Scholar
- Johnson JW, Fisher JF, Mobashery S. Bacterial cell‐wall recycling. Ann N Y Acad Sci. 2013;1277(1):54–75.View ArticlePubMedPubMed CentralGoogle Scholar
- Vollmer W, Joris B, Charlier P, Foster S. Bacterial peptidoglycan (murein) hydrolases. FEMS Microbiol Rev. 2008;32(2):259–86.View ArticlePubMedGoogle Scholar
- Ghuysen J-M, Tipper DJ, Strominger JL. Enzymes that degrade bacterial cell walls. Methods Enzymol. 1966;8:685–99.View ArticleGoogle Scholar
- Frirdich E, Gaynor EC. Peptidoglycan hydrolases, bacterial shape, and pathogenesis. Curr Opin Microbiol. 2013;16(6):767–78.View ArticlePubMedGoogle Scholar
- Baba T, Schneewind O. Target cell specificity of a bacteriocin molecule: a C-terminal signal directs lysostaphin to the cell wall of Staphylococcus aureus. EMBO J. 1996;15(18):4789.PubMedPubMed CentralGoogle Scholar
- Simmonds R, Pearson L, Kennedy R, Tagg J. Mode of action of a lysostaphin-like bacteriolytic agent produced by Streptococcus zooepidemicus 4881. Appl Environ Microbiol. 1996;62(12):4536–41.PubMedPubMed CentralGoogle Scholar
- Beukes M, Bierbaum G, Sahl H-G, Hastings J. Purification and partial characterization of a murein hydrolase, millericin B, produced by Streptococcus milleri NMSCC 061. Appl Environ Microbiol. 2000;66(1):23–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Loeffler JM, Djurkovic S, Fischetti VA. Phage lytic enzyme Cpl-1 as a novel antimicrobial for pneumococcal bacteremia. Infect Immun. 2003;71(11):6199–204.View ArticlePubMedPubMed CentralGoogle Scholar
- Nakimbugwe D, Masschalck B, Deckers D, Callewaert L, Aertsen A, Michiels CW. Cell wall substrate specificity of six different lysozymes and lysozyme inhibitory activity of bacterial extracts. FEMS Microbiol Lett. 2006;259(1):41–6.View ArticlePubMedGoogle Scholar
- Murashova N, Golosova T, Gerasimova L, Gorbuntsova R, Ivanova N. [Lysozyme in the overall therapy of patients with burn trauma]. Antibiotiki. 1975;20(4):369–73.PubMedGoogle Scholar
- Tanaka H, Kitoh Y, Kitabayashi N, Matsumura Y, Okayachi H, Nakatsuji Y, Tanaka K, Kubota K, Namba K, Takemura K. Development of a new delayed healing model of an open skin wound and effects of M-1011G (ointment gauze containing 5 % lysozyme hydrochloride) on the model. Nihon yakurigaku zasshi Folia pharmacologica Japonica. 1994;104(2):121.View ArticlePubMedGoogle Scholar
- Hoopes JT, Stark CJ, Kim HA, Sussman DJ, Donovan DM, Nelson DC. Use of a bacteriophage lysin, PlyC, as an enzyme disinfectant against Streptococcus equi. Appl Environ Microbiol. 2009;75(5):1388–94.View ArticlePubMedPubMed CentralGoogle Scholar
- Yoong P, Schuch R, Nelson D, Fischetti VA. Identification of a broadly active phage lytic enzyme with lethal activity against antibiotic-resistant Enterococcus faecalis and Enterococcus faecium. J Bacteriol. 2004;186(14):4808–12.View ArticlePubMedPubMed CentralGoogle Scholar
- Nilsen T, Nes IF, Holo H. Enterolysin A, a cell wall-degrading bacteriocin from Enterococcus faecalis LMG 2333. Appl Environ Microbiol. 2003;69(5):2975–84.View ArticlePubMedPubMed CentralGoogle Scholar
- Dhalluin A, Bourgeois I, Pestel-Caron M, Camiade E, Raux G, Courtin P, Chapot-Chartier M-P, Pons J-L. Acd, a peptidoglycan hydrolase of Clostridium difficile with N-acetylglucosaminidase activity. Microbiology. 2005;151(7):2343–51.View ArticlePubMedGoogle Scholar
- Rodríguez-Cerrato V, García P, Huelves L, García E, del Prado G, Gracia M, Ponte C, López R, Soriano F. Pneumococcal LytA autolysin, a potent therapeutic agent in experimental peritonitis-sepsis caused by highly β-lactam-resistant Streptococcus pneumoniae. Antimicrob Agents Chemother. 2007;51(9):3371–3.View ArticlePubMedPubMed CentralGoogle Scholar
- Kashige N, Nakashima Y, Miake F, Watanabe K. Cloning, sequence analysis, and expression of Lactobacillus casei phage PL-1 lysis genes. Arch Virol. 2000;145(8):1521–34.View ArticlePubMedGoogle Scholar
- Spencer J, Murphy LM, Conners R, Sessions RB, Gamblin SJ. Crystal structure of the LasA virulence factor from Pseudomonas aeruginosa: substrate specificity and mechanism of M23 metallopeptidases. J Mol Biol. 2010;396(4):908–23.View ArticlePubMedGoogle Scholar
- Kessler E, Safrin M, Blumberg S, Ohman DE. A continuous spectrophotometric assay for Pseudomonas aeruginosa LasA protease (staphylolysin) using a two-stage enzymatic reaction. Anal Biochem. 2004;328(2):225–32.View ArticlePubMedGoogle Scholar
- Low LY, Yang C, Perego M, Osterman A, Liddington RC. Structure and lytic activity of a Bacillus anthracis prophage endolysin. J Biol Chem. 2005;280(42):35433–9.View ArticlePubMedGoogle Scholar
- Mellroth P, Steiner H. PGRP-SB1: an N-acetylmuramoyl L-alanine amidase with antibacterial activity. Biochem Biophys Res Commun. 2006;350(4):994–9.View ArticlePubMedGoogle Scholar
- García-Cano I, Campos-Gómez M, Contreras-Cruz M, Serrano-Maldonado CE, González-Canto A, Peña-Montes C, Rodríguez-Sanoja R, Sánchez S, Farrés A. Expression, purification, and characterization of a bifunctional 99-kDa peptidoglycan hydrolase from Pediococcus acidilactici ATCC 8042. Appl Microbiol Biotechnol. 2015;99(20):8563–8573.View ArticlePubMedGoogle Scholar
- Szweda P, Schielmann M, Kotlowski R, Gorczyca G, Zalewska M, Milewski S. Peptidoglycan hydrolases-potential weapons against Staphylococcus aureus. Appl Microbiol Biotechnol. 2012;96(5):1157–74.View ArticlePubMedPubMed CentralGoogle Scholar
- Hughey V, Johnson E. Antimicrobial activity of lysozyme against bacteria involved in food spoilage and food-borne disease. Appl Environ Microbiol. 1987;53(9):2165–70.PubMedPubMed CentralGoogle Scholar
- Callewaert L, Walmagh M, Michiels CW, Lavigne R. Food applications of bacterial cell wall hydrolases. Curr Opin Biotechnol. 2011;22(2):164–71.View ArticlePubMedGoogle Scholar
- Schmelcher M, Waldherr F, Loessner MJ. Listeria bacteriophage peptidoglycan hydrolases feature high thermoresistance and reveal increased activity after divalent metal cation substitution. Appl Microbiol Biotechnol. 2012;93(2):633–43.View ArticlePubMedGoogle Scholar
- Fenton M, McAuliffe O, O’Mahony J, Coffey A. Recombinant bacteriophage lysins as antibacterials. Bioeng Bugs. 2010;1(1):9–16.View ArticlePubMedPubMed CentralGoogle Scholar
- Fischetti VA. Bacteriophage lysins as effective antibacterials. Curr Opin Microbiol. 2008;11(5):393–400.View ArticlePubMedPubMed CentralGoogle Scholar
- Rodríguez-Rubio L, Martínez B, Rodríguez A, Donovan DM, García P. Enhanced staphylolytic activity of the Staphylococcus aureus bacteriophage vB_SauS-phiIPLA88 HydH5 virion-associated peptidoglycan hydrolase: fusions, deletions, and synergy with LysH5. Appl Environ Microbiol. 2012;78(7):2241–8.View ArticlePubMedPubMed CentralGoogle Scholar
- García‐Cano I, Velasco‐Pérez L, Rodríguez‐Sanoja R, Sánchez S, Mendoza‐Hernández G, Llorente‐Bousquets A, Farrés A. Detection, cellular localization and antibacterial activity of two lytic enzymes of Pediococcus acidilactici ATCC 8042. J Appl Microbiol. 2011;111(3):607–15.View ArticlePubMedGoogle Scholar
- Fu L, Niu B, Zhu Z, Wu S, Li W. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012;28(23):3150–2.View ArticlePubMedPubMed CentralGoogle Scholar
- Wu CH, Apweiler R, Bairoch A, Natale DA, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R. The Universal Protein Resource (UniProt): an expanding universe of protein information. Nucleic Acids Res. 2006;34(suppl 1):D187–91.View ArticlePubMedPubMed CentralGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10.View ArticlePubMedGoogle Scholar
- Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, Manichanh C, Nielsen T, Pons N, Levenez F, Yamada T. A human gut microbial gene catalogue established by metagenomic sequencing. Nature. 2010;464(7285):59–65.View ArticlePubMedPubMed CentralGoogle Scholar
- Magoč T, Salzberg SL. FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics. 2011;27(21):2957–63.View ArticlePubMedPubMed CentralGoogle Scholar
- Li D, Liu C-M, Luo R, Sadakane K, Lam T-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics. 2015;31(10):1674–6.View ArticlePubMedGoogle Scholar
- Zhu W, Lomsadze A, Borodovsky M. Ab initio gene identification in metagenomic sequences. Nucleic Acids Res. 2010;38(12):e132.View ArticlePubMedPubMed CentralGoogle Scholar
- Sharma AK, Gupta A, Kumar S, Dhakan DB, Sharma VK. Woods: a fast and accurate functional annotator and classifier of genomic and metagenomic sequences. Genomics. 2015;106(1):1–6.View ArticlePubMedGoogle Scholar
- Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH. The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter. 2009;11(1):10–8.View ArticleGoogle Scholar
- Chang C-C, Lin C-J. LIBSVM: a library for support vector machines. ACM TIST. 2011;2(3):27.Google Scholar
- Bottou L, Lin C-J. Support vector machine solvers. Large scale kernel machines. 2007;301-20.Google Scholar
- Gupta A, Kapil R, Dhakan DB, Sharma VK. MP3: a software tool for the prediction of pathogenic proteins in genomic and metagenomic data. PLoS One. 2014;9(4):e93907.View ArticlePubMedPubMed CentralGoogle Scholar
- Breiman L. Random forests. Mach Learn. 2001;45(1):5–32.View ArticleGoogle Scholar
- Chaudhary N, Sharma AK, Agarwal P, Gupta A, Sharma VK. 16S classifier: a tool for fast and accurate taxonomic classification of 16S rRNA hypervariable regions in metagenomic datasets. PLoS One. 2015;10(2):e0116106.View ArticlePubMedPubMed CentralGoogle Scholar
- Touw WG, Bayjanov JR, Overmars L, Backus L, Boekhorst J, Wels M, van Hijum SA. Data mining in the Life Sciences with Random Forest: a walk in the park or lost in the jungle? Brief Bioinform. 2013;14(3):315–26.View ArticlePubMedPubMed CentralGoogle Scholar