Molecular characterization of carbapenem-resistant Klebsiella pneumoniae isolates with focus on antimicrobial resistance

Background The enhancing incidence of carbapenem-resistant Klebsiella pneumoniae (CRKP)-mediated infections in Mengchao Hepatobiliary Hospital of Fujian Medical University in 2017 is the motivation behind this investigation to study gene phenotypes and resistance-associated genes of emergence regarding the CRKP strains. In current study, seven inpatients are enrolled in the hospital with complete treatments. The carbapenem-resistant K. pneumoniae whole genome is sequenced using MiSeq short-read and Oxford Nanopore long-read sequencing technology. Prophages are identified to assess genetic diversity within CRKP genomes. Results The investigation encompassed eight CRKP strains that collected from the patients enrolled as well as the environment, which illustrate that blaKPC-2 is responsible for phenotypic resistance in six CRKP strains that K. pneumoniae sequence type (ST11) is informed. The plasmid with IncR, ColRNAI and pMLST type with IncF[F33:A-:B-] co-exist in all ST11 with KPC-2-producing CRKP strains. Along with carbapenemases, all K. pneumoniae strains harbor two or three extended spectrum β-lactamase (ESBL)-producing genes. fosA gene is detected amongst all the CRKP strains. The single nucleotide polymorphisms (SNP) markers are indicated and validated among all CRKP strains, providing valuable clues for distinguishing carbapenem-resistant strains from conventional K. pneumoniae. Conclusions ST11 is the main CRKP type, and blaKPC-2 is the dominant carbapenemase gene harbored by clinical CRKP isolates from current investigations. The SNP markers detected would be helpful for characterizing CRKP strain from general K. pneumoniae. The data provides insights into effective strategy developments for controlling CRKP and nosocomial infection reductions.

been identified in all gram-negative members of the ESKAPE pathogens [5], and KPC is the most clinically indispensable enzyme due to its prevalence in Enterobacteriaceae [6]. Moreover, pathogens harboring KPC-2 are resistant to all β-lactams and β-lactamase inhibitors except ceftazidime/avibactam, which extremely limit treatment options as well as lead to high mortality rates [7]. Additionally, NDM has become a serious threat to public health due to the rapid global dissemination of NDM-bearing pathogens and the presence on mobile genetic elements in an extensive series of species [8]. Consequently, it is imperative and urgent to investigate the CRKP characteristics for better controlling pathogens and diagnosing as well as treating patients.
In current investigation, seven CRKP strains are extracted from patients during their hospitalizations and another one CRKP strain is obtained from the dining car in Mengchao Hepatobiliary Hospital of Fujian Medical University (Additional file 1: Table S1). The whole genome of CRKP is sequenced using MiSeq short-reads and Oxford Nanopore long-reads sequencing technology. We conduct surveillance of the CRKP-mediated infection prevalence in Mengchao Hepatobiliary Hospital of Fujian Medical University, investigate the molecular characterization of the strains that obtained, and identify gene phenotypes as well as resistance-associated genes of the strain emergence. The detected single nucleotide polymorphisms (SNP) markers would be helpful for recognizing CRKP strain from general K. pneumoniae. Data of this study provide essential insights into effective strategy developments for controlling CRKP and nosocomial infection reductions.

Antimicrobial susceptibilities of the CRKP strains
The source of isolates is supplied in Table 1, which denotes the infectious type and the result of susceptibility testing during the patients' hospitalization. All eight strains involved in the study are confirmed to be K. pneumoniae, with five strains from sputum, one from bile, one from blood, and one from the environment (Additional file 1: Table S1). Clinical data demonstrate that seven of the eight patients are referred due to pulmonary infection, and another one is referred due to abdominal infection. The susceptibility testing data in Table 1 reveals that all the K. pneumoniae strains are resistant to almost all antibiotics, such as cephalosporins, penicilins, quinolones and carbapenems (imipenem with MICs ≥16 μg/ml). For aminoglycosides antibiotics, except that 1567D isolate is sensitive to amikacin and tobramycin, all other isolates are resistant to aminoglycosides antibiotics. The strains including 1566D, 2038D, 2039D and 2040D are resistant to sulfamethoxazole/trimethoprim with MICs ≥320 μg/ml, and the other strains Table 1 Antibiotic susceptibility profiles of K. pneumoniae. The results of antimicrobial susceptibility testing -antibiotics MIC (mg/L) and breakpoint interpretation or epidemiological cut-off value   Isolates  1566D  1567D  2035D  2036D  2037D  2038D  2039D  2040D   source  sputum  bile  sputum  sputum  blood  sputum  sputum  environment   Infection  Pulmonary  Abdominal  Pulmonary  Pulmonary  Pulmonary  Pulmonary Pulmonary (1567D, 2035D, 2036D, 2037D) are sensitive to sulfamethoxazole/trimethoprim with MICs ≤20 μg/ml.

Genome assembly and annotation
The short-read sequenced seven CRKP strains are assembled into contigs. As listed in Table 2, the assembled genome size of all trains ranged from 5.4 Mb to 5.8 Mb, with mean length of 5.7 Mb and average contigs numbering 199. The N50 length of genomes is from 176.6 kb to 251.6 kb with an average N50 length of 220.4 kb and mean GC content of 57.2%. To obtain a more complete genome, the 1567D strain is resequenced via long-read sequencing technology and assembled into three contigs with size of 5.6 Mb (Additional file 1: Figure S1). A total of 5841 protein-coding genes are predicted with length between 37 to 1649 bp (Additional file 1: Figure S2).

Characteristics of the CRKP isolates
The isolated eight CRKP bacteria are sequenced through Illumina MiSeq platform and assembled into whole genomes. To understand genetic diversity, mobile genetic elements of 24 prophages are identified in eight CRKP genomes, with sizes ranging from 8.4 kb to 98.9 kb (Fig. 1). According to the criterion that the length of an intact prophage should be more than 20 kb [9]. Prophages detected in most strains (except for 2036D) are complete with a size of at least 20.2 kb with an average GC percentage of 52.7%. Additionally, three prophages are respectively identified in 3 strains at the same time, revealing the genomic sequence homology among all isolates. The 2036D strain is comprised of just one prophage probably because of the small genome size and distinct sequence characteristics, which is expected to have less neutral targets for prophage integration [9]. Furthermore, multilocus-sequence typing (MLST) analysis reveals that there are two unrelated sequence type (ST) in K. pneumoniae strains isolated from different patients. 2036D K. pneumoniae strain correlates with ST2632, and the other six strains are relevant to ST11 (Table 3). pMLST analysis reveals that all of the six ST11 K. pneumoniae strains are associated with IncF[F33:A-:b-] and the ST2632 K. pneumoniae strain is relevant to IncHI1 and IncF.
Plasmid analysis [10] shows different circular plasmids carried by the individual strains. All strains harbored IncR and ColRNAI plasmids with no virulence genes but contain several resistance-associated genes that cause resistance to carbapenems, which is demonstrated in Table 3. The IncR plasmid is identified as multidrug-resistant plasmids and has variable copy numbers of certain resistance genes among K. pneumoniae isolates.
Except 2036D, all the other K. pneumoniae strains harbor bla KPC-2 which is associated with carbapenems resistance. Extended-spectrum β-lactamases (ESBLs) resistance genes such as bla CTX-M , bla TEM , bla LEN and bla SHV are also informed. bla TEM is one of the genes that produce ESBL.  detected in 2036D strain and the other five K. pneumoniae strains, respectively. Except for the 2036D strain, bla TEM-1B gene is observed in all the other six K. pneumoniae strains. Aac(3)-IId and rmtB encoding fluoroquinolone resistance are observed among all strains. oqxA and oqxB with the resistance to fluoroquinolones are exclusively detected in 2036D strain. fosA resulting in fosfomycin resistance [11] is also informed among all CRKP strains.

Characterizing CRKP SNPs and phylogeny
The SNP markers are identified for all strains that sequenced using the short-read MiSeq data. The data demonstrate that 33,716 markers are detected in the 2036D strain, which is largely more than the other strains with an average of 8289 SNPs. The cSNPs located in protein-coding regions are in slightly higher amounts among all detected SNPs of a minimal ratio of 85.5% (Additional file 1: Table S11). In addition, the pairwise comparison analysis reveals that 2036D isolate is disparate with the other strains based on clusters of sequence similarities using subprogram of Trinity [12] (Fig. 2a). Furthermore, the 2036D strain share few SNP loci with the others, which coincides with strain clusters (Fig. 2b). For validations, all strains have a high detection rate in that approximately 153 out of 200 SNPs (76.4%) that have amplifications, which demonstrate the analysis accuracy (Additional file 1: Table S11). After filtering SNP loci that are not located in exome regions, containing no-alleles locus, and comprising all-wild SNP loci in each isolate, we eventually obtain 92 SNPs among 200  Table 3 Resistance genes among the patient and environmental isolates Phosphonic Acid fosA fosA fosA fosA fosA fosA fosA validated loci. A total of 40 out of 92 SNPs are allvariation loci in all isolates, which could be utilized for recognizing CRKP strain from ordinary K. pneumoniae (Additional file 1: Table S12). In addition, 24 SNPs of strain's unique loci, including strains of 2036D (18 loci), 2035D (3 loci), 1566D (2 loci) and 2037D (1 loci), would be helpful resources for specific strain identification of clinical analysis. Previous 5 CRKP strains that isolated in Hangzhou [13] are downloaded from GenBank, and we conduct comparisons with strains in our study. The comparison result suggests that CRKP strains in Hangzhou are different from that in Fuzhou, presenting geographical difference (Additional file 1: Figure S6). The phylogenic tree shows that 1566D strain is most distantly related to other strains, and 2036D is more different from other strains, which is not even included in the phylogenic tree (Additional file 1: Figure S6).

GWAS analysis
To further identify significant SNPs and genes, we perform genome-wide association study (GWAS) analysis. The patients' body temperature and counts of leukocyte are selected as phenotypic character. The shortsequencing reads of six strains (Fig. 3) are aligned to the 1567D genome using BWA v0.7.17 software. We call SNPs using Platypus v0.8.1 [14], and then filter the SNPs through plink v1.9 according to the following conditions: As a result, 9 loci are identified (P < 0.05). Two loci (ygbI and murB) are related with temperature and the other seven loci (IsrD, SufD, yrkF, fabI, sppA, entF and ttuB) are relevant to leukocyte (Fig. 3).

Discussion
Data of current study confirm that all CRKP strains hold two types of plasmids with no virulence gene whereas harbor an abundance of associated resistance genes such as ESBLs and carbapenemases. One genotype of carbapenemases with bla KPC-2 and two ST types with ST11 and ST2632 are identified in the study, and the ST11 with KPC-2-positive is a prevalent strain accounting in all the six strains. The plasmid with IncR, ColRNAI and pMLST type with IncF[F33:A-:B-] co-exist in all ST11 with KPC-2-producing CRKP strains. The initial detection of a KPC-2-producing K. pneumoniae isolate from a hospital in China is reported in 2007 [15]. Since then, bla KPC-2 -bearing K. pneumoniae isolates have become more prevalent and reported in China as well as other countries and areas [16]. Recently, one patient is found to have susceptible K. pneumoniae bacteraemia in US [15]. While that case is relatively specific since the patient might be affected during the visit and hospitalization in India, which would add more complex environmental factors to confound the results. CRKP of ST11 associated with bla KPC-2 is disseminated widely across China [17,18], which is concordant with the results of our study. These findings suggest that the CRKP-mediated infections in our hospital result from ST11 with KPC-2-positive K. pneumoniae isolates. Continuous monitoring will be necessary to prevent further dissemination of carbapenemase-resistance genes.
Besides carbapenemases, a variety of ESBLs such as bla CTX-M , bla SHV , bla LEN , bla TEM are present in CRKP strains of this study. K. pneumoniae is one of the most indispensable infectious agents in the ICU [19]. There are "classic" and hypervirulent strains of K. pneumoniae [20][21][22]. The "classic" non-virulent strain of K. pneumoniae (C-KP) can produce ESBLs related to nosocomial infectious outbreaks especially in the ICU of a hospital. C-KP more easily acquires antimicrobial resistance such as ESBLs. In our investigation, bla CTX-M with different type is found among all the CRKP strains. Chromosome-mediated bla SHV and plasmid-mediated bla TEM are also positive for ESBLs production and are observed in six K. pneumoniae strains. Cooccurrence of bla CTX-M , bla KPC-2 , bla SHV-11 and bla TEM-1B are observed among five K. pneumoniae strains. All K. pneumoniae strains harbor two or three ESBLs-producing genes (bla CTX-M , bla SHV and bla TEM ), which indicate all isolates contained multiple ESBLs resistance genes. Previous reports noted consistent results that co-occurrence of bla TEM , bla SHV and bla CTX-M (any two or all three) was observed among Klebsiella isolates [23].
fosA is frequently identified in the E. coli and K. pneumoniae genomes [24,25]. The fosA5 gene is first found in E. coli in 2014 [26]. In 2017, it was reported that all of 73 carbapenem-resistant K. pneumoniae isolates were positive for fosA5 in one Chinese area: Zhejiang Province [27]. Antimicrobial susceptibility testing about fosfomycin is not conducted in this study though fosA is also found among all the CRKP strains, which might indicate that fosfomycinmodifying enzymes account for a majority of the fosfomycin resistance, and that fosfomycin is resistant to CRKP strains. As reported, a clinical Escherichia coli strain HS102707 isolate and an Enterobacter aerogenes strain HS112625 isolate are resistant to carbapenem and fosfomycin and positive for the blaKPC-2 and fosA3 genes [25], and fosA exists in all CRKP strains with blaKPC-2 in our study. Continuous monitoring will be necessary to prevent further dissemination of fosfomycin-resistant bacteria together with prudent use of fosfomycin in clinical settings. OqxA and oqxB genes are relevant to efflux pumps, which means that antibiotics such as cephalosporins, carbapenems and fluoroquinolones are almost completely expelled from K. pneumoniae through its cell membrane [28]. To our knowledge, these two genes are mainly reported to be responsible for the resistance to fluoroquinolones. They do have been previously reported to be associated with the nitrofurantoin resistance.
The genome sequences of the seven strains include massive contigs which are highly fragmented. Upon further investigation, we sequence the 1567D strain using long-read sequencing platform, which could help us assemble the genome with considerable improvement in completeness and contiguity. The carbapenem-resistant genes including fosA, oqxA and oqxB and 40 all-variation SNP loci are also identified in the above genome demonstrating the high-quality assembly. In comparison with previous study revealing 12.3 substitutions in average [29], we identify more SNP markers in each isolate due to loose threshold. The method in Yang et al. can filter large number of SNPs with low frequency or depth and ensure the quality of SNPs, however, those isolate-specific markers might also be filtered, which would not provide many enough markers for GWAS and downstream analysis for current study. As Klebsiella pneumoniae is an emerging nosocomial pathogen with extended antibiotic resistance, online resources, such as BacWGSTdb [30], offering rapid typing and phylogenetic relatedness linked to antibiotic resistance genes and clinical data would be increasingly indispensable in a globalized community. The assembly and annotation information will be beneficial in understanding the whole genomic characterization of CRKP strain for future study.

Conclusions
In conclusion, ST11 is the main CRKP type, and bla KPC-2 is the dominant carbapenemase gene harbored by clinical CRKP isolates of current investigation. The plasmid with IncR, ColRNAI and pMLST type with IncF[F33:A-:B-] exist in all ST11 with KPC-2-producing CRKP strains. Besides carbapenemases, all K. pneumoniae strains harbor two or three ESBLs-producing genes (bla CTX-M , bla SHV and bla TEM ), which indicate that all isolates contain multiple ESBLs resistance genes. fosA genes are also found among all the CRKP strains, which may infer that fosfomycin-modifying enzymes account for a majority of the fosfomycin resistance and that CRKP strains are resistant to fosfomycin. The 40 all-variation SNP loci in all isolates could be employed and referred for distinguishing CRKP strain from ordinary K. pneumoniae. The detected SNP markers would be helpful for characterizing CRKP strain from general K. pneumoniae. This study provides insights into effective strategy developments for controlling CRKP and nosocomial infection reductions.

Patient clinical information
In total, seven patients received treatments during their hospitalizations and the data of them were completely classified and studied. One bacterium was extracted from the dining car in the hospital and since the carrier was not human, there was no clinical data relating to it. All patients, except patient 1567P that was diagnosed as abdominal infection, were diagnosed as severe pneumonia or suffered lung infections (Additional file 1: Table S1). We further give Additional file 1: Tables S2-S8 to in detail provide all patients' treatment records as well as the phenotype measurement results and data.
All patients received systematic medical examinations such as whole blood cell test, blood routine test, blood electrolyte test, blood clotting, fungal D-glucan detection, galactomannan detection, etc. All the records are archived in detail for further investigations.

Bacterial isolates, identification and antimicrobial resistance
Single patient isolates are obtained from specimens that received from inpatients admitted to Mengchao Hepatobiliary Hospital of Fujian Medical University (Fuzhou, China) in 2017. From April, 2017 to December, 2017, a total of eight CRKP isolates (Additional file 1: Table S1), which are resistance to all the antibiotics tested, such as cephalosporins, penicilins, quinolones, aminoglycosides and carbapenems (Imipenem with MICs ≥16 μg/ml) ( Table 1), were processed following standard operating procedures: the isolates are extracted according to the aseptic operating procedures and cultured in the bacterial culture medium with Columbia Agar + 5% sheep blood. The study has been performed in accordance with the Institutional Ethical Committee of the Faculty of Medicine, Mengchao Hepatobiliary Hospital of Fujian Medical University, which approved this study (No. 2017_036_01).

Whole genome sequencing (WGS) and assembly
The isolated seven CRKP bacteria are sequenced on Illumina MiSeq (Illumina, San Diego, CA, USA) platform.
MiSeq short-read sequencing library is generated with 1 ng purified DNA. Inserting a phosphate to 5′ UTR end and "A" to 3′ UTR end produces end-repair, and PCR fragments (300~600 bp) are collected from bar-coded adapter ligation. The library is purified via AMPure XP (Beckman Coulter), which is then sequenced on MiSeq platform. In sum, a total of 40.5 million reads (2 × 300 bp) with a size of 1.36 Gb data are yielded (Additional file 1: Table S9). All short reads are first filtered for the lowquality sequences and then assembled into contigs using SPAdesv3.11.1 software [32].
Subsequently, we select an isolate of 1567D to perform long-read sequencing on Oxford Nanopore MinION (Oxford, UK) platform to easily sequence across repeat regions. The sequencing library is constructed with 1.5 μg purified DNA using the LSK-108 Oxford Nanopore Technologies (ONT) ligation protocol, and the prepared library is sequenced following the standard protocol of Oxford Nanopore MinION. A total of 7.48 Gb ultra-long reads are generated with N50 length of 25,890 bp (Additional file 1: Table S10). The long reads that 'passed' during the Nanopore base calling are used to assemble into complete genomic sequences via Canu software [33]. The long-read sequencing data of the same individual are used to correct base errors of assembled genome using Nanopolish (https://github.com/jts/nanopolish).

Detecting Prophages in the CRKP genomes
The putative prophages within contigs of the CRKP genomic sequences are identified using the PHAST web server (PHAge Search Tool) [34]. The prophage completeness and categorization (intact, incomplete, or questionable) are presented applying over sequences to check homology, and to detect, annotate, and graphically display prophages.

Carbapenemase-resistance gene identifications
To predict the protein-coding genes and functional proteins in the CRKP genomes, all assembled sequences are annotated by a web-based package RAST (Rapid Annotations using Subsystems Technology) [35]. The antibiotic resistance and virulence genes, plasmids, phenotyping and genotyping of CRKP genomes are scanned using the Bacterial Analysis Pipeline [36]. Carbapenemase-resistance genes are further identified from above annotated sequences according to Simner et al. [37].
The protein-coding genes of long-read assembled genome are predicted using GLIMMER (Gene Locator and Interpolated Markov ModelER) v3.02 [38]. To functionally annotate the predicted genes and perform the pathway analysis, we align them to NR, COG, Swiss-Prot, GO and KEGG databases using blastX (E-value: 10 − 5 ). The annotated genes serve to improve the completeness of some important carbapenemase-resistance genes.
Comparisons of strain similarity are performed using the Harvest Tools Suite [39] (version 1.1.2). For all of the isolates sequenced on a particular platform, parsnp is utilized to compare all the assembled isolates against each other and known reference strain. Results are visualized using EvolView.

SNP identification and validation
We download K. pneumoniae genome from NCBI as the reference (Accession No. PRJNA78789) to identify SNP markers [18]. All high-quality data (Q value > 20, reads length > 50 bp, number of uncertain bases < 5%) of eight CRKP strains are aligned to the reference genome sequences using BWA v0.7.17 [40], and aligned reads are sorted by coordinates via SAMTOOLS v1.4 [41]. The GATK (Genome Analysis Tool Kit) software v3.8.0 [42] is utilized to detect SNPs, which is described as following: (1) duplicated reads are removed; (2) reads around insertions/deletions are realigned; (3) base quality is recalibrated using default parameters; (4) all variants are identified using HaplotypeCaller method in GATK with emitting and calling standard confidence thresholds at 10.0 and 30.0, respectively. To validate the detected SNPs in the seven CRKPs, we select 20 loci within each sample that are located in protein-coding regions and sequence them with high read depth. All chosen markers are designed primers for amplification using Sequenom MassARRAY iPLEX platform.
Additional file 1: Figure S1. Circle diagram of K. pneumoniae genome sequenced via Oxford Nanopore sequencing technology. Figure S2. Distribution of protein-coding genes predicted in 1567D strain. Figure S3. COG classification 1567D stain for the carbapenem-resistant K. pneumoniae. Figure S4. Distribution of K. pneumoniae genes annotated in GO term. Figure S5. Annotation of KEGG pathways in the carbapenem-resistant K. pneumoniae. Figure S6. Phylogenetic tree assessing the relatedness of the carbapenem-resistant K. pneumoniae strains in Fuzhou (purple) and in Hangzhou (green) to the reference genome database (blue). Table S1. Information of strains and patient diagnosis. Table S2. Phenotypes of 1566D, a.k.a., medical records of 1566P. Table S3. Phenotypes of 1567D, a.k.a., medical records of 1567P. Table S4. Phenotypes of 2035D, a.k.a., medical records of 2035P. Table S5. Phenotypes of 2036D, a.k.a., medical records of 2036P. Table S6. Phenotypes of 2037D, a.k.a., medical records of 2037P. Table S7. Phenotypes of 2038D, a.k.a., medical records of 2038P .  Table S8. Phenotypes of 2039D, a.k.a., medical records of 2039P. Table S9. Illumina MiSeq sequencing yields. Table S10. Oxford Nanopore sequencing yields. Table S11. Detection and validation of SNPs in seven strains. Table S12. A total of 92 all-variation SNPs in seven strains. Red refers to 40 all-variation loci; Bold stands for 24 strain's unique SNP loci.