Characterizing clinical carbapenem-resistant Klebsiella pneumoniae phenotypes and genomic information reveals its molecular epidemiology

The enhancing incidence of carbapenem-resistant Klebsiella pneumoniae (CRKP)-mediated infections in Hospital of Medical University in 2017 promoted this investigation to study gene phenotypes and resistance genes of emergence regarding the CRKP strains. In current study, seven inpatients are enrolled in the hospital with complete treatments. The carbapenem-resistant K. pneumoniae whole genome is sequenced using MiSeq short-read and Oxford Nanopore long-read sequencing technology. Prophages are identied to assess genetic diversity within CRKP genomes. The investigation encompassed eight CRKP strains that collected from the patients enrolled as well as the environment, which illustrate that bla KPC-2 is responsible for phenotypic resistance in six CRKP strains that K . pneumoniae sequence type (ST-11) is inferred. The plasmid with IncR, ColRNAI and pMLST type with IncF[F33:A-:B-] co-exist in all ST-11 with KPC-2-producing CRKP strains. Along with carbapenemases, all K. pneumoniae strains harbor two or three extended spectrum β-lactamase (ESBL)-producing genes. F osA gene is detected amongst all the CRKP strains. The oqxA and oqxB expressions in CRKP strains may lead to carbapenem resistance since antimicrobials are expelled from pathogenic bacteria by eux pump. The single nucleotide polymorphisms (SNP) markers are indicated and validated among all CRKP strains, providing valuable clues for distinguishing carbapenem-resistant strains from conventional K. pneumoniae . The detected single nucleotide polymorphisms (SNP) markers would be helpful for recognizing CRKP strain from general K. pneumoniae. Data of this study provide essential insights into effective strategy developments for controlling CRKP and nosocomial infection reductions.


Background
Antibiotic resistance is amongst the extremely severe public health challenges nowadays. Carbapenem-resistant Enterobacteriaceae (CRE) has been reported as a consequence mainly due to acquisition of carbapenemase genes, and CRE is informed as an urgent threat to human health by the Centers for Disease Control and Prevention (CDC), USA in 2013 1 .
Carbapenems such as imipenem, meropenem, and biapenem represent the rst-line treatment of serious infections caused by multi-resistant Enterobacteriaceae including Klebsiella pneumoniae (K. pneumoniae) and Escherichia coli (E. coli) 2 . Whereas carbapenems can be hydrolyzed by carbapenemase in carbapenem-resistant K. pneumoniae (CRKP) 3 , which results in resistance to β-Lactam antibiotics including carbapenem. Carbapenemases can be divided into Ambler class A β-lactamases (e.g. Klebsiella pneumoniae carbapenemases (KPC)), class B metallo-β-lactamases (MBLs), verona integrin-encoded metallo-βlactamase (VIM), New Delhi metallo-β-lactamase (NDM) type, and Class D Enzymes of the OXA-48 type 4 . Among Ambler class A β-lactamases, plasmid-mediated KPC has been identi ed in all gram-negative members of the ESKAPE pathogens 5 , and KPC is the most clinically indispensable enzyme due to its prevalence in Enterobacteriaceae 6 . Moreover, pathogens harboring KPC-2 are resistant to all β-lactams and β-lactamase inhibitors, which extremely limit treatment options as well as lead to high mortality rates 7 . Additionally, NDM has become a serious threat to public health due to the rapid global dissemination of NDMbearing pathogens and the presence on mobile genetic elements in an extensive series of species 8 . Consequently, it is imperative and urgent to investigate the CRKP characteristics for better controlling pathogens and diagnosing as well as treating patients.
In present investigation, seven CRKP strains are extracted from patients during their hospitalizations and another one CRKP strain is obtained from the dining car in Mengchao Hepatobiliary Hospital of Fujian Medical University (Supplementary Table   1). The whole genome of CRKP is sequenced using MiSeq short-reads and Oxford Nanopore long-reads sequencing technology. We conduct surveillance of the CRKP-mediated infection prevalence in Mengchao Hepatobiliary Hospital of Fujian Medical University, investigate the molecular epidemiological characteristics of the strains that obtained, and identify gene phenotypes as well as resistance genes of the strain emergence. The detected single nucleotide polymorphisms (SNP) markers would be helpful for recognizing CRKP strain from general K. pneumoniae. Data of this study provide essential insights into effective strategy developments for controlling CRKP and nosocomial infection reductions.

Patient clinical information
In total, seven patients receive treatments during their hospitalizations and the data of them are completely classi ed and studied. One bacterium is extracted from the dining car in the hospital and since the carrier is not human, there is no clinical data relating to it. All patients, except patient 1567P that is diagnosed as abdominal infection, are diagnosed as severe pneumonia or sufferred lung infections (Supplementary Table 1). We further give Supplementary Table 2-8 to in detail provide   all patients' treatment records as well as the phenotype measurement results and data. For instance, 2036P is a male patient with 83 years old. Previous to be admitted in our hospital, he was treated in another hospital. His medical record as of August 1 st , 2017 shows that the urea nitrogen is 28.1 mmol/L, and creatinine is 429 μmol / L. His urinary red blood cell malformation rate is 56% high. Pulmonary CT indicate that there are nodules in his right upper lung, possibly being peripheral lung cancer.
There are bilateral pleural effusion. He was hospitalized in our hospital from August 25 th to September 29 th , 2017. For his medical treatment records, please refer to Supplementary Table 5.
All patients receive systematic medical examinations such as whole blood cell test, blood routine test, blood electrolyte test, blood clotting, fungal D-glucan detection, galactomannan detection, etc. All the records are archived in detail for further investigations.

Bacterial Isolates
Single patient isolates are obtained from specimens that received from inpatients admitted to Mengchao Hepatobiliary

Bacterial Identi cation and Antimicrobial Resistance
Bacterial isolates are con rmed by matrix-assisted laser desorption ionization-time of ight (MALDI-TOF) mass spectrometry (BioMerieux SA, BioMerieux Inc., France). The resistance of pathogenic bacteria is identi ed by automatic microbial identi cation system (VITEK-2 Compact, BioMerieux Inc., France) and K-B method (AST -GN13, BioMerieux Inc., France). The results of antimicrobial susceptibility testing are interpreted based upon Clinical and Laboratory Standards Institute (CLSI) 9 .
The standard strain under quality control is K. pneumoniae isolates ATCC700603 (American Type Culture Collection, ATCC).

Whole Genome Sequencing (WGS) and Assembly
The isolated seven CRKP bacteria are sequenced on Illumina MiSeq (Illumina, San Diego, CA, USA) platform. MiSeq short-read sequencing library is generated with 1 ng puri ed DNA. Inserting a phosphate to 5' UTR end and "A" to 3' UTR end produces end-repair, and PCR fragments (300 ~ 600 bp) are collected from bar-coded adapter ligation. The library is puri ed via AMPure XP (Beckman Coulter) , which is then sequenced on MiSeq platform. In sum, a total of 40.5 million reads (2 × 300 bp) with a size of 1.36 Gb data are yielded (Supplementary Table 9). All short reads are rst ltered for the low-quality sequences and then assembled into contigs using SPAdesv3.11.1 software 10 .
Subsequently, we select an isolate of 1567D to perform long-read sequencing on Oxford Nanopore MinION (Oxford, UK) platform to easily sequence across repeat regions. The sequencing library is constructed with 1.5 μg puri ed DNA using the LSK-108 Oxford Nanopore Technologies (ONT) ligation protocol, and the prepared library is sequenced following the standard protocol of Oxford Nanopore MinION. A total of 7.48 Gb ultra-long reads are generated with N50 length of 25,890 bp (Supplementary Table 10). The long reads that 'passed' during the Nanopore base calling are used to assemble into complete genomic sequences via Canu software 11 . The long-read sequencing data of the same individual are used to correct base errors of assembled genome using Nanopolish (https://github.com/jts/nanopolish).

Detecting Prophages in the CRKP Genomes
The putative prophages within contigs of the CRKP genomic sequences are identi ed using the PHAST web server (PHAge Search Tool) 12 . The prophage completeness and categorization (intact, incomplete, or questionable) are presented applying over sequences to check homology, and to detect, annotate, and graphically display prophages.

Carbapenemase-resistance Gene Identi cations
To predict the protein-coding genes and functional proteins in the CRKP genomes, all assembled sequences are annotated by a web-based package RAST (Rapid Annotations using Subsystems Technology) 13 . The antibiotic resistance and virulence genes, plasmids, phenotyping and genotyping of CRKP genomes are scanned using the Bacterial Analysis Pipeline 14 .
Carbapenemase-resistance genes are further identi ed from above annotated sequences according to Simner et al. 15 .
The protein-coding genes of long-read assembled genome are predicted using GLIMMER (Gene Locator and Interpolated Markov ModelER) v3.02 16 . To functionally annotate the predicted genes and perform the pathway analysis, we align them to NR, COG, Swiss-Prot, GO and KEGG databases using blastX (E-value: 10 -5 ). The annotated genes serve to improve the completeness of some important carbapenemase-resistance genes.

SNP Identi cation and Validation
We download K. pneumoniae genome from NCBI as the reference to identify SNP markers. All high-quality data (Q value >20, reads length > 50 bp, number of uncertain bases < 5%) of eight CRKP strains are aligned to the reference genome sequences using BWA v0.7.17 17 , and aligned reads are sorted by coordinates via SAMTOOLS v1.4 18 . The GATK (Genome Analysis Tool Kit) software v3.8.0 19 is utilized to detect SNPs, which is described as following: (1)

Antimicrobial Susceptibilities of the CRKP Strains
The source of isolates is supplied in Table 1, which denotes the infectious type and the result of susceptibility testing during the patients' hospitalization. All eight strains involved in the study are con rmed to be K. pneumoniae, with ve strains from sputum, one from bile, one from blood, and one from the environment (Supplementary Table 1). Clinical data demonstrate that seven of the eight patients are referred due to pulmonary infection, and another one is referred due to abdominal infection. The susceptibility testing data in Table 1 reveals that all the K. pneumoniae strains are resistant to almost all antibiotics, such as cephalosporins, penicilins, quinolones and carbapenems (imipenem with MICs ≥16). For aminoglycosides antibiotics, except that 1567D isolate is sensitive to amikacin and bramycin, all other isolates are resistant to aminoglycosides antibiotics. The strains including 1566D, 2038D, 2039D and 2040D are resistant to sulfamethoxazole/trimethoprim with MICs ≥320, and the other strains (1567D, 2035D, 2036D, 2037D) are sensitive to sulfamethoxazole/trimethoprim with MICs ≤ 20.

Genome assembly and annotation
The short-read sequenced seven CRKP strains are assembled into contigs. As listed in Table 2, the assembled genome size of all trains ranged from 5.4 Mb to 5.8 Mb, with mean length of 5.7 Mb and average contigs numbering 199. The N50 length of genomes is from 176.6 kb to 251.6 kb with an average N50 length of 220.4 kb and mean GC content of 57.2%. To obtain a more complete genome, the 1567D strain is resequenced using long-read sequencing technology and assembled into three contigs with size of 5.6 Mb (Figure 1). A total of 5,841 protein-coding genes are predicted with length between 37 to 1,649 bp (Supplementary Figure 1). Totals of 4,657, 5,097, 4,714, 3,179 and 3,099 predicted genes are functionally annotated in NR, COG, Swiss-Prot, GO and KEGG databases, respectively (Supplementary Figures 2, 3, 4).

Characteristics of the CRKP Isolates
The isolated seven CRKP bacteria are sequenced through Illumina MiSeq platform and assembled into whole genomes. To understand genetic diversity, mobile genomic elements of 24 prophages are identi ed in seven CRKP genomes, with sizes ranging from 8.4 kb to 49.3 kb (Figure 2). According to the criterion that the length of an intact prophage should be more than Plasmid analysis shows different circular plasmids carried by the individual strains. All strains harbored IncR and ColRNAI plasmids with no virulence genes but contain several associated resistance genes, which is demonstrated in Table 3. The IncR plasmid is identi ed as multidrug-resistant plasmids and has variable copy numbers of certain resistance genes among K. pneumonia isolates.
Except 2036D, all the other K. pneumoniae strains harbor the associated carbapenemases-producing resistance gene, bla KPC-2 .
Extended-spectrum β-lactamases (ESBLs) resistance genes such as bla CTX-M , bla TEM , bla LEN and bla SHV are also informed. FosA resulting in fosfomycin resistance 21 is also informed among all strains. The oqxA and oqxB are also detected in all CRKP strains that might result in carbapenem resistance since antimicrobials are expelled from pathogenic bacteria by e ux pump 22 .

Characterizing CRKP SNPs
The SNP markers are identi ed for all strains that sequenced using the short-read MiSeq data. The data demonstrate that SNPs. The cSNPs located in exonic regions are in slightly higher amounts among all detected SNPs of a minimal ratio of 85.5% (Supplementary Table 11). In addition, the pairwise comparison analysis reveals that 2036D isolate is disparate with the other strains based on clusters of sequence similarities using subprogram of Trinity 23 (Figure 3a). Furthermore, the 2036D strain share few SNP loci with the others, which coincides with strain clusters (Figure 3b).
For validations, all strains have a high detection rate in that approximately 153 out of 200 SNPs (76.4%) that have ampli cations, which demonstrate the analysis accuracy (Supplementary Table 11). After ltering SNP loci that are not located in exome regions, containing no-alleles locus, and comprise all-wild SNP loci in each isolate, we eventually obtain 92 SNPs among 200 validated loci. A total of 40 out of 92 SNPs are all-variation loci in all isolates, which could be utilized for recognizing CRKP strain from ordinary K. pneumonia (Supplementary Table 12

GWAS analysis
To further identify signi cant SNPs and genes, we perform genome-wide association study (GWAS) analysis. The patients' body temperature and counts of leukocyte are selected as phenotypic character. The short-sequencing reads of six strains ( Figure 4) are aligned to the 1567D genome using BWA v0.7.17 software. We call SNPs using Platypus v0.8.1 25 , and then lter the SNPs through plink v1.9 according to the following conditions: (i) missing loci, (ii) minor allele frequency (MAF) < 0.05 and (iii) signi cant deviation from the Hardy-Weinberg equilibrium (HWE) (P < 0.01). A total of 698 SNP markers are remained and utilized for GWAS analysis. As a result, 9 loci are identi ed (P < 0.05). Two loci (ygbI and murB) are related with temperature and the other seven loci (IsrD, SufD, yrkF, fabI, sppA, entF and ttuB) are relevant to leukocyte (Figure 4).

Discussion
Data of current study con rm that all CRKP strains hold two types of plasmids with no virulence gene whereas harbor an abundance of associated resistance genes such as ESBLs and carbapenemases. One genotype of carbapenemases with bla KPC-2 and two ST types with ST-11 and ST-2632 are identi ed in the study, and the ST-11 with KPC-2-positive is a prevalent infectious outbreaks especially in the ICU of a hospital. C-KP more easily acquires antimicrobial resistance such as ESBLs. Bla CTX-M with different type is found among all the CRKP strains. Chromosome-mediated bla SHV and plasmid-mediated bla TEM are also positive for ESBLs production and are observed in six K. pneumoniae strains. Co-occurrence of bla CTX-M , bla KPC-2 , bla SHV-11 and bla TEM-1B are observed among ve K. pneumoniae strains. All K. pneumoniae strains harbor two or three ESBLsproducing genes (bla CTX-M , bla SHV and bla TEM ), which indicate all isolates contained multiple ESBLs resistance genes. Previous reports noted consistent results that co-occurrence of bla TEM , bla SHV and bla CTX-M (any two or all three) was observed among Klebsiella isolates 31 .
FosA is frequently identi ed in the E. coli and K. pneumoniae genomes 32 . The fosA5 gene is rst found in E. coli in 2014 33 . In 2017, it was reported that all of 73 carbapenem-resistant K. pneumoniae isolates were positive for fosA5 in one Chinese area: Zhejiang Province 34 . Antimicrobial susceptibility testing about fosfomycin is not conducted in this study though fosA is also found among all the CRKP strains, which might indicate that fosfomycin-modifying enzymes account for a majority of the fosfomycin resistance, and that fosfomycin is resistant to CRKP strains. It is reported that fosA gene is transferred from E. coli to K. pneumoniae through whole plasmid transmission or mobile genetic element transmission, which raise doubts whether fosfomycin can be used as a supplementary drug for urinary tract infection caused by carbapenem-resistant E. coli in the hospital, as fosA exists in all CRKP strains from our study. Continuous monitoring will be necessary to prevent further dissemination of fosfomycin-resistant bacteria together with prudent use of fosfomycin in clinical settings.
OqxA and oqxB genes are relevant to e ux pumps, which means that antibiotics such as cephalosporins, carbapenems and uoroquinolones are almost completely expelled from K. pneumoniae through its cell membrane 35 . In this study, the quinolone/olaquindox e ux pump, oqxA and oqxB, is associated with tigecycline resistance in K. pneumoniae 36 . Although there are no carbapenemases that observed in 2036D strain, OqxA and oqxB genes are identi ed in it. The oqxA and oqxB expressions in CRKP strain might result in carbapenem resistance due to the fact that antimicrobials expelled from pathogenic bacteria by e ux pump. Consequently, carbapenem resistance in this K. pneumoniae isolate might be due to e ux pumps of oqxA/oqxB other than carbapenemases production.
The genome sequences of the seven strains include massive contigs which are highly fragmented. Upon further investigation, we sequence the 1567D strain using long-read sequencing platform, which could help us assemble the genome with considerable improvement in completeness and contiguity. The carbapenem-resistant genes including fosA, oqxA and oqxB and 40 all-variation SNP loci are also identi ed in the above genome demonstrating the high-quality assembly. The assembly and annotation information will be bene cial in understanding the whole genomic characterization of CRKP strain for future study.

Conclusions
In conclusion, ST11 is the main CRKP type, and bla KPC-2 is the dominant carbapenemase gene harbored by clinical CRKP isolates from present study. The plasmid with IncR, ColRNAI and pMLST type with IncF[F33:A-:B-] exist in all ST-11 with KPC-2producing CRKP strains. Besides carbapenemases, all K. pneumoniae strains harbor two or three ESBLs-producing genes (bla CTX-M , bla SHV and bla TEM ), which indicate all isolates contain multiple ESBLs resistance genes. FosA genes are also found among all the CRKP strains, which may infer that fosfomycin-modifying enzymes account for a majority of the fosfomycin resistance and that CRKP strains are resistant to fosfomycin. The differential expressions of oqxA and oqxB in CRKP strain might result in carbapenem-resistant because of antimicrobials that expelled from pathogenic bacteria by e ux pump. The 40 all-variation SNP loci in all isolates could be employed and referred for distinguishing CRKP strain from ordinary K. pneumonia.    Figure 1 Circle diagram of K. pneumoniae genome sequenced via Oxford Nanopore sequencing technology.

Figure 2
Intact prophages identi ed in seven CRKP strains.  GWAS results of the analysis.