Comparative genomic analysis of Pectobacterium carotovorum subsp. brasiliense SX309 provides novel insights into its genetic and phenotypic features

Background Pectobacterium carotovorum subsp. brasiliense is a broad host range bacterial pathogen, which causes blackleg of potatoes and bacterial soft rot of vegetables worldwide. Production of plant cell wall degrading enzymes is usually critical for Pectobacterium infection. However, other virulence factors and the mechanisms of genetic adaptation still need to be studied in detail. Results In this study, the complete genome of P. carotovorum subsp. brasiliense strain SX309 isolated from cucumber was compared with eight other pathogenic bacteria belonging to the Pectobacterium genus, which were isolated from various host plants. Genome comparison revealed that most virulence genes are highly conserved in the Pectobacterium strains, especially for the key virulence determinants involved in the biosynthesis of extracellular enzymes and others including the type II and III secretion systems, quorum sensing system, flagellar and chemotactic genes. Nevertheless, some variable regions of the T6SS and the CRISP-Cas immune system are unique for P. carotovorum subsp. brasiliense. Conclusions The extensive comparative genomics analysis revealed highly conserved virulence genes in the Pectobacterium strains. However, several variable regions of type VI secretion system and two subtype Cas mechanism-Cas immune systems possibly contribute to the process of Pectobacterium infection and adaptive immunity. Electronic supplementary material The online version of this article (10.1186/s12864-019-5831-x) contains supplementary material, which is available to authorized users.

carotovorum subsp. brasiliense BC1 isolated from Brassica rapa ssp. pekinensis [2] and P. carotovorum subsp. brasiliense BZA12 isolated from Cucumis sativus [9], P. carotovorum subsp. odoriferum BC S7 isolated from Brassica rapa ssp. pekinensis. P. carotovorum subsp. brasiliense was originally reported in Brazil and has since been fully described [10]. Subsequently, P. carotovorum subsp. brasiliense has emerged as a global problem with reports from many regions of the world, including Canada, the United States, the Netherlands, Switzerland, South Africa, Kenya, South Korea, and Japan [11][12][13]. During 2014-2016, a devastating cucumber bacterial soft rot caused by P. carotovorum subsp. brasiliense occurred in northern China [14]. Nevertheless, very few studies have been focused on studying the complete genome of P. carotovorum subsp. brasiliense, and consequently, the pathogenicity and the genetic adaptation to the host of this subspecies remain largely unknown.
The symptoms caused by Pectobacterium infection include soft rot and wilts resulting from vascular invasion. Extensive studies on the Pectobacterium pathogens that infect vegetable crops and ornamental plants led to the identification of a number of virulence factors including extracellular degradative enzymes, diverse regulatory systems, and bacterial secretion systems, which are collectively contribute to the bacterial infections [15]. Pectobacteria spp. are pectinolytic pathogens, producing large quantities of plant cell wall degrading enzymes (PCWDEs). These include pectate lyase (Pel), polygalacturonase (Peh), cellulose (Cel), protease (Prt) and many others that are used to catalyze the breakdown of pectin, the primary plant cell wall component [16]. These exoenzymes are secreted via the type II secretion system (T2SS) [17] under the control of an N-acyl homoserine lactone (AHL)-dependent quorum sensing (QS) system [18]. The virulence genes related to flagella biosynthesis, bacterial colonization, and swimming motility are also regulated by the QS system in P. carotovorum subsp. brasiliense [19]. The type III secretion system (T3SS) plays an important role in the pathogenesis of most plant pathogenic bacteria [20]. However, interestingly, most the T3SS-deficient Pectobacterium strains exhibit similar virulence to those T3SS-encoding strains in planta [21]. The type VI secretion systems (T6SS) possibly are also important for bacterial pathogenicity and host adaptation in some bacteria, which has been largely associated with various biological functions including biofilm formation, host adaptation and bacterial survival [22].
Most archaea and many bacteria protect themselves from infection by foreign genetic elements via Cas mechanism-Cas adaptive immunity systems to ensure their survival [23]. CRISPR-Cas immunity systems evolved three stages for function: adaptation, CRISPR RNA (crRNA) biogenesis, and interference [24]. Currently, CRISPR-Cas systems include two classes, class 1 (types I, III and IV) that requires multi-Cas protein complexes for interference, and class 2 (types II, V and VI) that employs one single effector protein for interference [24]. In E. coli, the type I-E CRISPR-Cas interfering complex contains not only Cas1 and Cas2 but also all other components of the effector Cascade complex (casA, casB, casC, casD, casE, and crRNA) and the Cas3 nuclease [25]. The subtype I-F Cas1 and Cas3 hybrid proteins interact with each other, suggesting a protein complex for adaptation and a role for the subtype I-F Cas3 proteins in both the adaptation and interference steps of the CRISPR/Cas mechanism [26]. Previous studies have shown that the CRISPR/Cas system in P. atrosepticum encodes six proteins including Cas1, Cas3, Csy1, Csy2, Csy3 and Csy4 [26]. Nevertheless, the biological functions of the CRISPR/Cas systems remain poorly understood in P. carotovorum subsp. brasiliense.
In this study, the complete genome sequence of Pcb strain SX309 that is highly virulent in a wide range of host plant species was sequenced, annotated and compared with the representative genomes of other Pectobacterium species and P. carotovorum subspecies, with a particular focus on virulence factors, regulatory mechanisms and potential genetic adaptation to the host. Through comparative genomic analysis, we found that the genes encoding PCWDEs, T2SS, T3SS, T6SS, QS system, two-component system and LPS are probably major virulence factors, and the CRISPR/Cas system may be involved in adaptive immunity. Characterization of these functional determinants among the Pectobacterium pathogens will provide novel insights into hostpathogen interactions.

Methods
Bacterial strains and genomic DNA extraction P. carotovorum subsp. brasiliense SX309 (original number: HG1501090306) was isolated from cucumber fruit showing typical soft rot symptoms in Shanxi Province of China in February 2015 [14]. The phenotypic, biochemical characterization and host range were tested and analyzed by Meng et al [14]. This strain was typically incubated in NB (Nutrient Broth, BD, USA) liquid media at 28°C with shaking for 48 h. High-quality genomic DNA was extracted from the cultured bacteria using a QIAamp®DNA Mini Kit (Qiagen, Valencia, CA).

Whole-genome sequencing
The complete genome sequencing of P. carotovorum subsp. brasiliense SX309 was performed at the Beijing Allwegene Technology Corporation using a Pacific Biosciences (PacBio) RS II platform with a Single Molecule Real-Time (SMRT). A SMRTbellTM template library with a 20 kb insert-size was constructed. The library was then sequenced using C4 sequencing chemistry and P6 polymerase with one SMRT cell, and the reads were trimmed on quality and length. The resulting clean reads were assembled de novo with the PacBio SMRT Analysis software [27] (version 2.3.0). The graphical views of genome alignments were generated using CGView software [28].

Phylogenetic analysis
Phylogenetic relationship analyses were determined from the multilocus sequence analysis (MLSA) on six housekeeping genes including 16S rRNA, gapA, gyrA, atpD, rpoA, and rho from twenty Pectobacterium spp., seven Dickeya spp., and eight Erwinia spp. GenBank accession numbers associated with the housekeeping loci of all of the strains can be found in Additional file 5: Table S3. The gene sequences were aligned using MUSCLE software and trimmed to remove ambiguously aligned regions. Subsequently, six housekeeping gene sequences were concatenated in the same order using Sequence-Matrix. The phylogenetic tree was constructed using the maximum likelihoods method derived from MEGA 6.0 software [35], and 1,000 bootstrap replicates were included in a heuristic search with a random tree and the tree bisection-reconnection branch-swapping algorithm.

Extracellular enzyme assays
Plate assays for the activity of Pel, Peh, Cel, and Prt were conducted as described by Chatterjee et al. [17] (1995) with slight modifications. Wells were bored in the agarose medium with a No. 2 cork borer, and the bottoms were sealed with 0.8% (w/v) of molten agarose. Bacterial cells were grown in NB liquid medium overnight at 28°C and adjusted to OD 600 = 0.8. Samples were applied to the wells, and the plates were incubated for 24 h at 28°C for Pel, Peh, and Cel and for 48 h for Prt. The Pel and Peh plates were developed with 4 N HCl, and the Cel plates were stained with 0.1% (w/v) Congo red solution for 10 min and then washed with 1 M NaCl solution three times. Haloes in the Prt plates became visible without any further treatment. Each treatment was repeated three times, and all of the experiments were repeated three times.

Virulence assays
The virulence and symptom development caused by P. carotovorum subsp. brasiliense SX309 were assessed in cucumber plants (Cucumis sativus) and potato plants (Solanum tuberosum). Cucumber and potato stems were stab-inoculated with 10 μL of approximately 1 × 10 8 CFU/mL bacterial suspensions of the SX309 strain. They were then incubated in a moist chamber at 28°C, and the appearance of the symptoms was periodically observed. Sterilized distilled water was used for the negative control inoculations. For each inoculation experiment, three plants were used, and the experiments were repeated three times.

Microscopic analysis
For transmission electron microscope (TEM) observation, bacterial cells were negatively stained using 1% uranium acetate on collodion-coated 100-mesh grids. The samples were visualized using a transmission electron microscope Hitachi-7700 (Hitachi High-Technologies Corporation, Tokyo). For fluorescence electron microscope (FEM) observation, the plasmid pSMC21 containing the gfp gene was used to generate a GFP-tagged P. carotovorum subsp. brasiliense strain [47]. The plasmid was introduced into the bacterial cells using electroporation. The GFP-tagged SX309 strain was then visualized using a fluorescence microscope Olympus BX51. For scanning electron microscope (SEM) observation, bacterial cells in exponential and stationary phases were fixed using 2.5% glutaraldehyde. Samples were observed using a scanning electron microscope Hitachi-S3400N.

Organism information
P. carotovorum subsp. brasiliense SX309 is a facultative anaerobic, Gram-negative, non-sporulating bacterium belonging to the Pectobacteriaceae family (Additional file 1: Table S1). SX309 strain is rod-shaped with a length of 1.5-2 μm and a diameter of 0.5-0.8 μm. It is motile by using peritrichous flagella (Additional file 2: Figure S1). Strain SX309 can utilize several carbon sources and grow in 5% NaCl [14]. Pathogenicity tests showed that SX309 is highly virulent in various host plants including some important vegetable crops such as cucumbers and potatoes (Additional file 3: Figure S2). Minimum information about the genome sequence (MIGS) of P. carotovorum subsp. brasiliense SX309 is summarized in Additional file 1: Table S1 [48].
General genomic features of P. carotovorum subsp. brasiliense SX309 A total of 37,555 clean reads with an average length of 12,020 bp and an N 50 size of 16,273 bp were generated. Assembly of the clean reads resulted in a single contig with 90.89-fold coverage on average without any gap (Additional file 4: Table S2). Thus, the genome of P. carotovorum subsp. brasiliense SX309 is composed of a single circular chromosome that is 4,966,299 bp in size with no apparent autonomous plasmids (Additional file 5: Table S3 and Fig. 1). The average G+C content of the whole genome is 52.18%, which is similar to P. carotovorum subsp. brasiliense BC1 (51.8%), P. carotovorum subsp. brasiliense BAZ12 (52.00%), P. carotovorum subsp. carotovorum PCC21 (52.18%) and P. carotovorum subsp. odoriferum BC S7 (51.80%) ( Table 1). In total, 4, 455 open reading frames (ORFs) have been predicted in the genome of SX309. In addition to 4,252 protein coding genes (CDSs), the chromosome contains 104 RNA genes including 76 tRNA genes, 22 rRNA operons, 6 ncRNAs and 99 pseudogenes (Additional file 5: Table  S3). These annotated genes are transcribed in the positive and negative directions from the perspective of the direction of DNA replication, respectively (Fig. 1). Using Pfam, SignalP, and the TMHMM database, 3,849 (86.40%), 409 (9.18%), and 19 (0.43%) of the ORFs could be classified to different groups, respectively (Additional file 5: Table S3).
Functional categorization of 4,252 CDSs were analyzed using the Cluster of Orthologous Groups of proteins (COG). The results showed that 3,474 (77.98%) of the predicted genes of SX309 were assigned to the COG categories (Additional file 5: Table S3). Among these assigned genes, 42.47% are related to metabolism, 20.63% to cellular processes and signaling, and 17.13% to information storage and processing. However, 19.77% of the genes cannot be assigned in COG categories because their features and functions remain unknown (Table 2). Moreover, the RAST annotation has assigned 2,406 genes of SX309 strain into 529 subsystems. Most of the genes are associated with carbohydrates (15.83%), amino acids and derivatives (12.74%), protein metabolism (8.98%), cofactors, vitamins and pigments (7.66%), RNA metabolism (7.13%), membrane transport (5.96%), and stress response (4.29%) (Additional file 6: Figure S3).
Comparison of the P. carotovorum subsp. brasiliense SX309 genome with other completely sequenced Pectobacterium spp.
To understand the relationships of P. carotovorum subsp. brasiliense SX309 with genome sequenced strains within the Pectobacterium, Dickeya, and Erwinia genera, a phylogenetic tree was constructed based on 16S rRNA and five housekeeping genes (gapA, gyrA, atpD, rpoA, rho) (Additional file 7: Table S4 and Additional file 8: Figure S4). As expected, the twenty Pectobacterium strains, seven Dickeya strains, and eight Erwinia strains were clustered into three major clades. In practice, Pectobacterium spp. are considered as broad-host range pathogens, except that P. atrosepticum has been reported almost exclusively from potatoes (Solanum tuberosum) and P. betavasculorum exclusively from sugar beets (Beta vulgaris). P. carotovorum has a broader host range and less restricted survival conditions than P. atrosepticum, P. parmentieri, and P. wasabiae, which are specialized to cause disease in one or few host plants only [49]. Strain SX309 is able to infect a wide range of plant species [14], which might explain the close relationship between SX309 and other P. carotovorum subsp. The Dickeya group clearly formed three distinct sub-clades. Strain EC1 was the closest homolog to D. zeae Ech586, followed by D. chrysanthemi Ech1591. In contrast, strain IPO2222 was the closest homolog to D. dadantii 3937, followed by D. solani ND14b. The Pectobacterium and Dickeya species are close relatives and were formerly classified as Erwinia spp. [6]. Our results of phylogenetic analysis agree with the previous findings. Thus, phylogenetic analysis based on multilocus sequences provided a strong support and an accurate classification for the species. Strain SX309 was assigned to the clade of P. carotovorum, which includes BC1 and BAZ12 belonging to P. carotovorum subsp. brasiliense.
The average nucleotide identity (ANI), and the genome-to-genome distance calculator, or in silico DDH (isDDH), are two of the most widely accepted bioinformatics tools that calculate whole-genome sequence  [3]. In this study, we performed additional calculations on the ANI and DDH values among the representative Pectobacterium strains (Additional file 9: Table S5). The results showed that the ANI and DDH values between strains SX309 and BC1 were approximately 97.43% and 77.70% respectively. These findings indicated that strains SX309 and BC1 were clustered closely and occupied the same taxonomic position. Lower ANI and DDH values were obtained when BC S7, CFBP3304, RNS08.42.1A and SCC3193 were used as reference genomes.
To evaluate the evolutionary distance among these sequenced strains within the Pectobacterium genus, the whole genome sequences were compared using Mauve software. At the subspecies level, the genome sequence of strain SX309 was aligned to two other P. carotovorum subsp. brasiliense (BC1 and BZA12) and its closest fully sequenced relatives, P. carotovorum subsp. carotovorum PCC21 and P. carotovorum subsp. odoriferum BC S7 ( Fig. 2a and 2b). This alignment showed that the SX309 genome is much more similar to the BC1 than to the BZA12 within the brasiliense subsp. At the subspecies level, the SX309 genome is much more similar to the PCC21 than to the BC S7 genome, supporting the relationship described above. In comparison to PCC21, there is no significant gene insertion or deletion of large regions in P. carotovorum subsp. brasiliense SX309, but large local collinear blocks (LCB) inversion occurred. Comparison of the whole genome sequences at the species level revealed that the locations of homologous genes were different in SX309 and P. atrosepticum SCRI1043, P. parmentieri SCC3193, P. parmentieri RNS08.42.1A, and P. wasabiae CFBP 3304 (Fig. 2C). Regions with low similarity among the genome occurred frequently, and distributed randomly. Additionally, the synteny plot of the pairwise alignment supports the previous analysis that strain SX309, PCC21 and BC S7 belong to the same subspecies. Moreover, the SX309 genome is more similar to SCRI1043 than the other three Pectobacterium species. However, there were large numbers of changes in the LCB between SX309 and SCRI1043 during the evolution of the species (Additional file 10: Figure S5).
To identify the specific genes in P. carotovorum subsp. brasiliense SX309, we compared its genome sequence to the complete genome sequences of the eight strains that have been released (Fig.2). As shown in Fig. 2d, there were 3,480 conserved genes shared by the three strains of P. carotovorum subsp. brasiliense. SX309 shared 202 genes with BAZ12 and had 108 genes with counterparts in the BC1 genome. Furthermore, 289 unique genes were present in the genome of SX309 and the functions of most unique genes are still unknow at the moment. At the subspecies level, the core genome among SX309, PCC21 and BC S7 is composed of 3,018 orthologous genes, which represents approximately 70.98% of all the predicted genes. In addition, 382 unique genes (8.98% of the predicted genes) present in the SX309 genome were not found in the other two genomes within the same subspecies (Fig. 2e). The analysis also revealed that a core genome consisting of 2,995 genes are common to all five species, while P. carotovorum subsp. brasiliense SX309 has 371 unique genes (Fig. 2f).

Plant cell wall-degrading enzymes
Extracellular enzyme assays showed that strain SX309 can produce pectate lyase (Pel), polygalacturonase (Peh), cellulase (Cel), and protease (Prt) (Additional file 11: Figure S6). Genome sequencing revealed the presence of the genes for the synthesis and secretion of plant cell wall-degrading enzymes in strain SX309. A total 59 known or putatively related genes encoding pectinases, cellulases and proteinases were identified in the SX309 genome. Briefly, the genome of SX309 contains 20 genes encoding pectin degradation enzymes, including pelN, pelI, pelA, pelY, pelC, pelB, pelZ, pelW, and pelX for pectate lyases, pnl for a pectin lyase, pemA and pemB for pectinesterase, paeX and paeY for pectin acetylesterase, pehX, pehN, pehA, and pehK for polygalacturonases, ogl for a oligogalacturonide lyase, and rhiE for a rhamnogalacturonate lyase (Additional file 12: Table S6). These pectin degradation genes were highly conserved in various Pectobacterium species, except that pehK was absent in P. parmentieri SCC3193 and P. parmentieri RNS08.42.1A. Therefore, the production of PCWDEs may be a hallmark of infection for Pectobacterium spp.
Moreover, 23 genes encoding proteases were detected in the SX309 genome (Additional file 12: Table S6). Among them, the six protease-encoding genes, including prt1, prtC, prtW, degP, degQ, and glpG encode serralysin homologs that share more than 90% similarity at the amino acid level. The four ATP-dependent Clp proteaseencoding genes, including clpS, clpA, clpX, and clpP, were identified, and a lon protease encoding gene lon was also found in SX309.

Secretion systems
The genome of SX309 contains a wide variety of secretion systems, which are closely related to bacterial pathogenicity (Additional file 13: Table S7).According to the comparative analysis, the P. carotovorum subsp. brasiliense SX309 chromosome contains a highly conserved T2SS gene cluster (gspCDEFGHIJKLMN and outOSB) (Fig. 3), covering 17.669 kb with 15 ORFs. The gsp gene cluster shares an average of 90% similarity with that of various Pectobacterium species at the amino acid level (Additional file 13: Table S7), except that gspC is absent in P. carotovorum subsp. odoriferum BC S7, and gspN is absent in P. parmentieri SCC3193 and P. parmentieri RNS08.42.1A. The outOSB genes are also highly conserved among Pectobacterium spp., except that the outO gene is replaced by BCS7_14675 encoding a hypothetical protein in strain BC S7. Among the six Pectobacterium spp., the common characteristics of T2SS is that it contains pel and pehK genes upstream of gspC, except the pel gene is absent in strains SCC3193 and RNS08.42.1A (Fig. 3). The genes involved in the secretion-signal recognition particle (Sec-SRP) system are highly conserved in all six Pectobacterium spp., except secA and secE, which are absent in strain BC S7.
Many plant pathogenic bacteria inject multiple effector proteins into plant cells via the Type III secretion system for successful infection. A large hrp/hrc gene cluster of 33 genes was identified in the genome of P. carotovorum subsp. brasiliense SX309. SX309 shares high similarities in the hrp/hrc gene cluster sequences with the other Pectobacterium species,. However, there are certain variations. For example, the hrp/hrc gene cluster is absent in P. parmentieri SCC3193 and P. parmentieri RNS08.42.1A but present in the other three species including P. carotovorum subsp. carotovorum PCC21, P. carotovorum subsp.
odoriferum BC S7, and P. atrosepticum SCRI1043 (Additional file 13: Table S7). In addition, the dspE and dspF genes encoding the AvrE-family T3SS effectors are also conserved among Pectobacterium spp., except that dspE is absent in P. parmentieri SCC3193 and P. parmentieri RNS08.42.1A, but dspF is absent in strains BC S7, SCC3193, and RNS08.42.1A. Given that most key hrp/hrc genes are highly conserved in strain SX309, it is highly possible that the T3SS in SX309 could play certain roles in the bacterial pathogenicity, which awaits further investigations.
The type VI secretion system (T6SS) is widely present in many Gram-negative bacteria, delivering toxic effector proteins into adjacent bacterial or host cells. In this study, the T6SS gene cluster of P. carotovorum subsp. brasiliense SX309 was found to have 33 genes, among which 15 were identified as core genes (Fig. 4). The 15 core T6SS genes are highly conserved in various Pectobacterium species and subspecies. Biological functions have been assigned for the outer membrane lipoprotein (VasD), Inner membrane proteins (ImpL and ImpK), ATPase (ClpV), and regulatory proteins or structure proteins (ImpB, ImpC, TssE, ImpG, ImpH, ImpI, ImpJ, VasH, VasI, VasJ, and VasL) [50] (Additional file 14: Table S8). In addition to the 15 core T6SS genes, there are five vgrG and 13 hcp genes that encode extracellular structural components of the secretion machine and specific effectors in SX309 genome. Nevertheless, the copy numbers of vgrG and hcp genes substantially varied among different Pectobacterium species and subspecies (Additional file 14: Table S8).

Quorum-sensing systems
Quorum sensing (QS) is a cell-population densitydependent regulatory mechanism in which gene expression is coupled to the accumulation of chemical signaling molecules known as autoinducers (AI) [51]. In P. carotovorum, two QS systems exist that are specified by the nature of the chemical signals involved: the N-acyl homoserine lactones (AHLs)-and the autoinducer-2 (AI-2)-dependent signaling systems [52]. In this study, a positive reaction was observed in the AHL biosensor Agrobacterium tumefaciens NTL/ pZLR4 (Additional file 15: Figure S7A), suggesting that SX309 could produce the AHL signals. A BLAST search of the SX309 genome revealed only one copy of carI (B5S52_ 21425) and a conserved luxR homolog (B5S52_21420) designated as expR (Additional file 16: Table S9). The proteins encoded by carI/expR have high sequence identity with the AHL biosynthetic and receptor proteins ExpI/ExpR of Pectobacterium spp. (more than 90%, respectively) at the amino acid level, except that they have low similarity with P. parmentieri SCC3193. In addition, P. carotovorum subsp. brasiliense SX309 has a functional luxS gene (B5S52_ 05735) and can produce an AI-2 signal (Additional file 15: Figure S7B). A BLAST search of receptors for AI-2 showed that SX309 contains one copy of rbsB (B5S52_21960) encoding the D-ribose ABC transporter substrate-binding protein. In the SX309 genome, two pairs of QS genes, qseB/qseC and gacS/gacA, were identified and highly conserved in SX309 and other five Pectobacterium strains.

Two-component system
The genome of P. carotovorum subsp. brasiliense SX309 contains 19 TCSs (Additional file 17: Table S10). Based  [53], the 19 TCSs were grouped into five previously described subfamilies. There are nine HK/RR TCSs of the OmpR subfamily, five TCSs of the NarL subfamily, two TCSs of the CitB subfamily, two TCSs of the NtrC subfamily, and one belonging to the chemotaxis subfamily.

Flagellar and chemotaxis genes
Two sets of genes encoding flagella biosynthetic and chemotactic proteins were found in the genome of P. carotovorum subsp. brasiliense SX309 (Additional file 18: Figure S8). The one for flagella biosynthesis is tightly clustered (B5S52_08335-B5S52_08505), and encode 39 proteins (FlhDC, FlhBAE, FlgN~K, FliR~C, FliA, and FliZ) with high protein similarity among Pectobacterium spp., except for the genes fliC and filD, which showed a low similarity with those in P. parmentieri SCC3193 and P. parmentieri RNS08.42.1A (Additional file 19: Table S11). These results suggested that the entire flagellar biosynthetic region was probably acquired as a genomic island through horizontal genetic transfer in the Pectobacterium genus. The other set of genes for chemotaxisrelated proteins are split in different clusters in the chromosome of P. carotovorum subsp. brasiliense SX309 (B5S52_08290-B5S52_08330, B5S52_00035, B5S52_06425, B5S52_08270, B5S52_10545, B5S52_ 12895, B5S52_14295, and B5S52_21005) (Additional Fig. 4 Genetic organization of the T6SS major structural gene cluster in Pectobacterium spp. Colored ORF indicates the genes with known function, and the same color represents the same or similar biological function. The gene encoding uncharacterized protein is indicated by gray ORF file 19: Table S11). BLAST results showed that these chemotactic proteins and chemotaxis family TCSs are highly conserved (average 90% protein similarity) within the Pectobacterium genus.

Lipopolysaccharide
The genes involved in the biosynthesis of LPS in SX309 were identified and clustered (Additional file 20: Table  S12). Specifically, all the nine genes (lpxACDHBKLM and waaA) and four genes (waaCEFQ), required for the biosynthesis of the core-lipid A complex [55], are present in the SX309 chromosome. In addition, the four genes involved in the assembly and transport of LPS in Gram-negative bacteria are also present in the SX309 genome (lapB and lptAFG). Furthermore, the O-antigen synthetic protein encoding gene rfbC was also identified. Two gene clusters (kdsABCD and gmhABCD) were also found to be highly conserved among Pectobacterium spp.
Based on the sequences of the CRISPR spacers, the putative CRISPR targets were also analyzed in six Pectobacterium strains using Viroblast or BLAST plasmid searches. The targeted sequences contained diverse phages, including those of Pectobacterium, Erwinia, and Ralstonia, additional bacterial phages, and various types of plasmids (Additional file 22: Table S14). In the SX309 genome, four CRISPR repeats were identified. Specifically, the CRISPR repeat sequence (CGGTTTATCCCCGCTGGCGCGGGGAACAC) conserved in P. carotovorum subsp. brasiliense SX309 and P. parmentieri RNS08.42.1A, contained the highest number Fig. 5 Diagram of the clustered regularly interspaced short palindromic repeats (CRISPR) with CRISPR associated proteins (Cas) system in Pectobacterium species. Blue indicates the subtype I-F CRISPR-associated protein, orange indicates the subtype I-E CRISPR-associated protein, yellow represents CRISPR repeats of spacers. There were 35 spacers in P. carotovorum subsp. brasiliense SX309 and 31 spacers in P. parmentieri RNS08.42.1A. Three of the 35 spacers in SX309 targeted several types of phages, including Erwinia phage ENT90 and Pectobacterium phage ZF40, but did not target plasmids (Additional file 22: Table S14). Four spacers among the 31 in RNS08.42.1A targeted different types of bacteriophages, including Pectobacterium phage phiTE, Pectobacterium phage ZF40, Erwinia phage vB_EamM_ChrisDB, and Erwinia phage phiEa2809 and did not target bacterial plasmids.

Discussion
Pectobacterium spp. are considered to be broad-host range pathogens, except that P. atrosepticum and P. parmentieri have been reported almost exclusively from potatoes (Solanum tuberosum) and P. betavasculorum almost exclusively from sugar beets (Beta vulgaris). The taxonomic position of many strains in the Pectobacterium genus has been re-classified in recent years [2,9,10]. For example, Pcc PC1 was classified into P. aroidearum, and P. peruviense, P. polaris and Candidatus P. maceratum were separated from P. carotovorum. In this study, Pcb SX309 was assigned to the clade of P. carotovorum brasiliense with the other reported Pcb strains (BC1 and BAZ12) based on the phylogenetic analysis. This is consistent with the findings of Meng et al [14]. Moreover, ANI and DDH values demonstrated the taxonomic position of Pcb SX309. The wide host range of Pcb SX309 also agreed with an important feature of P. carotovorum [3,14].
PCWDEs including pectinases, cellulases and proteinases are key virulence factors for bacterial pathogenicity of many important plant bacterial pathogens causing soft rot disease [1]. Alignment analysis revealed that the genes related to the production of PCWDEs all exist in various Pectobacterium species and are highly conserved. Previous study showed that Pel and other pectinases including Peh, pectin lyase (Pnl), pectinesterase, and pectin acetylesterase play a major role in the virulence of and tissue maceration by P. wasabiae [56]. In our study, Pel, Peh, Cel and Pet were detected in SX309. However, the functions of these proteases in pathogenicity of PcbSX309 still need to be determined.
Bacteria have evolved several sophisticated secretion systems that export a wide range of extracellular enzymes and effector proteins. In Gram-negative bacteria, these secretion systems can range from simple transporters to multi-component complexes and have been classified into six types, including type I through type VI secretion systems [57].
Many Gram-negative bacteria use the ubiquitous type II secretion system (T2SS) to translocate extracellular proteins from the periplasm across the outer membrane [58]. The T2SS is well-conserved and primarily composed of common secretion and Sec proteins, which are encoded by 12-15 general secretory pathway (Gsp) gene clusters (GspA to GspO and GspS) that are essential for the bacteria [59]. Previous studies have revealed that pectinases and cellulases are secreted by the T2SS in Pectobacterium, and its inactivation led to reduced pathogenicity [60]. T2SSs were also found in D. dadantii, the causal agent of bacterial stem and root rot of sweet potato, and P. carotovorum (formerly called E. carotovorum), which is responsible for soft rot disease in potato and other crops [61]. Moreover, the GspD-GspC T2SS played an important role in D. dadantii [62]. The role of T2SS in Pcb SX309 remains to be determined in the future.
T3SSs are used by many Gram-negative pathogenic bacteria to deliver virulence proteins (known as effectors) into host cells. Once inside host cells, the effectors manipulate host defenses and promote bacterial growth [63]. Unlike in many other plant bacterial pathogens, the T3SS in P. carotovorum subsp. carotovorum appears to secrete only one effector protein, DspE [64]. Therefore, Pectobacterium seems do not require the T3SS for pathogenicity [21]. T3SS contributes to P. carotovorum growth in the leaves of Arabidopsis thaliana [65] at the early stages of infection and contributes to the virulence of P. atrosepticum on Solanum tuberosum [66]. However, it need to be determined whether the virulence of Pcb partly depend on T3SS during infection of the host plant. A promiscuous secretion system, possibly participated in bacterial pathogenicity, is the recently identified type VI secretion systems (T6SS) in diverse Gram-negative bacteria [67]. T6SS gene clusters consist of 13 core genes that are hypothesized to be minimally necessary for function and conserved genes that vary in composition between species [68]. The vgrG (encoding valine/glycine-repeat protein G) gene contribute to the virulence in Acinetobacter baumannii ATCC 19606 [69]. In Acidovorax avenae subsp. avenae strain RS-2, disruption of the genes pppA, clpB, icmF, impJ and impM caused the reduction of biofilm formation, and mutation of pppA, clpB, icmF and hcp resulted in the reduction in motility. The vital roles of T6SS in the virulence of strain RS-2 may be partially attributed to the reductions in Hcp secretion, biofilm formation and motility. In the Pectobacterium genus, for the biological functions of the T6SS, researchers have not yielded a generalizable conclusion. In Pcc S1, impG strongly influences the virulence and hypersensitive response [70]. It was demonstrated that the PCWDEs genes (pelA and prtF) and T6SS genes (vgrG and hcp3) had the same expression profiles regulated by QS. In P. atrosepticum SCRI1043, and the hcp and vgrG genes are induced in response to potato extracts. However, the virulence of a single gene defective mutant that was interfered in the secretion of Hcp was reported to be stronger than that of the wild-type pathogen in potato tubers [71]. A mutant with double deletions of two machinery encoding clusters spanning 16 (W5S_0962-W5S_0978) and 23 (W5S_2418-W5S_2441) genes that included the two putative T6SS encoding loci was modestly affected in its virulence in the potato tuber slice assay [56]. To date, T6SSs in many bacteria may be involved in pathogenic or symbiotic interactions with their hosts. However, more work are needed to define the function of this intriguing system in Pcb.
Quorum Sensing is a special type of regulation of bacterial gene expression, usually active in conditions of a high population density of bacterial cells. QS systems are widespread among the plant soft-rotting bacteria [7].Previous research showed that Pectobacterium spp. produces two AHL family quorum sensing signals, i.e., N-3-oxooctanoyl-L-homoserine lactone (3-oxo-C8-AHL) and 3-oxohexanoyl-L-homoserine lactone (3-oxo-C6-AHL), which are encoded by the luxI homolog expI [72,73]. The AHL signal was detected by ExpR that belongs to the LuxR family of proteins and was transduced into cellular responses. The inactivation of expI resulted in the decreased production of PCWDEs and decreased virulence [19]. The second QS system, based on the production of the AI-2 signal molecules and controlled by the Sribosylhomocysteine lyase LuxS protein, exists in a wide variety of both Gram-negative and Gram-positive bacteria and is involved in bacterial interspecies communication [74]. The LuxS/AI-2 type QS plays a strain-dependent role in virulence of different Pectobacterium strains. A luxS homolog from a Pectobacterium was first reported in a derivative of P. carotovorum subsp. carotovorum ATTn10 and in P. atrosepticum SCRI1043 [75]. Previous study revealed that there is a correlation between the AI-2 level and the production of pectinolytic enzymes. But it lacks orthologs for both known AI-2 receptors: the LuxPQreceptor and the Lsr ABC-transporter [76]. We hypothesize that RbsB is an alternative to the AI-2 receptors in the Pectobacterium strain. However, the function of the rbsB gene still needs to be validated.
Interestingly, a new kind of autoinducer (AI-3) was discovered in Enterohemorrhagic Escherichia coli (EHEC). AI-3 is perceived by the sensor kinase QseC and its cognate response regulator QseB [77]. Meanwhile, it was found that qseC and qseB were both in PcbSX309, Overall, the biological significance of various QS systems, especially the LuxS/AI-2 QS system in SX309 and other Pectobacterium species, remains to be studied further. Previous studies show that the expression of the rsmA/rsmB genes involved in the regulation of PCWDE biosynthesis is also dependent upon the global regulatory GacA/GacS system [18].
To survive, colonize and cause disease, plantpathogenic bacteria often modulate the expression of their genes using two-component signal transduction systems (TCSs). These systems typically consist of a sensor histidine kinase (HK) and a response regulator (RR) performing a His-Asp phosphotransfer [78].It has been reported that virulence, resistance to magainin II, and the expression of pectate lyase in D. chrysanthemi 3937 were mediated by the response of the PhoP-PhoQ TCSs to pH and magnesium [79]. Additionally, the GacS/ GacA two-component regulators are involved in the global control of virulence in P. carotovorum subsp. carotovorum [80]. However, the functions of these TCSs still need to be addressed.
Bacterial flagella are complex and originated very early as organelles that provide swimming and swarming motilities and play a central role in adhesion, biofilm formation, and host invasion [81]. Flagellar proteins are normally responsible for cell motility and intracellular trafficking, secretion and vesicular transport, while the chemotactic proteins are involved in cell motility and signal transduction [82].In D. dadantii 3937, the mutation of fliA encoding a sigma factor eliminated the bacterial motility, and significantly reduced Pel production and the bacterial attachment to plant tissues [82]. Similarly, the inactivation of flgA, fliA, and flhB gene abolished the bacterial motility and significantly reduced the bacterial virulence in P. carotovorum subsp. carotovorum PCC21 [83]. We have observed that Pectobacterium cells are motile in diseased plant tissues (data not published), but whether the production of PCWDEs and secretion systems that contribute to virulence is coordinated with motility is still unclear. Thus, the functions of flagellar and chemotactic genes in Pectobacterium pathogenicity, especially in pathogen-host plant interactions, remain to be explored.
LPSs were shown to have complex and differing roles depending on their origin and the challenged plant. Previous research reported that different defense response patterns could be induced by the LPS of P. atrosepticum and Pseudomonas corrugata in three Solanaceae species, including tobacco, tomato, and potato [84]. Additionally, different signaling pathways could also be activated by LPS in Arabidopsis thaliana cells [85].A previous study showed that LPS are crucial for the optimal growth, survival and virulence of P. atrosepticum [86], but the roles of LPS in the SX309 strain remain to be determined.
The CRISPR-Cas system mediate immunity to invading genetic elements such as bacteriophages, viruses and plasmids [87]. Based on the presence of the Cas3, Cas9, and Cas10 proteins, different CRISPR-Cas systems were classified into three major types, type I, II, and III. The major types comprise further subtypes (e.g., I-A to I-F), each is characterized by a specific set of proteins [90]. Cas1 is the protein hallmark of CRISPR-mediated immunity, and Cas 1 and Cas2 were found in all CRISPRcontaining organisms [23].
The key factors of the CRISPR-mediated immunity system are small CRISPR RNAs that guide nucleases to complementary target nucleic acids of invading genetic material, generally followed by the degradation of the invader [88,89]. Previous studies revealed that the P. atrosepticum SCRI1043 CRISPR-Cas system contains six proteins, including Cas1, Cas3, and the four subtype I-F specific proteins Csy1, Csy2, Csy3, and Csy4, and three CRISPR repeats [26]. In P. atrosepticum, the Csy4 protein was identified to be responsible for processing the CRISPR RNAs into crRNAs and appears to interact with itself in the absence of other Cas proteins [90]. In our study, we found Pcc, Pco, and Pcb all harbor two subtypes of CRISPR/CAS system (Type I-E, I-F). In Escherichia coli, primed adaptation by type I-E CRISPR-Cas system occurs after the Cascade-crRNA complex interacts with a fully matching protospacer that is subject to interference [25]. However, there are relatively few reports concerning the CRISPR-Cas system in Pectobacterium species.
T6SS can be deployed as versatile weapons to compete with other bacterial cells or attack simple or higher eukaryotic cells and likely plays an important role in mediating a pathogenic or a symbiotic relationship between bacteria and eukaryotes in various environmental niches [68,[91][92][93][94]. Antibacterial effector toxins secreted by T6SSs contributed to the antibacterial functions, which could be neutralized by corresponding antagonistic immunity proteins to preventing self-killing or sibling-intoxication. In Vibrio cholerae, VgrG-3 was found to degrade peptidoglycan and hydrolyse the cell wall of Gramnegative bacteria, and the TsaB (type six secretion antitoxin B) was identified as the immunity protein.
In Dickeya dadantii 3937, Rhs played an important role in intercellular competition, which is linked with the VgrG component of T6SS [94]. The functions of T6SSs should be determined in future research.