Comparative genomics to examine the endophytic potential of Pantoea agglomerans DAPP-PG 734

Pantoea agglomerans DAPP-PG 734 was isolated as endophyte from knots (tumors) caused by Pseudomonas savastanoi pv. savastanoi DAPP-PG 722 in olive trees. To understand the plant pathogen-endophyte interaction on a genomic level, the whole genome of P. agglomerans DAPP-PG 734 was sequenced and annotated. The complete genome had a total size of 5′396′424 bp, containing one circular chromosome and four large circular plasmids. The aim of this study was to identify genomic features that could play a potential role in the interaction between P. agglomerans DAPP-PG 734 and P. savastanoi pv. savastanoi DAPP-PG 722. For this purpose, a comparative genomic analysis between the genome of P. agglomerans DAPP-PG 734 and those of related Pantoea spp. was carried out. In P. agglomerans DAPP-PG 734, gene clusters for the synthesis of the Hrp-1 type III secretion system (T3SS), type VI secretion systems (T6SS) and autoinducer, which could play an important role in a plant-pathogenic community enhancing knot formation in olive trees, were identified. Additional gene clusters for the biosynthesis of two different antibiotics, namely dapdiamide E and antibiotic B025670, which were found in regions between integrative conjugative elements (ICE), were observed. The in-depth analysis of the whole genome suggested a characterization of the P. agglomerans DAPP-PG 734 isolate as endophytic bacterium with biocontrol activity rather than as a plant pathogen. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08966-y.

The plasmids 1 and 3 in P. agglomerans DAPP-PG 734 carries several gene clusters as most were observed in P. vagans C9-1 on pPag1 and pPag3 [1, 2] for metabolic purpose and biosynthesis of secondary metabolites (Figure S2, Figure S3). However, the plasmids from DAPP-PG 734 are also containing some variation compared to the plasmids of C9-1.
Plasmid 1 include a complete gene cluster (malGFEKLMQPT) for maltose metabolism (DAPPPG734_22555 -DAPPPG734_22605). The maltose/maltodextrin system is regulated by an activator MalT, and contains eight further regulated genes, which are shaped for the transport and metabolism of maltose and maltodextrins. Six genes of this cluster encode a high-affinity and binding protein-dependent ABC transporter, called maltoporin, MalM, MalE, MalK and the subunits MalF and MalG [3]. Compared to related strains (Figure 3, main document), only two strains showed the absence of the malGFEKLMQPT gene cluster. Pantoea eucalypti NFPP29 does not contain any plasmids related to plasmid 1 in P. agglomerans DAPP-PG 734 or pPag3 in P. vagans C9-1 [2]. On the other hand, P. vagans FDAARGOS_160 contains a plasmid related to plasmid 1 in DAPP-PG 734, but it lacks all mal genes.
Other clusters for carbohydrate metabolism including arabinogalactan (ganKEFGABCLR) [4](DAPPPG734_23085 -DAPPPG734_23130) and fructoselysine (frlABDR) (DAPPPG734_21510 -DAPPPG734_21525) are located on plasmid 1 as observed in a large group of Pantoea strains which include a large universal Pantoea plasmid (LPP-1) [5]. The frlABDR gene cluster consist of four genes encoding for fructoselysine permease FrlA, fructoselysine-6-phosphate deglycase FrlB, fructoselysine-6-phosphatase FrlD and regulator of fructoselysine operon FrlR ( Figure S4). The presence of the frlABDR gene cluster is very variable within the compared strains (Figure 3, main document). This indicated that frlABDR is rather a variable trait within the species [5]. Therefore, further analyses are necessary to determine the role of this gene cluster within the species.
Corresponding to the production of the typically yellow pigment [6], P. agglomerans DAPP-PG 734 contains six genes (crtEXYIBZ) for carotenoid biosynthesis as observed in P. vagans C9-1 on pPag3 [2], which are located on plasmid 1 (DAPPPG734_22760 -DAPPPG734_22785). Carotenoids can play an important role in photooxidative damage protection and protection against environmental stress [7]. The crtEXYIBZ gene cluster were identified in almost all other strains (Figure 3, main document), except P. eucalypti NFPP29, which does not have a plasmid related to plasmid 1 in P.
Furthermore, the gene cluster for thiamine biosynthesis is also present in DAPP-PG 734 on plasmid 1 (DAPPPG734_22825 -DAPPPG734_22840) and consists of four genes (thiOSGF) as discovered in P.
vagans C9-1 on pPag3 [2]. Recent results showed that the biosynthesis of thiamine has an impact of enhancing the biosynthesis of exopolysaccharides (EPS) in Erwinia amylovora which cause necrotrophic fire blight disease of apple, pear and other rosaceous plants [8]. In comparison, the gene cluster thiOSGF was only absent in Pantoea sp. PMG_056 and P. eucalypti NFPP29 (Figure 3

, main document).
A gene cluster for encoding a heavy metal reduction of arsenate is also found on the plasmid 1 in DAPP-PG 734 (DAPPPG734_22475 -DAPPPG734_22490). In addition, some P. agglomerans strains contain an arsH gene encoding a putative flavoprotein [9]. The presence of the arsCBRH gene cluster is also here very variable within the compared strains (Figure 3, main document), indicating that it is rather a variable trait within the species [9]. It is important to mention that most strains do not include the gene arsH and were therefore marked as partially containing the gene cluster. Additional research in vitro will be necessary to indicate the diversification and presence of heavy metal reduction within the species. A further gene cluster for inner membrane iron and manganese transporter (sitABCD) as observed on LPP-1 [5] is also located on plasmid 1.
Based on the genomic analysis, plasmid 3 includes only a complete gene cluster for sucrose metabolism (DAPPPG734_25120 -DAPPPG734_25140) [1,10]. Sucrose is the most common disaccharide energy source for phytopathogenic bacteria and consist of a glucose unit linked to a fructose unit through a glycosidic linkage [11]. Four structural genes (scrABKY) are responsible for the transport and utilization of sucrose, which includes an ATP-dependent fructokinase, a sucrosespecific porin of the outer membrane, an enzyme of the phosphoenolpyruvate-dependent phosphotransferase system (PTS), and beta-fructofuranosidase fructohydrolase for cleaving sucrose 6phosphate into alpha-glucose 6-phosphate and beta-fructose. The sucrose metabolic system is regulated by a sucrose operon repressor (ScrR) and is induced in a sucrose-specific manner [12]. In comparison, the scrABKY gene cluster was only absent in eight other strains (Figure 3, main   document). These strains do not contain a plasmid related to plasmid 3 in P. agglomerans DAPP-PG 734 or pPag1 in P. vagans C9-1 [2].
Genes for sorbitol metabolism as observed in P. vagans C9-1 on pPag2 [1] were absent in DAPP-PG 734. Additionally, P. agglomerans DAPP-PG 734 contains a gene cluster (narIJHGK) for the reduction of nitrate to nitrite which can act as terminal respiratory electron acceptor [13]. This gene cluster is located on the chromosome (DAPPPG734_11365 -DAPPPG734_11390) directly adjacent to the T6SS-2. A gene for encoding flavorubredoxin is also located in this nitrate reductase metabolism cluster. Flavorubredoxin acts like a reductase partner of the anaerobic nitric oxide reductase where nitric oxide is detoxified by using NADH and flavorubredoxin [14]. Compared to related strains, only ten other strains lack all nar genes (Figure 3, main document).
4 Table S1: Antibiotic resistance gene profile in the genome of Pantoea agglomerans DAPP-PG 734 as predicted using CARD [15].  Figure S1: Genomic islands in Pantoea agglomerans DAPP-PG 734. This figure shows the circular plot of the genome of P. agglomerans DAPP-PG 734 and the predicated genomic islands which are colored based on the prediction methods [16]. Orange indicates genomic islands predicted by SIGI-HMM, blue represent genomic islands using IslandPath-DIMOB method and red shows the genomic islands that were predicted by an integrated analysis.   Figure S4: Gene cluster for degradation of fructoselysine in five strains of Pantoea agglomerans. Identical gene clusters within the genomes are shaded in grey and homologous genes are marked in the same color. The frlABDR gene cluster is colored in red and the additional subunit is colored in orange. White arrows represent no similarity to genes of other strains.
8 Figure S5: Gene cluster for biosynthesis of dapdiamide E in five Pantoea spp. The gene cluster for biosynthesis of dapdiamide E is shaded in grey, while the needed genes are colored in green. Conserved homologous genes, which are not part of the dapdiamide E cluster, are colored in yellow or in blue. White colored arrows represent no similarity to other genes. Pseudogenes are marked as dashed colored arrows. Figure S6: Gene cluster for biosynthesis of antibiotic B025670 in six Pantoea spp. The gene cluster for biosynthesis of antibiotic B025670 is shaded in grey, while the required genes are colored in violet. Homologous genes are marked in the same color. White colored arrows represent no similarity to other genes. A red asterisk represents a contig breaks within the sequenced genome. Figure S7: Gene cluster for type VI secretion system 1 (T6SS-1) in five Pantoea spp. The grey shadings show the conserved regions. Green marked arrows represent gene coding domains identified by Boyer et al. [17], and yellow painted arrows are not described genes by Boyer et al. [17] but are conserved genes among the Pantoea T6SS-1 loci [5]. The grey labelled arrows show the flanking sites of each strain. The red arrows stand for the vgrG and hcp genes. Blue-and light salmon-colored arrows represent homologous genes but are not part of the conserved region while the non-colored arrows (white) do not belong to the conserved genes and are not similar to other genes. Dashed colored arrows represent pseudogenes.  Figure S8: Gene cluster for type 6 secretion system 6 (T6SS-6) in five Pantoea agglomerans. This figure shows the genes involved for the biosynthesis of T6SS-6. The grey shading shows the identical cluster within the genome. Green marked arrows represent identical genes and grey labelled arrows shows the flanking sites of each strain. The red arrows identify the vgrG and hcp effector genes. Dashed colored arrows represent pseudogenes. Figure S9: Gene cluster for biosynthesis of enterobactin in four Pantoea spp. The ent-fep gene cluster is shaded in grey, while the responsible genes are colored in green. Yellow colored arrows represent homologous genes but are not conserved genes. The flanking sequences are indicated by grey colored arrows.  Figure S11: Gene cluster for biosynthesis of exopolysaccharide (EPS) in four Pantoea agglomerans. Identical genes within a genomic range across the compared strains are shaded in grey, while the responsible genes for the biosynthesis of EPS are colored in violet. Homologous genes, which are not part of the EPS cluster, are grouped by related colored arrows and pseudogenes are marked as dashed colored arrows. White arrows represent no similarity to genes in related strains.