Label-free quantitative proteomics of Corynebacterium pseudotuberculosis isolates reveals differences between Biovars ovis and equi strains

BackgroundCorynebacterium pseudotuberculosis is a pathogen classified into two biovars: C. pseudotuberculosis biovar ovis, the etiologic agent of caseous lymphadenitis and C. pseudotuberculosis biovar equi, which causes ulcerative lymphangitis. The available whole genome sequences of different C. pseudotuberculosis strains have enabled identify difference of genes related both virulence and physiology of each biovar. To evaluate be this difference could reflect at proteomic level and to better understand the shared factors and the exclusive ones of biovar ovis and biovar equi strains, we applied the label-free quantitative proteomic to characterize the proteome of the strains: 1002_ovis and 258_equi, isolated from goat (Brazil) and equine (Belgium), respectively.ResultsFrom this analysis, we characterized a total of 1230 proteins in 1002_ovis and 1220 in 258_equi with high confidence. Moreover, the core-proteome between 1002_ovis and 258_equi obtained here is composed of 1122 proteins involved in different cellular processes, which could be necessary for the free living of C. pseudotuberculosis. In addition, 120 proteins from this core-proteome presented change in abundant with statistically significant differences. Considering the exclusive proteome, we detected strain-specific proteins to each strain. When correlated, the exclusive proteome of each strain and proteome with change in abundant, the proteomic differences, between the 1002_ovis and 258_equi, this related to proteins involved in cellular metabolism, information storage and processing, cellular processes and signaling.ConclusionsThis study reports the first comparative proteomic study of the biovars ovis and equi of C. pseudotuberculosis. The results generated in this study provide information about factors which can contribute to understanding both the physiology and the virulence of this pathogen.


Background
Corynebacterium pseudotuberculosis is a Gram-positive facultative intracellular pathogen of the Corynebacterium, Mycobacterium, Nocardia, and Rhodococcus (CMNR) group. The CMNR group of pathogens has high G + C content in their genomes and shows a specific cell wall organization composed of peptidoglycan, arabinogalactan, and mycolic acids [1]. C. pseudotuberculosis is subdivided into two biovars: (i) C. pseudotuberculosis biovar ovis (nitrate negative) which is the etiologic agent of caseous lymphadenitis in small ruminants [2] and mastitis in dairy cattle [3] and (ii) C. pseudotuberculosis biovar equi (nitrate positive) that causes ulcerative lymphangitis and abscesses in internal organs of equines [4] and oedematous skin disease in buffalos [5]. C. pseudotuberculosis infection is reported worldwide and causes significant economic losses by affecting wool, meat, and milk production [6][7][8][9].
Various studies at genome level have been carried out by our research group in order to explore the molecular basis of specific and shared factors among different strains of C. pseudotuberculosis that could contribute to such biovar specific pathogenicity. Our studies on whole-genome sequencing and analysis of several C. pseudotuberculosis strains belonging to biovar ovis and equi, isolated from different hosts showed an average genome size of approximately 2,3 Mb, a core-genome having approximately 1504 genes across several C. pseudotuberculosis species, and accessory genomes of biovar equi and ovis composed of 95 and 314 genes, respectively [10][11][12]. According with pan-genome analysis, C. pseudotuberculosis biovar ovis presented a more clonallike behavior, than the C. pseudotuberculosis biovar equi. In addition, in this in silico study was observed a variability most interesting related to pilus genes, where biovar ovis strain presented high similarity, while, biovar equi strains have a great variability, suggesting that this variability could influence in the adhesion and invasion cellular of each biovar [10].
Apart from the structural genome informatics studies of C. pseudotuberculosis, some proteomic studies were conducted to explore the functional genome of this pathogen [13][14][15][16][17][18][19]. However, all these proteomic studies were performed using only strains belonging to biovar ovis. Until the present time, no proteomic studies were performed between biovar equi strains or between biovar ovis and biovar equi strains. Therefore, to provide insights on shared and exclusive proteins among biovar ovis and biovar equi strains and to complement the previous studies on functional and structural genomics of C. pseudotuberculosis biovars, using LC-MS E approach [13,18] this study reports for the first time a comparative proteomic analysis of two C. pseudotuberculosis strains, 1002_ovis and 258_equi, isolated from caprine (Brazil) and equine (Belgium), respectively. Our proteomic dataset promoted the validations of previous work in silico of C. pseudotuberculosis; in addition, the qualitative and quantitative differences in the proteins identified in this present work have potential to help understand the factors that might contribute for pathogenic process of biovar ovis and equi strains.

Methods
Bacterial strain and growth condition C. pseudotuberculosis biovar ovis 1002, isolated from a goat in Brazil, and C. pseudotuberculosis biovar equi 258, isolated from a horse in Belgium, were maintained in brain-heart infusion broth or agar (1.5%) (BHI-HiMedia Laboratories Pvt. Ltd., India) at 37°C. For proteomic analysis, overnight cultures (three biological replicate to each strain) in BHI were inoculated with a 1:100 dilution in fresh BHI at 37°C and cells were harvested during the exponential growth at DO 600 = 0.8 (Additional file 1: Figure S1).

Protein extraction and preparation of whole bacterial lysates for LC-MS/MS
After bacterial growth, the protein extraction was performed according to Silva et al. [18]. The cultures were centrifuged at 4000 x g at 4°C for 20 min. The cell pellets were washed in phosphate buffered saline (PBS) and then resuspended in 1 mL of lysis buffer (7 M Urea, 2 M Thiourea, CHAPS 4% and 1 M DTT) and 10 μL of Protease Inhibitor Mix (GE Healthcare, Piscataway, NJ, USA) was added. The cells were broken by sonication at 5 × 1 min cycles on ice and the lysates were centrifuged at 14,000 x g for 30 min at 4°C. Subsequently, samples were concentrated and lysis buffer was replaced by 50 mM ammonium bicarbonate at pH 8.0 using a 10 kDa ultra-filtration device (Millipore, Ireland). All centrifugation steps were performed at room temperature. Finally the protein concentration was determined by Bradford method [20]. A total of 50 μg proteins from each biological replicate of 1002_ovis and 258_equi were denatured by using RapiGEST SF [(0.1%) (Waters, Milford, CA, USA)] at 60°C for 15 min, reduced with DTT [(10 mM) (GE Healthcare)], and alkylated with iodoacetamide [(10 mM) (GE Healthcare)]. For enzymatic digestion, trypsin [(0.5 μg/μL) (Promega, Sequencing Grade Modified Trypsin, Madison, WI, USA)] was added and placed in a thermomixer at 37°C overnight. The digestion process was stopped by the addition of 10 μL of 5% TFA (Sigma-Aldrich, St. Louis, Missouri, USA) and glycogen phosphorilase (Sigma-Aldrich) was added to the digests to give 20 fmol.uL −1 as an internal standard for scouting normalization prior to each replicate injection into labelfree quantitation [21].

LC-HDMS E analysis and data processing
Qualitative and quantitative analysis were performed using 2D RPxRP (two-dimensional reversed phase) nanoUPLC-MS (Nano Ultra Performance Liquid Chromatography Mass Spectrometry) approach with multiplexed Nano Electrospray High Definition Mass Spectrometry (nanoESI-HDMS E ). To ensure that all samples were injected with the same amount into the columns and to ensure standardized molar values across all conditions, stoichiometric measurements based on scouting runs of the integrated total ion account (TIC) were performed prior to analysis. The experiments were conducted using both a 1 h reversed phase gradient from 7% to 40% (v/v) acetonitrile (0.1% v/v formic acid) and a 500 nL.min −1 on a 2D nanoACQUITY UPLC technology system [22]. A nanoACQUITY UPLC HSS (High Strength Silica) T3 1.8 μm, 75 μm × 15 cm column (pH 3) was used in conjunction with a reverse phase (RP) XBridge BEH130 C18 5 μm 300 μm × 50 mm nanoflow column (pH 10). Typical on-column sample loads were 250 ng of total protein digests for each 5 fractions (250 ng/fraction/load). For all measurements, the mass spectrometer was operated in the resolution mode with a typical m/z resolving power of at least 35,000 FMHW and an ion mobility cell filled with nitrogen gas and a cross-section resolving power at least 40 Ω/ΔΩ. All analyses were performed using nanoelectrospray ionization in the positive ion mode nanoESI (+) and a NanoLockSpray (Waters, Manchester, UK) ionization source.
The lock mass channel was sampled every 30 s. The mass spectrometer was calibrated with a MS/MS spectrum of [Glu1]-Fibrinopeptide B human (Glu-Fib) solution (100 fmol.uL −1 ) delivered through the reference sprayer of the NanoLockSpray source.The doublycharged ion ([M + 2H] 2+ = 785.8426) was used for initial single-point calibration and MS/MS fragment ions of Glu-Fib were used to obtain the final instrument calibration. Multiplexed data-independent (DIA) scanning with added specificity and selectivity of a non-linear 'T-wave' ion mobility (HDMS E ) experiments were performed with a Synapt G2-S HDMS mass spectrometer (Waters), which was automatically planned to switch between standard MS (3 eV) and elevated collision energies HDMS E (19-45 eV) applied to the transfer 'T-wave' CID (collision-induced dissociation) cell with argon gas. The trap collision cell was adjusted for 1 eV, using a miliseconds scan time previously adjusted based on the linear velocity of the chromatography peak delivered through nanoACQUITY UPLC to get a minimum of 20 scan points for each single peak, both in low energy and at high-energy transmission at an orthogonal acceleration time-of-flight (oa-TOF) from m/z 50 to 2000. The RF offset (MS profile) was adjusted is such a way that the nanoUPLC-HDMS E data are effectively acquired from m/z 400 to 2000, which ensured that any masses observed in the high energy spectra with less than m/z 400 arise from dissociations in the collision cell.

Database searching and quantification
Following the identification of proteins, the quantitative data were packaged using dedicated algorithms [23,24] and searching against a database with default parameters to account for ions [25]. The databases used were reversed "on-the fly" during the database queries and appended to the original database to assess the false positive rate (FDR) during identification. For proper spectra processing and database searching conditions, the Protein Lynx Global Server v.2.5.2 (PLGS) with Identity E and Expression E informatics v.2.5.2 (Waters) were used. UniProtKB (release 2013_01) with manually reviewed annotations was used, and the search conditions were based on taxonomy (Corynebacterium pseudotuberculosis). We have utilized a database from genome annotation of 1002_ovis CP001809.2 version and 258_equi CP003540.2 version. These databases were randomized within PLGS v.2.5.2 for generate a concatenated database from both genomes. Thus, the measured MS/MS spectra from proteomic datasets of 1002_ovis and 258_equi were searched against this concatenated database. The maximum allowed missed cleavages by trypsin were up to one, and variable modifications by carbamidomethyl (C), acetyl N-terminal, phosphoryl (STY) and oxidation (M) were allowed and peptide mass tolerance value of 10 ppm was used [26]. Peptides as source fragments, peptides with a charge state of at least [M + 2H] 2+ and the absence of decoys were the factors we considered to increase the data quality. The collected proteins were organized by the PLGS Expression E tool algorithm into a statistically significant list that corresponded to higher or lower regulation ratios among the different groups. For protein quantitation, the PLGS v2.5.2 software was used with the IdentityE algorithm using the Hi3 methodology. The search threshold to accept each spectrum was the default value in the program with a false discovery rate value of 4%. The quantitative values were averaged over all samples, and the standard deviations at p < 0.05 were determined using the Expression software. Only proteins with a differential expression log2 ratio between the two conditions greater than or equal to 1.2 were considered [26].

Bioinformatics analysis
The identified proteins in 1002_ovis and 258_equi were subjected to the bioinformatics analysis using the various prediction tools. SurfG+ v1.0 [27] was used to predict sub-cellular localization, SignalP 4.1.0 server [28] to predict the presence of N-terminal signal peptides for secretory proteins, SecretomeP 2.0 server [29] to identify exported proteins from non-classical systems (positive prediction score greater than to 0.5), LipoP server [30] to determine lipoproteins, Blast2GO [31] and COG database [32] were used for functional annotations. The protein-protein interaction network was generated using Cytoscape version 2.8.3 [33] with a spring-embedded layout.

Results and discussion
Characterization of the proteome of C. pseudotuberculosis biovar ovis and equi In this study, we applied the 2D nanoUPLC-HDMS E approach to characterize the proteome of the strains 1002_ovis and 258_equi. Both strains were grown in BHI media, subsequently proteins were extracted and digested in solution, and then the peptides were analyzed by LC/MS E . Our proteomic analysis identified a total of 1227 non-redundant proteins in 1002_ovis (Additional file 2: Table S1 and Additional file 3: Table  S2) and 1218 in 258_equi (Additional file 2: Table S1 and Additional file 4: Table S3) (Fig. 1a). The information about sequence coverage and a number of identified peptides for each protein sequence identified, as well as the information about the native peptide are available at Additional file 5: Table S4 and Additional file 6: Table  S5. Altogether from the proteome of these two biovars, we identified a total of 1323 different proteins of C. pseudotuberculosis with high confidence (Fig. 1a) and characterized approximately 58% of the predicted proteome of 1002_ovis [11] (Fig. 1b). In the case of 258_equi, we characterized approximately 57% of the predicted proteome [12] (Fig. 1b). The proteins identified in both proteomes were analyzed by SurfG+ tool [27] to predict the subcellular localization into four categories: cytoplasmic (CYT), membrane (MEM), potentially surfaceexposed (PSE) and secreted (SEC) (Fig. 1c). Further, we identified 83% (43 proteins) of the lipoproteins predicted in 1002_ovis and 79% (41 proteins) in 258_equi. Considering proteins with LPxTG motif which are involved in covalent linkage with peptidoglycan, we identified 6 proteins in 1002_ovis and 4 proteins in 258_equi that correspond to approximately 38% and 34% of the LPxTG proteins predicted in each strain, respectively.
The biovar equi and biovar ovis core proteome The core-proteome, between 258_equi and 1002_ovis is composed of 1122 proteins ( Fig. 1) (Additional file 2: Table S1). Interestingly, when correlated these 1122 Fig. 1 Characterization of the proteome of C. pseudotuberculosis and correlation with in silico data. a Distribution of the proteins identified in the proteome of 1002_ovis and 258_equi, represented by Venn diagram. b Correlation of the proteomic results with in silico data of the genomes of 1002_ovis and 258_equi. c Subcellular localization of the identified proteins and correlation with the in silico predicted proteome. CYT, cytoplasmic; MEM, membrane; PSE, potentially surface-exposed and SEC, secreted proteins with in silico data of the C. pseudotuberculosis core-genome [10], we observed that 86% (960 proteins) of the Open Reading Frame (ORF) that encodes these proteins are part of the core-genome (Additional file 2: Table S1), what represents approximately 64% of the predicted core-genome of this pathogen. In addition, these data show a set of proteins involved in different cellular processes which could be necessary for the free living of C. pseudotuberculosis. The other 14% (262 proteins) of the proteins that constitute the core-proteome are shared by at least one of the 15 strains used in the core-genome study. According to Gene Ontology analysis [31,32], the 1122 proteins were classified into four important functional groups: (i) metabolism, (ii) information storage and processing, (iii) cellular processes and signaling, and (iv) poorly characterized (Fig. 2a). As observed in the study of C. pseudotuberculosis [10] core genome in the categories "metabolism" and "information storage and processing" were detected a large number of proteins.
The label-free quantification was applied to evaluate the relative abundance of the core-proteome of 258_equi and 1002_ovis. The ProteinLynx Global Server (PLGS) v2.5.2 software with Expression E algorithm tool was used to identify proteins with p ≤ 0.05 (Additional file 2: Table  S1). Among these proteins, 120 proteins between 258_equi and 1002_ovis showed difference in level of abundance (log 2 ratios equal or greater than a factor of 1.2) [26] (Table 1). In this group of proteins that have presented different abundance level (258_equi:1002_ovis), 49 proteins were more abundant and 71 less abundant (Table 1). To visualize this differential distribution of the core-proteome a volcano plot of the log 2 ratio of 258_equi/1002_ovis versus Log (e) Variance was generated (Fig. 2b). Interestingly, the Phospholipase D (Pld), the major virulence factor of C. pseudotuberculosis, was more abundant in 258_equi, than in 1002_ovis ( Table 1). The Pld have an important play role in the pathogenic process of C. pseudotuberculosis, due to the sphingomyelinase activity of the Pld, this exotoxin increases Fig. 2 Representative results of the core-proteome 1002_ovis and 258_equi. a Functional distribution of the proteins identified in the core-proteome. b Volcano plot generated by differentially expressed proteins, log2 ratio of 258_equi/1002_ovis. c Biological processes differential between 258_equi and 1002_ovis   vascular permeability through the exchange of polar groups attached to membrane-bound lipids and helps the bacteria in spread inside the host [34,35]. In addition, this exotoxin is able to reduce the viability of both macrophages and neutrophils [34,36]. In comparative proteomic studies between 1002_ovis and C231_ovis exoproteome, Pld was detected only in the C231_ovis supernatant [13,15,16]. A study performed with pld mutant strains presented decreased virulence [37]. Thus, in relation to 258_equi, 1002_ovis could present a low potential of virulence. The 120 differential proteins were organized by cluster of orthologous groups, and when evaluated the different biological processes that comprise each category listed above, we observed that 19 process were differentials between 258_equi and 1002_ovis (Fig. 2c, Additional file 7: Figure S2 and Additional file 8: Figure S3). The majority of the more abundant proteins (258_equi:1002_ovis) are related to cellular metabolism. On other hand, the majority of the less abundant proteins (258_equi:1002_ovis) are classified as poorly characterized or of unknown function. However, when proteins of known or predicted function are evaluated the majority of the less abundant proteins are related to cellular processes and signaling.
Difference among the major functional classes identified from the core-proteome analysis of 1002_ovis and 258_equi Metabolism During the infection process, pathogens need to adjust their metabolism in response to nutrient availability inside and outside the host. In our proteomic study, we identified several proteins related to different metabolic pathways. To determine the metabolic network of each strain, the proteins identified in this study were analyzed using Kyoto Encyclopedia of Genes pathways and Genomes (KEGG) [38]. A total of 321 and 320 proteins, corresponding to 1002_ovis and 258_equi respectively, were mapped onto different metabolic pathways (Additional file 9: Figure S4 and Additional file 10: Figure S5). We observed differences in the metabolism of the biovars, related to Amino acid transport and metabolism, Carbohydrate transport and metabolism, Coenzyme metabolism, Energy metabolism, Lipid transport and metabolism, Nucleotide metabolism and Secondary metabolites biosynthesis, transport and catabolism. Difference in the metabolism cellular, also already were observed in others comparative proteomic study of C. pseudotuberculosis [13,16,17,19], as well as in the Mycobacterium tuberculosis pathogen [39].
Interestingly, the PTS system fructose-specific EIIABC component (PstF) related to carbohydrate metabolism was more abundant in 258_equi, than in 1002_ovis (Table 1). This protein showed increased abundance in field isolates of C. pseudotuberculosis biovar ovis grown in BHI when compared to C231_ovis, a reference strain [19]. This increased abundance of PstF in 258_equi, suggests that this protein could be important to the transport of carbon source both biovar ovis and biovar equi strains. On the other hand, the Precorrin 8X methyl mutase involved in cobalamin and vitamin B12 synthesis can be required only in biovar ovis strains, this protein beside being more abundant in 1002_ovis (Table 1), was also detected with greater abundance in the field isolates of C. pseudotuberculosis biovar ovis after having been grown in BHI [19]. Glutamate dehydrogenase (GDH) was detected more abundant in 258_equi (Table 1). A study performed with the M. bovis pathogen showed that GDH contributes to the survival of this pathogen during macrophage infection [40].
In C. pseudotuberculosis, it was demonstrated that genes related the iron-acquisition are involved in the virulence of this pathogen [41]. In the core-proteome of 1002_ovis and 258_equi, we detected proteins involved in this process, like CiuA, FagC and FagD; however, all these proteins were not differentially regulated between the two strains (Additional file 2: Table S1). On the other hand, HmuT protein, related to hemin uptake, was more abundant in 258_equi (Table 1). Additionally, we have also detected a cell surface hemin receptor in the exclusive proteome of this strain. Heme represents the major reservoir of iron source for many bacterial pathogens that rely on surface-associated heme-uptake receptors [42]. The HmuT is a lipoprotein that acts as a hemin receptor. The hmuT gene is part of the operon hmuTUV, an ABC transport system (haemin transport system), which is normally present in pathogenic Corynebacterium [43,44]. In addition, in the pathogen C. ulcerans, HmuT is required for normal hemin utilization [44].

Information storage and processing
Of the total protein of proteins identify in the category "information storage and processing" the majority of the differential proteins were less abundant in 258_equi (Table 1). Only, Metallophosphoesterase involved in DNA repair, SAM dependent methyltransferase related to transcriptional process and Ribosomal RNA small subunit methyltransferase I involved in translation process were more induced in 258_equi. In 1002_ovis the Exodeoxyribonuclease 7 important protein related to the DNA-damage pathway was more induced in this strain. In addition, we identified the TetR family regulatory protein as more abundant in 1002_ovis, this result was also observed in field isolates of C. pseudotuberculosis from sheep infected naturally [19]. TerR proteins are related to regulation of multidrug efflux pumps, antibiotic biosynthesis, catabolic process and cellular differentiation process [45]. Others important transcriptional regulators also were induced in 1002_ovis such as PvdS and GreA regulators.

Cellular processes and signaling
Our proteomic analyses detected differentially regulated proteins belonging to different antioxidant systems. These could contribute to the survival of C. pseudotuberculosis in various stress conditions, such as reactive oxygen species (ROS) and reactive nitrogen species (RNS), which are generally found in macrophage. The three major thiol-dependent antioxidant systems in prokaryotic pathogens are the thioredoxin system (Trx), the glutathione system (GSH-system) and the catalase system [46]. Thioredoxin TrxA and Thiol-disulfide isomerase thioredoxin were more abundant in 258_equi (Table  1). These proteins are involved in the Trx-system, which has a major role against oxidative stress [46]. However, proteins like catalase and glutaredoxin (nrdH) were less abundant in 258_equi (Table 1), being more active in 1002_ovis. Catalase plays an important role in resistance to ROS and RNS, as well as in the virulence of M. tuberculosis [47]. The protein NrdH has a glutaredoxin amino acid sequence and thioredoxin activity. It is present in Escherichia coli [48] and C. ammoniagenes [49], as well as in bacteria where the GSH system is absent, such as M. tuberculosis [50]. Thus, the presence of NrdH may represent one more factor that contributes to the resistance of C. pseudotuberculosis against ROS and RNS during the infection process, as well as to the maintenance of the balance of intracellular redox potential. Proteins like NorB and Glyoxalase/Bleomycin, which play roles in the nitrosative stress response of 1002_ovis, were identified in the exclusive proteome of this strain (Additional file 3: Table S2) [14,18]. These results shown that beside of present proteins with difference in abundance both strains present a set of proteins that could contribute to adaptive process under stress conditions.
Difference proteomic observed in the exclusive proteome of 258_equi and 1002_ovis We found respectively 105 and 96 proteins in the exclusive proteome of 1002_ovis and 258_equi ( Fig. 1) (Additional file 3: Table S2 and Additional file 4: Table  S3), related to different biological process (Additional file 7: Figure S2 and Additional file 8: Figure S3). Interestingly, in this exclusive proteome of 1002_ovis and 258_equi, we detected specific proteins in each strain ( Table 2, Additional file 3: Table S2 and Additional file 4: Table S3). In the exclusive proteome of 258_equi, the ORFs that codify twenty proteins are annotated as pseudogene in 1002_ovis ( Table 2, Additional file 3: Table S2 and Additional file 4: Table S3). On the other hand, the ORFs that encode six proteins were not detected in the genome of 1002_ovis. These proteins are two CRISPR, MoeB, and three unknown function proteins. CRISPR is an important bacterial defense system against infections by viruses or plasmids, this immunity is obtained from the integration of short sequences of invasive DNA 'spacers' into the CRISPR loci [51]. The distinction between the biovar ovis and biovar biovar equi strains is based on a biochemical assay, where biovar ovis strains are negative for nitrate reduction, whereas biovar equi strains are positive [52]. However, to date, there is no available information regarding the molecular basis underlying nitrate reduction in C. pseudotuberculosis biovar equi. MoeB is involved in the molybdenum cofactor (Moco) biosynthesis, which plays an important role in anaerobic respiration in bacteria and also are required to activation of nitrate reductase (NAR) [53]. In the closely related pathogen M. tuberculosis several studies have showed the great importance of molybdenum cofactor in its virulence and pathogenic process, mainly macrophage intracellular environmental [54]. Therefore, more studies are necessary to explore the true role of Moco both physiology and virulence of biovar equi strains. Other protein that also could contribute to resistance of 258_equi macrophage is NADPH dependent nitro/flavin reductase (NfrA), a pseudogene in 1002_ovis. In addition, studies performed in Bacillus subtilis showed that NfrA is involved in both oxidative stress [55] and heat shock resistance [56].
In 1002_ovis, only the ORF that encodes a DNA methylase was not found in the 258_equi genome (Table 2,  Additional file 3: Table S2 and Additional file 4: Table  S3). In addition, the ORFs that codifies seven proteins identified in the exclusive proteome of the strain 1002_ovis are annotated like pseudogene in 258_equi ( Table 2, Additional file 3: Table S2 and Additional file 4: Table S3). Inside this group, we have identified important proteins involved in the process of adhesion and invasion cellular, which might contribute in the pathogenesis of 1002_ovis. Adhesion to host cells is a crucial step that favors the bacterial colonization; this process is mediated by different adhesins [57]. We identified proteins such as: collagen binding surface protein Cnalike and Sdr family related adhesin, which are members of the collagen-binding microbial surface components recognizing adhesive matrix molecules (MSCRAMMs) ( Table 2). This class of proteins is present in several Gram positive pathogens and plays an important role in bacterial virulence by acting mainly in the cellular adhesion process [58][59][60][61].
Another detected protein that might contribute to the virulence of 1002_ovis is Neuraminidase (NanH) ( Table  2). This protein belongs to a class of glycosyl hydrolases that contributes to the recognition of sialic acids exposed on host cell surfaces [62]. In C. diphtheriae, it was demonstrated that a protein with trans-sialidase activity promotes cellular invasion [63,64]. In addition, NanH was reported to be immunoreactive in the immunoproteome of 1002_ovis, showing the antigenicity of this protein [65]. Interestingly, genomic difference in relation to gene involved in the adhesion and invasion process, also already were observed between biovar ovis strain and biovar equi strains, mainly in genes related to pilus [10,12]. According to pathogenic process of each biovar, unlike biovar equi strains, which rarely causes visceral lesions [4], biovar ovis strains, are responsible mainly by visceral lesions [2,35], what requires a high ability to adhere and invade the host cell, thus these protein could be responsible by this ability of biovar ovis strain in attacks visceral organs.

Proteogenomic analysis
In our proteomic analysis, the measured MS/MS spectra from the proteomic datasets of 1002_ovis and 258_equi were searched against a concatenated database composed by genome annotation of 1002_ovis CP001809.2 version and 258_equi CP003540.2 version for identify possible errors or unannotated genes. Thus, by adopting more stringent criteria of considering only proteins with a minimum representative of two peptides and a FDR < 1%, we identified five proteins in 1002_ovis and seven proteins in 258_equi, which were not previously annotated. All parameters, as well as, the peptides sequence which were used for identification of these proteins are shown in Additional file 11: Table S6 and Additional file 12: Table S7. The proteins identified in this proteogenomic analysis are associated to different biological processes. For instance, the Aminopeptidase N involved in the amino acid metabolism was detected in 1002_ovis, whereas the Cobaltochelatase (cobN), associated to cobalt metabolism, glutamate dehydrogenase (gdh) involved in the L-glutamate metabolism, the PTS system fructose specific EIIABC related to fructose metabolism and the Phosphoribosylglycinamide formyltransferase involved in the purine biosynthesis were all detected in 258_equi. Proteins involved in DNA processes, such as Uracil DNA glycosylase in 258_equi; and Exodeoxyribonuclease 7 small subunit in 1002_ovis were also detected in both strains. Proteins with general function prediction only and unknown function were also identified in both strains.

Conclusion
In conclusion, we used a label-free quantitative approach to compare, for the first time, the proteome of C. pseudotuberculosis strains belonging to both ovis and equi biovars. Taken together, the findings reported here show a set of shared and exclusive factors of 1002_ovis and 258_equi at the protein level, which can contribute to understanding both the physiology and the virulence of these strains. In addition, the functional analysis of the genome of 1002_ovis and 258_equi allows the in silico validation of data of the genome of these strains. Thus, the proteins identified here may be used as potential new targets for the development of vaccines against ovis and equi C. pseudotuberculosis in future investigations.