Genomic basis of broad host range and environmental adaptability of Rhizobium tropici CIAT 899 and Rhizobium sp. PRF 81 which are used in inoculants for common bean (Phaseolus vulgaris L.)

Background Rhizobium tropici CIAT 899 and Rhizobium sp. PRF 81 are α-Proteobacteria that establish nitrogen-fixing symbioses with a range of legume hosts. These strains are broadly used in commercial inoculants for application to common bean (Phaseolus vulgaris) in South America and Africa. Both strains display intrinsic resistance to several abiotic stressful conditions such as low soil pH and high temperatures, which are common in tropical environments, and to several antimicrobials, including pesticides. The genetic determinants of these interesting characteristics remain largely unknown. Results Genome sequencing revealed that CIAT 899 and PRF 81 share a highly-conserved symbiotic plasmid (pSym) that is present also in Rhizobium leucaenae CFN 299, a rhizobium displaying a similar host range. This pSym seems to have arisen by a co-integration event between two replicons. Remarkably, three distinct nodA genes were found in the pSym, a characteristic that may contribute to the broad host range of these rhizobia. Genes for biosynthesis and modulation of plant-hormone levels were also identified in the pSym. Analysis of genes involved in stress response showed that CIAT 899 and PRF 81 are well equipped to cope with low pH, high temperatures and also with oxidative and osmotic stresses. Interestingly, the genomes of CIAT 899 and PRF 81 had large numbers of genes encoding drug-efflux systems, which may explain their high resistance to antimicrobials. Genome analysis also revealed a wide array of traits that may allow these strains to be successful rhizosphere colonizers, including surface polysaccharides, uptake transporters and catabolic enzymes for nutrients, diverse iron-acquisition systems, cell wall-degrading enzymes, type I and IV pili, and novel T1SS and T5SS secreted adhesins. Conclusions Availability of the complete genome sequences of CIAT 899 and PRF 81 may be exploited in further efforts to understand the interaction of tropical rhizobia with common bean and other legume hosts.


Background
The symbiotic relationship between legumes and nitrogen fixing-bacteria, commonly known as rhizobia, has been the subject of practical and basic studies for over 120 years. Lately, a renewed interest in the topic has been observed due to its key role in agriculture sustainability, in lowering costs for the farmers, in improving soil fertility, and in the mitigation of greenhouse-gas emissions.
The rhizobia-legume symbiosis starts with a finely tuned molecular dialogue between the partners. Specific signals released by the legume, mainly flavonoids [1], are perceived by the rhizobium that, in response, produces lipochitin oligosaccharides, the Nod factors, that in turn elicit the formation of specialized organs, the root nodules [2]. Rhizobia enter the legume roots and colonize the developing nodules where they differentiate into bacteroids that fix atmospheric nitrogen [3]. In addition, a variety of other bacterial systems are required for root colonization, effective nodulation and nitrogen fixation, including surface polysaccharides and secretion systems [3,4].
Nitrogen fixation is an ancient prokaryotic trait that predates plant evolution [5], as evidenced by the distribution of diazotrophs in a broad range of non-phylogenetically related bacteria [6]. On the other hand, rhizobia-induced nodulation and the corresponding nodulation (nod) genes had a more recent origin, arising with the evolution of the host legumes [7]. nod and nif genes are contained in symbiotic plasmids or symbiotic islands. Rhizobial symbiotic compartments are enriched in mobile genetic elements, and show evidence of gene acquisition by lateral transfer [8,9]. This dynamic evolution is probably driven by selective pressures to nodulate a range of legume hosts.
The best studied rhizobium-legume symbiotic models are those involving Sinorhizobium (=Ensifer) melilotialfalfa (Medicago sativa L.) and Bradyrhizobium japonicum-soybean [Glycine max (L.) Merr.], and significant progress has been achieved by sequencing the genomes of the bacterial partners [10,11]. The genome sequence of Rhizobium etli CFN 42 shed light on its symbiotic relationship with common bean (Phaseolus vulgaris L.), the most important legume for direct human consumption in undeveloped and developing countries. However, poor understanding prevails with respect to other important common bean-nodulating symbionts of tropical acid soils, such as Rhizobium strains CIAT 899 and PRF 81 [12].
CIAT 899, isolated from a common-bean nodule in Colombia, is the type strain of Rhizobium tropici [13]. Originally, R. tropici comprised two phenotypically distinct groups, named type A and type B, but recently a new species, Rhizobium leucaenae, has been proposed to accommodate the type A strains, with strain CFN 299 selected as the representative of the species [14]. Strain PRF 81 was isolated from a common-bean nodule collected in the State of Parana, Brazil; it is phylogenetically related to R. tropici and R. leucaenae [15]. PRF 81 has been described as intermediate in phenotypic characteristics between R. tropici and R. leucaenae, although it has been suggested to be more closely related-phenotypically and phylogenetically-to the former species [15].
CIAT 899 and PRF 81 are promiscuous rhizobia with host ranges that include several species of the three legume subfamilies [15]. Several wild tropical legumes are the natural hosts of R. tropici and R. leucaenae strains [16], and these rhizobia gained access to common bean nodules when the legume was introduced into tropical regions [12]. The genetic determinants of the broad host range of R. tropici are presently unknown, although the capacity of certain strains, like CIAT 899, to produce a wide range of Nod factor structures [17] surely contributes to this phenotype. In addition to its association with legumes, CIAT 899 is also a proficient maize rhizospheric and endophytic colonizer, with the capacity to promote plant growth [18].
CIAT 899 and PRF 81 have been identified as highly effective in fixing N 2 with Andean and Mesoamerican common-bean genotypes, competitive against indigenous rhizobia, genetically stable, and adapted to stressful tropical conditions such as acidic soils and high temperatures [13,15,19,20]. These properties resulted in the inclusion of both strains in commercial inoculants in Brazil [21,22], and in some countries in Africa. For CIAT 899, the genetic determinants of growth at high temperatures and low pH that have been reported [23][24][25][26][27] do not fully explain the ability of this strain to withstand such stressful conditions. In addition, PRF 81 outcompetes CIAT 899 in co-inoculation experiments for bean nodulation under greenhouse conditions [15] and in field trials in acid soils [21]; however, the basis of this differential competitiveness remains unexplored, as well as the possible repertoire of genes related to the competitive ability of both strains when challenged with indigenous rhizobia.
CIAT 899 and other R. tropici strains are more resistant to several antimicrobial compounds and heavy metals in comparison to other common bean-nodulating rhizobia, such as R. etli and R. leucaenae [14,28]. R. tropici strains also show strong resistance to pesticides used in agriculture, such as the fungicides Thiram and Captan, that can diminish the survival of rhizobial inoculants when applied to seeds [29,30].
The aim of this study was to compare the genome of R. tropici CIAT 899 with that of Rhizobium sp. PRF 81, with special reference to their symbiotic plasmids and all possible determinants of resistance to stressful conditions and antimicrobial compounds. The sequence of the symbiotic plasmid of R. leucaenae CFN 299 was also obtained and analyzed. Possible mechanisms of broad host range and symbiotic plasmid evolution were found, as well as features that may explain the basis of the environmental stress tolerance and antimicrobial resistance of these strains.

Results and discussion
General characteristics of CIAT 899 and PRF 81 genomes R. tropici CIAT 899 had a 6,686,337-bp genome composed of a chromosome and three plasmids with rep-type replication systems. General features and statistics of the genome are presented in Table 1. The 3,837,060-bp chromosome contained all three ribosomal operons. The plasmid sizes were in general agreement with previously reported estimates based on Eckhardt gels [31], except for the 2,083,197-bp megaplasmid (pRtrCIAT899c) which was somewhat larger than expected. The second largest replicon (549,467 bp) was the symbiotic plasmid (pSym). The 216,610-bp, smallest plasmid (pRtrCIAT899a) and the pSym had lower G + C-content values than the genome average ( Table 1).
Sequencing of Rhizobium strain PRF 81 resulted in a high-quality draft genome with an average 17.4-fold coverage, and distributed in 103 contigs with an N50 size of 246 kb. The estimated genome size was 7.08 Mb. It was previously shown that PRF 81 has four plasmids [32]. Two contigs were found to represent the complete sequences of the two smallest plasmids pPRF81a and pPRF81b ( Table 1). Given that chromosomes in the Rhizobiaceae are conserved [33], an alignment of the PRF 81 contigs to the genomes of CIAT 899 and other Rhizobium strains allowed the identification of 23 putative chromosomal contigs of 3.76 Mb in total size. Interestingly, seven contigs of PRF 81 mapped with a very high level of identity to the CIAT 899 pSym and were 520 kb in total size, a value close to the estimated size of the PRF 81 pSym (pPRF81c) [15]. The remaining contigs (2.49 Mb in total size) must represent sequences from the PRF 81 megaplasmid (pPRF81d) that has an estimated size of 2.4 Mb, or repeated sequences in the genome.
Putative functions could be assigned to around 67% of the coding DNA sequences (CDS) predicted for each strain. The chromosomes coded for most genes assigned to functionally important classes such as central intermediary metabolism, cellular processes and DNA metabolism. No essential genes were found in plasmids. However, it is noteworthy that megaplasmids from both strains encoded about 45% of all the transport capacity of the genome, 37% of the regulatory functions and 34% of energy-related metabolism.
Phylogeny and whole genome relationships between CIAT 899 and PRF 81 R. tropici forms a clade with R. leucaenae (formerly R. tropici type A), Rhizobium multihospitium, Rhizobium lusitanum, Rhizobium miluonense and Rhizobium rhizogenes, in phylogenies constructed with the 16S rRNA and several housekeeping genes [14,34]. The only other representative of this clade with a sequenced genome is R. rhizogenes K84 [35][36][37]. We determined the relationships between the genomes of CIAT 899, PRF 81 and other Rhizobiaceae genomes using a dendrogram constructed with the genomic-distance index MUMi [38]. As shown in Figure 1, CIAT 899 and PRF 81 were the most closely related strains, and they had K84 as their sister taxon, which is consistent with the close relationship between the former strains and R. rhizogenes. The dendrogram also correctly depicts accepted phylogenetic relationships clustering strains from the same species (Rhizobium leguminosarum), as well as strains from closely related species like R. etli and Rhizobium phaseoli, or S. meliloti and Sinorhizobium medicae. As expected, due to their close relationship, CIAT 899 and PRF 81 shared a higher number of orthologous genes with each other than with any other sequenced rhizobia or agrobacteria (Additional file 1). Phenotypic and phylogenetic analyses previously reported by us suggest that, although close to CIAT 899, PRF 81 may be divergent enough to be considered as a different species [34]. When a digital DNA-DNA hybridization methodology [39] was performed, PRF 81 shared only 52% of its genome sequence with CIAT 899, suggesting that wet-lab hybridization will be lower than the 70% threshold required for inclusion in the species R. tropici [40]. Number of predicted genes  3734  1905  500  212  6351  3479  2093  478  149  132  6331   CDS  3672  1905  500  212  6289  3419  2093  478  149  132  6271   tRNA  53  ---53  51  ----51   rRNA  9  ---9  9 ----9 * Symbiotic plasmid. † For PRF 81, the size and G + C content of the chromosome, pD and pC are estimates.
Synteny, evaluated at the contig level, was found between CIAT 899 and PRF 81 chromosomes and pSyms ( Figure 2). Scattered syntenic gene clusters were found between their megaplasmids. These clusters included the septum formation minCDE, histidine degradation hutIHU,G, and protocatechuic acid degradation pcaGHCD,Q,R,IJF genes, all of them previously found to be conserved in secondary replicons of bacteria [35]. Recently, the term chromid was proposed for secondary replicons showing chromosome-like features but plasmid-type replication systems [41]. The megaplasmids from both strains conformed to the characteristics defined for chromids, such as having similar G + C contents as the chromosome (Table 1) and harboring core genes usually found in chromosomes. For example, both megaplasmids encoded for thiamin and cobalamin (vitamin B12) biosynthesis.
It has been proposed that a "chromid often arises at the origin of a new genus" by the migration of essential or important genes to a plasmid [41]. This may explain why a large proportion of genes contained in chromids are conserved within a genus [41]. In the case of CIAT 899 and PRF 81, 47% of the genes contained in the CIAT 899 chromid had orthologues located in putative chromid contigs of PRF 81. After genus emergence and species divergence, chromids may evolve differently. For example, in the R. etli/R. leguminosarum lineage, it seems that the original chromid suffered events of co-integration and resolution leading to its partition into three different replicons, while within the R. tropici lineage it was maintained as a single replicon. In contrast to CIAT 899 and PRF 81 chromids, those of R. etli and R. leguminosarum are highly syntenic [42]. The lack of chromid gene order conservation between CIAT 899 and PRF 81 may indicate that these strains have a longer history of divergence in comparison to R. etli and R. leguminosarum, again indicating that they belong to different species.
A conserved symbiotic plasmid defining a novel symbiovar: tropici CIAT 899 and PRF 81 are promiscuous rhizobia sharing a broad host range with R. leucaenae CFN 299 (Additional file 2). Previous studies reported that these three strains produce similar Nod factors [43] and have almost identical sequences in analyzed symbiotic genes. To compare the symbiotic gene repertoire of these strains, we sequenced the CFN 299 pSym. This plasmid  was transferred to a plasmid-less Agrobacterium strain and purified. A combination of Sanger and Solid sequencing was used, resulting in 27 contigs with 20-fold coverage, representing the almost complete CFN 299 pSym. We found that the three strains have a conserved and syntenic pSym, hereafter designated as the "tropici symbiotic plasmid" (Figure 3). The level of sequence identity between the three plasmids was very high (≥99.9% at the contig level), therefore all gene products were 100% identical at the amino acid level. This high sequence conservation likely indicates a recent dissemination of the tropici pSym. The only major differences between the three pSyms were the presence or absence of certain insertion sequences (ISs) (Figure 3). We have previously demonstrated that several ISs located in the pSym of CFN 299 are active at transposition [44].
Two replication systems were found in the tropici pSym, one having a set of repABC1 genes and the other only repC2. The highest similarities of RepC2 (74-77%) were with homologues located in S. meliloti and Sinorhizobium fredii plasmids. RepC1 also showed high similarities (82-87%) with sinorhizobial genes and also with RepCs from Rhizobium and Agrobacterium. Interestingly, two regions showing different G + C contents could be recognized in this plasmid, a 236-kb segment harboring the repABC1 genes showed 59%, whereas the other 291-kb segment harboring the repC2 gene showed 57%. The two regions are connected by 19.7-kb and 7.2-kb regions constituted by ISs and integrase genes. Thus, the pSym may have arisen by the co-integration of two plasmids. The largest region contained the core genes required for nodulation and nitrogen fixation, and probably represent the original pSym. The other region contained genes that could have improved the symbiotic or associative abilities, such as nodM, an uptake hydrogenase, and genes for the biosynthesis of plant hormones. The tropici pSym possessed the majority of all IS transposases present in the genomes of CIAT 899 and PRF 81. Members of the IS66 family were the most abundant ISs as also have been observed in symbiovar phaseoli plasmids [45]. Shared ISs between plasmids may promote co-integration events through homologous recombination.
The term symbiovar, recently proposed to replace the term biovar, is useful to designate symbiotic variants that exist in different rhizobial species [46]. Symbiovars are defined based on legume-host range and on symbiotic gene sequences. Therefore, it seems reasonable to propose that CIAT 899, CFN 299 and PRF 81 belong to the same symbiovar that here we designate as tropici. R. lusitanum P1-7, also ascribed to the R. tropici clade, has nodD and nifH genes resembling those in CIAT 899 [47] and induces effective nodulation in Leucaena sp.; therefore, it should also be ascribed to symbiovar tropici. The symbiovar tropici pSym shares an analogous history with symbiovar phaseoli pSym in the sense that a highly conserved plasmid has spread and is maintained in several different Rhizobium species [46].

Nodulation genes
Using different techniques, similar Nod factor structures have been attributed to CFN 299 [48], CIAT 899 [49] and PRF 81 [15]. The tropici pSym gene cluster nodD-ABC-SUIJ-HPQ, described in previous studies [49,50], is able to direct the synthesis of Nod factors and to confer nodulation ability when transferred to a non-nodulating strain [51]. nodC directs the synthesis of the Nod factor-backbone chitin oligosaccharide structure that is deacetylated, acylated, methylated, carbamoylated and sulfated by the products of nodB, nodA, nodS, nodU and nodH, respectively [49,50], and exported by an ABC-type transporter encoded by nodIJ [52]. The activated sulfate donor used by NodH is produced by the nodPQ products. Unexpectedly, the two functional domains of NodQ, sulfate adenylyltransferase and adenylylsulfate kinase, previously reported to be carried on a single polypeptide [49], were found in this study to be encoded in distinct but overlapping genes, designated as nodQ1 and nodQ2, respectively.
It has been proposed that the nature of the Nod factor acyl group attached by NodA can contribute to the determination of host range [53]. Interestingly, we found that, in addition to the nodA gene located adjacent to nodBC, two additional nodA genes are present in the tropici symbiotic plasmid. The amino acid identity between the three nodA products ranged from 72 to 78%, suggesting that they are not functional redundant copies. Phylogeny of the three NodA proteins indicated that they cluster within a large group of sequences from rhizobia isolated mostly from tropical Mimosoideae legumes ( Figure 4A), as previously noted for the R. leucaenae CFN 299 nodA gene [54]. Mimosoideae rhizobia may have broader host ranges including Papilionoideae and Faboideae legumes as is known for CIAT 899, PRF 81 and CFN 299. The two additional nodA genes from the tropici symbiotic plasmid, named nodA2 and nodA3, did not have close homologues in the databases ( Figure 4A). The copy named nodA2 was part of a gene cluster including nodFE and hsnT ( Figure 4B). The acyl carrier protein NodF and the β-acetoacetylsynthase NodE are required for the biosynthesis of α,β-unsaturated fatty acids present in Nod factors produced by some rhizobia and are determinants of host specificity [55]. hsnT encodes an uncharacterized acyltransferase that is unrelated to NodA proteins. It seems likely that the tropici pSym nodA2-hsnT-nodEF gene cluster directs the biosynthesis and incorporation of α,β-unsaturated acyl chains that have not been observed to date in Nod factors produced by CIAT 899 or CFN 299. The NodE protein encoded in the tropici pSym is most similar ( Figure 5) to the corresponding protein of Sinorhizobium sp. strain MUS10, a symbiont of the tropical legume Sesbania rostrata [56]. Neighbor genes of the third copy, nodA3, did not provide any indication of its acyl chain specificity. Additional and divergent nodA genes likely expand the diversity of Nod factor acyl chains and may contribute to widening the host range in these strains.
LysR-family transcriptional regulators encoded by nodD genes mediate activation of nod-gene expression in response to plant-produced signal molecules (mainly flavonoids) [57,58]. Different NodD proteins vary in their responses to different sets of flavonoids and some strains have several nodD genes [3]. In CIAT 899, five nodD genes have been described [59] and here we confirmed their presence in the sequenced tropici pSyms. When nodD1, the copy adjacent to nodABC, is transferred to R. etli and R. leguminosarum sv. trifolii, it confers the ability to nodulate L. leucocephala and P. vulgaris, respectively [60]. Of the five nodD genes of CIAT 899, nodD1 is mainly responsible for nod-gene activation with various purified flavonoids, and root or seed exudates from common bean, L. leucocephala, and Macroptilium atropurpureum [59,61]. In agreement, a nodD1 mutant is unable to nodulate L. leucocephala and M. atropurpureum, and shows severely diminished nodulation of common bean [61]. This evidence suggests that the remaining nodD copies fulfill accessory roles in nodulation, at least with the three hosts tested to date. nodD2 and nodD3 were in close vicinity to nodA2-hsnT-nodEF and nodA3, respectively ( Figure 4B), and may be responsible for their activation in response to signals from plants that require Nod factors with specialized acyl chains. An AraCfamily transcriptional regulator gene was found located adjacent to nodD2 and may also participate in nodA2 regulation. Interestingly, nodD4 and nodD5 were in close proximity to gene clusters required for plant-hormone biosynthesis and nitrogen fixation, respectively, as discussed below.
The tropici pSym did not harbor nolO homologues that in other rhizobia are responsible for carbamoylation at the 3' (or 4') position of the non-reducing terminal glucosamine, indicating that Nod factors produced by our strains are carbamoylated only at the 6' position by NodU. Glycosyl residues attached at the reducing end of the CIAT 899 Nod factors include mannose [49] and fucose or methyl fucose, the latter two present only after growth under saline stress [62]. The mannosyl substitution has been described only in CIAT 899 and the genetic determinants are unknown. Fucosyl residues are present in a range of rhizobial strains harboring nodZ genes, and they can be methylated by the noeI gene product. We did not find homologues to nodZ or noeI in the tropici pSym, nor elsewhere in the CIAT 899 and PRF 81 genomes. A putative glycosyltransferase and methyltransferase were encoded in the tropici pSym though they are not related to any described nod gene.
A nodM homologue was identified in the tropici pSym. NodM is a paralogue of the housekeeping enzyme glucosamine synthase, GlmS, and seemingly provides adequate amounts of the unit blocks used by NodC for Nod factor biosynthesis. Symbiotic plasmids or islands in R. leguminosarum, S. meliloti, S. medicae and Mesorhizobium spp. also harbor paralogous glucosamine synthases. NodM is required for efficient nodulation of some hosts [63].
Nodulation factors produced by CIAT 899 are acetylated at the non-reducing end, which, in R. leguminosarum and S. meliloti, is carried out by the pSym nodL gene product [3]. Although two acetyltransferase genes were located in the tropici pSym, they did not show significant similarities to nodL; besides, one of those genes was disrupted by an IS in CIAT 899, but not in PRF 81 or CFN 299. On the other hand, a nodL-like gene located in the chromosomes of CIAT 899 and PRF 81 encodes a putative acetyltransferase 63% similar to R. leguminosarum 248 NodL, encoded in its pSym [64]. Genes noeJ and noeK, noeL (=gmd) and nolK (=fcl), required for biosynthesis of the activated sugar donors GDP-D-mannose and GDP-L-fucose, respectively -usually necessary for the biosynthesis of common surface CIAT 899 and PRF 81 chromosomes also encoded proteins that are highly similar (91%) to the transcriptional repressor NolR of R. leguminosarum. NolR-mediated repression of nod-gene expression after induction by NodD is required for optimal nodulation of some hosts [65]. Inspection of rhizobial genome sequences revealed that nolR homologues are located in chromosomes or chromids.

Nitrogen-fixation genes
The tropici pSym harbored a set of nitrogen-fixation (nif/fix) genes similar to those in other tropical microsymbionts, comprising a higher number of genes in comparison to those present in temperate rhizobial species [66]. nif/fix genes were organized in four clusters in the tropici pSym. The first cluster was preceded by nodD4 and contained nifHDK coding for the nitrogenase structural components, nifEN whose products function as scaffolds for FeMo-cofactor assembly, and nifX coding for a FeMo-cofactor binding protein [6]. These nif genes showed the highest similarity to the corresponding genes of R. etli and M. loti. It is noteworthy that, in no other rhizobial species, a nifHDKEN cluster is preceded by a regulatory nodD gene, and it remains to be established if the expression of these nif genes is dependent on this transcriptional regulator and is increased by specific flavonoids. The second gene group, located after the main nod-gene cluster, included nifTZ, fdxN, fixN, nifBA, fixXCBA, nifWSU and nifQ, which also showed the highest similarity to homologues in M. loti and R. etli. Although the function of some of these genes is not known (nifT, nifW), the remaining nif genes are involved in nitrogenase maturation (nifZ), biosynthesis of the FeMo-cofactor (nifB, S, U and Q), and transcriptional activation of nif genes (nifA) [67]. The fdxN and fix genes present in this cluster are involved in transfer of electrons to nitrogenase [68]. A distinct gene cluster included the fixNOQP operon encoding the cbb3-type terminal symbiotic cytochrome oxidase components, and fixGHIS whose products are thought to be involved in copper uptake and metabolism required for the terminal oxidase [69]. These fix genes showed strong similarities to the corresponding gene clusters in R. leguminosarum and R. etli genomes. The fixLJ and fixK genes, encoding regulatory elements that control the expression of the terminal oxidase under microaerobic conditions, were adjacent to each other in the pSym and showed higher identities with S. meliloti genes than with homologues from other Rhizobium strains. In CIAT 899, an IS21 was found inserted between fixLJ and fixK, although it did not alter the coding sequences. Affiliation of the nif/fix genes to different microorganisms indicates a high contribution of gene recruitment by horizontal transfer to the genesis of the tropici pSym. This observation is supported by the proximity of mobile genetic elements to several of the described gene clusters.
In contrast to symbiovar phaseoli pSym that has three nifH copies, a unique nifH gene is present in the tropici symbiotic plasmid as previously reported [13]. More than one nifH gene is present in Sinorhizobium sp. NGR 234, Azorhizobium caulinodans ORS 571, and Bradyrhizobium sp. BTAi1 and ORS278. Multiple gene copies can represent hot spots for recombination and genomic rearrangements [70], in some cases leading to loss of symbiotic properties [71]. On the other hand, repeated sequences can promote the generation of symbiotic amplicons where the number of nod or nif genes can be increased [72].
An inactive uptake-hydrogenase gene cluster in the tropici pSym Uptake hydrogenases recycle part of the energy spent in the N 2 -fixation process through the oxidation of hydrogen (H 2 ) that is obligatorily produced by the nitrogenase and consumes at least 25% of the reducing power invested in N 2 fixation. Particularly in the 1970s and 1980s, searches for symbiotic partners expressing uptake hydrogenase were aimed at increasing plant yields by improving the energetic efficiency of the nitrogen-fixation process [73,74]. Among fast-growing rhizobia, only some strains of R. leguminosarum, R. tropici and Astragalus symbionts express hydrogenase activity [75]. In CIAT 899 and CFN 299, positive hybridization signals are observed with a probe containing hydrogenase genes; however, both strains have Hupphenotypes [76]. Here we confirmed that the tropici pSym carries genes encoding an uptake hydrogenase ( Figure 3) showing similarity (~70%) to R. leguminosarum homologues. Nevertheless, hupH and hupJ are truncated, and hupE, hupG and hupI are absent. The precise functions of hupGHIJ are not known, but they are required for maturation of the HupS hydrogenase subunit, and their combined absence or alteration in the tropici pSym likely explain the Hupphenotype reported for CIAT 899 and CFN 299 [77]. The other missing gene, hupE, encodes a Ni transporter [78]. A hupE-like gene was located 31 kb away from the hydrogenase cluster in the tropici pSym, and CIAT 899 and PRF 81 chromosomes carried an additional hupE-like transporter. Introduction of a R. leguminosarum hupE-defective hydrogenase gene cluster under the control of a fixJ promoter induces only a weak Hup + phenotype in CIAT 899 bacteroids [78], suggesting that the alternative Ni transporters could not fully replace the function of hupE.
CPS have been reported in rhizobia and agrobacteria [79,83]. In Sinorhizobium, CPS show structural analogies to group II K-antigens found in Escherichia coli and are, therefore, known as K polysaccharides (KPS) [79]. Although KPS have only been described in sinorhizobia, a first genome draft of PRF 81 revealed three genes for KPS biosynthesis [32]. Two chromosomally located gene regions, known as rkp-1 and rkp2, as well as a pSymB located rkp-3 region are responsible for KPS biosynthesis in S. meliloti [79]. Here we found a cluster of genes homologous to rkp-1 in the chromosomes of CIAT 899 and PRF 81. Genes in this region seem to determine the biosynthesis (rkpA), modification and transport (rkpGHIJ) of a lipophilic molecule that may serve as a lipid carrier or anchor for KPS [84,85]. The only other Rhizobiaceae with an rkp-1 region is the S4 strain of Agrobacterium vitis [35], although the production of KPS by this strain has not been investigated. The rkpU gene located upstream of rkpA in sinorhizobia and putatively involved in KPS export [86] is also conserved in CIAT 899, PRF 81 and S4. The sinorhizobial rkp-2 region is composed of lpsL and rkpK, genes that are involved in the biosynthesis of nucleotide-sugar precursors for KPS (rkpK) and LPS (lpsL and rkpK) [87]. In CIAT 899 and PRF 81, both genes were chromosomal, although the gene order is reversed in relation to S. meliloti. Similar rkp-2 regions are found in the genomes of other Rhizobium and Agrobacterium strains. In R. leguminosarum and R. rhizogenes, rkpK homologues have been implicated in the biosynthesis of CPS, EPS or LPS [88,89]. The sinorhizobial rkp-3 region contains genes involved in KPS polymerization and export, and also strain-specific genes that determine its sugar composition [79]. No matches to rkp-3 genes were found in CIAT 899 or PRF 81, indicating that the polysaccharide produced by these strains is different from that produced by sinorhizobia. A cluster of 19 genes located in the CIAT 899 megaplasmid encode functions putatively related to CPS biosynthesis. Recently, an orthologous syntenic cluster of R. rhizogenes K84 was postulated to be involved in the biosynthesis of a CPS different from the typical high molecular weight sinorhizobial KPS [89]. The putative K84 polysaccharide seems to be required for normal attachment and biofilm formation on an abiotic hydrophobic surface, but not on tomato roots [89].
The EPS secreted by CIAT 899 consists of an octasaccharide subunit made of glucose and galactose, decorated with acetyl and pyruvyl groups [90]. The CIAT 899 EPS is structurally similar to succinoglycan (EPS I) produced by S. meliloti [90]. The exo genes of CIAT 899 were organized in a large cluster located in the megaplasmid. A homologous exo cluster was located in a putative megaplasmid contig of PRF 81 with gene products displaying identities between 82% and 98% with those of CIAT 899. The exo gene clusters of these strains differed in that exoH has become a pseudogene in CIAT 899. In S. meliloti, ExoH is responsible for the addition of the succinyl group to EPS I [91]. This is consistent with the absence of succinyl groups in the CIAT 899 EPS [90], and indicates that PRF 81 produces succinylated EPS. Inoculation of alfalfa with a S. meliloti exoH mutant results in ineffective nodules without intracellular bacteria [91]. EPS is not required for nodulation of common bean by CIAT 899, but may contribute to competitiveness [92]. As succinylation of EPS is important for normal nodulation of some hosts, differences in host range between CIAT 899 and PRF 81 may be expected. The arrangements of exo genes in both strains were highly similar to that reported for S. meliloti 1021 [93], except for the absence of exoI and exoT in CIAT 899 and PRF 81. CDS with similarities to exoI and exoT of S. meliloti 1021 were found elsewhere in the genomes of both strains.
LPS represent the major component of the outer leaflet of the outer membrane of Gram-negative bacteria. LPS are composed of three structural domains, lipid A, core oligosaccharide and the O-antigen polysaccharide [94]. Rhizobium lipid A molecules typically contain a secondary very long acyl chain (VLAC) at the 2' position [95]. This VLAC seems to confer stability to the membrane and is required for the establishment of a normal or fully effective symbiosis [96,97]. As in other rhizobia, lipid A of CIAT 899 harbors 27-hydroxyoctacosanoic acid as VLAC, however, it also contains 29-hydroxytriacontanoic acid [98]. The acpXL and lpxXL genes encoding the specialized acyl carrier protein and transferase proteins required for incorporation of VLAC into lipid A, as well as four other genes responsible for the biosynthesis of this substituent, were found in the chromosomes of CIAT 899 and PRF 81 arranged in a syntenic cluster that is also conserved in other rhizobia [96]. Putative LpxE and LpxF phosphatases, and an LpxQ oxidase were also found encoded in the chromosomes of CIAT 899 and PRF 81, suggesting that lipid-A molecules produced by these strains lack 1 and 4' phosphate groups, and have a 2-amino-2-deoxy-gluconate moiety as described for the lipid As of R. leguminosarum and R. etli [95]. Dephosphorylation of lipid A in R. etli seems to confer resistance to antimicrobial peptides but is not required for symbiosis [82]. Glycosyltransferases, 76% identical to the core oligosaccharide mannosyltransferase LpcC from R. leguminosarum [99], were found encoded in the chromosomes of CIAT 899 and PRF 81. This is consistent with the observation that a mutation in noeJ, responsible for the biosynthesis of GDP-mannose, causes the production of LPS molecules with a truncated core in CIAT 899 [30]. No matches to other known genes involved in rhizobial core oligosaccharide biosynthesis were found in CIAT 899 and PRF 81, suggesting that sugar composition of their core oligosaccharides may be different from studied rhizobial strains. The O-antigen is the distal portion of the LPS molecule and can be highly variable from strain to strain [94]. In CIAT 899, the O-antigen units are composed of Dglucose, acetylated 6-deoxy-D-talose and L-fucose [100]. Two loci affecting O-antigen biosynthesis have been reported in CIAT 899 [30]. Both loci were mapped here in the chromosome. The first one includes a nucleotide sugar dehydratase gene, lpsβ2, putatively involved in UDP-D-QuiNAc biosynthesis, the donor of acetylquinovosamine that in other bacteria is the sugar linking the O-antigen and the core oligosaccharide [30]. An orthologue of lpsβ2 was found in the chromosome of PRF 81. The second CIAT 899 locus contains the wzm-wzt genes encoding an ABC type O-antigen transporter [30]. Interestingly, genome sequencing revealed that wzm-wzt were part of a larger locus encompassing 22 genes predicted to be involved in polysaccharide biosynthesis. This gene cluster was located next to a tRNA-Gln gene, and contained the remnants of a traG gene and an IS transposase, suggesting that it was acquired by lateral gene transfer. Two genes in this cluster encoded a putative sugar transferase for the initiation of O-antigen biosynthesis and a putative WaaL O-antigen ligase. The WaaL protein of CIAT 899 showed hydropathy plots and conserved residues similar to characterized O-antigen ligases (Additional file 3A, B and C). To test the involvement of the CIAT 899 waaL gene in O-antigen biosynthesis, an insertional mutant was constructed. The mutant produced rough LPS molecules with normal migration behavior, indicating that lipid A and core oligosaccharide were not affected, but did not produce smooth LPS molecules (those carrying O-antigen), in agreement with the predicted function of WaaL as an O-antigen ligase (Additional file 3D). The mutant complemented in trans with a wild-type copy of waaL regained the ability to produce smooth LPS molecules (Additional file 3D). Interestingly, a putative O-antigen biosynthesis gene cluster was found in a similar location next to a tRNA-Gln gene in the PRF 81 chromosome. Homologues of wzm and wzt, as well as a putative waaL gene were found in this cluster, but the remaining genes were different from those of CIAT 899. This is consistent with previously reported LPS-profile differences between CIAT 899 and PRF 81 [15]. Although the exact role of O-antigen during rhizobium-legume interactions is unknown, abnormal symbioses are elicited by mutants lacking this LPS domain, with the effects being more drastic when a legume forming determinate nodules, such as common bean, is involved [79]. Recently, an intact O-antigen was shown to be required for normal endophytic colonization of maize by CIAT 899 and evidence for a possible role of LPS as a protectant against maize lipophilic antimicrobial compounds was presented [30].

Secretion and plasmid-transfer systems
The conserved general secretion (sec) and twin-arginine translocation pathways responsible for the majority of protein export into the periplasm were identified in the CIAT 899 and PRF 81 genomes. Exoprotein secretion through the Gram-negative bacterial envelope is accomplished by several classes of transport system. Some of them such as the Type II and V systems depend on the activity of the sec pathway for delivery of their substrate proteins into the periplasm, whereas the remaining systems (type I, III, IV, VI, VII) are able to directly pick proteins from the cytosol [101]. Secreted proteins can fulfill general functions but also may be be specifically required for interaction with eukaryotic hosts or also with other bacteria [101]. In rhizobia, type I, III, IV and VI secretion systems have been shown to be involved in symbioses with legume hosts [4,102]. Only type I, IV and V systems were identified in the genomes of CIAT 899 and PRF 81.
Protein secretion by the type I secretion system (T1SS) occurs through an oligomeric protein channel composed of an inner membrane ATP-binding cassette (ABC) protein, a largely-periplasmic membrane fusion protein (MFP), and a pore-forming outer-membrane protein (OMP) [103]. The ABC-and MFP-encoding genes are usually arranged in an operon, whereas the OMP component may be encoded by a dedicated adjacent gene, or by tolC encoding a common OMP able to interact with several transporters [104]. T1SS protein substrates typically contain carboxy-terminal, glycine-and aspartate-rich repeats known as repeat-in-toxin (RTX) [105] and are often located close to ABC and MFP genes. Two and three putative T1SSs were identified in the genomes of CIAT 899 and PRF 81, respectively, all located in their megaplasmids. A T1SS shared by both strains was composed of an ABC-MFP gene operon followed with a divergently oriented gene encoding a protein 83% similar to the endoglycanase ExsH of S. meliloti. ExsH contributes to the production of low-molecular-weight EPSI, but is not involved in symbiosis [106]. The second T1SS of CIAT 899 comprised ABC and MFP genes and a putative T1SS substrate gene. The ABC and MFP components were highly similar (≥79%) to RspD and RspE required for rhizobiocin secretion in R. leguminosarum [107]; nevertheless, the CIAT 899 T1SS substrate protein was unrelated to known bacteriocins and did not show any conserved domain, besides RTX. The second and third T1SS of PRF 81 were composed of ABC-MFP and OMP-ABC-MFP genes, respectively, encoding proteins without close homologues in the databases, and did not have nearby genes encoding T1SS substrate proteins. The only additional identifiable T1SS substrate of PRF 81 possessed copies of the VCBSrepeat domain, suggesting that this protein acts as an adhesin. Other known rhizobial T1SS-exported proteins that do not possess the typical RTX motif, like the glycanases PlyA, PlyC and Egl, and the rhizobial adhering (Rap) proteins, did not have counterparts in the genomes of CIAT 899 or PRF 81.
The type V secretion system (T5SS) allows secretion of large proteins that act as virulence factors [108]. Three subclasses of T5SS are recognized. The autotransporter (AT) subclass consists of multidomain proteins, including a passenger domain that is the functional secreted moiety, and a pore-forming β-barrel domain that mediates secretion through the outer membrane. The trimeric autotransporters (TAA) subclass proteins contain only one third of the β-barrel domain, so they must form trimers in order to be secreted. In the two-partner system (TPS) subclass, the passenger and β-barrel domains are encoded in distinct polypeptides called TpsA and TpsB, respectively [108]. The genomes of CIAT 899 and PRF 81 each encoded two TPS and one TAA systems. The TPS systems of both strains were related, as each TpsB protein has a close homologue (>80% identical) encoded in the other strain, but the cognate TpsA secreted proteins have diverged (<60% identity). TpsA proteins of both strains were putative filamentous hemagglutinins (FHA). FHA act as adhesins and are virulence factors of animal [109] and plant [110] pathogens, but a possible role in symbiotic relationships has not been determined. The second subclass of T5SS systems found in the CIAT 899 and PRF 81 genomes encoded TAA proteins 69% identical to each other and which were similar to YadA-like adhesins. YadA is a virulence factor of Yersinia enterocolitica that mediates autoagglutination, adhesion to host cells and also protects the bacterium against complement and defensin lysis [111]. YadA-like adhesins have not been characterized in the context of plant-microbe interactions.
Type IV secretion systems (T4SS) are able to transfer protein or nucleoprotein complexes across membranes [112]. We found T4SSs in two of the CIAT 899 plasmids and in all four PRF 81 plasmids. Two P-type T4SS were identified in PRF 81, one located in pPRF81a and the other in the megaplasmid. The T4SS located in pPRF81a was composed of virB/virD4 genes and is most similar to the T4SS of plasmid pAtK84c of R. rhizogenes K84 [113]. The virB/virD4 system of Agrobacterium tumefaciens is responsible for transfer of tumorigenic DNA (T-DNA) into plant cells [114], while in M. loti R7A, a VirB/VirD4 system acts in the translocation of effector proteins into host cells, affecting the symbiosis in a host-dependant manner [115]. We did not find homologues to the two-component regulatory system VirA/VirG that controls expression of virB/virD4 systems in A. tumefacciens and M. loti, nor to M. loti T4SS effector proteins [115,116]. In contrast, a traA gene encoding a relaxase/nuclease, a component of the DNA transfer and replication (Dtr) system that recognizes and cleaves at the origin of transfer during conjugation, was found 10.4 kb downstream of virD4, indicating that the T4SS of pPRF81a is involved in plasmid conjugation rather than protein secretion. The P-type T4SS of the PRF 81 megaplasmid was similar to conjugation machineries present in R. etli CFN42 pSym, S. meliloti 1021 pSymA, and A. tumefaciens C58 pAt plasmid [117], and included the regulatory gene rctA that is required for repression of conjugative transfer [118]. A traA-traCDG Dtr system was located 13.4 kb away from the PRF 81 megaplasmid rctA-virB cluster, nevertheless, the traA gene is truncated suggesting that this conjugation system may not be functional.
F-type tra/trb T4SSs were identified in the tropici pSym, in pRtrCIAT899a, and in pPRF81b. Sequence analysis of individual tra/trb genes indicated that these T4SS were closely related to various conjugation systems, including those present in R. etli CFN42 p42a and S. fredii GR64 p64a plasmids [119]. pRtrCIAT899a is self-transmissible (our own unpublished data), whereas pPRF81b may have lost this ability as it lacks traCDG genes. All F-type T4SSs identified in the CIAT 899 and PRF 81 genomes were adjacent to repABC genes, and included traI, traR and traM genes, indicating that they are regulated by quorum-sensing mechanisms involving N-acyl homoserine lactones [119]. Interestingly, an IS256 inserted upstream of traR in the CIAT 899 pSym may have disrupted its promoter leading to a constitutive repression of the conjugation system of this plasmid.

Iron uptake
Iron is an essential nutrient that is not readily available under normal conditions, because it is present mostly as insoluble forms. Thus, the efficient acquisition of this element may improve bacterial survival and confer competitiveness [120]. Under Fe-replete conditions, bacteria repress the expression of Fe-uptake systems through special transcriptional regulators. CIAT 899 and PRF 81 have genes for RirA that has replaced Fur as the master regulator of iron-responsive genes in the Rhizobiaceae; and Irr, a second, minor-acting Fe-responsive regulator present only in the Rhizobiales and Rhodobacterales [121]. Both strains also possess a fur homologue. In rhizobia, fur gene products do not regulate Fe-uptake systems; instead they act as regulators of Mn transporters and the genes have been renamed as mur [121].
To capture iron, bacteria and fungi can produce siderophores, high-affinity low-molecular-weight ligands. CIAT 899 possesses a hydroxamate siderophore-biosynthesis gene cluster, some genes of which are similar to the vicibactin vbs genes of R. leguminosarum [122]. No genes for siderophore synthesis were identified in PRF 81, suggesting that this strain is adapted to obtain chelated iron forms from external sources, a strategy used by the endophytic bacterium Azoarcus sp. BH72 [123]. External siderophoreiron complexes are recognized and translocated to the periplasm by outer membrane TonB-dependent receptors, and once in the periplasm they are internalized into the cytoplasm by ABC-type transporters. Bacteria not producing siderophores can utilize "xenosiderophores" by making appropriate receptors and transporters. We found three and one siderophore-receptor genes in CIAT 899 and PRF 81, respectively, and five ABC transporters of the siderophore type encoded in each strain genome. A CIAT 899 receptor/transporter pair similar to several ferric hydroxamate utilization (Fhu) systems was encoded near the siderophore-biosynthesis genes. The second receptor of CIAT 899 was located adjacent to one siderophore transporter and was 55% similar to the anguibactin catechol-siderophore receptor FatA from Vibrio anguillarum [124]. The third CIAT 899 receptor was 85% similar to the single PRF 81 siderophore receptor, and both were 59-62% similar to ShmR of S. meliloti, a receptor required for heme utilization [125]. CIAT 899 and PRF 81 shared a transporter highly similar to the Hmu system of R. leguminosarum that is involved in heme utilization as an iron source [126].
We found three ABC transporters of the ferric ion (Fe +3 ) type in the genomes of both strains. Fe +3 may be more available in tropical acid soils where CIAT 899 and PRF 81 evolved than in neutral soils, thus explaining the presence of these transporters. Citrate can act as a weak siderophore and may capture enough iron to sustain growth in acid soils [127]. CIAT 899 and PRF 81 possess a citrate synthase gene in their pSyms. We have observed that the citrate synthase gene located in the tropici pSym is induced under low-iron conditions [128], suggesting that both strains do, indeed, use citrate as a siderophore. In contrast to other siderophores, ferric citrate utilization in rhizobia does not seem to require a TonB-dependent receptor [126]. CIAT 899 and PRF 81 have a gene encoding a bacterioferritin used for intracellular iron storage.

Phytohormone production
Several plant hormones, such as ethylene, auxins, gibberellins, have been reported to influence root colonization, and distinct stages of nodule development [129]. We found genes involved in plant-hormone metabolism in CIAT 899 and PRF 81 genomes. An acdS gene, with high identity (~84%) to homologues located in symbiosis islands of mesorhizobial strains R7A and MAFF303099, was found in the tropici pSym ( Figure 3B). acdS encodes 1-aminocyclopropane-1-carboxylate (ACC) deaminase which degrades ACC, the immediate precursor of ethylene in higher plants [130]. Ethylene inhibits rhizobial infection [129] and it has been shown that strains engineered to overexpress ACC deaminase activity have enhanced symbiotic proficiency [131,132]. The acdS gene of R. leguminosarum strain 128C53K is regulated by a leucine-responsive regulatory protein that is encoded by the lrpL (acdR) gene positioned upstream of acdS [133]. In the vicinity of the tropici pSym acdS, a gene encoding a two-component transcriptional regulator was found that may have a role in its regulation.
Many plant-associated bacteria, including rhizobia, synthesize auxins, in particular indole-3-acetic acid (IAA) [134]. Genomic analysis suggests that IAA biosynthesis may proceed through two and three tryptophan-dependant pathways in CIAT 899 and PRF 81, respectively. In the indoleacetamide pathway, tryptophan is converted to IAA in two consecutive reactions catalyzed by tryptophan monooxygenase (IaaM) and indoleacetamide hydrolase (IaaH). A putative monooxygenase gene, distantly related to IaaM from A. vitis [135] was found coded in the tropici pSym. An iaaH gene encoding a protein 47% similar to IaaH from A. vitis could be found only in the PRF 81 megaplasmid, indicating that the indoleacetamide pathway is operative in this strain but not in CIAT 899. In a second pathway, indole-3-acetonitrile can be directly converted to IAA by a nitrilase, or by the action of a nitrile hydratase it can be converted to indoleacetamide that is then metabolized by IaaH. Genes encoding proteins with similarities to nitrilase and two subunits of a nitrile hydratase were found in the chromosomes of CIAT 899 and PRF 81 strains. The third IAA biosynthesis route, the indolepyruvate pathway, requires tryptophan transferase and indole-3-acetaldehyde oxidase proteins. Homologues to Sinorhizobium sp. NGR234 genes encoding those proteins were located in the tropici pSym ( Figure 3B). The NGR234 genes are flavonoidinducible although they are not required for nodulation [136]. Interestingly, the transcriptional regulator nodD5 gene is located close to the IAA genes in the tropici pSym, suggesting that it may regulate their transcription in response to plant flavonoids. To test this hypothesis, IAA production was quantified in R. leucaenae CFN 299 and in its 200-kb pSym deleted mutant CFN 299-10 when exposed to various flavonoids. We observed that IAA production was enhanced by naringenin, genistein, luteolin, chysin and apigenin in CFN 299, but not in CFN 299-10 (Additional file 4), clearly indicating that a flavonoidinducible IAA biosynthesis system was coded in the tropici pSym.
An operon involved in gibberellin biosynthesis has been identified in the B. japonicum genome [137]. The tropici pSym possessed a cluster of genes encoding three putative cytochrome P450s, a sterol dehydrogenase-like enzyme, a geranylgeranyl diphosphate synthase, an ent-copalyl diphosphate synthase, and an ent-kaurene synthetase showing >87% identity to the B. japonicum homologues. We observed that similar gene clusters are present in the symbiosis island of M. loti and in the pSyms of Sinorhizobium sp. NGR 234 and R. etli CFN 42.

Resistance to antimicrobials
One distinguishing phenotype of R. tropici CIAT 899 is its high resistance to several antimicrobial compounds, including antibiotics and pesticides, in comparison to other rhizobia nodulating common bean, such as R. etli and R. leucaenae [14,28]. PRF 81 is as resistant as CIAT 899 to many of these antibiotics (Additional file 2). CIAT 899 also shows high resistance to other types of antimicrobial compounds, such as the fungicides Thiram and Captan that diminish the survival of rhizobia in seed-applied inoculants [29,30]. Resistance to antimicrobial compounds can be conferred by the action of efflux pumps that transport those compounds outside the cells [138]. We performed a systematic search for genes encoding drug-efflux pumps in the genomes of CIAT 899, PRF 81, other rhizobia, and in the genome of Burkholderia cenocepacia J2315. This latter strain was included as an example of a multidrug-resistant bacterium [139]. Interestingly, the genomes of CIAT 899 and PRF 81 had more genes encoding putative efflux pumps than the majority of other rhizobia ( Table 2). B. japonicum USDA 110 was the only rhizobial strain with a similar number of genes, although when normalized against genome size, CIAT 899 and PRF 81 genomes showed the highest density of these transporters per Mb. Similarly, B. cenocepacia had more genes than CIAT 899 and PRF 81, but not necessarily at higher density ( Table 2). The abundance of efflux pumps in their genomes may explain the higher resistance of CIAT 899 and PRF 81 to many antimicrobials in comparison to other rhizobia.
The activity of enzymes that modify or break down antibiotics represents another mechanism for resistance [140]. We found 15 and 17 genes encoding proteins of the β-lactamase family in the genomes of CIAT 899 and PRF 81, respectively. Some of them may explain the resistance of both strains to carbenicillin. Both genomes also encoded one or two aminoglycoside phosphotransferases, aminoglycoside acetyltransferases, and streptomycin kinases. Additionally, the presence of one and two genes for putative chloramphenicol acetyltransferases in CIAT 899 and PRF 81, respectively, may contribute to their known resistance to chloramphenicol (Additional file 2).
Soil can be an inimical environment and the production of antimicrobials represents a strategy for clearing niches of competing microorganisms or limiting their growth [141]. In this context, intrinsic resistance to antimicrobials may be advantageous for survival and thriving in the soil [142]. Bacteria colonizing plant tissues are also challenged with a wide range of antimicrobials produced by plants as defense mechanisms [143]. Even mutualistic symbionts like rhizobia are exposed to these compounds when interacting with their legume partners [144,145], and studies suggest that efflux pumps can protect rhizobia and thus promote effective nodulation of some hosts [146,147]. Therefore, the large number of efflux pumps of CIAT 899 and PRF 81 may contribute to the broad host ranges of these bacteria by allowing them to repel the antimicrobial compounds produced by legumes.

Other genes involved in competitiveness
Pili or fimbriae are proteinaceous, filamentous surface appendages that are involved in bacterial surface attachment and virulence [148]. CIAT 899 and PRF 81 genomes possessed a cluster of genes for pili assembly through the chaperone/usher pathway. Several rhizobia produce pili [149] that are involved in root attachment and nodulation [150]. Both strain genomes also harbored a gene cluster for the biosynthesis of Type IV pili. These pili are involved in host-cell attachment, biofilm formation, and also in twitching motility due to their capacity to retract [148]. Type IV pili have not been studied in rhizobia although they are involved in plant colonization in the N 2 -fixing bacterium Azoarcus sp. [151]. Although twitching motility has not been reported in CIAT 899 or PRF 81, both strains display swimming motility, a characteristic that may contribute to efficient host colonization [152]. A flagellumbiosynthesis gene cluster adjacent to a chemotaxis gene cluster was encoded in both strain genomes.
Opines are compounds produced by plant tumors or hairy roots induced by pathogenic Agrobacterium species. The genes required for opine synthesis are contained in the T-DNA that agrobacteria transfer to the plant cells. Opines serve as carbon and nitrogen sources for the bacteria. In a survey of Agrobacterium and rhizobial species, it was reported that rhizobia-with the possible exception of nopaline by B. japonicum-were unable to use opine or nopaline as carbon and nitrogen sources [153]. Interestingly, we identified genes for the uptake of octopines (occMPQT) in CIAT 899 and PRF 81 genomes, and nopaline (nocTMQ) in PRF 81. We have previously observed that CIAT 899 is able to use octopine as a source of carbon and nitrogen (E. Martínez-Romero, unpublished data). This unusual characteristic of CIAT 899 and PRF 81 may be used in conjunction with plants engineered to produce opines as an strategy to improve competiveness of inoculant strains against indigenous rhizobial populations [154].
The ability to catabolize root-exudate sugars has also been shown to be an advantageous trait for root colonization or nodulation. CIAT 899 and PRF 81 megaplasmids encoded an orthologue of the rhamnose-uptake and catabolism locus described in R. leguminosaum and required for competitive nodulation of clover (Trifolium repens) [155]. Likewise, CIAT 899 and PRF 81 chromosomes included orthologues of the iolDEB genes involved in inositol catabolism and nodulation in R. leguminosarum [156]. Both strain genomes carry two inositol ABC transporters, one in the megaplasmid and another in the chromosome. It was previously shown that CIAT 899 carry teu genes in pRtrCIAT899a which encode a putative sugar-uptake ABC transporter that is induced by root exudates of common bean and M. atropurpureum, and that is required for nodulation competitiveness [157]. We now report that the teu locus is present in the pPRF81b plasmid.
An orthologue of the transcriptional regulator RosR of R. etli CFN 42 was found encoded in the chromosomes of both strains. RosR belongs to a family of transcriptional regulators like MucR of S. meliloti and Ros of Agrobacterium species that are conserved in the Rhizobiaceae and that influence the expression of genes involved in nodulation or virulence [158]. CIAT 899 and PRF 81 genomes coded for eight homologues of the Bartonella bacilliformis ialB gene. IalB is required for erythrocyte adherence and invasion by bartonellas [159]. Two loci required for cellulose synthesis found in each genome may be involved in root attachment through cellulose fibrils as in R. leguminosarum [160]. A cellulase, 70% identical to CelC2 of R. leguminosarum ANU843, and an endoglucanase, 40% identical to EglA of Azoarcus sp. BH72, coded in both strain genomes may promote host-cell entry and dissemination by localized cellwall degradation [161,162]. Vitamin prototrophy may confer an advantage for competitive rhizosphere colonization [163]. Except for biotin, CIAT 899 and PRF 81 harbor similar complements of genes required for vitamin biosynthesis. CIAT 899 lacks biotin-biosynthesis genes, whereas PRF 81 harbored bioBFDAZ genes in its megaplasmid. Both strains may use external biotin through the action of transport systems coded by bioMN and bioY. Although CIAT 899 appears to be a biotin auxotroph, it may be highly efficient at scavenging biotin as it possesses an additional BioY transporter.

pH stress
CIAT 899 and PRF 81, considered acid-resistant strains [13,15], may face low-pH stress in acid soils and also in the rhizosphere and inside the symbiosome. Rhizobial mechanisms involved in survival and growth under acidic conditions are not well understood. In CIAT 899, a locus including the low pH inducible lpiA and the acid tolerance and virulence atvA genes is up-regulated by acidic pH [27]. This locus was found to be conserved in PRF 81 and in many acid-tolerant, as well as acidsensitive rhizobia and agrobacteria. lpiA is required for the synthesis of lysyl-phosphatidylglycerol, which confers resistance to antimicrobial peptides and is required for nodulation competitiveness, although not for acid resistance [164]. AtvA and its orthologue AcvB of A. tumefaciens are required for acid tolerance [27], but their function is unknown. Based on similarities to lipases, it has been proposed that AtvA/AcvB is involved in membrane lipid metabolism required for adaptation to acid conditions [27]. Indeed, cell-envelope modifications are common in bacteria exposed to acid stress [165][166][167]. In CIAT 899, acidity induces an increase in the amount of ornithine lipids (OL) in the membrane and the presence of a hydroxylated OL species produced by OlsC seems to be important for acid resistance [26]. Some Proteobacteria, including PRF 81, carry olsC genes and their presence may correlate with increased tolerance to acidic conditions [26,27]. An eptA homologue encoding a putative lipid A phosphoethanolamine transferase was found in CIAT 899 and PRF 81. In E. coli and Salmonella typhimurium, eptA is induced under mildly acidic conditions and it has been shown that eptA confers acid resistance in Shigella flexneri 2a [168]. eptA homologues are present in R. rhizogenes K84, agrobacteria and sinorhizobia, but not in other Rhizobium species. Cyclopropane-containing fatty acids are thought to improve acid tolerance by reducing membrane permeability to H + [169]. CIAT 899 and PRF 81 chromosomes possessed two genes encoding cyclopropane-fatty-acyl-phospholipid synthases (cfa). In addition, CIAT 899 possessed three cfa genes, one in its megaplasmid and two in pRtrCIAT899a. EPS production has been correlated with acid tolerance and exo genes are among the most induced in S. meliloti and A. tumefaciens after an acid shock [170][171][172]. CIAT 899 and PRF 81 produce copious amounts of EPS, a characteristic that may contribute to their resistance to acid. Under acidic stress, some bacteria can reverse membrane potential-normally negative inside and positive outside-by accumulating positively charged molecules, such as potassium ion (K + ), in order to repel protons and slow down H + influx [173]. The CIAT 899 and PRF 81 genomes encoded a homologue of kcsA, a Streptomyces lividans ion channel that opens at low pH allowing K + entrance into the cell. Riccillo et al. [24] found that the glutathione synthase gshB gene is required for CIAT 899 tolerance to acidic stress, its activation occurring under low-pH conditions [174]. The inability of a CIAT 899 gshB mutant to grow in low-pH media was related to a diminished capacity to accumulate K + at low pH, and it was proposed that a KefB/KefC glutathione-regulated K + efflux transporter may be too active in the absence of glutathione [24]. In support of that hypothesis, we found a putative kefB/kefC homologue in the genome of CIAT 899 and a highly similar gene in the PRF 81 genome.
Prokaryotic ClC-type chloride channels function as H + /Clantiporters, able to extrude H + under low-pH conditions [175]. Interestingly, we found four chromosomal genes for this type of antiporters in CIAT 899 and PRF 81. These paralogous genes probably make different contributions to pH homeostasis as a CIAT 899 mutant in one channel (sycA) was apparently not impaired in growth at pH 4.5, but rather showed symbiotic defects [25].
Enterobacteria possess efficient systems to cope with extremely acidic conditions that rely on H + consumption by decarboxylation of externally-supplied amino acids and on the action of antiporters that expel the decarboxylated products in exchange for new amino acid substrates [173]. CIAT 899 and PRF 81 seem to lack these systems as we did not find orthologues to the amino acid decarboxylase or antiporter genes used by enterobacteria. pH homeostasis in alkaline conditions relies on the action of antiporters that direct proton influx to reduce internal pH in exchange for the monovalent cations Na + or K + , depending on their availability [173]. Homologues of the multicomponent pha1 [176] and pha2 [177] cation/H + antiporter systems described in sinorhizobia for alkaline pH adaptation were found in CIAT 899 and PRF 81. The CIAT 899/PRF 81 pha2 system may also be involved in the tolerance of these strains to high NaCl concentrations, as observed in sinorhizobia [177]. Both strain genomes carried additional cation/H + antiporters of the monocomponent type (two in CIAT 899 and one in PRF 81) that may contribute to pH homeostasis or resistance to strongly saline conditions.

Temperature stress
CIAT 899 as well as PRF 81 can grow at temperatures up to 40°C and this characteristic is thought to be important for their success as tropical inoculant strains [13,15]. Several proteins, collectively known as heat-shock proteins (HSPs), are induced after exposure to high temperatures. In E. coli, transcription of many genes encoding HSPs is under the control of the alternative sigma factor RpoH. Proteobacterial species commonly have more than one rpoH homologue, such as two in R. etli, S. meliloti and Brucella melitensis, and three in B. japonicum. Usually, one of the rpoH genes is mainly responsible for coping with heat shock, whereas the others may complement the response or be involved in responses to other stresses. CIAT 899 and PRF 81 possessed two chromosomal rpoH genes, one of which is a close relative to rpoH1 of R. etli and S. meliloti, and to rpoH2 of B. melitensis, responsible for heat-shock responses in those bacteria. In many bacteria, like B. japonicum, the positive control of the heatshock response by RpoH is combined with a negative control by the HrcA repressor [178]. One putative hrcA gene was found in CIAT 899 and PRF 81, although it may have a restricted role, as has been shown for hrcA in A. tumefaciens [179]. Common HSPs involved in protein folding such as DnaK-DnaJ-GrpE composing the DnaK chaperone system, and GroEL-GroES composing the GroE chaperonin machinery were found encoded in the chromosomes of CIAT 899 and PRF 81, whereas the HtpG chaperone is encoded in their megaplasmids. In contrast with other rhizobia, a single groEL-groES operon was found in CIAT 899 and PRF 81. Both strains possessed several homologues of the ibpA/B genes encoding small HSPs of the HSP20 family involved in reversing protein aggregation induced at high temperatures [180]. Interestingly, two of the small HSPs genes resided in the tropici symbiotic plasmid, suggesting a possible role during symbiosis. Four homologues of the high temperature requirement HtrA protease/chaperonerequired for degradation of misfolded and mislocalized cell-envelope proteins [181]-were encoded in both strains.
Several htrA homologues are also present in the genomes of other bacteria [181]. In S. meliloti and Brucella abortus, mutations in one htrA paralogue have only a small effect on growth at high temperatures [182,183], probably due to functional redundancy. Other common components of the heat-shock response such as the Clp and Lon protease systems, and the translation elongation factor LepA were also found in the CIAT 899 and PRF 81 genomes. The differential expression of several PRF 81 proteins under high-temperature stress such as DnaK, GroEL, and the translation factors EF-Tu, Ef-G and IF2, was recently demonstrated [184]. It is noteworthy that some HSPs also play protective roles during other stressful conditions like the DnaK machinery during hyperosmotic salt stress [185] and HtrA in oxidative damage [183].
As observed during heat shock, cells respond to cold shock by producing proteins known as cold-shock proteins (CSPs) to overcome the deleterious effects of low temperatures [186,187]. Homologues to the CSPs-coding genes cspA, cspB, cspC and cspG were present in duplicate in CIAT 899 and PRF 81 except for cspB, with only one copy in both strains, and cspC, of which there was only one copy in PRF 81. CspA, the major CSP in E. coli, is an RNA chaperone thought to facilitate translation by destabilizing mRNA secondary structures formed at low temperatures. CspA and the homologous RNA chaperones CspE and CspC are also transcription antiterminators [186]. Although the transcriptional activation of a cspA homologue in S. meliloti upon cold shock has been reported [187], its involvement, or of any other rhizobial CSP gene, in growth at low temperatures has not been documented. Osmotolerance Bacteria exposed to osmotic stress react with mechanisms to avoid leakage of intracellular water in hypertonic environments, such as those with high concentration of salts; or to prevent cell burst due to excessive water influx in hypotonic environments [188]. K + uptake after an osmotic upshift is a rapid response to preserve cell turgor [189]. CIAT 899 and PRF 81 genomes encoded a chromosomally-located K + -uptake protein (kup) gene specifically required for growth under highly osmotic conditions [185,189]. An additional kup gene, found in the PRF 81 megaplasmid, was similar to uncharacterized homologues harbored by Bradyrhizobium and Methylobacterium. The conserved KdpABC K + -transporting ATPase required for osmoadaptation to moderate concentrations of ionic solutes and to normal K + homeostasis was found encoded in the megaplasmids of both strains [189]. Trk, the main system involved in K + accumulation after an osmotic upshift in S. meliloti [189], was not found in the genome of either strain.
Biosynthesis and intracellular accumulation of molecules known as compatible solutes, such as trehalose, glycine betaine, proline betaine or ectoine, are common responses under high osmolarity stress conditions [190]. Exogenous compounds like choline or choline sulfate can be taken up by the cell and act as osmoprotectants or induce the biosynthesis of compatible solutes [190]. Trehalose accumulation is osmoregulated in CIAT 899, and growth of this strain in a high salt medium is improved when glycine betaine or choline is present [191,192]. Trehalose biosynthesis in rhizobia not only confers osmotolerance, but is also involved in nodulation competitiveness [193][194][195]. Chromosomes of CIAT 899 and PRF 81 encoded the enzymes OtsA and OtsB for trehalose biosynthesis from UDP-glucose and glucose 6-phosphate [194]. CIAT 899 possessed an additional pathway involving trehalose synthase, TreS, which converts maltose into trehalose [193]. Biosynthesis of glycine betaine from choline may proceed in both strains by the activities of choline dehydrogenase BetA and betaine aldehyde dehydrogenase, BetB, that were encoded in their chromosomes in close vicinity to their regulatory protein BetI [196]. Choline sulfate may also be used as a precursor of glycine betaine biosynthesis by the action of choline sulfatase BetC, which was encoded elsewhere in the chromosomes of both strains in contrast to S. meliloti where it is linked to betAB [196]. Both strains could take up external choline by a homologue of the highaffinity ChoXWV ABC-type transporter [197] also encoded in their chromosomes. ChoXWV belong to the QAT family that includes importers for glycine betaine, proline betaine, proline and histidine. CIAT 899 and PRF 81 encoded three and four other QAT transporters, respectively. The activity of QAT systems may be required for catabolism of their transported substrates rather than for osmotolerance [197], nevertheless, at least one of the additional QAT systems encoded in each genome had an ATPase component with CBS domains whose presence has been correlated with osmoregulatory functions [198]. CIAT 899 and PRF 81 megaplasmids each encoded one orthologue of the S. meliloti PrbABCD transporter for uptake of proline betaine at low and high osmolarities [199]. A putative diaminobutyratepyruvate aminotransferase ectB gene was found in CIAT 899 and PRF 81, but not the genes required for the remaining steps of ectoine biosynthesis from L-2,4-diaminobutyrate [200]. We did not find genes related to the biosynthesis of the dipeptide N-acetylglutaminylglutamine amide identified in osmotically stressed cultures of S. meliloti [201].
Increase in cell turgor under hypo-osmotic conditions is counteracted by the release of internal solutes, mainly K + , through mechanosensitive channels that respond to stretching of the cell membrane [202]. The CIAT 899 and PRF 81 genomes encoded for a large conductance mechanosensitive channel MscL, and five and four small conductance mechanosensitive channels MscS, respectively. Expression of two MscS genes in R. leguminosarum has been shown to be upregulated in the rhizosphere [203], probably indicating that rhizobia face osmotic stress in this environment. Periplasmic glucans (PGs) are polysaccharides composed of glucopyranosyl residues linked with β-glycosidic bonds that accumulate in the periplasm especially under low osmotic conditions, and are thus known as osmoregulated PGs [204]. PGs are required for growth of Agrobacterium and Sinorhizobium in hypo-osmotic media and for proper interaction with their plant hosts [204]. Genes ndvB and ndvA encoding the cyclic β-1,2-glucan synthase and exporter, respectively, were found in CIAT 899 and PRF 81. Interestingly, while NdvA exporters of both strains were 97% identical, the NdvB synthases showed an identity of only 67%. The NdvB synthase of CIAT 899 was similar to homologues present in other rhizobia and this strain produces a cyclic (1 → 2)-β-glucan PG formed by 17 β-glucopyranose units that is structurally similar to PGs produced by other Rhizobiaceae [191,205]. In contrast, the PRF 81 ndvB gene is divergent in comparison to those of other Rhizobiaceae and is not linked to ndvA as in other rhizobia, indicating that this strain produces structurally dissimilar PGs.
Aquaporins are water-selective channel proteins that mediate the influx and efflux of water to and from the cell in response to changes in osmolarity. Aquaporins are widely distributed in animals and plants, but are less prevalent in bacteria [206]. A role of bacterial aquaporins in osmoadaptation has been suggested based on studies with E. coli and B. abortus [206,207]. Two aquaporin genes were found in each CIAT 899 and PRF 81 genome, one in the chromosome and the other in the pSym. The chromosomal aquaporin genes of both strains were 97% identical to each other, and displayed 73-74% identity to the B. abortus aquaporin, which is required for prolonged growth in media of low osmolarity [207]. The pSym aquaporin genes were only 42-43% identical to the chromosomal genes, but 76% identical to the sole aquaporin gene of R. etli CFN 42. The CFN42 gene is located in its pSym and has been shown to be upregulated in bacteroids [208] suggesting a role in symbiosis.

Oxidative stress
Oxidative stress occurs when the cell cannot properly detoxify reactive oxygen species (ROS) or repair the damage they cause on cellular components like proteins, lipids and DNA. Bacteria like rhizobia are exposed to endogenous ROS, like superoxide anion, hydrogen peroxide (H 2 O 2 ) and organic peroxides, as a result of their own normal metabolism, and to external ROS during interaction with other microorganisms or eukaryotic hosts. The regulators of oxidative stress response to H 2 O 2 and superoxide, OxyR and SoxR, respectively, were found in CIAT 899 and PRF 81. oxyR was located in their megaplasmids, whereas soxR was located in their chromosomes. Two classes of superoxide dismutase (SOD) enzymes that convert superoxide into H 2 O 2 and oxygen were found coded in the CIAT 899 and PRF 81 genomes. One was a Cu-Zn-SOD (SodC) and the other a Mn/Fe-SOD (SodM). Both genomes encoded a single catalase, 74% identical to KatG of R. etli CFN 42, a dual-function catalase/peroxidase responsible for decomposition of H 2 O 2 and other peroxides, which is not required for nodulation or nitrogen fixation [209]. As in R. etli [209], the katG gene of CIAT 899 and PRF 81 was plasmid-located (megaplasmids), adjacent to oxyR.
A wide range of other enzymes with peroxidase activity and protective roles were coded in both genomes. Four and three organic hydroperoxide resistance (ohr) protein paralogous genes were found in CIAT 899 and PRF 81, respectively. As their name indicates Ohr proteins show a marked preference for organic peroxides as substrates. One of the ohr genes of CIAT 899 and PRF 81 was adjacent to its putative ohrR transcriptional regulator. Orthologues of this ohr/ohrR gene pair in S. meliloti and A. tumefaciens are required for resistance to organic hydroperoxide stress [210,211]. We identified four genes for non-heme chloroperoxidases in CIAT 899 and two in PRF 81. This class of enzymes catalyzes concomitant chlorination and peroxidation reactions and is involved in oxidative stress responses [212] and secondary metabolism [213]. Genes for peroxiredoxins of the alkyl hydroperoxide reductase (AhpC)/thiol specific antioxidant (TSA) family were also found in CIAT 899 and PRF 81, with one representative in the former and three genes in the latter. An AhpC/TSA gene has been shown to be upregulated after H 2 O 2 exposure in S. meliloti [212]. CIAT 899 and PRF 81 also encoded one homologue of the bacterioferritin comigratory protein (BCP)-type of peroxiredoxins. AhpC/TSA and BCP peroxiredoxins play protective roles against oxidative damage, including lipid peroxidation [214]. Two genes encoding atypical-2-Cys peroxiredoxins were found in each genome, one in the chromosome and the other in the pSym. The latter gene was 80% identical to prxS of R. etli CFN 42, a peroxiredoxin located in the phaseoli pSym that is involved in bacteroid defence against oxidative stress [215]. A gene in each strain encoded a di-heme cytochrome c peroxidase, a periplasmatic enzyme acting on hydroperoxides and organic peroxides [216]. Distribution of genes encoding these enzymes is patchy within the Rhizobiaceae being present only in R. rhizogenes K84, Rhizobium sp. CIAT894, R. leguminosarum 3841 and WSM2304, and A. vitis S4.
A R. tropici CIAT 899 mutant deficient in glutathione biosynthesis is not only sensitive to acid stress, as previously mentioned, but also to oxidative stress [24]. We found one gene encoding a putative glutathione peroxidase (Gpo) in CIAT 899 and also in PRF 81. A low level of glutathione may affect the oxidative stress response as it compromise the activity of Gpo that uses glutathione as electron donor.
In other bacteria, Gpo is involved in organic hydroperoxide resistance and virulence [217]. Glutathione also has a protective effect as it can directly reduce protein disulfide bonds induced by ROS. PRF 81 harbors a second glutathione synthase gene in its megaplasmid. Some agrobacteria and all sinorhizobia genomes also harbor a second glutathione synthase gene.
A gene encoding a member of the Dps family of ferritin proteins was identified in each genome. The ferroxidase activity of Dps proteins protects cells against oxidative damage by simultaneously removing from the cytosol H 2 O 2 and free ferrous ion that otherwise react, producing hydroxil radicals by the Fenton reaction [218]. ROS induce protein damage by oxidation of methionine residues to methionine sulfoxides [219]. The CIAT 899 and PRF 81 genomes encoded for two MsrA and one MsrB methionine sulfoxide reductase repair-proteins [219].
Response to some oxidants like H 2 O 2 may protect cells against osmotic and thermal stress [220]. Such overlap or cross-talk between the stress responses is not uncommon. We have recently found that the expression of several oxidative stress-related proteins of PRF 81 is up-regulated after exposure to high temperatures [184].

Inorganic N assimilation and nitrosative stress
CIAT 899 and PRF 81 can use nitrate (NO 3 ) as sole nitrogen source. Genome sequencing suggested that both strains can metabolize NO 3 to ammonia via assimilatory nitrate reductase and nitrite reductase. Both strains possessed an ABC-type NO 3 -uptake transporter. PRF 81 seems able to use NO 3 for respiration via a respiratory nitrate reductase and a NarK protein providing a bifunctional uptake NO 3 symporter and NO 3 /nitrite antiporter. Neither CIAT 899 nor PRF 81 possessed genes for denitrification, i.e. NO 3 reduction to N 2 . Denitrification is not a desirable characteristic of inoculant bacteria in agriculture as it causes loss of soil N.
Nitric oxide (NO) is a signaling and defense molecule in eukaryotes, and in the interaction between plant and microorganisms; host-plant cells respond to infection by generating NO [221]. NO is toxic to bacteria; it diffuses across cell membranes into the cytoplasm and reacts with iron centers and thiols causing so-called nitrosative stress [222]. PRF 81, but not CIAT 899, harbored a gene for the flavohemoglobin Hmp, a dioxygenase that catalyzes conversion of NO to NO 3 [223]. Bacterial flavohemoglobins are well-known scavengers of NO and play a crucial role in protecting animal pathogens from nitrosative stress during infection [224]. A nsrR-like transcriptional regulator gene, that in other bacteria is involved in nitrosative stress [225], was located adjacent to hmp and may regulate it. Both strain genomes possessed a homologue of the E. coli glutathione-dependent formaldehyde dehydrogenase, an enzyme protective against nitrosative stress due to its potent activity toward S-nitrosoglutathione, the condensation product of glutathione and NO [226]. NO may react with superoxide forming peroxynitrite, a potent oxidant, thus previously mentioned antioxidant proteins like AhpC/TSA and MsrA/MsrB also protect cells against nitrosative stress.

Metal resistance
CIAT 899 and other R. tropici strains are more resistant to heavy metals than other bean-nodulating rhizobia [13]. This may be an advantageous trait for rhizobia growing in tropical acid soils where some of these metals can reach toxic concentrations. Genome analysis of CIAT 899 and PRF 81 revealed several genes that may be involved in metal resistance. Both strains possessed three genes coding for transporters of the cation diffusion facilitator (CDF) family. Different members of the CDF family are involved in efflux of Zn +2 , Co +2 , Cd +2 , Ni 2+ , and less frequently, Mn 2+ , Fe +2 , Cu +2 and mercury ions [227]. Additionally, 2 and 3 efflux P-type ATPases were identified in CIAT 899 and PRF 81, respectively. These transporters showed similarities to Zn-, Cd-, Au-, Ag-and Cu-efflux systems, including a protein required for resistance to Zn and Cd in S. meliloti [228]. Strain PRF 81 seems to be better equipped than CIAT 899 to contend with Cu excess, as it possessed three multicopper oxidase genes [229] and two Cu-sequestering protein copD genes [230] versus two and one of these systems, respectively, in CIAT 899. Interestingly, a homologue of the cadD gene required for Cd resistance in S. aureus was found in each genome [231]. cadD homologues are present mostly within the Firmicutes and the only other member of the Rhizobiales carrying a homologous gene was R. rhizogenes K84.
Both genomes carried a two-gene locus encoding a member of the chromate ion transporter (CHR) family and a regulator for chromate resistance [232]. Additionally, CIAT 899 genome had an alternative CHR efflux transporter system of the short-chain monodomain type [233]. Both strains had genes encoding the arsenic resistance proteins ArsC, which reduces arsenate to arsenite, and ArsB, which exports arsenite outside the cell [234]. CIAT899, but not PRF 81, had the tellurite resistance proteins TehA and TerC [235].

Conclusions
Genome sequencing of the inoculant strains R. tropici CIAT 899 and Rhizobium sp. PRF 81 revealed that these rhizobia share a highly-conserved symbiotic plasmid. R. leucaenae CFN 299, showing a similar host range with CIAT 899 and PRF 81, also possessed this plasmid. The presence of this plasmid in three genetically different strains that are highly-efficient in N 2 -fixation with common bean highlights an evolution towards symbiotic effectiveness by the assembly of a successful set of nodulation and nitrogen-fixation genes in a pSym that has been spread and is maintained in geographically distant bacteria. In addition, this pSym carries genes for biosynthesis and modulation of plant-hormone levels, traits that are thought to improve the symbiotic or associative abilities of rhizobia.
The three divergent nodA genes present in the tropici symbiotic plasmid likely allow CIAT 899, PRF 81 and CFN 299 to have an expanded host range in comparison to rhizobia carrying a single gene. Recently, it has been described that Rhizobium strains isolated from nodules of Mimosa plants in French Guiana [236] and New Caledonia [237] each carries two different nodA genes, although they were distinct from the nodA genes present in the tropici pSym ( Figure 4A). Symbiotic promiscuity seems to be a general characteristic of rhizobia [3] and our results suggest that reiteration of host-specific nodulation genes, like nodA, may be a previously unrecognized strategy to nodulate a wide range of legume hosts.
Competitiveness is a highly complex attribute that probably results from the combined activities of several classes of genes. CIAT 899 and PRF 81 possessed a wide array of genes coding for functions that may promote their competitiveness for root colonization and nodulation. Some traits such as the capacity to synthetize biotin and the NO-protecting enzyme Hmp coded in the PRF 81 genome but absent in CIAT 899 may explain the superior competitive ability of PRF 81.
The CIAT 899 and PRF 81 genomes evidence two rhizobial strains well equipped to cope with the environmental stress conditions found in tropical soils, including acidity and high temperatures, as well as oxidative and osmotic stresses. Proteins contributing to acid-stress response in other bacteria but not yet reported in rhizobia, such as a lipid A phosphoethanolamine transferase and a KcsA-like ion channel were found encoded in both genomes.
Remarkably, both genomes were found to be enriched in genes coding for drug-efflux pumps, a characteristic that helps to explain the resistance of CIAT 899 and PRF 81 to several antimicrobial compounds and probably also allows them to compete more effectively in the soil or rhizosphere against antimicrobial-producing microorganisms, or might even expand their legume host range by promoting resistance to antimicrobial compounds produced by their hosts.

Methods
Bacterial strains and growth conditions R. tropici CIAT 899 was obtained from the culture collection of the Centro de Ciencias Genómicas, Cuernavaca, Mexico. Rhizobium sp. PRF 81 (=SEMIA 4088) and R. leucaenae CFN 299 were obtained from the Diazotrophic and Plant Growth Promoting Bacteria Collection of Embrapa Soja, Londrina, Paraná, Brazil. Main differences previously reported between the strains are shown in Additional file 2. Bacterial growth conditions and DNA extraction were performed as described before [32,238].
Sequencing, assembly and gap closure R. tropici CIAT 899 and Rhizobium sp. PRF 81 genomes were sequenced using a whole-genome shotgun strategy. For CIAT 899, a combination of Roche 454 GS-FLX shotgun and 3 kb-insert paired-end libraries were used at the sequencing facility of Virginia Bioinformatics Institute (USA). For PRF 81, a combination of GS-FLX shotgun libraries obtained at Creative Dynamics Inc. (USA) and Sanger sequences previously obtained at Embrapa Soja (Brazil) [32] were used. The pSym of R. leucaenae CFN 299 was transferred to a plasmid-less Agrobacterium strain and purified. The CFN 299 pSym DNA was Sanger sequenced at Embrapa Soja [32,238]. CFN 299 whole-genome sequence reads generated by Solid sequencing at Sistemas Genómicos (Spain) were used to complement sequencing of its pSym. For closing gaps, a primer walking strategy was used and PCR products were Sanger sequenced either at Embrapa Soja, or at Macrogen (Korea).
Roche 454 sequence reads were de novo assembled using Newbler. For PRF 81, Sanger reads were combined with pyrosequence data. Phrap [239] was used for de novo assembling of the Sanger reads of the CFN 299 pSym, and Solid sequences were added to the assembly using Bowtie [240]. Sanger reads obtained during the gap closure stage were added to the assemblies with the assistance of Seq-ManPro (DNASTAR) or Consed [241].

Annotation
The genomes of CIAT 899 and PRF 81, and the pSym of CFN 299 were analyzed using the system for automated bacterial integrated annotation (SABIA) [242] and the rapid annotation using subsystem technology (RAST) server [243]. Both systems allowed the identification of gene-encoding regions, functional annotations, and manual curation of the gene annotations. Both systems also have tools for metabolic reconstruction using KEGG (Kyoto encyclopedia of genes and genomes) for sequence comparisons using BLAST and for functional comparisons using KEGG or FIGfam. The data for CIAT 899 and PRF 81 were submitted to the GenBank database and were assigned Bioprojects numbers PRJNA42391 and PRJNA13459, respectively. The data for the pSym of CFN 299 is available at the site http://www. bnf.lncc.br.

Genome comparisons
Genome sequence and annotations from other bacteria were retrieved from the GenBank database. Genome alignments were performed with Mummer [244] and Mauve [245]. Whole genome relationships were depicted with a neighbor joining dendrogram constructed using a matrix of MUMi distances [38]. Orthologues were identified by the bidirectional best hit criterion using BLASTP with a minimum sequence identity of 70% measured over at least 70% of the length of the smallest protein. A systematic search for putative antimicrobial efflux pumps was initiated by scanning the proteome of each genome for sequences displaying PFAM domains characteristic of the MFS, RND, MATE and SMR (super)family transporters with HMMER3 [246]. To distinguish proteins involved in drug efflux from those transporting other kind of substrates, phylogenetic trees including all members of each super(family) deposited in the Transporter Classification Database [247] were constructed and only those clustering with known drugefflux plumps were retained. Other genes belonging to functional classes described in this work were identified by searching for conserved motifs in the predicted proteomes and in the six-frame translations of the entire nucleotide sequence of each genome in order to avoid missing genes that were not predicted.

Genetic manipulations and phenotypic characterizations
A 1541-bp fragment containing the complete R. tropici CIAT899 waaL gene was PCR amplified with Pfu Ultra polymerase (Stratagene) using primers 108F_orfL (5'-TGA CCCAACGGCATAGGA-3') and orfL_23R (5'-AGGCGGT CGTGCAACTA-3'). The PCR product was TA cloned into pCR4-TOPO, generating the plasmid pERN-T16b. To disrupt the gene by targeted insertional mutagenesis, an internal 467-bp EcoRV-ApaI fragment of waaL was obtained from pERN-T16b, blunted with T4 DNA polymerase and cloned into the suicide mobilizable vector pK18mob [248] digested with SmaI. The resulting plasmid, pERN-K1, was introduced into CIAT899 by conjugal transfer in a triparental mating using a helper strain containing the pRK2013 plasmid [249]. The CIAT899-E4 mutant was selected among the kanamycin-resistant transconjugants obtained after checking for proper pERN-K1 insertion by PCR using vector and waaL specific primers. For complementation experiments, the waaL cloned in pERN-T16b was subcloned as an EcoRI fragment into the expression vector pBBR1MCS-5 under the control of the lacZ promoter [250]. The orientation of the cloned gene was checked by PCR and restriction digestion. The selected plasmid, pERN-B10, was transferred into CIAT899-E4 as described above. LPS were isolated and visualized as previously described [30].
R. leucaenae CFN 299 and its 200-kb spontaneous pSym deletion mutant CFN299-10 [128] were grown in minimal medium supplied with tryptophan in the presence or absence of various flavonoid compounds. After 15 h of growth, cells were harvested and the supernatant treated for IAA extraction. The indole acidic fraction was extracted with ethyl acetate, dried under nitrogen flux and resuspended in methanol. Thin-layer chromatography was carried out and the IAA migration region was excised. IAA was extracted with methanol and quantified by HPLC.