Repertoire, unified nomenclature and evolution of the Type III effector gene set in the Ralstonia solanacearum species complex
- Nemo Peeters†1, 2Email author,
- Sébastien Carrère†1, 2,
- Maria Anisimova3, 4,
- Laure Plener1, 2, 5,
- Anne-Claire Cazalé1, 2 and
- Stephane Genin1, 2Email author
© Peeters et al.; licensee BioMed Central Ltd. 2013
Received: 24 July 2013
Accepted: 29 November 2013
Published: 6 December 2013
Ralstonia solanacearum is a soil-borne beta-proteobacterium that causes bacterial wilt disease in many food crops and is a major problem for agriculture in intertropical regions. R. solanacearum is a heterogeneous species, both phenotypically and genetically, and is considered as a species complex. Pathogenicity of R. solanacearum relies on the Type III secretion system that injects Type III effector (T3E) proteins into plant cells. T3E collectively perturb host cell processes and modulate plant immunity to enable bacterial infection.
We provide the catalogue of T3E in the R. solanacearum species complex, as well as candidates in newly sequenced strains. 94 T3E orthologous groups were defined on phylogenetic bases and ordered using a uniform nomenclature. This curated T3E catalog is available on a public website and a bioinformatic pipeline has been designed to rapidly predict T3E genes in newly sequenced strains. Systematical analyses were performed to detect lateral T3E gene transfer events and identify T3E genes under positive selection. Our analyses also pinpoint the RipF translocon proteins as major discriminating determinants among the phylogenetic lineages.
Establishment of T3E repertoires in strains representatives of the R. solanacearum biodiversity allowed determining a set of 22 T3E present in all the strains but provided no clues on host specificity determinants. The definition of a standardized nomenclature and the optimization of predictive tools will pave the way to understanding how variation of these repertoires is correlated to the diversification of this species complex and how they contribute to the different strain pathotypes.
KeywordsType III effector Ralstonia solanacearum Selection Horizontal gene transfer Host specificity
Ralstonia solanacearum is a widely distributed soil-borne phytopathogen belonging to the beta subdivision of Proteobacteria . It causes lethal bacterial wilt of more than 200 plant species, including economically important crops [2, 3]. Among the pathogenicity determinants of this bacterium, the Type III Secretion System (T3SS) plays a crucial role because mutants unable to produce this specialized secretion machinery are unable to cause disease on plants . This T3SS ensures the direct translocation of Type III effector (T3E) proteins from the bacterium to the plant cell cytosol [5, 6]. These T3E are presumed to perturb host cell processes and modulate plant innate immunity to allow bacterial infection .
Phylogenetic analyses of Ralstonia strains causing wilt diseases revealed an extensive diversity [8, 9] and this group of organisms is now commonly called the R. solanacearum species complex (RSSC hereafter) . This species complex includes strains with broad and narrow host ranges with different geographic origins. Based on phylogenetic analyses and on comparative genomic hybridization, the RSSC has been classified in four phylogenetic groups called phylotypes, which reflect their origins as follows: Asia (phylotype 1), the Americas (phylotype 2), Africa (phylotype 3) or Indonesia (phylotype 4, which includes Ralstonia syzygii and the banana blood disease bacterium BDB) [8, 11, 12]. To date, 14 strains belonging to the RSSC have been completely sequenced.
Pioneering studies have established that T3E repertoires are highly variable among strains and shape the host range of bacterial pathogens [13, 14]. First exhaustive inventories of RSSC T3E using different in silico or experimental approaches were made in phylotype 1 strains GMI1000 [5, 7] and RS1000 [6, 15]. GMI1000 and RS1000 have almost identical repertoires that comprise 72 and 74 T3E for which T3SS-dependent plant cell targeting have been experimentally validated in RS1000 [6, 15]. A feature of these repertoires is the existence of multigenic T3E families . Functional studies have been carried out on members of the Gala family, which are proteins with F-box and Leucine Rich Repeat domains collectively required for full virulence [16–18], and members of the PopP family, which includes the avirulence proteins PopP1  and PopP2, the latter possessing acetyltransferase activity [20–22]. Recently a functional analysis of the AWR family demonstrated that some AWR T3E induce cell death necrotic reactions on plants and are required for full virulence .
The genome sequence data from strains representative of the biodiversity of the RSSC opens the way towards understanding the evolutionary processes that structured their T3E gene repertoire. This will also provide clues towards defining what makes a given strain more aggressive than others on a specific host. However such comparative genomic approaches are actually hampered by the fact that T3E inventories in multiple strains have not been accurately established: several T3E genes have been overlooked by automatic annotation programs and/or have been incorrectly predicted. Moreover, the lack of a unified nomenclature for RSSC T3E is confusing for a non-expert since many T3E genes from RSSC strains have different names in the published literature (Pop, Avr, Brg, Rip, Hpx or Lrp proteins). This doesn’t help the already difficult task of identifying orthologous and paralogous genes in strains harboring between 46 to 71 T3E genes.
This work presents an integrative and comprehensive database for the T3E of the RSSC. This database is a compendium of manually re-annotated genes across 11 sequenced strains and ordered with a novel and unifying nomenclature. This database is publicly available for browsing and retrieving data and information. Our analyses on this particular gene set at the forefront of the interaction between the bacteria and its host, provides new insight into their evolutionary history and their potential contribution to host specificity
Results and discussion
Ralstonia solanacearum T3E database
Inventory and re-annotation of T3E genes in the RSSC
Identification of T3E candidate genes in RSSC strains
A mining of the genome of nine RSSC strains from phylotypes 2, 3 and 4 for previously undescribed T3E gene families was performed based on the criteria listed above . In this process, we only kept the T3E candidates strictly fitting with both criteria (ii) and (iii) described above. This search yielded 16 RSSC T3E candidates, for which T3SS-dependent translocation is not yet demonstrated. These 16 hypothetical T3E gene families are listed in the Additional file 1 as well as in the RSSC-T3E database. Most of the corresponding genes did not display homology to any other known proteins, except for families RSSC-T3E-Hyp5, Hyp6 and Hyp7 having homologues only in Acidovorax spp or Xanthomonas spp, which are both plant pathogenic bacteria.
In many cases, T3E genes appeared to have frameshift mutations or to be split into several independent open reading frames on the assembled genomes. This could be due to mutations leading to gene inactivation or, more probably, to sequence and assembly errors in the available genome sequences. It should be noted that there are important differences in terms of quality in the available assembled genomes (see Methods). In some other cases, genome sequence gaps resulted in incomplete T3E gene prediction. Many genes encoding T3E with internal repeats are often predicted as truncated or incomplete, probably due to the difficulty to assemble repeat-containing short sequence reads (Next Generation Sequencing techniques). Frameshift-mutated and incomplete T3E genes were included in the RSSC-T3E database and are distinguished by the prefix fs ('frameshift’) before the gene name. Future re-sequencing should verify the current pseudogene status of these genes.
Probable non-functional pseudogenes are also listed in the RSSC-T3E database (with the “pg” prefix, for pseudogene). These pseudogenes correspond to genes or gene fragments which are either gene remnants, open reading frames disrupted by a transposable element insertion or frameshift mutated genes confirmed after re-sequencing. The number of predicted pseudogenes varies from one to eight among the eleven strains analyzed (Figure 1B). However, the formal distinction between a pseudogene and a functional gene is difficult to establish without experimental validation . In some cases, the absence of specific domains (e.g. RipC1CMR15 lacking the C-terminal half present in other RipC1 alleles) raises the question of the functionality of the corresponding protein.
The RSSC-T3E database interface
The dataset corresponding to the lists and expert annotation of validated and candidate T3E in the 11 sequenced strains representative of the 4 RSSC phylotypes were compiled in a web interface named “Ralstonia T3E” (https://iant.toulouse.inra.fr/T3E) designed to provide the user with a convenient and straightforward access to all the underlying data. The home page provides a synthetic table displaying the distribution of the 94 T3E gene families in the RSSC strains under the proposed nomenclature (see below). This table summarises for each strain whether a gene member is present (in single or multiple copies), absent, or is predicted as being not functional (pseudogene). A specific colour code also indicates genes with putative frameshift mutations. This information is also available as a table in the Additional file 2. The clickable T3E genes provide a link to multifasta files of the curated nucleotide and protein sequences as well as view of the corresponding DNA and protein alignments . Tab-style navigation provides a link to the 16 T3E candidate genes as well as a link to different services like “ScanYourGenome” (see hereafter), Pat Scan, HMScan and Blast.
Proposed guidelines for the nomenclature of T3Es in RSSC strains
List of the T3E genes currently identified in the R. solanacearum species complex and proposal for a unified nomenclature
Proposed T3E family name
Representative gene member
Functional domain/motif or Function
Evidence for T3SS-dependent secretion or translocation
RipA, Rip29, Hpx31, AWR2
Rip44, Hpx32, AWR3
Rip45, Hpx4, AWR4
Rip56, Hpx10, AWR5
RipB, Rip2, Hpx11
Rip34, Hpx25, Brg8
PopF1, PopF2, Rip70
F-box Leucine-Rich Repeats
Gala2, Rip37, Hpx20
F-box LRR protein
F-box LRR protein
Gala4, Rip17, Hpx15
F-box LRR protein
Gala5, Rip18, Hpx16
F-box LRR protein
RipG, Gala6, Rip13, Hpx13
F-box LRR protein
Gala7, Rip14, Hpx14
F-box LRR protein
HLK1, Rip15, Brg19
HLK3, Rip30, Brg18
YopJ acetyltransferase domain
Rip58, Hpx26, Brg44
Nudix hydrolase domain
R. syzygii RALSY_mp30159
UW163 [GenBank accession : CAF32358.1]
Rip64, Hpx24, Brg15, PopS
SKWP1, Rip27, Hpx37
Heat/Armadillo repeat domain
SKWP2, Rip65, Hpx36
Heat/Armadillo repeat domain
Heat/Armadillo repeat domain
SKWP4, Rip20, Hpx30
Heat/Armadillo repeat domain
SKWP5, Rip33, Hpx34
Heat/Armadillo repeat domain
Heat/Armadillo repeat domain
Heat/Armadillo repeat domain
Heat/Armadillo repeat domain
Putative cysteine protease
Rip12, Hpx29, Brg17
Ubiquitin ligase domain
Ubiquitin ligase domain
Harpin, Pectate lyase
AvrA, Rip5, Brg46
R. syzygii RALSY_20037
Rip23, Hpx28, Brg36
This work Additional file 3
Rip43, Hpx33, Brg33
Rip50, Hpx2, Brg34
Ubiquitin ligase domain
Rip66, Hpx9, Brg43
Rip68, Hpx8, Brg45
Rip39, Hpx27, Brg39
Ubiquitin ligase domain
Rip55, Hpx21, Brg37
R. syzygii RALSY_20407
RSc0227, RSp0228 [pseudogene]
YopJ acetyltransferase domain & Ankyrin Repeats
R. syzygii RALSY_20184
Ubiquitin ligase domain
Shigella flexneri OspD family
Rip19, Hpx17, Brg11
Putative transcription factor
Manuscript in preparation
After identifying groups of homologous genes by reciprocal best hit in the curated list of RSSC likely T3E genes, we concentrated our effort in grouping the different genes in orthologous groups and naming then accordingly. Three situation can occur: (i) a single hit (or no hit) in each strain, with conservation of synteny on the genome; (ii) a single hit (or no hit) in each strain, but with a breach of synteny for at least one of the homologous genes; (iii) multiple hits (two or more for at least one strain) in different strains.
In the first case a single orthologous group is defined irrespective of the pairwise identity between the orthologous genes. This can be exemplified by RipB a single gene present in all strains with pairwise amino acid identity ranging from 72 to 100%. Another case is RipU also a single gene present in all strains with a strict conservation of synteny, but with surprising divergent members (pairwise amino acid identity ranging from 23 to 100%). Even though it is likely that RipU has evolved different functions in the different strains, based on the likely common ancestral origin suggested by the conservation of synteny [29, 30], we advocate for keeping a single orthology group.
In the second situation, an apparent single orthologous group exists but differences in synteny support a scenario of gene duplication followed by gene loss or lateral gene transfer between strains. Here we favour synteny as a ruler for ortholog definition [29, 30]. This is exemplified by RipO1 and RipO2, the latter being present only in the strain R24, devoid of RipO1.
Finally when there are strains with two or more paralogous genes, again we favour the synteny rule to identify groups of orthology . A careful phylogenetic reconstruction for these homologous genes across the whole species complex (Additional file 4) illustrates the accuracy of the orthology attributions . These phylogenetic trees also highlighted the existence of two paralogs in several strains that clearly belong to a clade defined as an orthologous group (see Additional file 4, for RipA5, RipE1, RipF1, RipG1 and RipH2). We believe that these paralogs result from strain specific (or group of related strains) recent gene duplication. We thus choose to name these genes in a way that indicates their recent evolution: e.g. RipA5_1Molk2 and RipA5_2MolK2; RipF1_1CMR15 and RipF1_2CMR15 etc.…The rule of synteny is conserved since we verified that all these genes have indeed a conserved synteny (e.g. RipA5_1MolK2, RipA5_1IPO1609, RipA5_1UW551 and RipA5_1P082 have a conserved genomic location, as do RipA5_2MolK2, RipA5_2IPO1609, RipA5_2UW551 and RipA5_2P082).
Suggested name reassignment of previously characterized R. solanacearum T3E
Whenever possible the proposed new nomenclature conserves the original letter designations used in previous annotation e.g. RipP1 is PopP1 ; RipP2 is PopP2 ; RipAA is AvrA . In the case of paralogous genes, the names are, for instance: RipG1, RipG2, …to RipG8 for the GALA gene family [16, 17]; RipA1, RipA2, …to RipA5 for the AWR family . In a few cases, there is evidence for recent T3E gene duplications resulting in two or more gene copies in a single given strain, e.g. strain Psi07 harbors 3 copies of RipG1  and 2 copies of RipH2: these were renamed RipG1_1, RipG1_2, RipG1_3 and RipH2_1, RipH2_2, respectively, to differentiate them from the other RipH and RipG genes in this strain (Table 1).
In addition, a Rip name is proposed for the 9 T3E previously identified as Pop [20, 32–36] or Avr . The Pop designation is historical and was formerly coined when R. solanacearum was known as Pseudomonas solanacearum, the “Avr“ term was solely used for the AvrA avirulence protein identified in 1990 . These designations can be confusing because the Pop term has also been used to name some Pseudomonas aeruginosa T3E  and AvrA also refer to an unrelated T3E from Salmonella species .
“ScanYourGenome” a bioinformatic tool for detecting T3E orthologs
In order to swiftly analyse the T3E content of newly produced genome sequences, we developed a protocol for the identification of putative effector candidates. This pipeline is based on a de novo effectome prediction using T3E models. Then each candidate is tested using different methods with decreasing stringency to assign them to the most probable known effector gene (see Methods section). This protocol was first tested on reference genomes used above for manual annotation of the T3E genes in order to calibrate the detection parameters (see Methods) before using it for predicting T3E in the recently published draft genomes of strains K60 , FQY_4  and Y45 . This analysis yielded a prediction of 60, 75 and 73 potential T3E encoding genes encoded respectively by the K60, FQY_4 and Y45 genomes, (Additional file 2). The gene model prediction takes into account possible frameshifts, also when the gene is shorter than 80% of the average length of the other alleles of this Rip gene, the predicted gene is tagged as potential pseudogene. Both frameshift and pseudogene annotations appear in the prediction. This orthology search engine and the consequent Rip assignment are available to the community for queries of draft or complete genome sequences. For shorter gene sequences a more straightforward blast is advised. The advantage of a sliding scale of orthology detection is the possibility to unequivocally assign each potential T3E gene to a specific orthologous group. Whenever a new candidate T3E gene, experimentally validated as being secreted or translocated into plant cells, will not retrieve an already labelled orthologous Rip family, this gene will be assigned the next available Rip code.
Evolutionary dynamics of rip genes
Classification of paralogous rip genes
A specific feature of R. solanacearum T3Es is the abundance of paralogous rip genes in all the strains sequenced to date. Some of these paralogous genes are well represented in strains from the four phylotypes, hence they probably originated from ancient duplications in the common ancestor of these diverse strains. This was well documented for the RipG1-G8  and the RipA1-A5  paralogous gene families and is probably also true for RipH1-H3 and RipS1-S8. Although all strains contain members of these paralogous family, the likely ancient duplications doesn’t exclude some phylotype specificities explained by loss or more simply by recent duplications e.g. RipA1 and RipS6 seem to be specific to phylotype 1, RipG8 is only found in CMR15, the sole representative of phylotype 3; and RipH4 seems to be specific of the phylotype 4 strains (see Additional file 2).
A second group of paralogous rip genes is characterised by a smaller number (2–3) of paralogous sequences in a given strain. Phylogenetic analyses were used to estimate the evolutionary relationships between paralogues using sequence data from the 11 RSSC representative strains. We defined eight additional rip genes (RipC2, RipE2, RipF2, RipO2, RipV2, RipAF2, RipAX2 and RipAZ2) (Table 1 and Additional file 4). Several of these paralogous genes, such as ripC2 or ripO2, seem to differ significantly from RipC1 and RipO2 respectively and could have originated through lateral gene transfer (see below) since homologous genes exist in other bacterial species. For the gene families present in most of the RSSC strains (ripE2 and ripV2), the genes are located in each strain in a similar genomic context, an observation which also supports a common evolutionary origin. But distribution of some paralogs can be variable among strains: .i.e. RipE1 seems to be ubiquitously present whereas RipE2 is absent in phylotype 1 strains.
Protein sequence analyses indicated that RipAR, RipAW, RipV1, RipV2 and RipBG contain putative ubiquitin-ligase domains (see below), likewise, RipJ, RipK, RipAE, RipBC, RipP1 and RipP2 could all potentially display acetyltransferase activity (see phylogenetic tree in Additional file 5). Notwithstanding this apparent functional conservation, the sequences of these T3E genes have diverged significantly and can’t be assigned in orthologous goups. It has to be noted that the numerical identification of the two RipP1 and RipP2, and the pseudogene RipP3GMI1000 is used in reference to their previous names PopP1 [7, 36] (RipP1), PopP2 [20, 22, 44] (RipP2) and PopP3 . This is an exception to the previous rule as we don’t consider these to be paralogs.
Horizontally acquired rip genes
The detection of horizontal gene transfer (HGT hereafter) events in a given bacterial genome can be performed retrospectively through bioinformatics-based comparative analyses . A frequent hallmark of genes with an extrinsic origin is the difference in GC content of these genes compared of the mean content of the host genome [46, 47]. Thirteen Rip genes exhibit a mean GC% below 60% (whereas the genomic mean content in RSSC strains is 67%) (Additional file 6). In several cases, the T3E gene is physically associated with insertion sequence elements (RipAA, RipAX1, RipO2, RipE2), integrases (RipAF2) or are part of prophage sequences integrated in the genome (RipP1, RipP2, RipT, RipAG, RipAX2, RipE2, RipBD). From these observations, we can assume that bacteriophage-mediated transfer appears to be an efficient mean for lateral transfer of these T3E in the RSSC.
Phylogenetic analyses also provided interesting insights into possible HGT with other bacterial plant pathogens. For example, RipC2CFBP2957, outgroup of the RipC1 gene family, could derive from the XopC T3E from Xanthomonas spp . Furthermore, the low GC content of ripC2 CFBP2957 (61%) supports the hypothesis of an HGT, with the possibility of a shared common ancestor between ripC2 CFBP2957 and xopC. Similar observations can be made with RipO2 R.syzygii R24 (and P. syringae pv. phaseolicola HopG1), RipAF2 R.syzygii R24 (and P. syringae HopF1), RipE1 (and P. syringae HopX1 and Xanthomonas spp. XopE), RipP1 (and Xanthomonas spp. XopJ), RipAX2 (and Xanthomonas garderni XopG and P. syringae HopH1) and RipH2 (and Xanthomonas sp. XopP), see Additional file 4. Together with RipTAL , already suspected of inter-species transfer [48, 49], this analysis thus provided a total of seven T3E genes that could have been acquired through HGT.
Evidence of phylogenetic incongruences
Another example of discrepancy between species and gene phylogeny is for RipAA. Here the increased polymorphism is correlated with the presence of a hypervariable domain consisting of Variable Number of Tandem Repeats .
Several rip genes underwent selection and recombination
Rip coding sequences under strong diversifying positive selection on the protein level
Number of strains
Alignment length (nt)
Population recombination rate,N e r(PLPT)a
LRT statistic values for codon model pairsb
Proportions of sites in different selection regimesc
M0 vs M3
M1a vs M2a
M7 vs M8
M8a vs M8
Strict negative (ω< 0.15)
Relaxed negative (0.15 <ω<0.9)
Importantly, the presence of a high degree of recombination can hamper LRTs for positive diversifying selection, leading to false positives . However inference of recombination can also be affected by selection forces [53, 54]. This is why we systematically analysed all data for evidence of recombination (see Additional file 7 for full results). Table 2 also displays the results of tests for recombination for the nine previously identified Rip genes. Among these, only two (RipAW and RipG7) could also be affected by recombination, while for RipAA the evidence of recombination is not clear-cut. The interplay between selection and recombination was already disentangled previously for RipG7 , with the conclusion that there is indeed a strong likelihood of positive selection acting on this gene. Here we won’t address the question further for RipAA and RipAW but a future analysis with more allelic variants should be informative.
It is interesting to note that in the multigene paralogous families there seems to be one member under positive selection: RipH3, RipS7, RipG7. When we consider only 2 out of 3 LRTs for positive selection (see Additional file 7), we can define 14 more Rip coding sequences with evidence for positive selection, out of which 9 belong to the above-mentioned paralogous families (including RipA5). It is tempting to speculate that after duplications some of the paralogous genes could have undergone sub- or neo-functionalisation allowing the cognate Rip proteins to adapt to evolving plant targets or evade from host immunity.
Comparative genomics and functional implications
The RSSC T3E core set: a large group of conserved effectors
The establishment of a near-complete T3E repertoire in strains representative of the large phylogenetic diversity of the RSCC allows a more specific and accurate comparison than those based on comparative genomic hybridizations . We performed T3E repertoire comparisons using the following criteria: (i) rip genes listed as pseudogenes in the database were considered non-functional but those listed as containing frameshifts were considered as functional genes. The assumption that all the frameshifts are due to sequencing errors is probably an overestimation. Since we can’t validate this experimentally, and considering that the number of frameshifts identified is inversely correlated with the genomic sequence quality, we will keep this assumption. This is exemplified with GM1000 and CFBP2957 high quality genomes, not containing a single frameshift mutation in their T3E genes. (ii) The 16 hypothetical T3E newly identified in the different strains were also included in the repertoire for comparisons.
T3E repertoire comparisons provide no clues on host specificity determinants
R. solanacearum strains exhibit great variations in host range  and it is tempting to speculate that T3E repertoires shape these host range capabilities. In order to tentatively identify candidate genes involved in host specificity, we performed T3E repertoire comparisons within specific phylogenetic groups such as phylotype 2 or 4 using strains with marked host range differences (Figure 4B). These comparisons identified strain-specific genes but did not pinpoint strong host-specificity candidates. Indeed, none of the Molk2 specific T3E is common with those of the BDBR229 strain which is also pathogenic on banana; the same is true for potato-associated T3E genes from the Po82 and UW551/IPO1609 strains. Although more genomic sequences of RSSC strains are needed to perform robust associations between host range and T3E repertoires, these observations already suggest that host-range maybe controlled by multiple or differential combinations of T3E determinants, or determinants others than T3E, or that differences in T3E protein sequence or gene expression might also be involved . Similar observations were reported for comparison of P. syringae pathovars T3E repertoires , thus reinforcing the idea that a complex genetic basis underlies host range evolution in plant pathogens.
Finally, intra-phylotype comparisons suggest that the proportion of conserved T3E is higher in phylotype 2 than in phylotype 4 strains (Figure 4C). Although phylotype 4 strains BDBR29 and R24 have undergone gene reduction potentially affecting this comparison, we still believe that this difference reflects the highest genetic diversity within phylotype 4  and could also be associated with the diverse lifestyle among phylotype 4 strains .
Identification of novel T3E gene harboring putative ubiquitin-ligase domains
Molecular functions of most R. solanacearum T3E remain unknown, and more than half of the repertoire corresponds to proteins with no structural motif or domain suggestive of function. The search for functional motifs identified two T3E proteins, RipAR and RipAW, carrying a C-terminal domain structurally related to the Shigella flexneri IpaH ubiquitin ligase domain . Although the overall similarity between IpaH and RipAR/RipAW is low, these R. solanacearum T3E have a C-terminal domain with a predicted structure consisting of 12 alpha-helices as determined for IpaH family proteins . Most of the highly conserved residues in the IpaH family, including a highly conserved cysteine residue essential for activity , are conserved in RipAR and RipAW see sequence alignment in Additional file 9. Considering the previously identified T3E RipV, a Salmonella SspH1 homologue , and the RipG family members , R. solanacearum potentially harbors a total of 10 T3E endowed with potential ubiquitin-ligase activity. This highlights the probable central mechanism consisting in subversion of the host’s ubiquitination system by T3E during plant pathogenesis [59, 60].
The specific case of the RipF translocon proteins
The biological implications of this gene duplication of the RipF translocator in some RSSC lineages and the structural divergence between the RipF1/RipF2 family members are unknown. In GMI1000, RipF1_1 has a major role in T3E translocation in tomato and tobacco whereas RipF1_2 plays a minor role in this process on these hosts . The specific involvement of RipF2 and RipF1 in pathogenicity of phylotype 2 strains will need to be addressed in future studies.
T3E are essential to R. solanacearum pathogenesis but progress in understanding of their relative contribution to disease through reverse genetic approaches has been hampered by the evidence of functional redundancies, due to the existence of large T3E repertoires. In this study, we have undertaken groundwork for a global inventory of R. solanacearum T3E at the species level in order to provide to the community a curated dataset, tools and a rationalized nomenclature that should pave the way for future work on RSSC effectomics. We conducted a large scale approach aimed at the identification, expert annotation and phylogenetic analyses of T3E from the RSSC, a species complex showing considerable genomic diversity [10, 11] and responsible for one of the most devastating bacterial disease of plants worldwide . Our search yielded a total of 94 T3E Rip genes and 16 additional candidate T3E genes distributed among the 11 genomes analyzed in this study. This total of more than 100 predicted T3Es is significantly higher than the T3E inventories from other bacterial plant pathogens. Indeed, in P. syringae, genome analysis of 19 phylogenetically diverse isolates revealed the existence of 58 T3E genes  (the online resource http://www.pseudomonas-syringae.org, references 61 Hop orthologous groups) whereas this number is estimated to 52 in Xanthomonas spp . These comparisons highlight the great diversity of T3E genes present in the RSSC and the apparent complexity of T3SS-dependent pathogenesis in this species complex.
The RSSC T3E also appears to be highly dynamic, as evidenced by the number of T3E under positive selection indicative of possible neo-functionalization or the number of T3E pseudogenes identified in this study. In particular, there is an obvious tendency to T3E gene decay in R. syzygii which is correlated with the genome reduction in this strain . R. syzygii is an exception among the RSSC since it is strictly limited to Clover tree, the T3E repertoire reduction in this strain may be a consequence of this host specialization. On the other hand, the cornucopia of T3E identified in R. solanacearum and other related pathogenic beta-proteobacteria is probably a factor explaining the exceptional adaptation of these pathogens to such a wide diversity of hosts. Importantly, phylogenetic analyses allowed the definition of novel T3E genes, resulting in the definition of new Rip genes orthologous group or paralogs of already identified Rip genes. It is conceivable that these newly defined groups correspond to T3E genes with novel functional specificities.
Our analysis should also be helpful for refined functional studies: (i) the RipF1-RipF2 translocon proteins appear as major discriminating determinants among the main lineages of the RSSC and this probably reflects a fundamental evolutionary divergence (ii) global comparisons of repertoires among genetically diverse strains identified a set of 20–30 core T3E widely distributed in the species which could presumably be considered as ancestral T3E important in the interaction of the pathogen with its hosts, and (iii) the identification of T3E displaying a positive selection pattern may provide hints on the determinants evolving under plant selection pressure, (iv) our bioinformatics pipeline is dedicated to rapidly predict and assign Rip identifiers to all homologous T3E genes in newly sequenced strains of the RSSC.
General information of the features of the 14 strains of the RSSC and the corresponding genome sequences used for T3E mining is provided in Additional file 10. These strains are representatives of the RSSC in terms of host range, worldwide geographic origin and phylogenetic distribution [10, 11].
T3E inventory and annotation in RSSC genomes
PatScan searches  for the hrpII box element (TTCGn16TTCG) were performed in RSSC genomes using the criteria previously used , i.e.: one mismatch allowed, considering only hits in the 500 bp region upstream of a start codon. Analysis of the 50 amino acid N-terminal domain of candidate T3E for detection of T3SS-dependent export pattern was made using the criteria defined previously , which considered as positive a N-terminal domain meeting at least two out of the three following rules: (i) content in Serine + Proline >30%, (ii) content in Leucine <10% and (iii) absence of acidic residues within the first twelve amino acids.
Prediction of T3E start codon
We observed a great heterogeneity among the predicted start codons for many T3E families in the RSSC annotated genomes deposited at GenBank. When possible, multiple sequence alignments of the regions located downstream the hrpII box element were performed to predict the most probable start codon which was defined as the more distal 5′ initiator codon conserved among the different strain sequences.
Frameshift and pseudogene prediction
T3E genes were annotated as frameshift in two cases: (i) when several contiguous open reading frames displayed homology to a defined Rip gene sequence (thus resulting in the annotation of two or multiple gene fragments), and (ii) when the T3E gene sequence was located on a contig border (thus resulting in the annotation of a T3E gene fragment).
T3E genes were defined as pseudogenes in the following situations: (i) the structure of T3E gene was strongly altered with a gene size <50% to other known alleles, or led to the deletion of the N-terminal domain necessary for T3SS-dependent translocation, (ii) the T3E gene open reading frames was disrupted by the insertion of an IS element, or (iii) there was experimental evidence that the T3E gene product is not translocated or secreted by the T3SS.
Detection of candidate effectors in sequenced genomes using “ScanYourGenome”
The first step of the pipeline we developed to detect putative effector candidates is a de novo proteome prediction. To achieve this, we run a blastx of the genome against the T3E proteins and use this data as an input of the prokaryotic gene predictor FrameD . This tool is run twice with the T3E nucleic coding sequence as model: the first pass is done with a high frameshift penalty score and the second one with a lower one, allowing frameshift and pseudogene prediction. To ensure the completeness of this new effectome, we add translated regions matching a T3E member according to the blastx results.
The second step of the pipeline is the search of homologous T3E member for each candidate. In order to get the best precision, we run different methods and synthesise information taking into account the specificity of each method and parameters.
The first method is the search for homology using a modified version of OrthoMCL  pipeline. The modifications used are: filter inactivation in the blastp preprocess with default parameters and stepwise decrease of the percent match cutoff (from 90% to 60%) in ortholog clustering in order to retrieve shorter pseudogene. The best blastp, hmmscan and tblastn are respectively kept in order to complete orthoMCL assignation or to remove ambiguity of multiple assignations, especially in the case of paralogous gene families.
The results are ordered according to the stringency of the method (from OrthoMCL90 > OrthoMCL80 > OrthoMCL70 > OrthoMCL60 > blastp > HMMscan > tblastn). It is also indicated whether a frameshift mutation was introduced to produce a better homologous sequence. If the candidate gene is shorter than 80% of the average length of the cognate Rip gene, then the gene is tagged as a candidate.
This pipeline, written in Perl, is available through the T3E web interface and all parameters are available on demand.
Rip sequences were aligned using the ProGraphMSA program, which implements the evolution-aware alignment [65, 66]. This program performs well with indel rich data as well as with variation in tandem repeats such as leucine rich repeats, as is often the case here. All phylogenies were reconstructed using fast maximum likelihood (ML) heuristic search. For all individual Rip genes we captured information from both nonsynonymous and synonymous sites by using tree searches under codon model M0  using CodonPhyML .
Since phylogenies for paralogous gene families described much more diverse datasets, they were reconstructed under amino acid model LG  with C-rate variation among sites , as implemented in PhyMLv3.0 . Branch supports were estimated using the aBayes method, which is fast, accurate and has performance comparable with the Bayesian method . Phylogenetic trees were produced using the online software ITOL .
Analysis of selection pressures
Selection pressures were analysed on T3E genes datasets containing three or more orthologs. Selection pressures on T3E genes were evaluated using Markov models of codon substitution, and three pairs of likelihood ratio tests (LRTs) were used to detect positive selection like previously described .
Testing for recombination
The same data used for the selection pressure analysis were used to estimate the population recombination rates using the approximate-likelihood coalescent method and permutation test  like previously described .
Availability of supporting data
All the data present in this work and supporting our analysis is available on the publicly accessible database that has been set up and will be maintained by us.
https://iant.toulouse.inra.fr/T3E is a website designed to provide the user with a convenient and straightforward access to all the underlying data.
Out of the 841 Ralstonia solanacearum accessions used in this study, we have submitted 42 new and proposed the modification of the annotation of 289 other individual T3E gene accessions to GenBank. All the Genbank accessions appear on the database webpage (under data/supplementary data and also as Additional file 11.
Blood disease bacterium
Horizontal gene transfer
Likelihood ratio test
Ralstonia injected protein
Ralstonia solanacearum species complex
Type III effector
Type III secretion system.
We thank Jérôme Gouzy for advices and discussions. This work was supported by funds from the “Laboratoire d’Excellence” (LABEX) entitled TULIP (ANR-10-LABX-41) and grant 31003A_127325 from the Swiss National Science Foundation to M.A.
- Peeters N, Guidot A, Vailleau F, Valls M: Ralstonia solanacearum, a widespread bacterial plant pathogen in the post-genomic era. Mol Plant Pathol. 2013, 14: 651-662. 10.1111/mpp.12038.View ArticlePubMed
- Mansfield J, Genin S, Magori S, Citovsky V, Sriariyanum M, Ronald P, Dow M, Verdier V, Beer SV, Machado MA, Toth I, Salmond G, Foster GD: Top 10 plant pathogenic bacteria in molecular plant pathology. Mol Plant Pathol. 2012, 13: 614-629. 10.1111/j.1364-3703.2012.00804.x.View ArticlePubMed
- Elphinstone JG: The Current Bacterial Wilt Situation: A Global Overview. Bact Wilt Dis Ralstonia Solanacearum Species Complex. Edited by: Allen C, Prior P, Hayward AC. 2005, St Paul, MN, USA: APS Press, 9-28.
- Genin S: Molecular traits controlling host range and adaptation to plants in Ralstonia solanacearum. New Phytol. 2010, 187: 920-928. 10.1111/j.1469-8137.2010.03397.x.View ArticlePubMed
- Cunnac S, Occhialini A, Barberis P, Boucher C, Genin S: Inventory and functional analysis of the large Hrp regulon in Ralstonia solanacearum: identification of novel effector proteins translocated to plant host cells through the type III secretion system. Mol Microbiol. 2004, 53: 115-128. 10.1111/j.1365-2958.2004.04118.x.View ArticlePubMed
- Mukaihara T, Tamura N, Iwabuchi M: Genome-wide identification of a large repertoire of Ralstonia solanacearum type III effector proteins by a new functional screen. Mol Plant Microbe Interactions MPMIs. 2010, 23: 251-262. 10.1094/MPMI-23-3-0251.View Article
- Poueymiro M, Genin S: Secreted proteins from Ralstonia solanacearum: a hundred tricks to kill a plant. Curr Opin Microbiol. 2009, 12: 44-52. 10.1016/j.mib.2008.11.008.View ArticlePubMed
- Fegan M, Prior P: How Complex is the “Ralstonia Solanacearum Species Complex. Bact Wilt Dis Ralstonia Solanacearum Species Complex. Edited by: Allen C, Prior P, Hayward AC. 2005, St Paul, MN, USA: APS Press, 449-461.
- Wicker E, Lefeuvre P, de Cambiaire J-C, Lemaire C, Poussier S, Prior P: Contrasting recombination patterns and demographic histories of the plant pathogen Ralstonia solanacearum inferred from MLSA. ISME J. 2012, 6: 961-974. 10.1038/ismej.2011.160.PubMed CentralView ArticlePubMed
- Genin S, Denny TP: Pathogenomics of the Ralstonia solanacearum species complex. Annu Rev Phytopathol. 2012, 50: 67-89. 10.1146/annurev-phyto-081211-173000.View ArticlePubMed
- Remenant B, de Cambiaire J-C, Cellier G, Jacobs JM, Mangenot S, Barbe V, Lajus A, Vallenet D, Medigue C, Fegan M, Allen C, Prior P: Ralstonia syzygii, the blood disease bacterium and some Asian R. Solanacearum strains form a single genomic species despite divergent lifestyles. PLoS One. 2011, 6: e24356-10.1371/journal.pone.0024356.PubMed CentralView ArticlePubMed
- Guidot A, Prior P, Schoenfeld J, Carrère S, Genin S, Boucher C: Genomic structure and phylogeny of the plant pathogen Ralstonia solanacearum inferred from gene distribution analysis. J Bacteriol. 2007, 189: 377-387. 10.1128/JB.00999-06.PubMed CentralView ArticlePubMed
- Baltrus DA, Nishimura MT, Romanchuk A, Chang JH, Mukhtar MS, Cherkis K, Roach J, Grant SR, Jones CD, Dangl JL: Dynamic evolution of pathogenicity revealed by sequencing and comparative genomics of 19 Pseudomonas syringae isolates. PLoS Pathog. 2011, 7: e1002132-10.1371/journal.ppat.1002132.PubMed CentralView ArticlePubMed
- Hajri A, Brin C, Hunault G, Lardeux F, Lemaire C, Manceau C, Boureau T, Poussier S: A “repertoire for repertoire” hypothesis: repertoires of type three effectors are candidate determinants of host specificity in Xanthomonas. PLoS One. 2009, 4: e6632-10.1371/journal.pone.0006632.PubMed CentralView ArticlePubMed
- Mukaihara T, Tamura N: Identification of novel Ralstonia solanacearum type III effector proteins through translocation analysis of hrpB-regulated gene products. Microbiol Read Engl. 2009, 155 (Pt 7): 2235-2244.View Article
- Angot A, Peeters N, Lechner E, Vailleau F, Baud C, Gentzbittel L, Sartorel E, Genschik P, Boucher C, Genin S: Ralstonia solanacearum requires F-box-like domain-containing type III effectors to promote disease on several host plants. Proc Natl Acad Sci U S A. 2006, 103: 14620-14625. 10.1073/pnas.0509393103.PubMed CentralView ArticlePubMed
- Remigi P, Anisimova M, Guidot A, Genin S, Peeters N: Functional diversification of the GALA type III effector family contributes to Ralstonia solanacearum adaptation on different plant hosts. New Phytol. 2011, 192: 976-987. 10.1111/j.1469-8137.2011.03854.x.PubMed CentralView ArticlePubMed
- Kajava AV, Anisimova M, Peeters N: Origin and evolution of GALA-LRR, a new member of the CC-LRR subfamily: from plants to bacteria?. PLoS One. 2008, 3: e1694-10.1371/journal.pone.0001694.PubMed CentralView ArticlePubMed
- Lavie M, Seunes B, Prior P, Boucher C: Distribution and sequence analysis of a family of type ill-dependent effectors correlate with the phylogeny of Ralstonia solanacearum strains. Mol Plant Microbe Interactions MPMI. 2004, 17: 931-940. 10.1094/MPMI.2004.17.8.931.View ArticlePubMed
- Deslandes L, Olivier J, Peeters N, Feng DX, Khounlotham M, Boucher C, Somssich I, Genin S, Marco Y: Physical interaction between RRS1-R, a protein conferring resistance to bacterial wilt, and PopP2, a type III effector targeted to the plant nucleus. Proc Natl Acad Sci U S A. 2003, 100: 8024-8029. 10.1073/pnas.1230660100.PubMed CentralView ArticlePubMed
- Deslandes L, Olivier J, Theulieres F, Hirsch J, Feng DX, Bittner-Eddy P, Beynon J, Marco Y: Resistance to Ralstonia solanacearum in Arabidopsis thaliana is conferred by the recessive RRS1-R gene, a member of a novel family of resistance genes. Proc Natl Acad Sci U S A. 2002, 99: 2404-2409. 10.1073/pnas.032485099.PubMed CentralView ArticlePubMed
- Tasset C, Bernoux M, Jauneau A, Pouzet C, Brière C, Kieffer-Jacquinod S, Rivas S, Marco Y, Deslandes L: Autoacetylation of the Ralstonia solanacearum effector PopP2 targets a lysine residue essential for RRS1-R-mediated immunity in Arabidopsis. PLoS Pathog. 2010, 6: e1001202-10.1371/journal.ppat.1001202.PubMed CentralView ArticlePubMed
- Solé M, Popa C, Mith O, Sohn KH, Jones JDG, Deslandes L, Valls M: The awr gene family encodes a novel class of Ralstonia solanacearum type III effectors displaying virulence and avirulence activities. Mol Plant Microbe Interactions MPMI. 2012, 25: 941-953. 10.1094/MPMI-12-11-0321.View ArticlePubMed
- Cunnac S, Boucher C, Genin S: Characterization of the cis-acting regulatory element controlling HrpB-mediated activation of the type III secretion system and effector genes in Ralstonia solanacearum. J Bacteriol. 2004, 186: 2309-2318. 10.1128/JB.186.8.2309-2318.2004.PubMed CentralView ArticlePubMed
- Sharma V, Firth AE, Antonov I, Fayet O, Atkins JF, Borodovsky M, Baranov PV: A pilot study of bacterial genes with disrupted ORFs reveals a surprising profusion of protein sequence recoding mediated by ribosomal frameshifting and transcriptional realignment. Mol Biol Evol. 2011, 28: 3195-3211. 10.1093/molbev/msr155.PubMed CentralView ArticlePubMed
- Edgar RC: MUSCLE: a multiple sequence alignment method with reduced time and space complexity. BMC Bioinforma. 2004, 5: 113-10.1186/1471-2105-5-113.View Article
- Mukaihara T, Tamura N, Murata Y, Iwabuchi M: Genetic screening of Hrp type III-related pathogenicity genes controlled by the HrpB transcriptional activator in Ralstonia solanacearum. Mol Microbiol. 2004, 54: 863-875. 10.1111/j.1365-2958.2004.04328.x.View ArticlePubMed
- Lindeberg M, Stavrinides J, Chang JH, Alfano JR, Collmer A, Dangl JL, Greenberg JT, Mansfield JW, Guttman DS: Proposed guidelines for a unified nomenclature and phylogenetic analysis of type III Hop effector proteins in the plant pathogen Pseudomonas syringae. Mol Plant Microbe Interactions MPMI. 2005, 18: 275-282. 10.1094/MPMI-18-0275.View Article
- Lemoine F, Lespinet O, Labedan B: Assessing the evolutionary rate of positional orthologous genes in prokaryotes using synteny data. BMC Evol Biol. 2007, 7: 237-10.1186/1471-2148-7-237.PubMed CentralView ArticlePubMed
- Kristensen DM, Wolf YI, Mushegian AR, Koonin EV: Computational methods for gene orthology inference. Brief Bioinform. 2011, 12: 379-391. 10.1093/bib/bbr030.PubMed CentralView ArticlePubMed
- Poueymiro M, Cunnac S, Barberis P, Deslandes L, Peeters N, Cazale-Noel A-C, Boucher C, Genin S: Two type III secretion system effectors from Ralstonia solanacearum GMI1000 determine host-range specificity on tobacco. Mol Plant Microbe Interactions MPMI. 2009, 22: 538-550. 10.1094/MPMI-22-5-0538.View ArticlePubMed
- Arlat M, Van Gijsegem F, Huet JC, Pernollet JC, Boucher CA: PopA1, a protein which induces a hypersensitivity-like response on specific Petunia genotypes, is secreted via the Hrp pathway of Pseudomonas solanacearum. EMBO J. 1994, 13: 543-553.PubMed CentralPubMed
- Guéneron M, Timmers AC, Boucher C, Arlat M: Two novel proteins, PopB, which has functional nuclear localization signals, and PopC, which has a large leucine-rich repeat domain, are secreted through the hrp-secretion apparatus of Ralstonia solanacearum. Mol Microbiol. 2000, 36: 261-277. 10.1046/j.1365-2958.2000.01870.x.View ArticlePubMed
- Li J-G, Liu H-X, Cao J, Chen L-F, Gu C, Allen C, Guo J-H: PopW of Ralstonia solanacearum, a new two-domain harpin targeting the plant cell wall. Mol Plant Pathol. 2010, 11: 371-381. 10.1111/j.1364-3703.2010.00610.x.View ArticlePubMed
- Meyer D, Cunnac S, Guéneron M, Declercq C, Van Gijsegem F, Lauber E, Boucher C, Arlat M: PopF1 and PopF2, two proteins secreted by the type III protein secretion system of Ralstonia solanacearum, are translocators belonging to the HrpF/NopX family. J Bacteriol. 2006, 188: 4903-4917. 10.1128/JB.00180-06.PubMed CentralView ArticlePubMed
- Lavie M, Shillington E, Eguiluz C, Grimsley N, Boucher C: PopP1, a new member of the YopJ/AvrRxv family of type III effector proteins, acts as a host-specificity factor and modulates aggressiveness of Ralstonia solanacearum. Mol Plant Microbe Interactions MPMI. 2002, 15: 1058-1068. 10.1094/MPMI.2002.15.10.1058.View Article
- Carney BF, Denny TP: A cloned avirulence gene from Pseudomonas solanacearum determines incompatibility on Nicotiana tabacum at the host species level. J Bacteriol. 1990, 172: 4836-4843.PubMed CentralPubMed
- Yabuuchi E, Kosako Y, Yano I, Hotta H, Nishiuchi Y: Transfer of two Burkholderia and an alcaligenes species to ralstonia gen. Nov.: proposal of ralstonia pickettii (Ralston, palleroni and doudoroff 1973) comb. Nov., ralstonia solanacearum (smith 1896) comb. Nov. And ralstonia eutropha (Davis 1969) comb. Nov. Microbiol Immunol. 1995, 39: 897-904. 10.1111/j.1348-0421.1995.tb03275.x.View ArticlePubMed
- Goure J, Pastor A, Faudry E, Chabert J, Dessen A, Attree I: The V antigen of Pseudomonas aeruginosa is required for assembly of the functional PopB/PopD translocation pore in host cell membranes. Infect Immun. 2004, 72: 4741-4750. 10.1128/IAI.72.8.4741-4750.2004.PubMed CentralView ArticlePubMed
- Schesser K, Dukuzumuremyi JM, Cilio C, Borg S, Wallis TS, Pettersson S, Galyov EE: The salmonella YopJ-homologue AvrA does not possess YopJ-like activity. Microb Pathog. 2000, 28: 59-70. 10.1006/mpat.1999.0324.View ArticlePubMed
- Remenant B, Babujee L, Lajus A, Médigue C, Prior P, Allen C: Sequencing of K60, type strain of the major plant pathogen Ralstonia solanacearum. J Bacteriol. 2012, 194: 2742-2743. 10.1128/JB.00249-12.PubMed CentralView ArticlePubMed
- Cao Y, Tian B, Liu Y, Cai L, Wang H, Lu N, Wang M, Shang S, Luo Z, Shi J: Genome sequencing of ralstonia solanacearum FQY_4, isolated from a bacterial wilt nursery used for breeding crop resistance. Genome Announc. 2013, 1: e00125-13-PubMed CentralPubMed
- Li Z, Wu S, Bai X, Liu Y, Lu J, Liu Y, Xiao B, Lu X, Fan L: Genome sequence of the tobacco bacterial wilt pathogen Ralstonia solanacearum. J Bacteriol. 2011, 193: 6088-6089. 10.1128/JB.06009-11.PubMed CentralView ArticlePubMed
- Bernoux M, Timmers T, Jauneau A, Brière C, de Wit PJGM, Marco Y, Deslandes L: RD19, an Arabidopsis cysteine protease required for RRS1-R-mediated resistance, is relocalized to the nucleus by the Ralstonia solanacearum PopP2 effector. Plant Cell. 2008, 20: 2252-2264. 10.1105/tpc.108.058685.PubMed CentralView ArticlePubMed
- Didelot X, Maiden MCJ: Impact of recombination on bacterial evolution. Trends Microbiol. 2010, 18: 315-322. 10.1016/j.tim.2010.04.002.PubMed CentralView ArticlePubMed
- Dufraigne C, Fertil B, Lespinats S, Giron A, Deschavanne P: Detection and characterization of horizontal transfers in prokaryotes using genomic signature. Nucleic Acids Res. 2005, 33: e6-10.1093/nar/gni004.PubMed CentralView ArticlePubMed
- Kado CI: Horizontal gene transfer: sustaining pathogenicity and optimizing host-pathogen interactions. Mol Plant Pathol. 2009, 10: 143-150. 10.1111/j.1364-3703.2008.00518.x.View ArticlePubMed
- De Lange O, Schreiber T, Schandry N, Radeck J, Braun KH, Koszinowski J, Heuer H, Strauss A, Lahaye T: Breaking the DNA-binding code of Ralstonia solanacearum TAL effectors provides new possibilities to generate plant resistance genes against bacterial wilt disease. New Phytol. 2013, 199: 773-786. 10.1111/nph.12324.View ArticlePubMed
- Fall S, Mercier A, Bertolla F, Calteau A, Gueguen L, Perrière G, Vogel TM, Simonet P: Horizontal gene transfer regulation in bacteria as a “spandrel” of DNA repair mechanisms. PLoS One. 2007, 2: e1055-10.1371/journal.pone.0001055.PubMed CentralView ArticlePubMed
- Wolf YI, Koonin EV: A tight link between orthologs and bidirectional best hits in bacterial and archaeal genomes. Genome Biol Evol. 2012, 4: 1286-1294. 10.1093/gbe/evs100.PubMed CentralView ArticlePubMed
- Gil M, Zanetti MS, Zoller S, Anisimova M: CodonPhyML: fast maximum likelihood phylogeny estimation under codon substitution models. Mol Biol Evol. 2013, 30: 1270-1280. 10.1093/molbev/mst034.PubMed CentralView ArticlePubMed
- Anisimova M, Nielsen R, Yang Z: Effect of recombination on the accuracy of the likelihood method for detecting positive selection at amino acid sites. Genetics. 2003, 164: 1229-1236.PubMed CentralPubMed
- Reed FA, Tishkoff SA: Positive selection can create false hotspots of recombination. Genetics. 2006, 172: 2011-2014.PubMed CentralView ArticlePubMed
- O’Reilly PF, Birney E, Balding DJ: Confounding between recombination and selection, and the Ped/Pop method for detecting selection. Genome Res. 2008, 18: 1304-1313. 10.1101/gr.067181.107.PubMed CentralView ArticlePubMed
- Ryan RP, Vorhölter F-J, Potnis N, Jones JB, Van Sluys M-A, Bogdanove AJ, Dow JM: Pathogenomics of Xanthomonas: understanding bacterium-plant interactions. Nat Rev Microbiol. 2011, 9: 344-355. 10.1038/nrmicro2558.View ArticlePubMed
- Baltrus DA, Nishimura MT, Dougherty KM, Biswas S, Mukhtar MS, Vicente J, Holub EB, Dangl JL: The molecular basis of host specialization in bean pathovars of Pseudomonas syringae. Mol Plant Microbe Interactions MPMI. 2012, 25: 877-888. 10.1094/MPMI-08-11-0218.View ArticlePubMed
- Singer AU, Rohde JR, Lam R, Skarina T, Kagan O, Dileo R, Chirgadze NY, Cuff ME, Joachimiak A, Tyers M, Sansonetti PJ, Parsot C, Savchenko A: Structure of the Shigella T3SS effector IpaH defines a new class of E3 ubiquitin ligases. Nat Struct Mol Biol. 2008, 15: 1293-1301. 10.1038/nsmb.1511.PubMed CentralView ArticlePubMed
- Rohde JR, Breitkreutz A, Chenal A, Sansonetti PJ, Parsot C: Type III secretion effectors of the IpaH family are E3 ubiquitin ligases. Cell Host Microbe. 2007, 1: 77-83. 10.1016/j.chom.2007.02.002.View ArticlePubMed
- Dudler R: Manipulation of Host Proteosomes as a Virulence Mechanism of Plant Pathogens. Annu Rev Phytopathol. 2013, 51: 521-542. 10.1146/annurev-phyto-082712-102312.View ArticlePubMed
- Angot A, Vergunst A, Genin S, Peeters N: Exploitation of eukaryotic ubiquitin signaling pathways by effectors translocated by bacterial type III and type IV secretion systems. PLoS Pathog. 2007, 3: e3-10.1371/journal.ppat.0030003.PubMed CentralView ArticlePubMed
- Büttner D, Nennstiel D, Klüsener B, Bonas U: Functional analysis of HrpF, a putative type III translocon protein from Xanthomonas campestris pv. vesicatoria. J Bacteriol. 2002, 184: 2389-2398. 10.1128/JB.184.9.2389-2398.2002.PubMed CentralView ArticlePubMed
- Dsouza M, Larsen N, Overbeek R: Searching for patterns in genomic data. Trends Genet TIG. 1997, 13: 497-498.View ArticlePubMed
- Schiex T, Gouzy J, Moisan A, de Oliveira Y: FrameD: a flexible program for quality check and gene prediction in prokaryotic genomes and noisy matured eukaryotic sequences. Nucleic Acids Res. 2003, 31: 3738-3741. 10.1093/nar/gkg610.PubMed CentralView ArticlePubMed
- Li L, Stoeckert CJ, Roos DS: OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 2003, 13: 2178-2189. 10.1101/gr.1224503.PubMed CentralView ArticlePubMed
- Szalkowski AM, Anisimova M: Markov models of amino acid substitution to study proteins with intrinsically disordered regions. PLoS One. 2011, 6: e20488-10.1371/journal.pone.0020488.PubMed CentralView ArticlePubMed
- Szalkowski A, Anisimova M: Graph-based modeling of tandem repeats improves global multiple sequence alignment. Nucleic Acids Res. 2013, 41: e162-10.1093/nar/gkt628.PubMed CentralView ArticlePubMed
- Goldman N, Yang Z: A codon-based model of nucleotide substitution for protein-coding DNA sequences. Mol Biol Evol. 1994, 11: 725-736.PubMed
- Le SQ, Gascuel O: An improved general amino acid replacement matrix. Mol Biol Evol. 2008, 25: 1307-1320. 10.1093/molbev/msn067.View ArticlePubMed
- Yang Z: Maximum likelihood phylogenetic estimation from DNA sequences with variable rates over sites: approximate methods. J Mol Evol. 1994, 39: 306-314. 10.1007/BF00160154.View ArticlePubMed
- Guindon S, Dufayard J-F, Lefort V, Anisimova M, Hordijk W, Gascuel O: New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst Biol. 2010, 59: 307-321. 10.1093/sysbio/syq010.View ArticlePubMed
- Anisimova M, Gil M, Dufayard J-F, Dessimoz C, Gascuel O: Survey of branch support methods demonstrates accuracy, power, and robustness of fast likelihood-based approximation schemes. Syst Biol. 2011, 60: 685-699. 10.1093/sysbio/syr041.PubMed CentralView ArticlePubMed
- Letunic I, Bork P: Interactive Tree Of Life v2: online annotation and display of phylogenetic trees made easy. Nucleic Acids Res. 2011, 39: W475-W478. 10.1093/nar/gkr201. (Web Server issue)PubMed CentralView ArticlePubMed
- McVean G, Awadalla P, Fearnhead P: A coalescent-based method for detecting and estimating recombination from gene sequences. Genetics. 2002, 160: 1231-1241.PubMed CentralPubMed
- Yang Z, Nielsen R, Goldman N, Pedersen AM: Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics. 2000, 155: 431-449.PubMed CentralPubMed
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.