Comparative sequence analysis of Solanum and Arabidopsis in a hot spot for pathogen resistance on potato chromosome V reveals a patchwork of conserved and rapidly evolving genome segments
© Ballvora et al; licensee BioMed Central Ltd. 2007
Received: 16 November 2006
Accepted: 2 May 2007
Published: 2 May 2007
Quantitative phenotypic variation of agronomic characters in crop plants is controlled by environmental and genetic factors (quantitative trait loci = QTL). To understand the molecular basis of such QTL, the identification of the underlying genes is of primary interest and DNA sequence analysis of the genomic regions harboring QTL is a prerequisite for that. QTL mapping in potato (Solanum tuberosum) has identified a region on chromosome V tagged by DNA markers GP21 and GP179, which contains a number of important QTL, among others QTL for resistance to late blight caused by the oomycete Phytophthora infestans and to root cyst nematodes.
To obtain genomic sequence for the targeted region on chromosome V, two local BAC (bacterial artificial chromosome) contigs were constructed and sequenced, which corresponded to parts of the homologous chromosomes of the diploid, heterozygous genotype P6/210. Two contiguous sequences of 417,445 and 202,781 base pairs were assembled and annotated. Gene-by-gene co-linearity was disrupted by non-allelic insertions of retrotransposon elements, stretches of diverged intergenic sequences, differences in gene content and gene order. The latter was caused by inversion of a 70 kbp genomic fragment. These features were also found in comparison to orthologous sequence contigs from three homeologous chromosomes of Solanum demissum, a wild tuber bearing species. Functional annotation of the sequence identified 48 putative open reading frames (ORF) in one contig and 22 in the other, with an average of one ORF every 9 kbp. Ten ORFs were classified as resistance-gene-like, 11 as F-box-containing genes, 13 as transposable elements and three as transcription factors. Comparing potato to Arabidopsis thaliana annotated proteins revealed five micro-syntenic blocks of three to seven ORFs with A. thaliana chromosomes 1, 3 and 5.
Comparative sequence analysis revealed highly conserved collinear regions that flank regions showing high variability and tandem duplicated genes. Sequence annotation revealed that the majority of the ORFs were members of multiple gene families. Comparing potato to Arabidopsis thaliana annotated proteins suggested fragmented structural conservation between these distantly related plant species.
The potato (Solanum tuberosum) is the most important crop of the Solanaceae. It is a tetraploid, non-inbred, annual plant species that is vegetatively propagated by tubers. Polyploidy and inbreeding depression prevent the generation of homozygous lines. When the ploidy level is reduced from 4n to 2n, the diploid potatoes are self incompatible. Potato genotypes at all ploidy levels are therefore heterozygous . The basic chromosome number of potato is twelve and its genome size is in the order of 800 to 1000 megabases, similar to the closely related tomato (Solanum lycopersicum). Detailed RFLP (restriction fragment length polymorphism) linkage maps have been constructed for the twelve chromosomes [2–5, 63], which were subsequently used to locate in the potato genome factors controlling monogenic and polygenic traits of agronomic relevance such as resistance to pests and pathogens or tuber quality (e. g. starch and sugar content) (reviewed in [1, 6]). When using the same locus specific DNA-based markers in different mapping populations, the positional information of the mapped factors controlling qualitative and quantitative traits can be compared and integrated. This comparison showed that a number of the factors which control qualitative (R genes) or quantitative resistance (QRL = quantitative resistance loci) to different types of pathogens map to similar positions. These chromosomal regions are so-called hot-spots for pathogen resistance. One of the most conspicuous resistance hot-spots in the potato genome is located on potato chromosome V, in a chromosome segment tagged by the DNA-based markers GP21 and GP179. The 3 cM interval between GP21 and GP179  includes the R genes Rx2 and Nb both for resistance to Potato Virus X [8, 9] and the R1 gene for race-specific resistance to the oomycete Phytophthora infestans causing late blight [10, 7]. The same markers are also linked to QRL for P. infestans [11–14] and QRL for the root cyst nematodes G. rostochiensis and G. pallida [15–18]. As shown by QTL mapping [12–14, 19, 20], this region on potato chromosome V not only contains genes for resistance to various pathogens but also genes controlling plant vigor, plant maturity (the time the plant needs from planting to reach maturity under long day conditions), tuber yield, tuber starch and tuber sugar content.
Two R genes from the chromosome V resistance hot-spot have been functionally characterized, Rx2 for extreme resistance to Potato Virus X  and R1 for resistance to P. infestans . Both R genes are members of the superfamily of plant resistance genes characterized by a coiled coil (CC), a nucleotide binding (NB) and a leucine rich repeat (LRR) domain , but otherwise share low sequence similarity. R1 has been introgressed from the allo-hexaploid – wild potato species Solanum demissum into the cultivated potato germplasm pool [1, 25] and is one member of a clustered gene family in the GP21–GP179 interval [22, 25]. The molecular basis of the QRL for late blight and root cyst nematodes in the same region is unknown. One possibility is that alleles of the R1 and/or the Rx2 gene, or other members of the R1 gene family and/or another resistance-gene-like (RGL) family in this genome region encode the factors for the quantitative resistance phenotypes, similar to classical plant genetic studies, where resistance loci with multiple specificities to different races of a pathogen may be alleles of the same gene, or tightly linked genes [11, 26]. However, the resolution of QTL mapping in this region of the potato genome, even when based on large populations related by descent  is low. Genes physically linked to the R1 family, but structurally and functionally unrelated to R1 or other RGLs, cannot be excluded as candidates for the QRL of interest. High resolution mapping and map-based cloning of QTL based on recombinant inbred populations or near isogenic lines  is not feasible in potato due to self-incompatibility. As an alternative approach, genomic sequencing and annotation can provide information on all putative genes present in the whole region, which then might be further examined for function as quantitative trait loci, first in silico by functional annotation and then experimentally by analysis of natural allelic diversity of positional candidates and complementation analysis using candidate gene allelic variants . Powerful bioinformatic tools for gene annotation  and functional analysis of sequence related genes in model plants such as Arabidopsis thaliana and rice can facilitate the selection of functional candidate genes among all the positional candidates in a genome segment harboring QTL.
Parts of the genomic region corresponding to the GP21–GP179 interval in S. demissum were sequenced and three different haplotypes A, B and C equivalent to the three homeologous chromosome pairs of S. demissum were identified . They demonstrated substantial structural variation among the haplotype sequences. In this paper, we report the genomic sequence analysis of two orthologous chromosome segments of the heterozygous, diploid Solanum tuberosum genotype P6/210 in the same GP21–GP179 interval. Two independent bacterial artificial chromosome (BAC) contigs, corresponding to the homologous chromosomes of P6/210 were constructed, sequenced and annotated, thereby extending the genomic sequence information available in this functionally important region of the potato genome. Comparative sequence analysis revealed regions with severe structural distortions and deviations from gene by gene co-linearity, which are flanked by conserved regions showing microsynteny with the A. thaliana genome.
Contig construction and BAC sequencing
The potato clone P6/210 used to construct the BAC libraries was heterozygous for R1 (R1/r1). Physical mapping in the context of cloning the R1 resistance gene had identified overlapping BAC insertions that originated from the homologous chromosomes carrying either the R1 resistance allele or an r1 allele for susceptibility . In order to obtain two contigs, one for each homologous chromosome in the R1 genomic region, the physical map was extended. Subsequently, we refer to the two contigs as the R1-contig and the r1-contig.
Summary of sequenced BAC insertions of genotype P6/210.
Sequence length [bp]
Genomic sequence analysis
The 620, 226 kbp genomic sequence obtained from seven R1- and three r1-BAC insertions was assembled into two distinguished, unambiguous stretches of DNA sequence corresponding to the R1- and r1-contig of 417,445 base pairs [GenBank:EF514212] and 202,781 base pairs [GenBank:EF514213], respectively, using MegaMerger. Overall GC content in the R1- and r1-contig is 33.26% and 34.11%, respectively.
For easier reference, we label regions along the R1 contig from A to F (Figure 2) based on features revealed by comparison of the R1 with the r1 contig using MUMmer (Figure 2). For region A, no sequence was obtained from r1. Region B is highly similar to the start of contig r1. In region C, co-linearity and alignment are disturbed, and similarity is primarily detected in three tandem repeats. No similarity to r1 is detected in region D. Region E (69,850 bp) again aligns well with r1, but in reverse orientation, indicating a genomic inversion. Region F, resembling region C, is not aligned well but contains two tandem repeats that are highly similar to the tandem repeats in region C, but in inverse orientation. Region B contains a palindromic structure discussed in more detail below.
Comparison to the A, B and C haplotypes of the orthologous genome region in Solanum demissum  revealed similar features (supplementary Fig. S3 and S4). The R1 contig is most closely related to the A sequence, with co-linearity and high sequence identity (99%). However, we orient BAC PGEC472P22 (accession AC151815) of A in reverse orientation compared to , thereby introducing the inversion of A and R1 relative to r1, B and C (see Discussion). The r1 contig is more similar to the sequences of B and C haplotypes of S. demissum.
Regions A and B show co-linearity with the S. demissum genomic regions II and III as defined by another study , whereas regions C through F correspond to S. demissum genomic regions IV and V.
We predict that the R1 contig contains 48 genes, seven of which are transposons and three are pseudogenes. Thirteen protein coding genes on the r1 contig can be identified as orthologs of collinear genes on R1. However, on R1 there are two additional protein coding genes. Six transposon genes are specific to r1, and for one pseudogene, we could not identify a partner on R1. On average, one ORF is annotated in every 9 kbp of genomic sequence. The average GC content of exons is 39%.
Genomic sequence annotation of the R1- and r1-contig.
ORF No. (1)
Manual Functional Annotation (2)
Manual Functional Annotation (2)
Fragment of disease resistance protein
ZF-HD homeobox protein
Ribosomal protein L34e
No apical meristem (NAM)-like
Retrotransposon RNAse containing
F-box associated domain containing protein
Pseudogene (F-box protein-like)
RNAseH containing reverse transcriptase
TCP transcription factor
RNA dependent RNA polymerase
Hypothetical protein (Pseudogene)
Origin recognition complex subunit 6
RNA dependent RNA polymerase
Sterol desaturase/Acid phosphatase
CAAX amino terminal protease
Disease resistance protein (R1)
Disease resistance protein (R1_1, 82% identity to R1)
Phytochrome kinase substrate
Disease resistance protein (R1_2, 49.3% identity to R1)
Regulator of chromosome condensation, RCC1
Disease resistance protein (R1-3, 74.7% identity to R1)
Hypothetical protein (Pseudogene)
Disease resistance protein (R1-6, 83.5% identity to R1)
Disease resistance protein (R1_4, 80.2% identity to R1)
Disease resistance protein (R1_7, 80.7% identity to R1)
Disease resistance protein (R1_5, 85.9% to R1)
Disease resistance protein (R1_8, 83.2% identity to R1)
Where sequence from both contigs was available, most proteins and some transposon-related genes are conserved between the R1 and r1 contig (Figure 1 and Additional file 5). Notable exceptions are 13 retroelements, six of which were found only in the r1 contig (ORF 14/1, ORF 19/1, ORF 21, ORF 49, ORF 51 and ORF 53) and 7 in the R1 contig (ORF 7, ORF 9, ORF 11, ORF 12, ORF 25, ORF 28 and ORF 40). The palindromic structure in region B is formed by an inverted repeat of two highly similar RNA-directed RNA polymerases (ORFs 14 and 16), separated by a hypothetical protein (ORF 15). The r1 specific retrotransposon 14/1 is inserted between the inverted repeats. Region E contains five genes that are conserved between R1 and r1 but in reverse order and orientation, indicating a genomic inversion. The proximal two genes are members of the R1 family, ORF 44 being the functional R1 resistance gene (accession AF447489) and the following ORF 45 is a tandem duplicate. ORFs 44 and 45 are conserved in sequence, but having inverse order and orientation, with resistance gene homologues 52 and 54 on contig r1, suggesting that the inversion includes these two genes. However, proteins 44 and 45 are more similar to each other than to 52 or 54, indicating that they may have arisen through tandem duplication after the inversion event. Another resistance gene homologue (ORF 46) follows on contig R1, but is less similar to the other R1-related genes. Also two genes (ORFs 47 and 48) could not be related to any locus on contig r1. This suggests that the proximal breakpoint of the genomic inversion in contig R1 is after gene 44 or 45.
To map the distal breakpoint of the inversion event on contig R1, we used sequence from S. demissum BAC PGEC093P17 (accession AC149290), as the sequence of contig r1 did not extend sufficiently far in proximal direction. The PGEC093P17 sequence contains orthologs of proteins 43, 42, 41, 39 and 38 in the same order and orientation as found on contig r1, therefore inverted with respect to R1. As in r1, the R1 specific transposon ORF 40 is not found in PGEC093P17 (Fig. 1). Proximal to protein 38 followed probable orthologs of proteins 37, 36 and 35 in reverse order and orientation compared to contig R1, indicating that these three genes are part of the inversion. The remaining PGEC093P17 sequence proximal to ORF 35 contains only transposon fragments and hypothetical proteins, which are unrelated to genes distal to ORF 35 (mainly F-box genes, Table 2) in the R1-contig. This maps the distal inversion breakpoint in contig R1 between genes 34 and 35. As consequence of the inversion, genes 23 to 34 in the R1 contig are probably part of a region that is, at least at this genomic location, R1 specific. This region includes two R1 homologous genes (ORF 23 and 24) in tandem orientation to ORF 22, two transposon-related genes and a series of eight F-box genes. For the flanking regions of contig R1, we found almost perfect gene-by-gene co-linearity to r1 or PGEC093P17. We could not assign three genes in contig r1 to any collinear region, two of which are transposons and one resembles a fragmented resistance gene.
Microsynteny with the A. thaliana genome
Syntenic blocks between potato annotated genes in the R1- and r1-contigs and sequence related genes of A. thaliana.
S. tuberosum ORF
A. thaliana ORF (1)
A. thaliana BAC
A. thaliana ORF position (1) [Mbp]
A. thaliana block size
S. tuberosum block size
We found extensive co-linearity of protein-coding genes, interrupted by unilateral insertions of retrotransposons and a region of highly diverged DNA sequence in the vicinity of clusters of tandem duplicated genes. This corresponds to findings in the orthologous genomic region of hexaploid S. demissum . We show that a 70 kb region containing ten protein-coding genes is inverted in both the R1 contig and presumably also the A haplotype of S. demissum. While sequence similarity at the nucleotide level was not sufficient to precisely map the inversion breakpoints in the intergenic regions, order and orientation of homologous gene pairs were consistent without exception and allowed to estimate the position of the inversion breakpoints. This inversion is not evident in the available sequence of the S. demissum A haplotype, as BAC PGEC472P22 (accession AC151815) mainly contains genes from the inversion and only few beyond the inversion breakpoint. The gap in the sequence of haplotype A presumably led to BAC PGEC472P22 being oriented to achieve co-linearity with B and C haplotypes. Kuang et al.  do not present any data, e.g. mapping of BAC ends or overlaps with neighboring BAC clones, to verify the orientation. Thus, we take the high sequence similarity, uninterrupted across the presumed inversion break point, between the R1 contig and PGEC472P22 to indicate that the inversion also exists in the A haplotype (Additional files 3 and 4).
Currently, there are only few examples where comparative structural analysis of orthologous genome segments was performed in crop plants over several hundred kb. In all cases reported, micro-structural diversity was found, e.g. in Zea mays [30, 31]. In maize the major contribution to diversity stems from LTR retrotransposons. In our analysis transposon related genes are less conserved, also suggesting recent insertions and deletions. In tomato (S. lycopersicum), a segment on chromosome 6 containing the Mi-1 gene for resistance to root knot nematodes, which has been introgressed from S. peruvianum, was shown to be inverted in resistant when compared to susceptible tomato genotypes . The situation at the potato R1 locus described here remarkably resembles this finding. The R1-contig is part of a genome fragment of unknown size introgressed from S. demissum into S. tuberosum, whereas the r1-contig originated either from S. tuberosum or S. spegazzinii, another closely related tuber bearing Solanum species. P40, the parental donor of the r1 allele, was an inter-specific hybrid between S. tuberosum and S. spegazzinii . The structural differences between the homologous chromosomes could interfere with chromosome pairing and crossing-over during meiosis, explaining the low frequency of recombination observed in this region . Similarly, in regions on tomato chromosome 6 and 11, where the resistance loci Mi and Tm2a, respectively, have been introgressed from the wild species S. peruvianum, a high degree of recombination suppression was observed [32, 34].
In the R1 contig, the inversion seems to have separated the R1 resistance gene from a tandem array of R1 homologs (proteins 22, 23 and 24). We attempted to date the inversion relative to the duplications of tandem resistance genes by phylogenetic analysis of the protein sequences, but the results were inconclusive (data not shown). Proteins 23, 24, 44 and 45 clustered together. Proteins 22 and 22/1 also clustered together, and even though gene 22 on the R1 contig is truncated at the N-terminus relative to 22/1 and the other R1 homologous proteins, we assume these to be an orthologous pair. For the other R1 homologues, allelic relationships are not clear, and they may have arisen through duplication after the divergence of R1 and r1.
Kuang et al  analyzed the same genomic region on potato chromosome V between three homeologous chromosomes of the allo-hexaploid potato species S. demissum. Alignment of the sequenced S. demissum BACs with the R1- and r1-contig identified haplotypes A and B/C as most similar but not identical to the R1- and r1-contig, respectively. Similarity of individual homologous protein pairs between B and r1 or C and r1 ranges between 90 and 99% identity on the amino acid level, with most, but not all proteins slightly more similar between C and r1 than between B and r1. On the other hand, the B haplotype is structurally more similar to r1, as C shows no tandem repeated R1 homologous genes but only one single copy.
We found a gene density of one gene every 9 kb, which is similar to previous findings in S. demissum (7.6 kb, ) and tomato (8 kb, [36, 37]), but lower than A. thaliana (5 kb, ) and rice (6 kbp, ), and higher than in barley, where only three genes were found in a stretch of 60 kbp genomic DNA . The overall GC content was 37% and 39.5% within the putative gene coding regions. These values are comparable to tomato (37% overall GC content and 42% in coding regions, ) and A. thaliana (36% overall GC content and 44% in coding regions, ) but lower than in rice (44% overall GC content and 54% in coding regions, ) and maize (47% overall GC content and 55% in coding regions.
The R1 gene family and disease resistance QTL
Annotation of the R1 gene family in S. demissum haplotype A  and in the R1-contig was comparable but not identical. In the R1-contig, six putative full-length R1 homologous genes and no partial homolog were annotated besides the R1 resistance gene itself. The six members of the R1 gene family were organized in two clusters of three genes each (proteins 22, 23, 24 and 44 = R1, 45, 46). In the corresponding region of S. demissum haplotype B, four putative full length R1 homologous genes and six partial homologues were identified. In S. demissum haplotype C, one complete R1 homologue was found , whereas three complete members of the R1 gene family (proteins 22/1, 52 and 54) were annotated in contig r1(Figures 1 and 3). Allelic relationships between the R1 homologues could not be deduced with certainty, but proteins 22 and 22/1 might be allelic based on collinear positions in the R1- and r1-contig, whereas proteins 44 and 45 might be allelic to 54 and 52 respectively as they are collinear under the assumption that they are part of the proposed genomic inversion.
Most of the molecular characterized plant R genes are members of tightly linked gene families . The R1 gene family is no exception in this respect. Allelic variants of the nine identified members of the R1 family or additional paralogous members that are not present in genotype P6/210 are non-exclusive candidates for the quantitative resistance traits in the resistance hot spot on potato chromosome V. At this point, we only know that some of the R1 homologues likely have functions other than the R1 resistance gene. This is based on the observation that the R1 homologue encoded by ORF 45 was not capable to complement the R1 race specific resistance phenotype (A. B. unpublished results).
Genes sequence-related to retrotransposons, ribosomal genes, RNA dependent RNA polymerase and pseudogenes are ranked low for being functional candidates for the quantitative traits in this region of the potato genome. The remaining 30 putative genes, including hypothetical genes and genes with unknown function, are all positional candidates for the QTL. Of particular interest as new candidates for quantitative resistance loci, besides the members of the R1 gene family, are the members of the F-box domain family. F-box proteins are involved in various signaling pathways in A. thaliana, and recently, F-box proteins are suggested to function as receptors for various plant hormones . Furthermore, an F-box domain was identified in the SGT1 protein that was shown to play a role as co-chaperon in the stabilization of R-proteins [43, 44]. With the annotation of the sequenced region, the list of positional candidate genes for the QTL is certainly not complete, as the sequence covers only part of the GP21–GP179 interval. To ultimately validate the role of any candidate gene for a QTL, complementation analysis with allelic variants is required. Unless high-throughput methods for complementation analysis become available, strategies to reduce the number of candidate genes to be considered for complementation analysis are necessary. For example, we perform functional testing by expression studies and by down-regulation of candidate gene expression by antisense or RNAi approaches . The model plant A. thaliana may also be used to study the function of genes that are most closely sequence-related to potato positional candidate genes. Unfortunately, this approach may not be applicable to the F-box family, as this is also a highly expanded gene family in A. thaliana with diverse cellular roles.
Synteny with A. thaliana
We identified at least five microsyntenic relationships between the R1 contig and A. thaliana. These cover varying stretches of the genome, ranging from just four consecutive genes within 7 kb of A. thaliana and 25 kb of potato to basically the entire R1 contig, covering 405 kb. Frequent insertion-deletion events can be detected. Similar patterns of interrupted co-linearity on the DNA sequence level were found among cereals  and between A. thaliana and rice . In the highly collinear tomato genome [2, 5], genomic sequences of 57 kbp and 106 kbp on chromosome 2  and 7 , respectively, have been compared to the A. thaliana genomic sequence. These studies revealed syntenic blocks of comparable redundancy and size with respect to the A. thaliana syntenic regions. In contrast to these previous studies, the contiguous potato sequence compared was 4 to 8-times longer. This revealed that the syntenic potato genes in the R1-contig were organized in three clusters (ORF 2 to 5, ORF 17 to 19 and ORF 38 to 48) that were separated by two non-syntenic regions (ORF 6 to 16 and ORF 20 to 37). In the r1-contig, a non-syntenic region (ORF 20 to 54) separated two syntenic regions (ORF 17 to 19 and ORF 38 to 43).
The most notable of the syntenic relationships spans almost the complete R1 contig (405 kbp) and 54 kbp of A. thaliana chromosome 1. Seven genes are conserved in sequence, order and orientation, except for two from region E that show reverse order and orientation compared to A. thaliana. This could indicate that the genomic inversion occurred in the R1 lineage after the divergence of A. thaliana and potato, with r1 and S. demissum B and C haplotypes showing the ancestral orientation. In the R1 contig, a large number (18) of genes do not show synteny, whereas in the A. thaliana region this only applies to five genes. The discrepancy is less pronounced if the 17 tandemly duplicated genes in potato are ignored.
The non-syntenic regions correspond to the highly divergent regions between R1 and r1 and included all but one (ORF 40) transposon sequences, all F-box-containing genes and six of the ten resistance-gene-homologues. The annotation of the A. thaliana syntenic regions identified, besides the sequence related ORFs, some transposon sequences but only one F-box-containing gene and no resistance gene homolog. Moreover, the non-syntenic regions in the R1- and r1-contigs coincided with regions II and IV in S. demissum, which showed the highest divergence between the homeologous chromosome segments A, B and C . This suggests that the genome of potato and related species in the sequenced region consists of a patchwork of faster and more slowly evolving segments.
The sequenced potato genomic segment covers a genetic distance of only 0.1 Centimorgan. At a hundred times larger scale, when genome-wide genetic maps of potato, sunflower, sugar beet and Prunus were compared to the A. thaliana physical map (macrosynteny), syntenic blocks from 1 to 20 Centimorgans were identified. A common fraction of the genomes of these distantly related plant species appear to have been conserved throughout the evolution of the dicots, when compared to the rest of the genome . The GP21–GP179 interval was not part of a macrosyntenic block between potato and A. thaliana .
Two contiguous sequences of 417,445 and 202,781 base pairs were assembled and annotated for a region on potato chromosome V, which contains genes controling several agronomic traits. Comparative sequence analysis revealed highly conserved collinear regions that flank regions showing high variability and tandem duplicated genes. The co-linearity between the homologous chromosomes was disrupted by non-allelic insertions of retrotransposon elements, stretches of diverged intergenic sequences, differences in gene content and gene order. The latter was mainly caused by inversion of a 70 kbp genomic fragment.
Annotation of the genomic sequence identified 48 putative open reading frames (ORF) in one contig and 22 in the other, with an average of one ORF every 9 kbp. The majority of the ORFs were members of multiple gene families. Ten ORFs were classified as resistance-gene-like, 11 as F-box-containing genes, 13 as transposable elements and three as transcription factors. Comparing potato to Arabidopsis thaliana annotated proteins revealed five micro-syntenic blocks of three to seven ORFs with A. thaliana chromosomes 1, 3 and 5, suggesting fragmented structural conservation between these distantly related plant species.
For contig construction, two BAC genomic libraries were used, each consisting of ca. 100 000 clones (two hundred and sixty four 384-well microtiter plates). Both libraries were generated from high molecular weight DNA of the diploid, heterozygous potato clone P6/210, a F1 hybrid of the parental clones P41 (H79.1506/1) and P40 (H80.696/4) . The 'BA' library has been described . The library 'BC' was constructed in the cloning vector pBeloBAC11  from partially Eco RI digested genomic DNA. The procedures for the construction of recombinant BAC clones, clone picking and storing were as described previously [22, 51]. The average insertion size of the 'BC' library was 80 kbp, corresponding to an, on average, 8-fold coverage of the potato genome.
BAC plasmid DNA isolation
A single colony was pre-cultured in 250 μl LB medium including 12.5 mg/l tetracycline for clones from the "BA" library and 12.5 mg/l chloramphenicol for clones from the "BC" library. The pre-culture was used to inoculate 50 ml LB medium containing the corresponding antibiotic. Plasmid DNA was isolated from 50 ml overnight culture using the QIAGEN Plasmid Midi Kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions.
High-density colony filters of the libraries were prepared and screened by colony-hybridization as described . The 'BC' library was also screened by the polymerase chain reaction (PCR) after isolating plasmid DNA from bacterial cells pooled at three levels: mini-pools, maxi-pools and super-pools. One thousand fifty six mini-pools were made from 96 clones each, 264 maxi-pools were prepared from 384 clones each (four mini-pools) and the 88 super-pools consisted of 1152 clones each (three maxi-pools). To identify a single positive clone, four rounds of PCR screening were performed. First, the DNA of the 88 super-pools was used as template. Second, the three maxi-pools constituting a positive super-pool were screened. Third, the four mini-pools of a positive maxi-pool were amplified and last, the 96 clones of the positive mini-pool were screened individually.
Overlapping BACs were identified and ordered in two contigs corresponding to the two homologous chromosomes of genotype P6/210 as described previously . In short, BAC insertion ends were sequenced using T3 and T7 oligonucleotides as sequencing primers. The end sequences were used to detect overlaps, either based on 100% sequence identity with already sequenced BACs or by generating amplicons with identical sequences in different BACs. BAC contigs were assigned to either one or the other homologous chromosome by identifying DNA polymorphisms in insertion end sequences that were specific either for the allele inherited from parent P41 or parent P40.
Genomic DNA Sequencing and Assembly
Whole BAC clones were sequenced by the shotgun sequencing strategy. Custom sub-libraries of the BACs were prepared by GATC-Biotech AG (Konstanz, Germany). After physical fractionation of the BAC DNA, the random sheared fragments were blunt-ended by using T4 DNA-Polymerase and then ligated into the pCR4Blunt-TOPO vector (Invitrogen, California, USA). Approximately 1300 clones containing ca. 1.5 kbp insertions and 300 to 400 clones with 4 to 5 kbp insertions were produced for each BAC. The smaller inserts were amplified by colony-PCR using as primers TO2f (5'-agcggataacaatttcacacagga-3') and TO2r (5'-gacgttgtaaaacgacggccagtg-3'). The PCR was performed in a volume of 100 μl containing 10 pmol of each primer, 0.2 mM dNTPs and 1.5 mM MgCl2, 0.2 Units of Taq-Polymerase and the corresponding buffer from Invitrogen, California, USA. PCR conditions were: initial denaturation at 96°C for 5 min followed by 34 cycles of denaturation at 94°C for 50 sec, annealing at 55°C for 50 sec and extension at 72°C for 3 min and a final extension for 4 min at 72°C. Plasmid DNA was purified from the clones having 5 kb inserts using the BioRobot 9600 (Qiagen, Hilden, Germany). PCR products and plasmids were sequenced using T3 and T7 primers (Amersham, Pharmacia GE Healthcare Bio-Sciences: Little Chalfont, UK). Sequencing reactions were performed by using the ABI PRISM BigDye Terminator Cycle Sequencing Ready Reaction Kit and an ABI377 automated DNA Sequencer (PE Biosystems, Foster City, California, USA). The approximately ten-fold redundant sequences were assembled using the PreGAP4 and GAP4 from Staden software package (Medical Research Council Laboratory of Molecular Biology, Cambridge, UK). The Lasergene software package (DNAstar, Madison, WI, USA) was used for sequence assemblies, comparisons and alignments. The sequences of overlapping BAC insertions were assembled after trimming any vector sequences using the SeqMan module of Lasergene and the Megamerger program, which is part of the EMBOSS package .
Dotter and MUMer  were used to align and compare genomic sequences. Putative exons and open reading frames (ORFs) were predicted by the programs GenMark.hmm , FGeneSH  and by alignment of EST (Expressed Sequence Tags) and protein sequences using GenomeThreader . Genes were annotated by combining predicted ORFs from these gene finder programs with alignments of homologous sequences in public databases using the Apollo Genome Annotation Curation Tool  (Additional files 1 and 2). All genes were also manually annotated for putative function. Functional descriptions of homologous genes in the SWISSPROT database  were compared with homologous protein domains and patterns in the InterPro database . Homologous genes in the SWISSPROT database were identified using BlastP  and homologous protein domains and patterns were identified by InterProScan . Inparanoid  and BlastX  were used for the identification of similar genes in A. thaliana. The deduced amino acid sequences of the annotated ORFs were compared to A. thaliana annotated proteins from The A. thaliana Information Resource (TAIR) Release 6 . The threshold criterion for accepting sequence similarity as significant was an E-value < 10-10 for BLASTP searches. Polyproteins and transposable elements were excluded from the comparison because of their limited information value. For the same reason, ORFs with more than 30 hits in the A. thaliana genome were also excluded . A two-dimensional array was generated with the potato physical map of 420,000 bp in one dimension and the A. thaliana physical map of 121 mb in the other . Hits of the putative potato proteins with A. thaliana annotated proteins were positioned in the array according to their base pair coordinates on the local potato and genome wide A. thaliana physical maps. Syntenic blocks were identified based on the criterion that at least three different ORFs within the potato contigs found hits within an A. thaliana genome fragment of similar size.
List of abbreviations
quantitative trait loci
quantitative resistance loci
open reading frame
Expressed Sequence Tags
bacterial artificial chromosome
polymerase chain reaction
kilo base pairs
This research was funded under the GABI program (Genome analysis in the biological system of plants) by BMBF (Bundesministerium für Bildung und Forschung), project no 0312290 CONQUEST (Genes CONtrolling QUantitativE traits of Solanum Tuberosum). Part of this work was carried out in the department of plant breeding research and yield physiology, headed by Francesco Salamini, and in the department of plant breeding research and genetics, headed by Maarten Koornneef.
- Gebhardt C: Potato Genetics: Molecular Maps and More in Biotechnology in Agriculture and Forestry. Molecular Marker Systems. Edited by: Lörz H, Wenzel G. 2004, Springer-Verlag, Berlin, Heidelberg, 55: 215-227.Google Scholar
- Bonierable M, Plaisted RL, Tanksley SD: RFLP map based on a common set of clones reveal modes of chromosomal evolution in potato and tomato. Genetics. 1988, 120: 1095-1103.Google Scholar
- Gebhardt C, Ritter E, Debener T, Schachtschabel U, Walkemeier B, Uhrig U, Salamini F: RFLP analysis and linkage mapping in Solanum tuberosum. Theor Appl Genet. 1989, 78: 65-75. 10.1007/BF00299755.PubMedView ArticleGoogle Scholar
- Gebhardt C, Walkemeier B, Henselewski H, Barakat A, Delseny M, Stüber K: Comparative mapping between potato (Solanum tuberosum) and A. thaliana reveals structurally conserved domains and ancient duplications in the potato genome. The Plant J. 2003, 34: 529-541. 10.1046/j.1365-313X.2003.01747.x.View ArticleGoogle Scholar
- Tanksley SD, Ganal MW, Prince JP, de Vicente MC, Bonierbale MW, Broun P, Fulton TM, Giovannoni JJ, Grandillo S, Martin GB, Messeguer R, Miller JC, Miller L, Paterson AH, Pineda O, Roeder MS, Wing RA, Wu W, Young ND: High density molecular linkage maps of the tomato and potato genomes. Genetics. 1992, 132: 1141-1160.PubMed CentralPubMedGoogle Scholar
- Gebhardt C, Valkonen JPT: Organization of genes controlling disease resistance in the potato genome. Annu Rev Phytopathol. 2001, 39: 79-102. 10.1146/annurev.phyto.39.1.79.PubMedView ArticleGoogle Scholar
- Meksem K, Leister D, Peleman J, Zabeau M, Salamini F, Gebhardt C: A high-resolution map of the vicinity of the R1 locus on chromosome V of potato based on RFLP and AFLP markers. Mol Gen Genet. 1995, 249: 74-81. 10.1007/BF00290238.PubMedView ArticleGoogle Scholar
- De Jong W, Forsyth A, Leister D, Gebhardt C, Baulcombe DC: A Potato hypersensitive resistance gene against potato virus X maps to a resistance gene cluster on chromosome V. Theor Appl Genet. 1997, 5: 153-162.Google Scholar
- Ritter E, Debener T, Barone A, Salamini F, Gebhardt C: RFLP mapping on potato chromosomes of two genes controlling extreme resistance to potato virus X (PVX). Mol Gen Genet. 1991, 227: 81-85. 10.1007/BF00260710.PubMedView ArticleGoogle Scholar
- Leonards-Schippers C, Gieffers W, Gebhardt C, Salamini F: The R1 gene conferring race-specific resistance to Phytophtora infestans in potato is located on potato chromosome V. Mol Gen Genet. 1992, 233: 278-283. 10.1007/BF00587589.PubMedView ArticleGoogle Scholar
- Leonards-Schippers C, Gieffers W, Schäfer-Pregl R, Ritter E, Knapp SJ, Salamini F, Gebhardt C: Quantitative resistance to Phytophthora infestans in potato: a case study for QTL mapping in an allogamous plant species. Genetics. 1994, 137: 67-77.PubMed CentralPubMedGoogle Scholar
- Oberhagemann P, Chatot-Balandras C, Bonnel E, Schäfer-Pregl R, Wegener D, Palomino C, Salamini F, Gebhardt C: A genetic analysis of quantitative resistance to late blight in potato: Towards marker assisted selection. Mol Breed. 1999, 5: 399-415. 10.1023/A:1009623212180.View ArticleGoogle Scholar
- Collins A, Milbourne D, Ramsay L, Meyer R, Chatot-Balandras C, Oberhagemann P, De Jong W, Gebhardt C, Bonnel E, Waugh R: QTL for field resistance to late blight in potato are strongly correlated with earliness and vigour. Mol Breed. 1999, 5: 387-398. 10.1023/A:1009601427062.View ArticleGoogle Scholar
- Visker MHPW, Keizer LCP, Van Eck HJ, Jacobsen ELT, Colon LT, Struik PC: Can the QTL for late blight resistance on potato chromosome 5 be attributed to foliage maturity type?. Theor Appl Genet. 2003, 106: 317-325.PubMedGoogle Scholar
- Kreike CM, De Koning JRA, Vinke JH, Van Ooijen JW, Stiekema WJ: Quantitatively-inherited resistance to Globodera pallida is dominated by one major locus in Solanum spegazzinii. Theor Appl Genet. 1994, 88: 764-769. 10.1007/BF01253983.PubMedView ArticleGoogle Scholar
- Rouppe van der Voort J, Wolters P, Folkertsma R, Hutten R, van Zandvoort P, Vinke H, Kanyuka K, Bendahmane A, Jacobsen E, Janssen R, Bakker J: Mapping of the cyst nematode resistance locus Gpa2 in potato using a strategy based on co-migrating AFLP markers. Theor Appl Genet. 1997, 95: 874-880. 10.1007/s001220050638.View ArticleGoogle Scholar
- Rouppe van der Voort J, van der Vossen E, Bakker E, Overmars H, van Zandvoort P, Hutten R, Klein-Lankhorst R, Bakker J: Two additive QTLs conferring broad-spectrum resistance in potato to Globodera pallida are localized on resistance gene clusters. Theor Appl Genet. 2000, 101: 1122-30. 10.1007/s001220051588.View ArticleGoogle Scholar
- Sattarzadeh A, Achenbach U, Lübeck J, Strahwald J, Tacke E, Hofferbert HR, Rotstein T, Gebhardt C: Single nucleotide polymorphism (SNP) genotyping as basis for developing a PCR-based marker highly diagnostic for potato varieties with high resistance to Globodera pallida pathotype Pa2/3. Mol Breed. 2006, 4: 301-312. 10.1007/s11032-006-9026-1.View ArticleGoogle Scholar
- Schäfer-Pregl R, Ritter E, Concilio L, Hesselbach J, Lovatti L, Walkemeier B, Thelen H, Salamini F, Gebhardt C: Analysis of quantitative trait loci (QTL) and quantitative trait alleles (QTA) for potato tuber yield and starch content. Theor Appl Genet. 1998, 97: 834-846. 10.1007/s001220050963.View ArticleGoogle Scholar
- Menendez CM, Ritter E, Schäfer-Pregl R, Walkemeier B, Kalde A, Salamini F, Gebhardt C: Cold-sweetening in diploid potato. Mapping QTL and candidate genes. Genetics. 2002, 162: 1423-1434.PubMed CentralPubMedGoogle Scholar
- Bendahmane A, Querci M, Kanyuka K, Baulcombe DC: Agrobacterium transient expression system as a tool for the isolation of disease resistance genes: application to the Rx2 locus in potato. Plant J. 2000, 21: 73-81. 10.1046/j.1365-313x.2000.00654.x.PubMedView ArticleGoogle Scholar
- Ballvora A, Ercolano MR, Weiss J, Meksem K, Bormann CA, Oberhagemann P, Salamini F, Gebhardt C: The R1 gene for potato resistance to late blight (Phytophthora infestans) belongs to the leucine zipper/NBS/LRR class of plant resistance genes. The Plant J. 2002, 30: 361-371. 10.1046/j.1365-313X.2001.01292.x.View ArticleGoogle Scholar
- Chisholm ST, Coaker G, Day B, Staskawicz BJ: Host-Microbe Interactions: Shaping the Evolution of the Plant Immune Response. Cell. 2006, 124: 803-814. 10.1016/j.cell.2006.02.008.PubMedView ArticleGoogle Scholar
- Ross H: Potato breeding. Problems and perspectives. Adv Plant Breed. 1986, Paul Parey Verlag, Berlin and Heidelberg, Supplement 13
- Kuang H, Wei F, Marano MR, Wirtz U, Wang X, Liu J, Shum WP, Zaborsky J, Tallon LJ, Rensink W, Lobst S, Zhang P, Tornqvist CE, Tek A, Bamberg J, Helgeson J, Fry W, You F, Luo MC, Jiang J, Robin Buell C, Baker B: The R1 resistance gene cluster contains three groups of independently evolving, type I R1 homologues and shows substantial structural variation among haplotypes of Solanum demissum. The Plant J. 2005, 44: 37-51. 10.1111/j.1365-313X.2005.02506.x.View ArticleGoogle Scholar
- Pryor T, Ellis J: The genetic complexity of fungal resistance genes in plants. Adv Plant Pathol. 1993, 10: 281-305.Google Scholar
- Gebhardt C, Ballvora A, Walkemeier B, Oberhagemann P, Schüler K: Assessing genetic potential in germ plasm collections of crop plants by marker-trait association: a case study for potatoes with quantitative variation of resistance to late blight and maturity type. Mol Breed. 2004, 13: 93-102. 10.1023/B:MOLB.0000012878.89855.df.View ArticleGoogle Scholar
- Salvi S, Tuberosa R: To clone or not to clone plant QTLs: present and future challenges. Trends Plant Sci. 2005, 10: 297-304. 10.1016/j.tplants.2005.04.008.PubMedView ArticleGoogle Scholar
- Stein L: Genome annotation: From sequence to biology. Nature Reviews Genetics. 2001, 2: 493-503. 10.1038/35080529.PubMedView ArticleGoogle Scholar
- Fu H, Dooner HK: Intraspecific violation of genetic colinearity and its implications in maize. Proc Natl Acad Sci. 2002, 99: 9573-9578.PubMed CentralPubMedView ArticleGoogle Scholar
- Brunner S, Fengler K, Morgante M, Tingey S, Rafalski A: Evolution of DANN Sequence Nonhomologies among Maize Inbreds. Plant Cell. 2005, 17: 343-360. 10.1105/tpc.104.025627.PubMed CentralPubMedView ArticleGoogle Scholar
- Seah S, Yaghoobi J, Rossi M, Gleason CA, Williamson VM: The nematode gene, Mi-1, is associated with an inverted chromosomal segment in susceptible compared to resistant tomato. Theor Appl Genet. 2004, 108: 1635-1642. 10.1007/s00122-004-1594-z.PubMedView ArticleGoogle Scholar
- Barone A, Ritter E, Schachtschabel U, Debener T, Salamini F, Gebhardt C: Localization by restriction fragment length polymorphism mapping in potato of a major dominant gene conferring resistance to the potato cyst nematode Globodera rostochiensis. Mol Gen Genet. 1990, 224: 177-182. 10.1007/BF00271550.PubMedView ArticleGoogle Scholar
- Ganal MW, Tanksley SD: Recombination around Tm2a and Mi resistance genes in different crosses of Lycopersicom peruvianum. Theor Appl Genet. 1996, 92: 101-108. 10.1007/BF00222958.PubMedView ArticleGoogle Scholar
- Michelmore RW, Meyers BC: Clusters of resistance genes in plants evolve by divergent selection and a birth-and-death process. Genome Res. 1998, 8: 9828-9832.Google Scholar
- Mao l, Begum D, Goff SA, Wing RA: Sequence and Analysis of the Tomato JOINTLESS Locus. Plant Physiol. 2001, 126 (3): 1331-1340. 10.1104/pp.126.3.1331.PubMed CentralPubMedView ArticleGoogle Scholar
- Rossberg M, Theres K, Acarkan A, Herrero R, Schmitt T, Schumacher K, Schmitz G, Schmidt R: Comparative sequence analysis reveals extensive microcolinearity in the Lateral Suppressor regions of the tomato, A. thaliana, and Capsella genomes. The Plant Cell. 2001, 13: 979-988. 10.2307/3871354.PubMed CentralPubMedView ArticleGoogle Scholar
- The A. thaliana Genome Initiative: Analysis of the genome sequence of the flowering plant A. thaliana. Nature. 2000, 408: 796-815. 10.1038/35048692.View ArticleGoogle Scholar
- The Rice Chromosome 3 Sequencing Consortium: Sequence, annotation, and analysis of synteny between rice chromosome 3 and diverged species. Genome. 2006, 15: 1284-1291.View ArticleGoogle Scholar
- Panstruga R, Büschges R, Piffanelli P, Schulze-Lefert P: A contigous 60 kb genomic stretch from barley reveals molecular evidence for gene islands in a monocot genome. Nuc Acid Res. 1998, 26: 1056-1062. 10.1093/nar/26.4.1056.View ArticleGoogle Scholar
- Hulbert SH, Webb CA, Smith SM, Sun Q: Resistance gene complexes: evolution and utilization. Annu Rev Phytopathol. 2001, 39: 285-312. 10.1146/annurev.phyto.39.1.285.PubMedView ArticleGoogle Scholar
- Kepinski S, Leyser O: The A. thaliana F-box protein TIR1 is an auxin receptor. Nature. 2005, 435: 446-451. 10.1038/nature03542.PubMedView ArticleGoogle Scholar
- Shirazu K, Schulze-Lefert P: Regulators of cell death in disease resistance. Plant Mol Biol. 2000, 44: 371-385. 10.1023/A:1026552827716.View ArticleGoogle Scholar
- Schulze-Lefert P: Plant Immunity: The origami of receptor activation. Curr Biol. 2004, 14: R22-R24.PubMedView ArticleGoogle Scholar
- Waterhause PM, Helliwell CA: Exploring plant genomes by RNA-induced gene silencing. Nat Rev Genet. 2003, 4: 29-38. 10.1038/nrg982.View ArticleGoogle Scholar
- Ware D, Stein L: Comparison of genes among cereals. Curr Op Plant Biol. 2003, 6: 121-127. 10.1016/S1369-5266(03)00012-8.View ArticleGoogle Scholar
- Bennetzen JL, Ma J: The genetic colinearity of rice and other cereals on the basis of genomic sequence analysis. Curr Op Plant Biol. 2003, 6: 128-133. 10.1016/S1369-5266(03)00015-3.View ArticleGoogle Scholar
- Ku H.-M, Vision T, Liu J, Tanksley SD: Comparing sequenced segments of the tomato and A. thaliana genomes: Large-scale duplication followed by selective gene loss creates a network of synteny. Proc Natl Acad Sci. 2000, 97: 9121-9126. 10.1073/pnas.160271297.PubMed CentralPubMedView ArticleGoogle Scholar
- Dominguez I, Graziano E, Gebhardt C, Barakat A, Berry S, Arús P, Delseny M, Barnes S: Plant genome acheology: evidence for conserved ancestral chromosome segments in dicotyledonous plant species. Plant Biotech J. 2003, 1: 91-99. 10.1046/j.1467-7652.2003.00009.x.View ArticleGoogle Scholar
- Kim UJ, Birren BW, Slepak T, Mancino V, Boyson C, Kang HL, Simon MI, Shizuya H: Construction and characterization of a human bacterial artificial chromosome library. Genomics. 1996, 34: 213-218. 10.1006/geno.1996.0268.PubMedView ArticleGoogle Scholar
- Meksem K, Zobrist K, Ruben E, Hyten D, Quanzhou T, Zhang H-B, Lightfoot DA: Two large-insert soybean genomic libraries constructed in a binary vector: applications in chromosome walking and genome wide physical mapping. Theor Appl Genet. 2000, 101: 747-755. 10.1007/s001220051540.View ArticleGoogle Scholar
- Rice P, Longden I, Bleasby A: EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 2000, 16 (6): 276-7. 10.1016/S0168-9525(00)02024-2.PubMedView ArticleGoogle Scholar
- Sonnhammer EL, Durbin R: A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. Gene. 1995, 167: GC1-GC10. 10.1016/0378-1119(95)00714-8.PubMedView ArticleGoogle Scholar
- Lukashin AV, Borodovsky M: GeneMark.hmm: new solutions for gene finding. Nuc Acids Res. 1998, 26: 1107-1115. 10.1093/nar/26.4.1107.View ArticleGoogle Scholar
- Salamov AA, Solovyev VV: Ab initio Gene Finding in Drosophila Genomic DNA. Genome Res. 2000, 10: 516-522. 10.1101/gr.10.4.516.PubMed CentralPubMedView ArticleGoogle Scholar
- Gremme G, Brendel V, Sparks ME, Kurtz S: Engineering a software tool for gene structure prediction in higher organisms. Information and Software Technology. 2005, 47: 965-978. 10.1016/j.infsof.2005.09.005.View ArticleGoogle Scholar
- Lewis SE, Searle SMJ, Harris N, Gibson M, Iyer V, Ricter J, Wiel C, Bayraktaroglu L, Birney E, Crosby MA, Kaminker JS, Matthews B, Prochnik SE, Smith CD, Tupy JL, Rubin GM, Misra S, Mungall CJ, Clamp ME: Apollo: a sequence annotation editor. Genome Biol. 2002, 3: research0082-10.1186/gb-2002-3-12-research0082.PubMed CentralPubMedView ArticleGoogle Scholar
- Bairoch A, Apweiler R: The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nuc Acids Res. 2000, 28: 45-48. 10.1093/nar/28.1.45.View ArticleGoogle Scholar
- Apweiler R, Attwood TK, Bairoch A, Bateman A, Birney E, Biswas M, Bucher P, Cerutti L, Corpet F, Croning MDR, Durbin R, Falquet L, Fleishmann W, Gouzy J, Hermjakob H, Hulo N, Jonassen L, Kahn D, Kanapin A, Karavidopoulou Y, Lopez R, Marx B, Mulder NJ, Oinn TM, Pagni M, Servant P, Sigrist CJA, Zdobnov EM: The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Bioinformatics. 2000, 16: 1145-1150. 10.1093/bioinformatics/16.12.1145.PubMedView ArticleGoogle Scholar
- Altschul F, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nuc Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.View ArticleGoogle Scholar
- Zdobnov EM, Apweiler R: InterProScan – an integration platform for the signature-recognition methods in InterPro. Bioinformatics. 2001, 17: 847-848. 10.1093/bioinformatics/17.9.847.PubMedView ArticleGoogle Scholar
- Remm M, Storm CEV, Sonnhammer ELL: Automatic Clustering of Orthologs and In-paralogs from Pairwise Species Comparisons. JMB. 2001, 14: 1041-1052. 10.1006/jmbi.2000.5197.View ArticleGoogle Scholar
- GABI Primary Database. [https://gabi.rzpd.de/projects/Pomamo/]
- The Arabidopsis Information Resource. [http://www.arabidopsis.org]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.