- Research article
- Open Access
A general pipeline for the development of anchor markers for comparative genomics in plants
© Fredslund et al; licensee BioMed Central Ltd. 2006
- Received: 11 March 2006
- Accepted: 14 August 2006
- Published: 14 August 2006
Complete or near-complete genomic sequence information is presently only available for a few plant species representing a large phylogenetic diversity among plants. In order to effectively transfer this information to species lacking sequence information, comparative genomic tools need to be developed. Molecular markers permitting cross-species mapping along co-linear genomic regions are central to comparative genomics. These "anchor" markers, defining unique loci in genetic linkage maps of multiple species, are gene-based and possess a number of features that make them relatively sparse. To identify potential anchor marker sequences more efficiently, we have established an automated bioinformatic pipeline that combines multi-species Expressed Sequence Tags (EST) and genome sequence data.
Taking advantage of sequence data from related species, the pipeline identifies evolutionarily conserved sequences that are likely to define unique orthologous loci in most species of the same phylogenetic clade. The key features are the identification of evolutionarily conserved sequences followed by automated design of intron-flanking Polymerase Chain Reaction (PCR) primer pairs. Polymorphisms can subsequently be identified by size- or sequence variation of PCR products, amplified from mapping parents or populations. We illustrate our procedure in legumes and grasses and exemplify its application in legumes, where model plant studies and the genome- and EST-sequence data available have a potential impact on the breeding of crop species and on our understanding of the evolution of this large and diverse family.
We provide a database of 459 candidate anchor loci which have the potential to serve as map anchors in more than 18,000 legume species, a number of which are of agricultural importance. For grasses, the database contains 1335 candidate anchor loci. Based on this database, we have evaluated 76 candidate anchor loci with respect to marker development in legume species with no sequence information available, demonstrating the validity of this approach.
- Intron Position
- Anchor Marker
- Legume Family
- Legume Genome
The ancestors of crops were originally domesticated and bred because they exhibited specific extraordinary features that made them useful for human consumption, animal food, or ornamental display. Understanding the genetics of these man-selected features involves the identification of underlying qualitative and quantitative trait loci (QTLs) through genetic mapping.
Mapping efficiency and ultimately the ability to isolate interesting loci can be improved by facilitating information transfer from genetic model species to crops and vice versa. Synteny, the conservation of gene order along the chromosomes of related species, is a genomic property that can be exploited for this purpose (reviewed in [1–3]). Shared genetic loci can be used as landmarks for the alignment of linkage groups, thereby defining large chromosomal blocks (macrosynteny). The alignment of sequences from related species often reveals the presence of short regions of sequence conservation (microsynteny). Comparing genomic sequences of genetic models such as Arabidopsis  and rice [5, 6], with large collections of ESTs from related plants, enables the identification of shared loci instrumental in projecting the large and repetitive genomes of many crop species onto the genomes of the model species. However, comparisons between genetic maps of distantly related species are usually difficult and less productive. Genomes often undergo chromosomal rearrangements, such as inversions, translocations, duplications, deletions and cycles of polyploidization followed by diploidization . Plants, given their sexual promiscuity and potential for vegetative reproduction, are particularly prone to genome rearrangements . For example, whole genome duplications have occurred at several occasions during the evolution of modern plant species . In the diploid phase, members of a duplicated gene pair are retained or deleted at random in the two duplicated regions, obscuring their common past. This process results in diminished congruency between two genomes that are separated by a polyploidization-diploidization cycle. Hence, in order to avoid the pitfalls of comparative genome mapping, the species to be compared should be carefully chosen.
A central step in genome comparisons is the identification of sequences that can readily be identified in genomes of the species to be compared and serve as "anchors" of their respective genetic maps. Commonly used markers, such as microsatellite or AFLP markers, can give high resolution genetic maps, but are of little comparative value since they are not necessarily conserved across species boundaries. Anchor sequences should be chosen to maximize the potential to serve as markers in several species, and allow the quick estimation of congruency between genetic maps of the organisms. First generation comparative maps of plants relied on Southern hybridization of highly conserved homologous probes and their scoring as RFLP markers [10, 11]. Markers requiring Southern hybridization, although informative, are time consuming and labor intensive. Furthermore, it can be difficult to generate specific hybridization markers, due to cross-hybridization to other genomic regions. PCR based markers are much more efficient as they are amenable to high throughput automation and, if well designed, of high specificity. Towards this goal we employ a genome-wide strategy based on the identification of low-copy evolutionarily conserved sequences within transcribed sequences of representative species. These regions are used as primer annealing sites for PCR amplification of intercalated introns that are subsequently sequenced in order to ascertain polymorphisms between mapping parents. This approach ensures that non-repetitive, transcribed regions of the genome are the primary targets of mapping efforts.
Here we present an automated pipeline for generation of C omparative A nchor T agged S equence markers (CATS ) and apply it to design a set of legume anchor markers.
Identification of legume CATS
Arabidopsis homolog count for the collections of gene indices.
Lotus japonicus (v. 3.0)
Medicago truncatula (v. 8.0)
Glycine max (v. 12.0)
Phaseolus vulgaris (v. 1.0)
Total number of gene indices
One Arabidopsis homolog
Two Arabidopsis homologs
Total (one and two Arabidopsis homologs)
In the next step, the selected EST sequences are aligned with corresponding genomic sequences in order to score intron positions and lengths. This ensures maximal chances of detecting polymorphisms between mapping parents at later steps, since intron sequences are under relaxed evolutionary constraints. Intron position and approximate length, however, are strongly conserved features, even over long evolutionary distances . Here, we make use of the L. japonicus and M. truncatula genome sequences , but the general idea could easily be extended to the Arabidopsis or poplar genome sequences , given the conserved nature of intron positions. We also score the length of introns. This parameter is of interest for two reasons: (i) short introns are less likely to be polymorphic than longer ones, and (ii) using standard polymerases the amplicon size is limited to a maximum of ~3 kb. Aligning L. japonicus EST sequences to the L. japonicus genome identified 450 sequences with introns of appropriate length. Alignment of M. truncatula ESTs to the M. truncatula genome identified 1,391 sequences with introns of appropriate length.
Next, homologous ESTs from as many as possible of the four legumes with the same best Arabidopsis protein match are aligned. One of the included ESTs must originate from L. japonicus or M. truncatula and alignment to its gene must reveal the presence of at least one intron of appropriate length.
The numbers of identified CATS and their anchoring in current L. japonicus and M. truncatula maps
Number of CATS
Number of map- anchored CATS
Synteny between to genomic regions can only be established if markers are true orthologs. Orthology between genes from different species is indicated by the matching of gene trees and species tree. To test this, we constructed a Neighbor-joining tree for each alignment. Of a total of 459 gene trees, 57 (12 %) topologies were incompatible with the species tree, potentially due to sequences of paralogous origin in the alignment. These CATS should be treated with care.
The sequence alignments, phylogenetic trees, known map positions, primer reports and gene annotations of all CATS are web-accessible .
Potential for marker development in the grass family
Testing legume family anchor markers on the outlier peanut
Peanut (Arachis hypogea) represents a phylogenetic outgroup to the clade from which the legume sequences were sampled. Hence, it provides a challenging opportunity to assess the potential of our pipeline to generate pan-legume family anchor markers. Recently a microsatellite marker-based genetic map of the AA component of the allotetraploid genome of peanut, has been developed . This map was developed using a mapping population derived from a cross between A. duranensis, the most probably AA genome donor of peanut, and the closely related A. stenosperma. Anchoring this map to the genome sequence of L. japonicus and M. truncatula will be a central next step in the development of Arachis genetics. Towards this goal, we tested 74 legume family CATS primer pairs which were designed based on the degenerate consensus sequence of the multiple alignment of two to four ESTs included in the legume CATS pipeline. Testing was performed on the in-clade representative common bean and the outlier Arachis to evaluate the robustness and general applicability of the pipeline. We determined whether these primer sets amplified the correct sequences, and scored polymorphisms between mapping parents of central mapping populations for both species (see Experimental procedures). Of the 76 tested primer sets, 43 and 56 (or 58 and 75 %) amplified the expected product for Arachis and bean, respectively. Sequencing of the PCR products revealed that 79 % of the Arachis CATS were polymorphic between Arachis duranensis K7988 X A. stenosperma V10309. Among the bean CATS, 65 % were polymorphic between the mapping parents Bat93 X JaloEEP558.
The automated bioinformatic pipeline described here allows the large-scale generation of marker candidates useful for map construction and comparisons in legumes and grasses and, by extension, to any phylogenetic clade with appropriate comparative sequence information. Since only unique sequences are unambiguous as markers, the number of paralogous sequences in the target genome is of interest. An approximation to this number is obtained by counting homologous sequences in the proteome of a reference species. Since no legume genome has been completely sequenced to date, the pipeline relies on EST data and is not able to discern between orthologous and paralogous origin of homologous sequences. However, selecting sequences with only one or two homologs in the Arabidopsis reference proteome enhances the probability of an orthologous relationship between homologous legume ESTs. For 88% of the CATS candidate developed here, the gene tree follows the species tree, suggesting orthology. Although these indirect criteria maximize congruency when comparing maps, they by no means guarantee it. A common ancestor of the legumes has undergone a whole-genome duplication [9, 16–20], potentially obstructing congruency through differential gene loss in duplicated chromosomal regions. Whereas macrosynteny is not recognizable between M. truncatula and Arabidopsis , microsynteny is generally much better retained [15, 29, 30]. Within a given clade, such as legumes or grasses, both micro- and macrosyntenic relations are readily identifiable [8, 11, 15, 31].
A main application of this algorithm is the transfer of genome information between model species and closely related large genome crops. Dense genetic maps spanning all linkage groups are of invaluable help for breeding purposes. Comparison to genomes of other species, especially model species, can help in the development of new markers in regions of interest and may allow educated guesses at candidate genes in the region under investigation. It is therefore advantageous to use markers that can serve as syntenic anchors, connecting genetic maps of as many species as possible.
66 legume CATS were recently published . Here we extend this collection significantly by automating the bioinformatics tasks involved. Applying this pipeline to the currently available legume EST and genome data, we were able to identify 459 CATS. An interactive table of these CATS primers along with EST alignments, BAC sequences and L. japonicus and M. truncatula map positions is web-available .
Currently, the number of CATS to be identified in legumes is limited by the incomplete M. truncatula or L. japonicus genomic sequences. The sequencing of these genomes is still in progress and will hence allow the identification of many more CATS candidates in the future. Given the conserved nature of intron position and approximate length , the Arabidopsis genome sequence might be equally valuable for the identification of introns, a strategy which has been employed by Choi et al. [31, 32]. However, apart from increasing the number of CATS, such a measure does not allow the linking of genetic map information to sequence information in legumes.
It might be useful to include EST collections from other related species for example ESTs from mung bean (Vigna mungo) in the case of the legume family. For this purpose, we have developed a web-based service, GeMprospector [41, 42], that allows the user to submit other EST collections to the pipeline, both for legumes and grasses.
The genome sequence used for defining introns makes it possible to sample CATS with even dispersal across all linkage groups or enriched in chromosomal regions of interest. As expected, the success-rate of PCR amplification using degenerate primers decreases with phylogenetic distance of the test species from the species chosen to provide the sequence information for the pipeline. The ability to predict successful primer pairs at the phylogenetic distance between the outlier Arachis and the clade defined by M. truncatua, L. japonicus and soybean is remarkable and demonstrates the success of the pipeline.
We present a general framework for the automated design of CATS marker candidates through mining of large-scale sequence collections of diverse origin. The accompanying databases allow legume and grass researchers to get instant and easy access to unprecedented numbers of CATS. Apart from their use as comparative anchor markers, CATS will be useful in the estimation of population genetic parameters such as allele frequencies, effective population sizes and inbreeding coefficients in natural populations and outbreeding crops.
Biological sequence resources
The EST clusters used for this analysis were retrieved from The Institute of Genome Research (TIGR). We downloaded the gene indices (clustered EST collections [34, 35]) for the legumes Lotus japonicus (Release 3.0: 28,460 sequences), Medicago truncatula (Release 8.0: 36,976 sequences), Phaseolus vulgaris (Release 1.0: 9,484) and Glycine max (Release 12.0: 63,676 sequences) and the grasses Hordeum vulgare (Release 9.0: 50,453 sequences), and Sorghum bicolor, (Release 8.0: 39,148 sequences). The Arabidopsis and rice (Oryza sativa) genome, proteome and coding sequences were downloaded from the TIGR FTP site. The L. japonicus and M. truncatula genomic sequences were retrieved using NCBIs ENTREZ.
The Blast package  was obtained from the NCBI. For comparison of nucleotide sequences, we used the Megablast program with a wordsize of 20 and cutoff e-value of 10-40. For DNA – protein comparisons, we used the Blastx program (cutoff e-value 2 × 10-7). A series of Python scripts were written to parse the Blast outputs and assemble collections of homologous sequences. Multiple alignments were generated using ClustalW , and automated primer design was achieved through application of the PriFi program ).
DNA was extracted from Arachis and bean as described in  and , respectively. PCR products were sequenced using Applied Biosystems BigDye version 3.1 and run on an ABI PRISM 310 Genetic Analyzer.
The performance of the pipeline was validated experimentally by making use of the suggested primer pairs to amplify homologous sequences from DNA of mapping parents from two legume mapping populations. For bean, the parents of the Bat93 X JaloEEP558 population , and for Arachis the parents of the A. duranensis K7988 X A. stenosperma V10309 cross were used . PCR amplification conditions were 40 cycles of [94°C 30 seconds, 48°C 30 seconds, 72°C 2 minutes].
We thank Paul Gepts, University of California, Davis, for seeds of the Bat93 and JaloEEP558 bean lines. LS is supported by the Danish Research Council Grant SNF 21-01-0329. JF is supported by the Danish Agricultural and Veterinary Research Council. BKH is funded by the Council for Development Reasearch (RUF) under the Danish Ministry of Foreign Affairs Project number 91210. We also thank the European Union INCO Programme (ARAMAP, reference: ICA4-2001-10072), and the Generation Challenge Program for financial support. We thank the anonymous reviewers for comments on the manuscript.
- McCouch SR: Genomics and synteny. Plant Physiol. 2001, 125: 152-5. 10.1104/pp.125.1.152.PubMedPubMed CentralView ArticleGoogle Scholar
- Schmidt R: Synteny: recent advances and future prospects. Curr Opin Plant Biol. 2000, 3: 97-102. 10.1016/S1369-5266(99)00048-5.PubMedView ArticleGoogle Scholar
- Delseny M: Re-evaluating the relevance of ancestral shared synteny as a tool for crop improvement. Curr Opin Plant Biol. 2004, 7: 126-31. 10.1016/j.pbi.2004.01.005.PubMedView ArticleGoogle Scholar
- The Arabidopsis Genome Initiative: Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000, 408: 796-815. 10.1038/35048692.View ArticleGoogle Scholar
- Yu J, Hu S, Wang J, Wong GK, Li S, Liu B, Deng Y, Dai L, Zhou Y, Zhang X, Cao M, Liu J, Sun J, Tang J, Chen Y, Huang X, Lin W, Ye C, Tong W, Cong L, Geng J, Han Y, Li L, Li W, Hu G, Huang X, Li W, Li J, Liu Z, Li L, Liu J, Qi Q, Liu J, Li L, Li T, Wang X, Lu H, Wu T, Zhu M, Ni P, Han H, Dong W, Ren X, Feng X, Cui P, Li X, Wang H, Xu X, Zhai W, Xu Z, Zhang J, He S, Zhang J, Xu J, Zhang K, Zheng X, Dong J, Zeng W, Tao L, Ye J, Tan J, Ren X, Chen X, He J, Liu D, Tian W, Tian C, Xia H, Bao Q, Li G, Gao H, Cao T, Wang J, Zhao W, Li P, Chen W, Wang X, Zhang Y, Hu J, Wang J, Liu S, Yang J, Zhang G, Xiong Y, Li Z, Mao L, Zhou C, Zhu Z, Chen R, Hao B, Zheng W, Chen S, Guo W, Li G, Liu S, Tao M, Wang J, Zhu L, Yuan L, Yang H: A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science. 2002, 296: 79-92. 10.1126/science.1068037.PubMedView ArticleGoogle Scholar
- Goff SA, Ricke D, Lan TH, Presting G, Wang R, Dunn M, Glazebrook J, Sessions A, Oeller P, Varma H, Hadley D, Hutchison D, Martin C, Katagiri F, Lange BM, Moughamer T, Xia Y, Budworth P, Zhong J, Miguel T, Paszkowski U, Zhang S, Colbert M, Sun WL, Chen L, Cooper B, Park S, Wood TC, Mao L, Quail P, Wing R, Dean R, Yu Y, Zharkikh A, Shen R, Sahasrabudhe S, Thomas A, Cannings R, Gutin A, Pruss D, Reid J, Tavtigian S, Mitchell J, Eldredge G, Scholl T, Miller RM, Bhatnagar S, Adey N, Rubano T, Tusneem N, Robinson R, Feldhaus J, Macalma T, Oliphant A, Briggs S: A draft sequence of the rice genome (Oryza sativa L. ssp.japonica). Science. 2002, 296: 92-100. 10.1126/science.1068275.PubMedView ArticleGoogle Scholar
- Coghlan A, Eichler EE, Oliver SG, Paterson AH, Stein L: Chromosome evolution in eukaryotes: a multi-kingdom perspective. Trends Genet. 2005, 21: 673-82. 10.1016/j.tig.2005.09.009.PubMedView ArticleGoogle Scholar
- Bennetzen JL: Comparative sequence analysis of plant nuclear genomes: microcolinearity, its many exceptions. Plant Cell. 2000, 12: 1021-1029. 10.1105/tpc.12.7.1021.PubMedPubMed CentralView ArticleGoogle Scholar
- Paterson AH, Bowers JE, Chapman BA: Ancient polyploidization predating divergence of the cereals,, its consequences for comparative genomics. Proc Natl Acad Sci U S A. 2004, 101: 9903-9908. 10.1073/pnas.0307901101.PubMedPubMed CentralView ArticleGoogle Scholar
- Fulton TM, Van der Hoeven R, Eannetta NT, Tanksley SD: Identification, analysis,, utilization of conserved ortholog set markers for comparative genomics in higher plants. Plant Cell. 2002, 14: 1457-1467. 10.1105/tpc.010479.PubMedPubMed CentralView ArticleGoogle Scholar
- Draye X, Lin YR, Qian XY, Bowers JE, Burow GB, Morrell PL, Peterson DG, Presting GG, Ren SX, Wing RA, Paterson AH: Toward integration of comparative genetic, physical, diversity,, cytomolecular maps for grasses, grains, using the sorghum genome as a foundation. Plant Physiol. 2001, 125: 1325-1341. 10.1104/pp.125.3.1325.PubMedPubMed CentralView ArticleGoogle Scholar
- Lyons LA, Laughlin TF, Copeland NG, Jenkins NA, Womack JE, O'Brien SJ: Comparative anchor tagged sequences (CATS: for integrative mapping of mammalian genomes. Nature Genetics. 1997, 15: 47-56. 10.1038/ng0197-47.PubMedView ArticleGoogle Scholar
- Fredslund J, Schauser L, Madsen LH, Sandal N, Stougaard J: PriFi: using a multiple alignment of related sequences to find primers for amplification of homologs. Nucl Acids Res. 2005, 33: W516-520. 10.1093/nar/gki425.PubMedPubMed CentralView ArticleGoogle Scholar
- Schlueter JA, Dixon P, Granger C, Grant D, Clark L, Doyle JJ, Shoemaker RC: Mining EST databases to resolve evolutionary events in major crop species. Genome. 2004, 47: 868-876. 10.1139/g04-047.PubMedView ArticleGoogle Scholar
- Mudge J, Cannon SB, Kalo P, Oldroyd GE, Roe BA, Town CD, Young ND: Highly syntenic regions in the genomes of soybean, Medicago truncatula and Arabidopsis thaliana. BMC Plant Biol. 2005, 5: 15+-10.1186/1471-2229-5-15.PubMedPubMed CentralView ArticleGoogle Scholar
- Pfeil BE, Schlueter JA, Shoemaker RC, Doyle JJ: Placing paleopolyploidy in relation to taxon divergence: a phylogenetic analysis in legumes using 39 gene families. Syst Biol. 2005, 54: 441-454. 10.1080/10635150590945359.PubMedView ArticleGoogle Scholar
- Blanc G, Wolfe KH: Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes. Plant Cell. 2004, 16: 1667-1678. 10.1105/tpc.021345.PubMedPubMed CentralView ArticleGoogle Scholar
- Vision TJ, Brown DG, Tanksley SD: The origins of genomic duplications in Arabidopsis. Science. 2000, 290: 2114-2117. 10.1126/science.290.5499.2114.PubMedView ArticleGoogle Scholar
- Simillion C, Vandepoele K, Montagu MC, Zabeau M, Peer Y: The hidden duplication past of Arabidopsis thaliana. Proc Natl Acad Sci USA. 2002, 99: 13627-13632. 10.1073/pnas.212522399.PubMedPubMed CentralView ArticleGoogle Scholar
- Bowers JE, Chapman BA, Rong J, Paterson AH: Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature. 2003, 422: 433-438. 10.1038/nature01521.PubMedView ArticleGoogle Scholar
- Fedorov A, Merican AF, Gilbert W: Large-scale comparison of intron positions among animal, plant,, fungal genes. Proc Natl Acad Sci U S A. 2002, 99: 16128-16133. 10.1073/pnas.242624899.PubMedPubMed CentralView ArticleGoogle Scholar
- Young ND, Cannon SB, Sato S, Kim D, Cook DR, Town CD, Roe BA, Tabata S: Sequencing the genespaces of Medicago truncatula, Lotus japonicus. Plant Physiol. 2005, 137: 1174-1181. 10.1104/pp.104.057034.PubMedPubMed CentralView ArticleGoogle Scholar
- Poplar Genome Consortium. [http://genome.jgi-psf.org/Poptr1/Poptr1.home.html]
- Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD: Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res. 2003, 31: 3497-3500. 10.1093/nar/gkg500.PubMedPubMed CentralView ArticleGoogle Scholar
- Ware DH, Jaiswal P, Ni J, Yap IV, Pan X, Clark KY, Teytelman L, Schmidt SC, Zhao W, Chang K, Cartinhour S, Stein LD, McCouch SR: Gramene, a tool for grass genomics. Plant Physiol. 2002, 130: 1606-1613. 10.1104/pp.015248.PubMedPubMed CentralView ArticleGoogle Scholar
- Moretzsohn MC, Leoi L, Proite K, Guimaraes PM, Leal-Bertioli SC, Gimenes MA, Martins WS, Valls JF, Grattapaglia D, Bertioli DJ: A microsatellite-based, gene-rich linkage map for the AA genome of Arachis (Fabaceae). Theor Appl Genet. 2005, 111: 1060-1071. 10.1007/s00122-005-0028-x.PubMedView ArticleGoogle Scholar
- Zhu H, Kim DJ, Baek JM, Choi HK, Ellis LC, Kuester H, McCombie WR, Peng HM, Cook DR: Syntenic relationships between Medicago truncatula, Arabidopsis reveal extensive divergence of genome organization. Plant Physiol. 2003, 131: 1018-1026. 10.1104/pp.102.016436.PubMedPubMed CentralView ArticleGoogle Scholar
- Krusell L, Madsen LH, Sato S, Aubert G, Genua A, Szczyglowski K, Duc G, Kaneko T, Tabata S, de Bruijn F, Pajuelo E, Sandal N, Stougaard J: Shoot control of root development, nodulation is mediated by a receptor-like kinase. Nature. 2002, 420: 422-426. 10.1038/nature01207.PubMedView ArticleGoogle Scholar
- Choi HK, Mun JH, Kim DJ, Zhu H, Baek JM, Mudge J, Roe B, Ellis N, Doyle J, Kiss GB, Young ND, Cook DR: Estimating genome conservation between crop, model legume species. Proc Natl Acad Sci U S A. 2004, 101: 15289-15294. 10.1073/pnas.0402251101.PubMedPubMed CentralView ArticleGoogle Scholar
- Choi HK, Kim D, Uhm T, Limpens E, Lim H, Mun JH, Kalo P, Penmetsa RV, Seres A, Kulikova O, Roe BA, Bisseling T, Kiss GB, Cook DR: A sequence-based genetic map of Medicago truncatula comparison of marker colinearity with M sativa. Genetics. 2004, 166: 1463-1502. 10.1534/genetics.166.3.1463.PubMedPubMed CentralView ArticleGoogle Scholar
- Zhu H, Choi HK, Cook DR, Shoemaker RC: Bridging model, crop legumes through comparative genomics. Plant Physiol. 2005, 137: 1189-96. 10.1104/pp.104.058891.PubMedPubMed CentralView ArticleGoogle Scholar
- Quackenbush J, Cho J, Lee D, Liang F, Holt I, Karamycheva S, Parvizi B, Pertea G, Sultana R, White J: The TIGR Gene Indices: analysis of gene transcript sequences in highly sampled eukaryotic species. Nucleic Acids Res. 2001, 29: 159-164. 10.1093/nar/29.1.159.PubMedPubMed CentralView ArticleGoogle Scholar
- Pertea G, Huang X, Liang F, Antonescu V, Sultana R, Karamycheva S, Lee Y, White J, Cheung F, Parvizi B, Tsai J, Quackenbush J: TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets. Bioinformatics. 2003, 19: 651-652. 10.1093/bioinformatics/btg034.PubMedView ArticleGoogle Scholar
- Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST, PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.PubMedPubMed CentralView ArticleGoogle Scholar
- Sandal N, Petersen TR, Murray J, Umehara Y, Karas B, Yano K, Kumagai H, Yoshikawa M, Saito K, Hayashi M, Murakami Y, Wang X, Hakoyama T, Imaizumi-Anraku H, Sato S, Kato T, Chen W, Hossain MS, Shibata S, Wang TL, Yokota K, Larsen K, Kanamori N, Madsen E, Radutoiu S, Madsen LH, Radu TG, Krusell L, Ooki Y, Banba M, Betti M, Rispail N, Skot L, Tuck E, Perry J, Yoshida S, Vickers K, Pike J, Mulder L, Charpentier M, Muller J, Ohtomo R, Kojima T, Ando S, Marquez AJ, Gresshoff PM, Harada K, Webb J, Hata S, Suganuma N, Kouchi H, Kawasaki S, Tabata S, Hayashi M, Parniske M, Szczyglowski K, Kawaguchi M, Stougaard J: Genetics of symbiosis in Lotus japonicus : recombinant inbred lines, comparative genetic maps, and map position of 35 symbiotic loci. Mol Plant Microbe Interact. 2006, 19: 80-91.PubMedView ArticleGoogle Scholar
- Medicago truncatula sequencing resources. [http://www.medicago.org/genome]
- Freyre R, Skrotch PW, Geffrey V, Adam-Blondon A-F, Shirmohamadali A, Johnson WC, Llaca V, Nodari RO, Pereira PA, Tsai S-M, Tohme J, Dron M, Nienhuis J, Vallejos CE, Gepts P: Towards, integrated map of common bean 4: Development of a core linkage map, alignment of RFLP maps. Theor Appl Genet. 1998, 97: 847-856. 10.1007/s001220050964.View ArticleGoogle Scholar
- Doyle JJ, Luckow MA: The rest of the iceberg Legume diversity, evolution in a phylogenetic context. Plant Physiol. 2003, 131: 900-910. 10.1104/pp.102.018150.PubMedPubMed CentralView ArticleGoogle Scholar
- Fredslund J, Madsen LH, Hougaard BK, Sandal N, Stougaard J, Bertioli D, Schauser L: GeMprospector – Online Design of Cross-Species Genetic Marker Candidates in Legumes, Grasses. Nucleic Acids Research.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.