Identification of SNP and SSR markers in eggplant using RAD tag sequencing
© Barchi et al; licensee BioMed Central Ltd. 2011
Received: 7 March 2011
Accepted: 10 June 2011
Published: 10 June 2011
The eggplant (Solanum melongena L.) genome is relatively unexplored, especially compared to those of the other major Solanaceae crops tomato and potato. In particular, no SNP markers are publicly available; on the other hand, over 1,000 SSR markers were developed and publicly available. We have combined the recently developed Restriction-site Associated DNA (RAD) approach with Illumina DNA sequencing for rapid and mass discovery of both SNP and SSR markers for eggplant.
RAD tags were generated from the genomic DNA of a pair of eggplant mapping parents, and sequenced to produce ~17.5 Mb of sequences arrangeable into ~78,000 contigs. The resulting non-redundant genomic sequence dataset consisted of ~45,000 sequences, of which ~29% were putative coding sequences and ~70% were in common between the mapping parents. The shared sequences allowed the discovery of ~10,000 SNPs and nearly 1,000 indels, equivalent to a SNP frequency of 0.8 per Kb and an indel frequency of 0.07 per Kb. Over 2,000 of the SNPs are likely to be mappable via the Illumina GoldenGate assay. A subset of 384 SNPs was used to successfully fingerprint a panel of eggplant germplasm, producing a set of informative diversity data. The RAD sequences also included nearly 2,000 putative SSRs, and primer pairs were designed to amplify 1,155 loci.
The high throughput sequencing of the RAD tags allowed the discovery of a large number of DNA markers, which will prove useful for extending our current knowledge of the genome organization of eggplant, for assisting in marker-aided selection and for carrying out comparative genomic analyses within the Solanaceae family.
Eggplant (Solanum melongena L., 2n = 2x = 24) is a species belonging to the Solanaceae family. It is assumed to have been first domesticated in South and East Asia , and brought to Europe by Arab traders and immigrants around 600 CE . In production terms, eggplant is the third most important Solanaceae crop species (after potato and tomato; http://faostat.fao.org), and is cultivated all over the world, but most intensively in China and India. About 2.4% of world production in 2009 is sited in Europe, with Italy being the single largest producer.
The estimated genome size of eggplant is 1.1 Gbp . Knowledge of its genome organization is rather limited compared to that of either tomato or potato (http://solgenomics.net/, http://www.potatogenome.net). Genetic maps based on both inter-specific [4, 5] and intra-specific [6–9] crosses have been developed. The most recent inter-specific map  is constituted of 347 COS and RFLP markers spanning 1,535 cM, while the most recent intra-specific maps were constructed by Barchi et al. and Nunome et al. and comprise 238 markers, spanning 718.7, and 236 markers, spanning 951.4 cM, respectively. Nevertheless the level of marker saturation is still low in the context of both fine mapping and genomic synteny. A small set of SSR markers was developed by Stagel et al. from genic DNA sequence lodged in public access databases, while Nunome et al. reported the identification of over 1,000 SSR markers from a screen of enriched gDNA and cDNA libraries. Many of these latter proved informative for intra-specific mapping and have been used to generate what is currently the best available genetic linkage map. More recently, Fukuoka et al. have published a dataset containing a large number (~ 16,000) of transcript sequences, but these have yet to be mined for either SSR or SNP markers.
The so-called "Restriction-site Associated DNA" (RAD) method was proposed by Miller et al. as providing a reliable means for genome complexity reduction. The concept is based on acquiring the sequence adjacent to a set of particular restriction enzyme recognition sites. The application of high throughput sequencing technology has allowed significant progress in developing a RAD genotyping platform ; specifically, large volumes of polymorphism data can be now generated by applying massively parallel sequencing and multiplexing with RAD tag libraries.
In this report we describe the generation of genomic RAD tags from the two parents of an F2 segregating population used to generate an intra-specific eggplant genetic map ; the RAD tags were sequenced using the Illumina platform and then annotated/categorized. These data allowed the discovery of a large number of SNP, indel and SSR markers, and some of the SNPs have been tested against a panel of eggplant accessions.
Results and Discussion
Sequencing and contig assembly
Summary statistics of the RAD tags sequencing via Illumina (San Diego, CA)
Illumina reads (million)
Mb of sequences
Total Mb after sequence editing
Average contig length (bp)
Contig length range (bp, min-max)
Common Contigs between parents
Number of sequences with SNPs
Total SNPs (frequency)
10,089 (1/1,241 bp)
Total InDels (frequency)
874 (1/14,325 bp)
In all, 6,411 sequences (14.1%) of the SM-I dataset matched 4,761 entries in the Fukuoka 16 K eggplant annotated unigene dataset (later referred as 16 K) . A BlastN search of the SGN Cornell unigene database (http://solgenomics.net/) produced significant hits from 9,476 (20.9%) of the SM-I sequences matching 8,244 SGN unigenes, of which ~47% originated from tomato, ~38% from potato, and ~11% from tobacco. Combining the 16 K and SGN hits produced 12,315 unique sequences; a total of 9,976 sequences were properly annotated, of which 2,123 were annotated in both the SGN and 16 K databases, 6,440 only in SGN, and 1,413 only in 16 K. Some 35,414 SM-I sequences were unrepresented in either of these two databases, and these were used as a batch BlastX query against the TAIR9 Arabidopsis thaliana protein database to allow a putative assignment of function. In all, 2,798 sequences (7.9%) produced a hit with an E value of < e-15, corresponding to 1,853 A. thaliana genes. This rather small number of hits presumably reflects sequence divergence between eggplant and A. thaliana orthologs, although it has been recognized that the BLAST algorithm can be rather inefficient in identifying homologous sequences when short reads are involved . Globally, therefore, the SM-I dataset consists of some 12,774 annotated sequences which match 7,191 A. thaliana loci (Additional file 2).
The successful identification of a large number of SNP (and indel and SSR) markers highlights the utility of the RAD approach for uncovering genome-wide polymorphisms, especially in materials with low polymorphism . The versatility of the method lies in the ease with which different samples of the genome can be accessed merely by changing the identity of the restriction enzyme(s) used to cleave the genomic DNA; its particular advantage in the context of SNP discovery lies in the ease of aligning short DNA fragments between contrasting templates. Note also that the application of Illumina sequencing allowed for the identification of polymorphic sites outside of the restriction enzyme recognition site .
Genetic diversity revealed by SNP markers
Solanum melongena lines genotyped with SNP markers (shape and skin colour are indicated)
Long dark purple
Long dark purple
Long dark purple
Long dark purple
Long dark purple
Long dark purple
Long light purple
Oval dark purple
Oval dark purple
Oval dark purple
Oval dark purple
Oval dark purple
Round dark purple
Mel sais (violetta)
Violetta di Siracusa
Violetta di Toscana
Identification of SSRs
A screen of the SM-I dataset resulted in the identification of 1,797 sequences containing 1,877 putative SSRs. A small number of these SSRs (22) were discarded as they had already been previously identified [7, 10, 11]. The SSR was present in both mapping parents for 1,145 sequences, in '305E40' alone for 381 sequences, and in '67/3' alone for 329 sequences. At least 1,119 sequences permitted the design of PCR primers, leading to the generation of 1,155 putative markers (Additional file 4). About 4.1% of the SM-I sequences contained an SSR (equivalent to a density of one SSR per 9.0 Kb), which is comparable to the success rate recorded from ESTs of eggplant [7, 10] and tomato , somewhat higher than in potato  but lower than in either coffee  or sweet pepper [36, 37]. Thus the RAD technique appears to offer an effective means of discovering SSRs, especially given the understanding that SSRs are more common in transcribed rather than in genomic sequences .
The 1,855 SSR motifs identified in 1,777 sequences
Average motif length
Number of repeats
Frequencies and repeat numbers for the 20 most present SSR motifs
% of the Total
The RAD method was highly successful for the rapid and large-scale discovery of DNA markers, even in a species recognized to be low polymorphic. Applied to a pair of eggplant mapping parents, the approach was able to define over 10,000 SNPs, 1,600 indels and 1,800 putative SSRs. The current eggplant genetic maps are far from saturated, and as such have had little impact on breeding. The early maps were based on a wide cross, as this was considered necessary to achieve a sufficient level of polymorphism for the markers then available. With the rapid advances being made in sequencing technology, it is now possible to work with intra-specific crosses which are more relevant to the breeder. The present study has generated a large number of SNP, indel and SSR assays, which should permit the rapid saturation of the best available intra-specific genetic map .
Our primary goal was the identification of SNP markers, however data from RAD tags sequencing made it also possible the identification of SSR motifs and respective primers pairs for their amplification. The multi-allelic SSR markers are currently widely applied for both genetic mapping and diversity analyses, despite their cost for development and their limited throughput capabilities . During the last few years the exploitation of publicly available EST sequences leaded to the identification of several thousands of new SSRs markers in a wide range of vegetables species like tomato, pepper, globe artichoke, Brassica, as well as eggplant [10, 36, 37, 39, 44, 45]
The GoldenGate SNP array was highly robust for S. melongena germplasm, but also has potential for a wide-cross population as 84% of the loci were scorable in a contrast between cultivated eggplant and its relative S. aethiopicum. Since these DNA markers define a specific position in the eggplant genome, they should be useful for merging the various genetic linkage maps currently available, some of which include loci related to important agronomic traits. Finally, the markers are very informative for the analysis of genetic diversity, as well as for comparative studies across species within the Solanaceae family.
Plant materials and DNA isolation
DNA was extracted from the two eggplant lines '305E40' and '67/3', which are the parents of an F2 intra-specific mapping population . The female parent, double-haploid line '305E40', produces long, highly pigmented dark purple fruit. The parent '305E40' is an introgression line derived from the somatic hybrid S. melongena cv. 'Dourga'(+)S. aethiopicum which was backcrossed with a tetraploid plant of the eggplant line 'DR2' and then subjected to anther culture; an anther-derived dihaploid plant was backcrossed 4 times with the line 'Tal1/1', then selfed two times and, finally, made completely homozygous through anther culture [17, 44]. The male parent, line '67/3', was an F8 selection from the intra-specific cross cvs. 'Purpura' × 'CIN2'. Its fruit is round and violet coloured. The DNAs extracted from a set of 23 accessions (including the two mapping parents) representative of the S. melongena gene pool (Table 2), together with an accession of S. aethiopicum (a progenitor of '305E40') were tested with a subset of the newly developed SNP assays. All DNA samples were extracted from young leaves, using the GenElute™ Plant Genomic DNA Miniprep kit (Sigma, St. Louis, MO), following the manufacturer's protocol.
RAD library preparation, sequencing, assembly
The RAD library was constructed at Floragenex Inc. (USA), according to the protocol described by Baird et al., as follows. Genomic DNA (300 ng) was digested for 60 min at 37° C in a 50 μL reaction containing 20 U each of SgrA I and Pst I (New England Biolabs, Beverly MA, USA). The reactions were stopped by holding at 65° C for 20 min. The P1 adapter (a modified Illumina adapter, see Baird et al. was ligated to the products of the restriction reaction, and the "barcoding" of the various samples was achieved with a set of index nucleotides in the P1 adapter sequence. A 2.5 μL aliquot of 100 nM P1 adapter was added to each sample, along with 1 μL 10 mM ATP (Promega), 1 μL 10 × NEB Buffer4, 1 μL (equivalent to 1,000 U) T4 DNA ligase (Enzymatics, Inc) and 5 μL water, and the reaction was incubated at room temperature for 20 min, and then heat-inactivated (20 min at 65° C). The reactions were then pooled and the products randomly sheared to a mean size of 500 bp using a Bioruptor (Diagenode). The material was electrophoresed through a 1.5% agarose gel, and the DNA in the range 300-800 bp isolated using a MinElute Gel Extraction Kit (Qiagen). The dsDNA ends were treated with end blunting enzymes (Enzymatics, Inc) to remove overhangs, and the samples purified by passing through a MinElute column (Qiagen). 3'-adenine overhangs were then added by the addition of 15 U Klenow exo- (Enzymatics), followed by an incubation at 37° C for 10 min. Following re-purification, 1 μL 10 μM P2 adapter (a modified Illumina adapter, see Baird et al.) was ligated, as described above for P1. The samples were then purified as above, and eluted in a volume of 50 μL. Following quantification (Qubit fluorimeter), 20 ng were taken as the template for a 100 μL PCR containing 20 μL Phusion Master Mix (NEB), 5 μL 10 μM P1 adapter primer (Illumina), 5 μL 10 μM P2 adapter primer (Illumina) and water. The Phusion PCR settings followed product guidelines (NEB) over 18 cycles. The amplicons were gel purified, the size range 300-700 bp was excised from the gel, its DNA content adjusted to 3 ng/μL. RADs from each parent were sequenced on a Genome Analyzer II (Illumina, San Diego, CA) using paired end 54 bp sequence reads. The paired end sequences from each parent were pooled and segregated by single read RAD sequences. Velvet  was used to assemble consensus LongRead contigs from the paired end data. Repetitive element occurrence was searched via CENSOR, a software tool which screens query sequences against a reference collection of repeats (http://www.girinst.org/censor; ), adopting default parameters and considering Viridiplantae as target database.
CAP3  algorithm was used to identify sequences in common between the mapping parents using default parameters with some modifications (overlap length cut-off = 80 and overlap percent identity cut-off = 95). The resulting dataset (SM-I; Solanum melongena-Illumina) included singlets from '67/3' and '305E40' as well as contigs deriving from both RAD rounds. A stand-alone BLAST tool was used to provide the optimal annotation for each dataset.
A BlastN search was performed against the SGN Cornell unigene database (http://solgenomics.net/), using as cut-off parameters 90% identity and a minimum alignment of 100 bp. A second BlastN search was made against the 16 K Fukuoka eggplant unigene dataset (in the article referred as 16 K, http://vegmarks.nivot.affrc.go.jp), using as cut-off parameters 95% identity and a minimum alignment of 100 bp. A BlastX search was carried out against the TAIR9 dataset (http://www.arabidopsis.org), adopting a threshold E-value of e-15. The annotated sequences were assigned a function based on the Gene Ontology tool available at TAIR (http://www.arabidopsis.org/tools/bulk/go/), using A. thaliana orthologs as input (AGI codes), and mapped to higher level categories (plant GO Slim) using GOSlimViewer  according to the three principal GO categories "molecular function", "biological process" and "cellular localization" .
SNPs were called using a short read alignment algorithm  which aligned non-assembled 50 bp Illumina reads from '67/3' against the '305E40' assembly, by analogy with the MAQ style sequence pileup  at a minimum coverage of 6x; to call indels, an SSAHA-based alignment strategy  was applied. Both SNPs and indels were regarded as true polymorphisms, when each allele was observed at least three times.
Each SNP was assigned a designability score via a dedicated "assay design tool" (http://www.illumina.com), which identified SNP loci free of other polymorphisms 60 bp either upstream or downstream. A quality score, based on the probability of good performance using the Illumina Golden Gate assay, was assigned to each SNP, where a score > 0.6 indicated a high probability of success.
Genetic diversity assessment based on the GoldenGate assay
The GoldenGate assay (Illumina, San Diego, CA) was used for SNP genotyping at the UC Davis Genome Center. Automatic allele calling for each locus was obtained by GenCall software (Illumina). As an internal control, two duplicate templates were included in each run. An estimate of PIC (Polymorphism Information Content) was made following the suggestion of Anderson et al.. Each SNP locus was scored in binary fashion. A co-phenetic distance matrix based on co-dominant markers was generated, as described by Smouse et al. and used to construct a UPGMA-based dendrogram as implemented within NTSYS software package v2.10 .
SSR motifs were identified by SciRoKo software . Both perfect and imperfect mono, di-, tri-, tetra-, penta- and hexanucleotide motifs were targeted. Primer pairs were designed from the flanking sequences using PRIMER3 software  in batch mode, as implemented in the SciRoKo package. The target amplicon size range was set as 125-250 bp, the optimal annealing temperature 60° C, and the optimal primer length 20 bp.
This research was partially supported by the Italian Ministry of Agricultural Alimentary and Forest Politics in the framework of "PROM", "ESPLORA" and "AGRONANOTECH" projects
- Polignano G, Uggenti P, Bisignano V, Gatta Della C: Genetic divergence analysis in eggplant (Solanum melongena L.) and allied species. Genetic Resources and Crop Evolution. 2010, 57 (2): 171-181. 10.1007/s10722-009-9459-6.View Article
- Daunay M, Lester R, Ano G: Eggplant. Tropical plant breeding. Edited by: Charrier, A, Jacquot, M, Hamon, S & Nicolas, D. CIRAD, Paris, France, 199-222.
- Arumuganathan K, Earle E: Nuclear DNA content of some important plant species. Plant Molecular Biology Reporter. 1991, 9 (3): 208-218. 10.1007/BF02672069.View Article
- Doganlar S, Frary A, Daunay M, Lester R, Tanksley S: A comparative genetic linkage map of eggplant (Solanum melongena) and its implications for genome evolution in the Solanaceae. Genetics. 2002, 161 (4): 1697-1711.PubMed CentralPubMed
- Wu F, Eannetta N, Xu Y, Tanksley S: A detailed synteny map of the eggplant genome based on conserved ortholog set II (COSII) markers. Theoretical and Applied Genetics. 2009, 118 (5): 927-935. 10.1007/s00122-008-0950-9.PubMedView Article
- Nunome T, Ishiguro K, Yoshida T, Hirai M: Mapping of fruit shape and color development traits in eggplant (Solanum melongena L.) based on RAPD and AFLP markers. Breeding science. 2001, 51 (1): 19-26. 10.1270/jsbbs.51.19.View Article
- Nunome T, Negoro S, Kono I, Kanamori H, Miyatake K, Yamaguchi H, Ohyama A, Fukuoka H: Development of SSR markers derived from SSR-enriched genomic library of eggplant (Solanum melongena L.). Theoretical and Applied Genetics. 2009, 119 (6): 1143-1153. 10.1007/s00122-009-1116-0.PubMedView Article
- Nunome T, Suwabe K, Iketani H, Hirai M: Identification and characterization of microsatellites in eggplant. Plant Breeding. 2003, 122 (3): 256-262. 10.1046/j.1439-0523.2003.00816.x.View Article
- Barchi L, Lanteri S, Portis E, Stagel A, Vale G, Toppino L, Rotino GL: Segregation distortion and linkage analysis in eggplant (Solanum melongena L.). Genome. 2010, 53 (10): 805-815. 10.1139/G10-073.PubMedView Article
- Stàgel A, Portis E, Toppino L, Rotino GL, Lanteri S: Gene-based microsatellite development for mapping and phylogeny studies in eggplant. BMC Genomics. 2008, 9: 357-10.1186/1471-2164-9-357.PubMed CentralPubMedView Article
- Fukuoka H, Yamaguchi H, Nunome T, Negoro S, Miyatake K, Ohyama A: Accumulation, functional annotation, and comparative analysis of expressed sequence tags in eggplant (Solanum melongena L.), the third pole of the genus Solanum species after tomato and potato. Gene. 2010, 450 (1-2): 76-84. 10.1016/j.gene.2009.10.006.PubMedView Article
- Miller MR, Dunham JP, Amores A, Cresko WA, Johnson EA: Rapid and cost-effective polymorphism identification and genotyping using restriction site associated DNA (RAD) markers. Genome Research. 2007, 17 (2): 240-248. 10.1101/gr.5681207.PubMed CentralPubMedView Article
- Baird NA, Etter PD, Atwood TS, Currey MC, Shiver AL, Lewis ZA, Selker EU, Cresko WA, Johnson EA: Rapid SNP Discovery and Genetic Mapping Using Sequenced RAD Markers. PLoS ONE. 2008, 3 (10): e3376-10.1371/journal.pone.0003376.PubMed CentralPubMedView Article
- Zeng S, Xiao G, Guo J, Fei Z, Xu Y, Roe B, Wang Y: Development of a EST dataset and characterization of EST-SSRs in a traditional Chinese medicinal plant, Epimedium sagittatum (Sieb. Et Zucc.) Maxim. BMC Genomics. 2010, 11 (1): 94-10.1186/1471-2164-11-94.PubMed CentralPubMedView Article
- Harris M, Clark J, Ireland A, Lomax J, Ashburner M, Foulger R, Eilbeck K, Lewis S, Marshall B, Mungall C, Richter J, Rubin GM, Blake JA, Bult C, Dolan M, Drabkin H, Eppig JT, Hill DP, Ni L, Ringwald M, Balakrishnan R, Cherry JM, Christie KR, Costanzo MC, Dwight SS, Engel S, Fisk DG, Hirschman JE, Hong EL, Nash RS, et al: The Gene Ontology (GO) database and informatics resource. Nucleic Acids Research. 2004, 32 (Database): D258-261.PubMed
- Varshney R, Hiremath P, Lekha P, Kashiwagi J, Balaji J, Deokar A, Vadez V, Xiao Y, Srinivasan R, Gaur P, Siddique KHM, Town CD, Hoisington DA: A comprehensive resource of drought- and salinity-responsive ESTs for gene discovery and marker development in chickpea (Cicer arietinum L.). BMC Genomics. 2009, 10: 523-10.1186/1471-2164-10-523.PubMed CentralPubMedView Article
- Toppino L, Vale G, Rotino GL: Inheritance of Fusarium wilt resistance introgressed from Solanum aethiopicum Gilo and Aculeatum groups into cultivated eggplant (S.melongena) and development of associated PCR-based markers. Molecular Breeding. 2008, 22 (2): 237-250. 10.1007/s11032-008-9170-x.View Article
- Simko I, Haynes KG, Jones RW: Assessment of Linkage Disequilibrium in Potato Genome With Single Nucleotide Polymorphism Markers. Genetics. 2006, 173 (4): 2237-2245. 10.1534/genetics.106.060905.PubMed CentralPubMedView Article
- Velasco R, Zharkikh A, Troggio M, Cartwright DA, Cestaro A, Pruss D, Pindo M, Fitzgerald LM, Vezzulli S, Reid J, Malacarne G, Iliev D, Coppola G, Wardell B, Micheletti D, Macalma T, Facci M, Mitchell JT, Perazzolli M, Eldredge G, Gatto P, Oyzerski R, Moretto M, Gutin N, Stefanini M, Chen Y, Segala C, Davenport C, Dematté L, Mraz A, et al: A High Quality Draft Consensus Sequence of the Genome of a Heterozygous Grapevine Variety. PLoS ONE. 2007, 2 (12): e1326-10.1371/journal.pone.0001326.PubMed CentralPubMedView Article
- Barker G, Edwards K: A genome-wide analysis of single nucleotide polymorphism diversity in the world's major cereal crops. Plant Biotechnology Journal. 2009, 7 (4): 318-325. 10.1111/j.1467-7652.2009.00412.x.PubMedView Article
- Jiang D, Ye QL, Wang FS, Cao L: The Mining of Citrus EST-SNP and Its Application in Cultivar Discrimination. Agricultural Sciences in China. 2010, 9 (2): 179-190. 10.1016/S1671-2927(09)60082-1.View Article
- Van Deynze A, Stoffel K, Buell CR, Kozik A, Liu J, van der Knaap E, Francis D: Diversity in conserved genes in tomato. BMC Genomics. 2007, 8: 9-10.1186/1471-2164-8-9.View Article
- Jung J, Park S, Liu W, Kang B: Discovery of single nucleotide polymorphism in Capsicum and SNP markers for cultivar identification. Euphytica. 2010, 175 (1): 91-107. 10.1007/s10681-010-0191-2.View Article
- Feltus FA, Wan J, Schulze SR, Estill JC, Jiang N, Paterson AH: An SNP Resource for Rice Genetics and Breeding Based on Subspecies Indica and Japonica Genome Alignments. Genome Research. 2004, 14 (9): 1812-1819. 10.1101/gr.2479404.PubMed CentralPubMedView Article
- Schneider K, Kulosa D, Soerensen T, Mohring S, Heine M, Durstewitz G, Polley A, Weber E, Jamsari , Lein J, Hohmann U, Tahiro E, Weisshaar B, Schulz B, Koch G, Jung C, Ganal M: Analysis of DNA polymorphisms in sugar beet (Beta vulgaris L.) and development of an SNP-based map of expressed genes. Theoretical and Applied Genetics. 2007, 115 (5): 601-615. 10.1007/s00122-007-0591-4.PubMedView Article
- Riju A, Arunachalam V: Interspecific differences in single nucleotide polymorphisms (SNPs) and indels in expressed sequence tag libraries of oil palm Elaeis guineensis and E. oleifera. Available from Nature Preceding. 2009, [http://precedings.nature.com/documents/3593/version/2]
- Batley J, Barker G, O'Sullivan H, Edwards K, Edwards D: Mining for single nucleotide polymorphisms and insertions/deletions in maize expressed sequence tag data. Plant Physiology. 2003, 132 (1): 84-91. 10.1104/pp.102.019422.PubMed CentralPubMedView Article
- Holmquist R: Transitions and transversions in evolutionary descent: An approach to understanding. Journal of Molecular Evolution. 1983, 19 (2): 134-144. 10.1007/BF02300751.PubMedView Article
- Yang Z, Yoder AD: Estimation of the Transition/Transversion Rate Bias and Species Sampling. Journal of Molecular Evolution. 1999, 48 (3): 274-283. 10.1007/PL00006470.PubMedView Article
- Ramirez M, Graham M, Blanco-Lopez L, Silvente S, Medrano-Soto A, Blair M, Hernandez G, Vance C, Lara M: Sequencing and Analysis of Common Bean ESTs. Building a Foundation for Functional Genomics. Plant Physiology. 2005, 137: 1211-1227. 10.1104/pp.104.054999.PubMed CentralPubMedView Article
- Terol J, Naranjo M, Ollitrault P, Talon M: Development of genomic resources for Citrus clementina: characterization of three deep-coverage BAC libraries and analysis of 46,000 BAC end sequences. BMC Genomics. 2008, 9: 423-10.1186/1471-2164-9-423.PubMed CentralPubMedView Article
- Wittwer CT, Reed GH, Gundry CN, Vandersteen JG, Pryor RJ: High-Resolution Genotyping by Amplicon Melting Analysis Using LCGreen. Clinical Chemistry. 2003, 49 (6): 853-860. 10.1373/49.6.853.PubMedView Article
- Wu X, Ren C, Joshi T, Vuong T, Xu D, Nguyen H: SNP discovery by high-throughput sequencing in soybean. BMC Genomics. 2010, 11 (1): 469-10.1186/1471-2164-11-469.PubMed CentralPubMedView Article
- Yan J, Yang X, Shah T, Sánchez-Villeda H, Li J, Warburton M, Zhou Y, Crouch J, Xu Y: High-throughput SNP genotyping with the GoldenGate assay in maize. Molecular Breeding. 2010, 25 (3): 441-451. 10.1007/s11032-009-9343-2.View Article
- Hyten D, Song Q, Choi I, Yoon M, Specht J, Matukumalli L, Nelson R, Shoemaker R, Young N, Cregan P: High-throughput genotyping with the GoldenGate assay in the complex genome of soybean. Theoretical and Applied Genetics. 2008, 116 (7): 945-952. 10.1007/s00122-008-0726-2.PubMedView Article
- Kumpatla S, Mukhopadhyay S: Mining and survey of simple sequence repeats in expressed sequence tags of dicotyledonous species. Genome. 2005, 48: 985-998. 10.1139/g05-060.PubMedView Article
- Portis E, Nagy I, Sasvari Z, Stagel A, Barchi L, Lanteri S: The design of Capsicum spp. SSR assays via analysis of in silico DNA sequence, and their potential utility for genetic mapping. Plant Science. 2007, 172: 640-648. 10.1016/j.plantsci.2006.11.016.View Article
- Morgante M, Hanafey M, Powell W: Microsatellites are preferentially associated with nonrepetitive DNA in plant genomes. Nature Genetetics. 2002, 30: 194-200. 10.1038/ng822.View Article
- Nagy I, Stagel A, Sasvari Z, Roder M, Ganal M: Development, characterization, and transferability to other Solanaceae of microsatellite markers in pepper (Capsicum annuum L.). Genome. 2007, 50: 668-688. 10.1139/G07-047.PubMedView Article
- Shirasawa K, Asamizu E, Fukuoka H, Ohyama A, Sato S, Nakamura Y, Tabata S, Sasamoto S, Wada T, Kishida Y, Tsuruoka H, Fujishiro T, Yamada M, Isobe S: An interspecific linkage map of SSR and intronic polymorphism markers in tomato. Theoretical and Applied Genetics. 2010, 121 (4): 731-739. 10.1007/s00122-010-1344-3.PubMed CentralPubMedView Article
- Kantety R, La Rota M, Matthews D, Sorrells M: Data mining for simple sequence repeats in expressed sequence tags from barley, maize, rice, sorghum and wheat. Plant Molecular Biology. 2002, 48: 501-510. 10.1023/A:1014875206165.PubMedView Article
- Tangphatsornruang S, Sangsrakru D, Chanprasert J, Uthaipaisanwong P, Yoocha T, Jomchai N, Tragoonrung S: The Chloroplast Genome Sequence of Mungbean (Vigna radiata) Determined by High-throughput Pyrosequencing: Structural Organization and Phylogenetic Relationships. DNA Research. 2010, 17 (1): 11-22. 10.1093/dnares/dsp025.PubMed CentralPubMedView Article
- La Rota M, Kantety R, Yu J, Sorrells M: Nonrandom distribution and frequencies of genomic and EST-derived microsatellite markers in rice, wheat, and barley. BMC Genomics. 2005, 6: 23-10.1186/1471-2164-6-23.PubMed CentralPubMedView Article
- Tang J, Baldwin S, Jacobs J, Van der Linden CG, Voorrips RE, Leunissen JAM, Van Eck HJ, Vosman B: Large-scale identification of polymorphic microsatellites using an in silico approach. BMC Bioinformatics. 2008, 9: 374-10.1186/1471-2105-9-374.PubMed CentralPubMedView Article
- Scaglione D, Acquadro A, Portis E, Taylor C, Lanteri S, Knapp S: Ontology and diversity of transcript-associated microsatellites mined from a globe artichoke EST database. BMC Genomics. 2009, 10: 454-10.1186/1471-2164-10-454.PubMed CentralPubMedView Article
- Rizza F, Mennella G, Collonnier C, Shiachakr D, Kashyap V, Rajam M, Prestera M, Rotino GL: Androgenic dihaploids from somatic hybrids between Solanum melongena and S. aethiopicum group gilo as a source of resistance to Fusarium oxysporum f. sp melongenae. Plant Cell Reports. 2002, 20 (11): 1022-1032. 10.1007/s00299-001-0429-5.View Article
- Zerbino DR, Birney E: Velvet: Algorithms for de novo short read assembly using de Bruijn graphs. Genome Research. 2008, 18 (5): 821-829. 10.1101/gr.074492.107.PubMed CentralPubMedView Article
- Kohany O, Gentles AJ, Hankus L, Jurka J: Annotation, submission and screening of repetitive elements in Repbase: RepbaseSubmitter and Censor. BMC Bioinformatics. 2006, 7: 474-10.1186/1471-2105-7-474.PubMed CentralPubMedView Article
- Huang X, Madan A: CAP3: A DNA Sequence Assembly Program. Genome Research. 1999, 9 (9): 868-877. 10.1101/gr.9.9.868.PubMed CentralPubMedView Article
- McCarthy F, Wang N, Magee GB, Nanduri B, Lawrence M, Camon E, Barrell D, Hill D, Dolan M, Williams WP, Luthe DS, Bridges SM, Burgess SC: AgBase: a functional genomics resource for agriculture. BMC Genomics. 2006, 7 (1): 229-10.1186/1471-2164-7-229.PubMed CentralPubMedView Article
- Needleman SB, Wunsch CD: A general method applicable to the search for similarities in the amino acid sequence of two proteins. Journal of Molecular Biology. 1970, 48 (3): 443-453. 10.1016/0022-2836(70)90057-4.PubMedView Article
- Li H, Ruan J, Durbin R: Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Research. 2008, 18 (11): 1851-1858. 10.1101/gr.078212.108.PubMed CentralPubMedView Article
- Ning Z, Cox A, Mullikin J: SSAHA: A fast search method for large DNA databases. Genome Research. 2001, 1725-1729.
- Anderson J, Churcill G, Autrique J, Tanksley S, Sorrels M: Optimizing parental selection for genetic linkage maps. Genome. 1992, 36: 181-186.View Article
- Smouse PE, Peakall R: Spatial autocorrelation analysis of individual multiallele and multilocus genetic structure. Heredity. 1999, 82 (5): 561-573. 10.1038/sj.hdy.6885180.PubMedView Article
- Rohlf F: NTSYS-pc Numerical Taxonomy and Multivariate Analysis System version 2.02 User Guide. 1998
- Kofler R, Schlotterer C, Lelley T: SciRoKo: a new tool for whole genome microsatellite search and investigation. Bioinformatics. 2007, 23 (13): 1683-1685. 10.1093/bioinformatics/btm157.PubMedView Article
- Rozen S, Skaletsky H: Primer3 on the www for general users and for biologist programmers. Methods Molecular Biology. 2000, 132: 365-386.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.