Transcriptome sequencing of field pea and faba bean for discovery and validation of SSR genetic markers
© Kaur et al; licensee BioMed Central Ltd. 2012
Received: 7 October 2011
Accepted: 20 March 2012
Published: 20 March 2012
Field pea (Pisum sativum L.) and faba bean (Vicia faba L.) are cool-season grain legume species that provide rich sources of food for humans and fodder for livestock. To date, both species have been relative 'genomic orphans' due to limited availability of genetic and genomic information. A significant enrichment of genomic resources is consequently required in order to understand the genetic architecture of important agronomic traits, and to support germplasm enhancement, genetic diversity, population structure and demographic studies.
cDNA samples obtained from various tissue types of specific field pea and faba bean genotypes were sequenced using 454 Roche GS FLX Titanium technology. A total of 720,324 and 304,680 reads for field pea and faba bean, respectively, were de novo assembled to generate sets of 70,682 and 60,440 unigenes. Consensus sequences were compared against the genome of the model legume species Medicago truncatula Gaertn., as well as that of the more distantly related, but better-characterised genome of Arabidopsis thaliana L.. In comparison to M. truncatula coding sequences, 11,737 and 10,179 unique hits were obtained from field pea and faba bean. Totals of 22,057 field pea and 18,052 faba bean unigenes were subsequently annotated from GenBank. Comparison to the genome of soybean (Glycine max L.) resulted in 19,451 unique hits for field pea and 16,497 unique hits for faba bean, corresponding to c. 35% and 30% of the known gene space, respectively. Simple sequence repeat (SSR)-containing expressed sequence tags (ESTs) were identified from consensus sequences, and totals of 2,397 and 802 primer pairs were designed for field pea and faba bean. Subsets of 96 EST-SSR markers were screened for validation across modest panels of field pea and faba bean cultivars, as well as related non-domesticated species. For field pea, 86 primer pairs successfully obtained amplification products from one or more template genotypes, of which 59% revealed polymorphism between 6 genotypes. In the case of faba bean, 81 primer pairs displayed successful amplification, of which 48% detected polymorphism.
The generation of EST datasets for field pea and faba bean has permitted effective unigene identification and functional sequence annotation. EST-SSR loci were detected at incidences of 14-17%, permitting design of comprehensive sets of primer pairs. The subsets from these primer pairs proved highly useful for polymorphism detection within Pisum and Vicia germplasm.
The Fabaceae (Leguminosae) is the third largest angiosperm family, containing c. 18,000 species attributed to 650 genera [1–3]. Legumes provide major benefits to cropping systems and the environment, due to the ability to perform symbiotic nitrogen fixation. In comparison to cereals, for which a broad range of genetic and genomic resources are available, genomic databases for legumes are generally still underdeveloped. However, recent advances in sequencing and genotyping technologies offer the opportunity to rapidly ameliorate the status of given species at relatively low cost . Major efforts are currently being directed towards the development of species-specific genomic tools and datasets. As an example, the whole genome sequence of soybean, a warm-season grain legume, has recently been determined http://www.phytozome.net/soybean.
Cool-season food legumes within the Hologalegina clade of the Fabaceae sub-family Papilionoideae, which includes lentil, chickpea, field pea and faba bean (pulses), are important food and fodder crops, especially in developing countries such as those of the Indian sub-continent . These species are important components of farming systems across Western Asia, the Middle East, North Africa, the Indian sub-continent, North America and Australia. In Australia, pulses are sown over c. 2 million hectares and produce c. 2.5 million tonnes of grain with a commodity value of over AU$ 675 million . Despite close phylogenetic relationships, pulse species vary considerably in aspects of biology such as genome size, fundamental chromosome number, ploidy level, and degree of reproductive self-compatibility. The genome size of chickpea is relatively small (c. 700 Mb), but pulses of the Vicieae tribe (lentil, pea and faba bean) exhibit much larger genome sizes (in the range from 4-13 Gb). Recently, generation of large-scale lentil transcriptome data by our group has substantially increased the volume of publicly available genomic data for this species . Similar strategies have been pursued for field pea and faba bean in the current study.
Field pea, which is the third most globally important grain legume crop (at 5.5 million hectares per year) after soybean and common bean (Phaseolus vulgaris L.), is a self-pollinating diploid (2n = 2x = 14) species with a genome size of c. 5 Gbp . Various studies have been performed to determine the genetic basis of multiple phenotypic traits in field pea [9–11] and to quantify diversity between different pea cultivars [12–16]. Recently, a comprehensive transcriptome analysis of field pea has been performed using second-generation sequencing technologies  that will contribute significantly to the enrichment of genomics resources for field pea. In contrast, faba bean has not been widely adopted on a global basis. In terms of cultivation area, this species ranks fourth among the cool-season food legumes (at 2.6 million hectares per year) after field pea, chickpea and lentil http://faostat.fao.org. Faba bean has been traditionally cultivated in the Mediterranean basin, the Nile valley, Ethiopia, Central and East Asia, Latin America, Northern Europe, North America and Australia . Faba bean is a diploid taxon (2n = 2x = 12), and exhibits facultative cross-pollination at frequencies ranging from 4-84%. The nuclear genome size of faba bean is one of the largest yet described among crop legumes, at c. 13 Gb. Formal genetic analysis of faba bean, such as through genetic linkage mapping and identification of quantitative trait loci (QTLs), has so far been hindered by these aspects of biology .
Conventional breeding methods based on phenotypic assessment are currently in use for breeding line selection in field pea and faba bean. Such methods are logistically demanding and time-consuming, especially for traits that require specific biotic or abiotic challenges, such as resistance to individual diseases. In addition to this, when breeding for types eaten as immature seed, quality testing adds considerable complexity to the relevant programs. There is consequently a major requirement for species-specific molecular genetic markers and derived linkage maps for field pea and faba bean, to enable germplasm advancement through genomics-assisted selection.
Current publicly available genetic and genomic tools for field pea and faba bean are limited in extent [20–23], comprising 18,552 and 5,253 ESTs, respectively that are available in Genbank. In addition to this, a recently sequenced Pisum sativum transcriptome generated a total of 81,449 unigenes that are also available for download as a fully annotated fasta format . Second-generation DNA sequencing systems such as the Roche 454 massively-parallel pyrosequencing platform are capable of rapidly producing species-specific genomic resources to address these short-comings. This system can generate 4-6 × 108 bp from each run, with individual read lengths of 400-500 bp , and is suitable for de novo sequencing of small genomes , whole genome resequencing , SNP detection , and in particular, sequencing of transcriptomes .
ESTs obtained from the latter activity provide valuable resources for gene discovery, large-scale expression analysis, improved genome annotation, elucidation of phylogenetic relationships and facilitation of breeding programs for both plants and animals through provision of SSR and single nucleotide polymorphism (SNP) genetic markers . SSR loci have been widely used for improvement of a range of crop species . Only a limited number of SSRs are available in public domain for field pea and faba bean, creating an incentive for further discovery and validation. In comparison with genomic DNA-derived SSRs, those located in ESTs are functionally associated with genic regions, and support potential diagnostic genetic marker development [31–34].
This study describes the development, de novo assembly and gene annotation of a transcriptome dataset derived from cDNA samples obtained from several tissues at various stages of development of multiple field pea and faba bean genotypes. Clustering and annotation to generate a unigene set has permitted computational identification of SSR loci, and the design and evaluation of a set of EST-SSR marker-directed primer pairs.
Materials and methods
Seeds of field pea were obtained from the Australian Temperate Field Crops Collection (ATFCC) held at the Department of Primary Industries, Horsham, Victoria, Australia. Faba bean seeds were obtained from the Australian faba bean breeding program at The University of Adelaide, South Australia, Australia. Three to four seeds from each variety of field pea (Parafield, Yarrum, Kaspa, 96-286*) and faba bean (Icarus, Ascot) were selected based on the criteria of genetic diversity and significant agronomic variation, and were sown into commercial potting mix. These genotypes were also potential parents for the genetic mapping populations of field pea and faba bean, to be used to dissect various traits of interest. Germinated plantlets were grown to maturity under glasshouse conditions with natural light at the Department of Primary Industries, Bundoora, Victoria, Australia. Selected plant tissues were harvested for RNA isolation from plants at various stages of development, including leaf (young and mature), stem, flowers, immature pods, mature pods and immature seeds. A total of 4-8 seeds were also germinated in Petri dishes in order to provide material for harvest of seedling root and shoot samples. All of the vegetative plant tissues (leaf and stem) were pooled for RNA isolation and designated LS (leaf/stem) tissue. All of the reproductive organs including flowers, immature pods, mature pods and immature seeds were also pooled for RNA isolation and designated FS (flower/seed) tissue. The seedling-derived root (RG) and shoot (SG) samples were used separately for RNA isolation.
RNA isolation and cDNA preparation
Total RNA isolation and cDNA synthesis were performed as described in an equivalent study performed for lentil .
EST sequence generation, assembly and annotation
cDNAs obtained from the four distinct RNA pools (LS, FS, RG and SG) were combined in equimolar ratio before proceeding to GS FLX library preparation. Approximately 5 μg of bulked cDNA was sheared by nebulisation at 206 kPa for 2-4 min. The GS FLX Titanium shotgun libraries were constructed following manufacturer's instructions (Roche Diagnostics, Castle Hill, NSW, Australia). The ssDNA libraries were quantified using real-time quantitative PCR. Finally, emulsion (em) PCR was performed using the Lib-L emPCR protocol (Roche Diagnostics, Castle Hill, NSW, Australia). The enriched beads obtained as a result of em-PCR were loaded onto picotitre plates for sequencing. All of the pooled cDNA libraries obtained from different genotypes of field pea and faba bean were separately sequenced on individual quarters of picotitre plates.
All sequence reads generated from different genotypes were de novo assembled using the Next Gene software (Softgenetics, State College, Pennsylvania, USA). The adaptor and primer sequences were removed prior to the assembly using the 'trimming' function (trim sequences with 100% similarity to the primer/adaptor sequence). De novo assembly was performed using the Greedy algorithm and error correction condensation. The Greedy algorithm searches for maximum overlap between reads and extends the overlap to form large contigs and is recommended for 454 reads or reads with average read length > 70 bp. The error correction condensation tool functions by dividing sequence reads in which homopolymers are found and at least 16 bases intervene between the homopolymer runs. These shorter reads were termed keywords, and comparison of keywords between reads allowed the correct determination of the bases at the end of each keyword. Sequence reads that contain variations of low frequency were then corrected.
Assembled contig outputs were deposited in the Transcriptome Shotgun Assembly (TSA) of GenBank (field pea; JR950756-JR964200 and faba bean; JR964201-JR970413). Contigs and singletons were compared against the M. truncatula (Mt 3.0), A. thaliana (TAIR 9 CDS [coding sequences]), G. max (Glyma 1.0) and P. sativum transcriptome databases using BLASTN  with a threshold E value of 10-10. Both field pea and faba bean unigene sets were also BLASTN analysed against respective EST and nucleotide sequences publicly available in GenBank. BLASTN analysis was also performed in the non-redundant database of GenBank using the tBLASTX algorithm to derive putative annotations of the unigene set. Gene ontology (GO) terms were assigned to unigenes that showed hits against the Arabidopsis thaliana database using the 'Gene Ontology at TAIR' tool.
Discovery of EST-SSRs, primer design and marker validation
Detection of EST-SSR loci and primer pair design was performed using the Batch Primer3 software http://probes.pw.usda.gov/cgi-bin/batchprimer3/batchprimer3.cgi. The parameters were designed for identification of perfect di-, tri-, tetra-, penta-, and hexanucleotide motifs with minimum of repeat numbers of 6, 4, 3, 3, and 3, respectively. Primer design parameters were set as follows: length range = 18 to 23 nucleotides with 21 as optimum; PCR product size range = 100 to 400 bp; optimum annealing temperature = 55°C; and GC content 40-60%, with 50% as optimum.
Genomic DNA was extracted from target plant genotypes for EST-SSR marker validation using the DNeasy® 96 Plant Kit (QIAGEN), following the manufacturer's instructions. Frozen leaf tissue from each genotype was used for each extraction and ground using a Mixer Mill 300 (Retsch®, Rheinische Straße, Haan, Germany). DNA was resuspended in 50 μl of water and dilutions were performed to obtain a final concentration of 10 ng/μl, followed by storage at -20°C. A collection of randomly selected EST-SSR primer pairs were validated experimentally, forward primers being synthesised with addition of a bacteriophage M13-matching sequence, to enable fluorescent tail addition through the PCR amplification process
. PCR conditions included a hot-start at 95°C for 10 minutes, followed by 10 cycles of 94°C for 30 s, 60-50°C for 30 s and 72°C for 30 s, followed by 25 cycles of 94°C for 30 s, 50°C for 30 s and 72°C for 30 s and a final elongation step of 72°C for 10 min. PCR products were separated using an ABI3730xl (Applied Biosystems, Foster City, California, USA) according to manufacturer's instructions with the addition of the ABI GeneScan LIZ500 size standard and amplification product sizes were determined using the GeneMapper® v3.7 software (Applied Biosystems).
EST sequencing and de novo assembly
Summary of GS FLX sequencing outputs (total number of reads, cumulative sequence output, median read length, number of reads used for assembly)
Total number of reads generated
Cumulative sequence (Mbp)
Median read length (bp)
Number of reads used for assembly
Summary of data on contig assemblies for field pea and faba bean
Number of reads per contig
Number of contigs
Percentage of total contigs per read number class
Since M. truncatula is the model legume species that is most closely related to field pea and faba bean, consensus sequences from all contigs and singletons were preferentially compared to Medicago coding sequences. In case of field pea, a total of 11,737 unique matches were obtained (6,224 contigs and 5,513 singletons) (Additional file 5). The unigene set was also compared against the nr database of GenBank. A total of 9,101 contigs and 13,194 singletons (22,295 unigenes) obtained matches at E < 10-10. Any query sequences that revealed a highest-ranking match against a non-plant species were removed from the list, leaving a total of 22,057 unique hits (Additional file 6 sheet 'final'). Finally, all of the consensus sequences were compared against the A. thaliana database. A total of 6,156 unique matches were obtained, consisting of 3,668 contigs and 2,488 singletons (Additional file 7).
The faba bean unigene set was also compared with the M. truncatula genome and a total of 10,179 hits were obtained (3,246 contigs and 6,933 singletons) at E < 10-10 (Additional file 8). The unigene set was subsequently compared to the nr database of GenBank, resulting in 18,244 unique hits composed of 4,508 contigs and 13,736 singletons. Any sequence that matched a non-plant database entry was removed from the list, resulting in 18,052 unique hits (4,668 contigs and 13,584 singletons) (Additional file 9, sheet 'final'). The unigene set was also compared to the A. thaliana database at a threshold value of E < 10-10 (Additional file 10), and a total of 4,883 hits were obtained, consisting of 1,948 contigs and 2,935 singletons. Finally, the field pea and faba bean unigene sets were also compared against the G. max EST sequence database that identified 19,451 unique matches for field pea and 16,497 for faba bean (Additional file 11). 'The contigs and singletons obtained from field pea in the current study were also compared against the unigene set generated from transcriptome analysis of field pea performed by Franssen et al. (2011) and as a result, a total of 45,161 overlapping hits were identified (10,832 contigs [24%] and 34,329 singletons [76%]) (Additional file 12). In some instances, more than one contig revealed hits to the same gene, which may be due to origin of more than one contig or singleton from a single gene due either to non-overlapping sequence reads or high levels of sequence error in a single read. This process has also demonstrated the benefits obtained from comparison between two complementary studies.
All of the ESTs and nucleotide sequences currently available in GenBank for field pea and faba bean were also downloaded on the local server to perform BLASTN searches against field pea and faba bean contigs and singletons obtained from the current study. In case of field pea, a total of 2,764 EST and 77,431 nucleotide sequences obtained from Genbank showed significant hits against unigene set generated in the current study (corresponding to 2,244 and 31,624 unique hits, respectively) (Additional file 13, sheets 1-2). For faba bean, a total of 549 ESTs (222 unique matches against faba bean unigene set) and 3,684 nucleotides (1,277 unique matches against faba bean unigene set) were found be common between Genbank and transcriptome data generated from the current study (Additional file 13, sheets 3-4).
Frequencies of different SSR repeat motif types observed in field pea and faba bean
SSR motif length
Repeat unit number
Validation of EST-SSR assays
A subset of 96 EST-SSR primer pairs each from field pea and faba bean data sets were selected for validation of marker assay performance. For field pea, a total of 86 (90%) successfully obtained amplification products from one or more template genotypes, of which 40 (46.5%) revealed polymorphism between 5 genotypes of field pea. Inclusion of a template sample from the non-domesticated species PS3689 (wild type landrace accession of Pisum sativum from Afghanistan) permitted polymorphism detection by 11 additional primer pairs (an increase to 59.3% of total) (Additional file 15, sheet Fieldpea). For faba bean, 81 primer pairs (84%) exhibited successful amplification, of which 24 detected polymorphic (29.6%) between cultivated V. faba genotypes (Icarus and Ascot). When the non-domesticated V. faba genotype ACC118 was included in the analysis, polymorphism rate increased to 48% (Additional file 15, sheet Fababean).
EST assembly and gene annotation
The increasing capacity of DNA sequencing technologies has permitted substantial increases in genomic resource availability for several legume crops that had been previously underdeveloped. Recently, large-scale transcriptome characterisation using the GS FLX platform has been performed for both lentil and pigeonpea [8, 37]. This technology can deliver large amounts of data at considerably lower costs as compared to traditional sequencing methods, and so provides an effective means to expedite analysis of less-studied species . In the present study, equivalent approaches have been applied to the two Vicieae species, field pea and faba bean, in order to develop a transcribed sequence database and to identify and validate EST-SSRs.
GS FLX sequencing has been shown to ineffectively process homopolymer regions that are longer than 8 bp in length . Therefore, poly(A) tails at mRNA termini may present major challenges, and result in under-representation of the 3'-ends of transcripts. In the present study, the problem was resolved through use of a modified primer with an interrupted polyd(T) tail. This contributed to an increase in the output of the total number of sequenced fragments by c. 6% (data not shown). A number of other transcriptome studies have used the same approach to overcome the homopolymer sequencing problems [39, 40].
Prior to sequencing, normalisation of the cDNA samples obtained from leaf and stem tissues was performed in order to increase the sequencing efficiency of rare transcripts. The normalisation process helps to reduce over -sampling of abundant transcripts that are presentin high quantities, hence increasing confidence of detecting a larger proportion of rare transcripts. Preliminary experiments indicated that normalisation of leaf/stem cDNA could increase the possibility of detecting rare transcripts by c. 10% (unpublished data). Similar approaches have been applied to detect rare transcripts in lentil, M. truncatula, Artemisia annua and greenhouse whitefly [8, 41–43].
The average contig lengths for the target species in this study are comparable to those observed in other studies (Pisum sativum, 454 bp , Pinus contorta, 500 bp ; lentil, 770 bp ; sweet potato, 790 bp ; mungbean, 843 bp ). A large proportion of the reads assembled into contigs in case of field pea (87%), which is comparable to the values observed in other studies (Glanville fritillary butterfly, 91% ; Eucalyptus grandis, 88% ; Acropora millepora larvae, 90% ). In contrast, a relatively smaller proportion (65%) of reads from faba bean assembled into contigs, resulting in lower length and depth as compared to the data derived from field pea. This may be due to the fact that the sequencing output for faba bean was comparatively smaller than that of field pea. Similar results have been observed in other studies . As a result of de novo assembly, a large number of singletons were obtained both for field pea (86,476) and faba bean (79,657), also as observed for other species [17, 42, 44, 48]. Although some singletons may arise as contaminating sequences or artefacts, the majority probably originate from transcripts expressed at low levels, and were consequently retained in the dataset. Many singleton sequences (15% for field pea and 17% for faba bean) exhibited high read quality due to matching of protein-encoding genes in the existing genic databases, and hence provide valuable sources of information. The remaining singletons could have resulted from various reasons such as incompleteness of known databases, sequencing errors, short read lengths leading to difficulty in assembly etc. [8, 31].
BLAST searches against databases of model plant species provided annotation data for field pea and faba bean ESTs, with totals of 22,057 and 18,052 unique hits, respectively. These values are very close to the estimated number of total genes (c. 25,000) present in a typical diploid plant genome, based on data from rice (Oryza sativa L.), sorghum (Sorghum bicolor L.), A. thaliana and Brachpodium distachyon[49, 50]. On this basis, the sequences annotated in this study are likely to represent c. 88% and c. 72% of the gene complements of field pea and faba bean, respectively. Such estimates are also supported by comparison with the M. truncatula genome, from which a total of 11,737 unique hits obtained from field pea represented c. 49% of the known gene space, and 10,179 unique hits from faba bean represented c. 41% of the known gene space. Comparisons were also made to G. max, which is more distantly related to the Vicieae tribe species than M. truncatula, being located outside the Hologalegina clade, A total of 19,451 unique hits from field pea and 16, 497 from faba bean represent c. 35% and 30% of the known gene space respectively, based on total of predicted 55,787 protein-coding loci in the palaeopolyploid genome of soybean. In comparison to the genome of A. thaliana, which is more distantly related to both model and crop legume species within the dicotyledonous plants, the corresponding values were c. 25% for field pea and c. 20% for faba bean.
Marker discovery and validation
One major advantage of second-generation DNA sequencing technologies is the capacity for computational interrogation of transcriptome data in order to develop large numbers of gene-based genetic markers such as SSRs and SNPs, of which few are currently available in the public domain for either field pea or faba bean. The EST-SSR primer pair sets generated in the current study will prove directly useful for the target species, and due to likely primer site conservation, may also be readily transferable to closely related species . The transcriptome data generated in the current study, being derived from distinct genotypes, may potentially be also used for the detection of SNP markers in field pea and faba bean, to further enrich the available genomic resources for these two species.
The relative proportions of SSR array types in field pea and faba bean were similar to those observed in other plant species [8, 52–54]. In theory, the frequencies of di-, tri-, tetra-, penta-, and hexanucleotide repeats should progressively decrease, based on the relative probability of replication slippage events. However, trinucleotide repeat units were predominant, followed by tetra-, di-, hexa-, and pentanucleotide repeat units. This observation is quite common for EST-derived SSRs, as trinucleotide expansions (or multiples thereof) within translated regions are capable of maintaining reading frame and hence generating a homopolymeric amino acid run within a partially or fully active protein.
The validation results for sub-sets of EST-SSR markers demonstrated that inclusion of non-domesticated genotypes in the study increased rates of polymorphism detection, consistent with the results of similar studies [8, 55]. EST-SSRs generated in the present study will consequently provide a valuable tool for the understanding of global genetic diversity among both non-domesticated and cultivated pea and faba bean germplasm, as well as for dissection of the genetic control of important agronomic traits.
In the current study, the generation of EST-datasets for field pea and faba bean has been described. Unigene sets obtained from field pea and faba bean were annotated against different genomic databases including those of M. truncatula, A. thaliana, G. max, and the nr database from GenBank. Furthermore, the EST dataset was used for design of EST-SSRs, subsets of which were validated across a number of cultivated and wild genotypes of pea and faba bean, indicating effectiveness of polymorphism detection and cross transferability.
This work was supported by funding from the Victorian Department of Primary Industries and the Grains Research and Development Corporation, Australia.
- Zhu H, Choi H-K, Cook DR, Shoemaker RC: Bridging model and crop legumes through comparative genomics. Plant Physiol. 2005, 137: 1189-1196. 10.1104/pp.104.058891.PubMed CentralView ArticlePubMed
- Lavin M, Herendeen PS, Wojciechowski MF: Evolutionary rates analysis of Leguminosae implicates a rapid diversification of lineages during the tertiary. Syst Biol. 2005, 54: 574-594.View Article
- Legumes of the world. Edited by: Lewis G, Schrire B, Mackinder B, Lock M. 2005, Kew Publishing
- Cannon SB, Sterck L, Rombauts S, Sato S, Cheung F, Gouzy J, Wang X, Mudge J, Vasdewani J, Schiex T, Spannagl M, Monaghan E, Nicholson C, Humphray SJ, Schoof H, Mayer KFX, Rogers J, Quétier F, Oldroyd GE, Debellé F, Cook DR, Retzel EF, Roe BA, Town CD, Tabata S, Peer YV, Young ND: Legume genome evolution viewed through the Medicago truncatula and Lotus japonicus genomes. PNAS. 2006, 103: 14959-14964. 10.1073/pnas.0603228103.PubMed CentralView ArticlePubMed
- Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T, Nelson W, Hyten DL, Song Q, Thelen JJ, Cheng J, Xu D, Hellsten U, May GD, Yu Y, Sakurai T, Umezawa T, Bhattacharyya MK, Sandhu D, Valliyodan B, Lindquist E, Peto M, Grant D, Shu S, Goodstein D, Barry K, Futrell-Griggs M, Abernathy B, Du J, Tian Z, Zhu L, Gill N, Joshi T, Libault M, Sethuraman A, Zhang XC, Shinozaki K, Nguyen HT, Wing RA, Cregan P, Specht J, Grimwood J, Rokhsar D, Stacey G, Shoemaker RC, Jackson SA: Genome sequence of the palaeopolyploid soybean. Nature. 2010, 463: 178-183. 10.1038/nature08670.View ArticlePubMed
- Gepts P, Beavis WD, Brummer EC, Shoemaker RC, Stalker HT, Weeden NF, Young ND: Legumes as a Model Plant Family. Genomics for Food and Feed Report of the Cross-Legume Advances through Genomics Conference. Plant Physiol. 2005, 137: 1228-1235. 10.1104/pp.105.060871.PubMed CentralView ArticlePubMed
- Gibson G: Pulse Australia-General Crop Information 2009, Pulse Australia. [http://www.pulseaus.com.au/crop_information.aspx]
- Kaur SK, Cogan NOI, Pembleton LW, Shinozuka M, Savin KW, Materne M, Forster JW: Transcriptome sequencing of lentil based on second-generation technology permits large-scale unigene assembly and SSR marker discovery. BMC Genomics. 2011, 12: 265-10.1186/1471-2164-12-265.PubMed CentralView ArticlePubMed
- Timmerman-Vaughan GM, Mills A, Whitfield C, Frew T, Butler R, Murray S, Lakeman M, McCallum J, Russell A, Wilson D: Linkage mapping of QTL for seed yield, yield components and development traits in pea. Crop Sci. 2005, 45: 1336-1344. 10.2135/cropsci2004.0436.View Article
- Gomez-Roldan V, Fermas S, Brewer PB, Puech-Pagès V, Dun EA, Pillot J-P, Letisse F, Matusova R, Danoun S, Portais J-C, Bouwmeester H, Bécard G, Beveridge CA, Rameau C, Rochange SF: Strigolactone inhibition of shoot branching. Nature. 2008, 455: 189-194. 10.1038/nature07271.View ArticlePubMed
- Hecht V, Knowles CL, Schoor JKV, Liew LC, Jones SE, Lambert MJM, Weller JL: Pea LATE BLOOMER1 is a GIGANTEA ortholog with roles in photoperiodic flowering, deetiolation, and transcriptional regulation of circadian clock gene homologs. Plant Physiol. 2007, 144: 648-661. 10.1104/pp.107.096818.PubMed CentralView ArticlePubMed
- Burstin J, Deniot G, Potier J, Weinachter C, Aubert G, Baranger A: Microstaellite polymorphism in Pisum sativum. Plant Breed. 2001, 120: 311-317. 10.1046/j.1439-0523.2001.00608.x.View Article
- Tar'an B, Zhang C, Warkentin T, Tullu A, Vandenberg A: Genetic diversity among varieties and wild species accessions of pea (Pisum sativum L.) based on molecular markers, and morphological and physiological characters. Genome. 2005, 48: 257-272. 10.1139/g04-114.View ArticlePubMed
- Smýkal P, Horáèek J, Dostálová R, Hýbl M: Variety discrimination in pea (Pisum sativum L.) by molecular, biochemical and morphological markers. J Appl Genet. 2008, 49: 155-166. 10.1007/BF03195609.View ArticlePubMed
- Zong X, Redden RJ, Liu Q, Wang S, Guan J, Liu J, Xu Y, Liu X, Gu J, Yan L, Ades P, Ford R: Analysis of a diverse global Pisum sp. collection and comparison to a Chinese local P. sativum collection with microsatellite markers. Theor Appl Genet. 2009, 118: 193-204. 10.1007/s00122-008-0887-z.View ArticlePubMed
- Jing R, Vershinin A, Grzebyta J, Shaw P, Smýkal P, Marshall D, Ambrose MJ, Ellis N, Flavell AJ: The genetic diversity and evolution of field pea (Pisum) studied by high throughput retrotransposon based insertion polymorphism (RBIP) marker analysis. BMC Evol Biol. 2010, 10: 44-10.1186/1471-2148-10-44.PubMed CentralView ArticlePubMed
- Franssen SU, Shrestha RP, Bräutigam A, Bornberg-Bauer E, Weber APM: Comprehensive transcriptome analysis of the highly complex Pisum sativum genome using next generation sequencing. BMC Genomics. 2011, 12: 227-10.1186/1471-2164-12-227.PubMed CentralView ArticlePubMed
- Torres AM, Avila CM, Gutierrez N, Palomino C, Moreno MT, Cubero JI: Marker-assisted selection in faba bean (Vicia faba L.). Field Crops Res. 2010, 115: 243-252. 10.1016/j.fcr.2008.12.002.View Article
- Ellwood SR, Phan HTT, Jordan M, Hane J, Torres AM, Avila CM, Cruz-Izquierdo S, Oliver RP: Construction of a comparative genetic map in faba bean (Vicia faba L.); conservation of genome structure with Lens culinaris. BMC Genomics. 2008, 9: 380-10.1186/1471-2164-9-380.PubMed CentralView ArticlePubMed
- Dalmais M, Schmidt J, Signor CL, Moussy F, Burstin J, Savois V, Aubert G, Brunaud V, Oliveira Y, Guichard C, Thompson R, Bendahmane A: UTILLdb, a Pisum sativum in silico forward and reverse genetics tool. Genome Biol. 2008, 9: R43-10.1186/gb-2008-9-2-r43.PubMed CentralView ArticlePubMed
- Coyne CJ, McClendon MT, Walling JG, Timmerman-Vaughan GM, Murray S, Meksem K, Lightfoot DA, Shultz JL, Keller KE, Martin RR, Inglis DA, Rajesh PN, McPhee KE, Weeden NF, Grusak MA, Li CM, Storlie EW: Construction and characterization of two bacterial artificial chromosome libraries of pea (Pisum sativum L.) for the isolation of economically important genes. Genome. 2007, 50: 871-875. 10.1139/G07-063.View ArticlePubMed
- Hofer J, Turner L, Moreau C, Ambrose M, Isaac P, Butcher S, Weller J, Dupin A, Dalmais M, Signor CL, Bendahmane A, Ellis N: Tendril-less regulates tendril formation in pea leaves. Plant Cell. 2009, 21: 420-428. 10.1105/tpc.108.064071.PubMed CentralView ArticlePubMed
- Hellens RP, Moreau C, Lin-Wang K, Schwinn KE, Thomson SJ, Fiers MWEJ, Frew TJ, Murray SR, Hofer JMI, Jacobs JME, Davies KM, Allan AC, Bendahmane A, Coyne CJ, Timmerman-Vaughan GM, Ellis THN: Identification of Mendel's white flower character. PLoS One. 2010, 10: e13230-View Article
- Moe KT, Chung J-W, Cho Y-I, Moon J-K, Ku J-H, Jung J-K, Lee J, Park Y-J: Sequence information on simple sequence repeats and single nucleotide polymorphisms through transcriptome analysis of Mungbean. J Integr Plant Biol. 2011, 53: 63-73. 10.1111/j.1744-7909.2010.01012.x.View ArticlePubMed
- Thomson NR, Holden MTG, Carder C, Lennard N, Lockey SJ, Marsh P, Skipp P, O'Connor CD, Goodhead I, Norbertzcak H, Harris B, Ormond D, Rance1 R, Quail MA, Parkhill J, Stephens RS, Clarke IN: Chlamydia trachomatis: Genome sequence analysis of lymphogranuloma venereum isolates. Genome Res. 2008, 18: 161-171.PubMed CentralView ArticlePubMed
- Bentley DR: Whole-genome re-sequencing. Curr Opin Genet Dev. 2006, 16: 545-552. 10.1016/j.gde.2006.10.009.View ArticlePubMed
- Barbazuk WB, Emrich SJ, Chen HD, Li L, Schnable PS: SNP discovery via 454 transcriptome sequencing. Plant J. 2007, 51: 910-918. 10.1111/j.1365-313X.2007.03193.x.PubMed CentralView ArticlePubMed
- Vera JC, Wheat CW, Fescemyer HW, Frilander MJ, Crawford DL, Hanski I, Marden JH: Rapid transcriptome characterization for a nonmodel organism using 454 pyrosequencing. Mol Ecol. 2008, 17: 1636-1647. 10.1111/j.1365-294X.2008.03666.x.View ArticlePubMed
- Blanca J, Canizares J, Roig C, Ziarsolo P, Nuez F, Pico B: Transcriptome characterization and high throughput SSRs and SNPs discovery in Cucurbita pepo (Cucurbitaceae). BMC Genomics. 2011, 12: 104-10.1186/1471-2164-12-104.PubMed CentralView ArticlePubMed
- Park YJ, Lee JK, Kim NS: Simple sequence repeat polymorphisms (SSRPs) for evaluation of molecular diversity and germplasm classification of minor crops. Molecules. 2009, 14: 4546-4569. 10.3390/molecules14114546.View ArticlePubMed
- Zeng S, Xiao G, Guo J, Fei Z, Xu Y, Roe BA, Wang Y: Development of a EST dataset and characterization of EST-SSRs in a traditional Chinese medicinal plant, Epimedium sagittatum (Sieb. Et Zucc.) Maxim. BMC Genomics. 2010, 11: 94-10.1186/1471-2164-11-94.PubMed CentralView ArticlePubMed
- Rafalski JA: Novel genetic mapping tools in plants: SNPs and LD-based approaches. Plant Sci. 2002, 162: 329-333. 10.1016/S0168-9452(01)00587-8.View Article
- Loridon K, McPhee K, Morin J, Dubreuil P, Pilet-Nayel ML, Aubert G, Rameau C, Baranger A, Coyne C, Lejeune-Hénaut I, Burstin J: Microsatellite marker polymorphism and mapping in pea (Pisum sativum L.). Theor Appl Genet. 2005, 111: 1022-1031. 10.1007/s00122-005-0014-3.View ArticlePubMed
- Hougaard BK, Madsen LH, Sandal N, Moretzsohn MC, Fredslund J, Schauser L, Nielsen AM, Rohde T, Sato S, Tabata S, Bertioli DJ, Stougaard J: Legume anchor markers link syntenic regions between Phaseolus vulgaris, Lotus japonicus, Medicago truncatula and Arachi s. Genetics. 2008, 179: 2299-2312. 10.1534/genetics.108.090084.PubMed CentralView ArticlePubMed
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol. 1990, 215: 403-410.View ArticlePubMed
- Schuelke M: An economic method for the fluorescent labelling of PCR fragments. Nat Biotechnol. 2000, 18: 233-234. 10.1038/72708.View ArticlePubMed
- Dutta S, Kumawat G, Singh BP, Gupta DK, Singh S, Dogra V, Gaikwad K, Sharma TR, Raje RS, Bandhopadhya TK, Datta S, Singh MN, Bashasab F, Kulwal P, Wanjari KB, Varshney RK, Cook DR, Singh NK: Devlopment of genic-SSR markers by deep transcriptome sequencing in pigeonpea [Cajanus cajan (L.) Millspaugh]. BMC Plant Biology. 2011, 11: 17-10.1186/1471-2229-11-17.PubMed CentralView ArticlePubMed
- Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z: Genome sequencing in microfabricated high-density picolitre reactors. Nature. 2005, 437: 376-380.PubMed CentralPubMed
- Beldade P, Rudd S, Gruber JD, Long AD: A wing expressed sequence tag resource for Bicyclus anynana butterflies, an evo-devo model. BMC Genomics. 2006, 7: 130-10.1186/1471-2164-7-130.PubMed CentralView ArticlePubMed
- Sun C, Li Y, Wu Q, Luo H, Sun Y, Song J, Lui EMK, Chen S: De novo sequencing and analysis of the American ginseng root transcriptome using a GS FLX Titanium platform to discover putative genes involved in ginsenoside biosynthesis. BMC Genomics. 2010, 11: 262-10.1186/1471-2164-11-262.PubMed CentralView ArticlePubMed
- Cheung F, Haas BJ, Goldberg SM, May GD, Xiao Y, Town CD: Sequencing Medicago truncatula expressed sequenced tags using 454 Life Sciences technology. BMC Genomics. 2006, 7: 272-10.1186/1471-2164-7-272.PubMed CentralView ArticlePubMed
- Wang W, Wang Y, Zhang Q, Qi Y, Guo D: Global characterization of Artemisia annua glandular trichome transcriptome using 454 pyrosequencing. BMC Genomics. 2009, 10: 465-10.1186/1471-2164-10-465.PubMed CentralView ArticlePubMed
- Karatolos N, Pauchet Y, Wilkinson P, Chauhan R, Denholm I, Gorman K, Nelson DR, Bass C, ffrench-Constant RH, Williamson MS: Pyrosequencing the transcriptome of the greenhouse whitefly, Trialeurodes vaporariorum reveals multiple transcripts encoding insecticide targets and detoxifying enzymes. BMC Genomics. 2011, 12: 56-10.1186/1471-2164-12-56.PubMed CentralView ArticlePubMed
- Parchman TL, Geist KS, Grahnen JA, Benkman CW, Buerkle CA: Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery. BMC Genomics. 2010, 11: 180-10.1186/1471-2164-11-180.PubMed CentralView ArticlePubMed
- Schafleitner R, Tincopa LR, Palomino O, Rossel G, Robles RF, Alagon R, Rivera C, Quispe C, Rojas L, Pacheco JA, Solis J, Cerna D, Kim JY, Hou J, Simon R: A sweetpotato gene index established by de novo assembly of pyrosequencing and Sanger sequences and mining for gene-based microsatellite markers. BMC Genomics. 2010, 11: 604-10.1186/1471-2164-11-604.PubMed CentralView ArticlePubMed
- Vera JC, Wheat CW, Fescemyer HW, Frilander MJ, Crawford DL, Hanski I, Marden JH: Rapid transcriptome characterization for a nonmodel organism using 454 pyrosequencing. Mol Ecol. 2008, 17: 1636-1647. 10.1111/j.1365-294X.2008.03666.x.View ArticlePubMed
- Novaes E, Drost DR, Farmerie WG, Pappas GJ, Grattapaglia D, Sedero RR, Kirst M: High-throughput gene and SNP discovery in Eucalyptus grandis, an uncharacterized genome. BMC Genomics. 2008, 9: 312-10.1186/1471-2164-9-312.PubMed CentralView ArticlePubMed
- Meyer E, Aglyamova GV, Wang S, Buchanan-Carter J, Abrego D, Colbourne JK, Willis BL, Matz MV: Sequencing and de novo analysis of a coral larval transcriptome using 454 GSFlx. BMC Genomics. 2009, 10: 219-10.1186/1471-2164-10-219.PubMed CentralView ArticlePubMed
- Vogel : Genome sequencing and analysis of the model grass Brachypodium distachyon. Nature. 2009, 463: 763-768.View Article
- Bevan M, Walsh S: The Arabidopsis genome: A foundation for plant research. Genome Res. 2005, 15: 1632-1642. 10.1101/gr.3723405.View ArticlePubMed
- Barbara T, Palma-Silva C, Paggi GM, Bered F, Fay MF, Lexer C: Cross-species transfer of nuclear microsatellite markers: potential and limitations. Mol Ecol. 2007, 16: 3759-3767. 10.1111/j.1365-294X.2007.03439.x.View ArticlePubMed
- Kumpatla SP, Mukhopadhyay S: Mining and survey of simple sequence repeats in expressed sequence tags of dicotyledonous species. Genome. 2005, 48: 985-998. 10.1139/g05-060.View ArticlePubMed
- Eujayl I, Sledge MK, Wang L, May GD, Chekhovskiy K, Zwonitzer JC, Mian MAR: Medicago truncatula EST-SSRs reveal cross-species genetic markers for Medicago spp. Theor Appl Genet. 2004, 108: 414-422. 10.1007/s00122-003-1450-6.View ArticlePubMed
- Luro FL, Costantino G, Terol J, Argout X, Allario T, Wincker P, Talon M, Ollitrault P, Morillon R: Transferability of the EST-SSRs developed on Nules clementine (Citrus clementina Hort ex Tan) to other Citrus species and their effectiveness for genetic mapping. BMC Genomics. 2008, 9: 287-10.1186/1471-2164-9-287.PubMed CentralView ArticlePubMed
- Castillo A, Budak H, Varshney RK, Dorado G, Graner A, Hernandez P: Transferability and polymorphism of barley EST-SSR markers used for phylogenetic analysis in Hordeum chilense. BMC Plant Biol. 2008, 8: 97-10.1186/1471-2229-8-97.PubMed CentralView ArticlePubMed
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.