Integrating genetic maps in bambara groundnut [Vigna subterranea (L) Verdc.] and their syntenic relationships among closely related legumes
- Wai Kuan Ho†1, 2Email author,
- Hui Hui Chai†2,
- Presidor Kendabie3,
- Nariman Salih Ahmad4,
- Jaeyres Jani5,
- Festo Massawe2,
- Andrzej Kilian6 and
- Sean Mayes1, 2, 3
© The Author(s). 2017
Received: 27 May 2016
Accepted: 7 December 2016
Published: 20 February 2017
Bambara groundnut [Vigna subterranea (L) Verdc.] is an indigenous legume crop grown mainly in subsistence and small-scale agriculture in sub-Saharan Africa for its nutritious seeds and its tolerance to drought and poor soils. Given that the lack of ex ante sequence is often a bottleneck in marker-assisted crop breeding for minor and underutilised crops, we demonstrate the use of limited genetic information and resources developed within species, but linked to the well characterised common bean (Phaseolus vulgaris) genome sequence and the partially annotated closely related species; adzuki bean (Vigna angularis) and mung bean (Vigna radiata). From these comparisons we identify conserved synteny blocks corresponding to the Linkage Groups (LGs) in bambara groundnut genetic maps and evaluate the potential to identify genes in conserved syntenic locations in a sequenced genome that underlie a QTL position in the underutilised crop genome.
Two individual intraspecific linkage maps consisting of DArTseq markers were constructed in two bambara groundnut (2n = 2x = 22) segregating populations: 1) The genetic map of Population IA was derived from F2 lines (n = 263; IITA686 x Ankpa4) and covered 1,395.2 cM across 11 linkage groups; 2) The genetic map of Population TD was derived from F3 lines (n = 71; Tiga Nicuru x DipC) and covered 1,376.7 cM across 11 linkage groups. A total of 96 DArTseq markers from an initial pool of 142 pre-selected common markers were used. These were not only polymorphic in both populations but also each marker could be located using the unique sequence tag (at selected stringency) onto the common bean, adzuki bean and mung bean genomes, thus allowing the sequenced genomes to be used as an initial ‘pseudo’ physical map for bambara groundnut. A good correspondence was observed at the macro synteny level, particularly to the common bean genome. A test using the QTL location of an agronomic trait in one of the bambara groundnut maps allowed the corresponding flanking positions to be identified in common bean, mung bean and adzuki bean, demonstrating the possibility of identifying potential candidate genes underlying traits of interest through the conserved syntenic physical location of QTL in the well annotated genomes of closely related species.
The approach of adding pre-selected common markers in both populations before genetic map construction has provided a translational framework for potential identification of candidate genes underlying a QTL of trait of interest in bambara groundnut by linking the positions of known genetic effects within the underutilised species to the physical maps of other well-annotated legume species, without the need for an existing whole genome sequence of the study species. Identifying the conserved synteny between underutilised species without complete genome sequences and the genomes of major crops and model species with genetic and trait data is an important step in the translation of resources and information from major crop and model species into the minor crop species. Such minor crops will be required to play an important role in future agriculture under the effects of climate change.
KeywordsConserved synteny markers Mapping Genotyping-by-sequencing Genomic comparative analysis
Three crops account for over 60% of all food calories grown in the world; wheat (Triticum spp.), rice (Oryza sativa) and maize (Zea mays) with thirty crops in all accounting for around 95% of total calories consumed [1, 2]. This over-dependence on a limited number of major crops has narrowed the genetic and species base of agriculture. Growing monocultures of crop genotypes selected to respond to intensive inputs potentially makes major crops more vulnerable to pest and diseases and can lead them to perform poorly in low input systems. Exploring the potentials of underutilised and minor crops to contribute to agricultural biodiversity may also help agricultural production to cope with the effects of climate change, through improving the resilience of future agricultural systems, using crops that have been selected in-field for millennia under low input agriculture.
Underutilised and minor crops often have limited research or development funding with little or no interest from commercial seed companies. Limited resource is often a major challenge in expediting the improvement of any promising underutilised crops through marker-assisted breeding programmes. The ability to translate trait information from model and major crops species to underutilised crops is important to be able to fully exploit available resources, effectively developing research simultaneously into a complex of species, rather than a single species in isolation. Working with species complexes also allows important insights into genetic networks responsible for performance and environmental responses in the context of evolutionary relationships and ecological differences in these crops and their progenitors.
Bambara groundnut [Vigna subterranea (L) Verdc.] is widely grown as a plant protein source for poor farmers, particularly in sub-Saharan Africa with the seeds containing good levels of protein (18 to 26%) for human nutrition [3, 4]. The crop is drought tolerant and as a legume fixes nitrogen, it is able to tolerate low fertility soils  and can contribute nitrogen to agricultural systems.
Bambara groundnut is cleistogamous, highly inbreeding and has 11 pairs of chromosomes (2n = 2x = 22) . The first genetic linkage map of bambara groundnut was reported by Basu in 2007 using a F2 segregating population (n = 98) derived from an interspecific cross between the non-domesticated wild type (VSSP11) and a domesticated form (DipC). An initial QTL analysis was carried out on trait differences observed for growth habit, maturity and yield production . The developed map consists of 20 linkage groups and was 516 cM in length, based on 67 amplified fragment length polymorphism (AFLP) and one cross-species simple sequence repeat (SSR) marker, with the inter-marker distance varying from 4.7 to 32 cM. The first intraspecific genetic map used a F3 segregating population (n = 73) derived from a cross between two domesticated forms of bambara groundnut (Tiga Nicuru and DipC), sharing the domesticated common parental line with the interspecific cross . This intraspecific map was constructed from 29 SSR and 209 DArT Array markers covering 608.6 cM in 21 linkage groups. Both parental genotypes have significant contrasting features in growth habit and seed eye pattern. Tiga Nicuru from Mali has a bunchy growth habit with longer peduncle length (p <0.05) and does not have eye pattern around the hilum whereas DipC collected from Botswana has a semi-spreading morphology with longer petiole and shorter internode length (p <0.05), greater leaf area (p <0.05) and has a dark eye pattern around hilum, in addition to higher pod number (p <0.05), seed weight per plant (p <0.05) and shelling percentage (p <0.05). A DArT Array-based UPGMA genetic distance analysis grouped the two parents into different sub-sections based on a previous population structure analysis , which has made these two crosses useful in unravelling our understanding of the domestication events in bambara groundnut and the genetic control of a number of morphological and physiological traits. In the future, a completed and annotated genome sequence will become available through the efforts of the African Orphan Crops Consortium (AOCC) which includes bambara groundnut as one of its targets . This will greatly facilitate research in this species, but it will be some time before a fully assembled and annotated genome is available. Beyond the 101 crops identified for sequencing by AOCC, there are believed to be around 7,000 plant species which have been used by humankind, so the development of translational methodologies for the location of genetic components of traits and their underlying candidate genes is a priority for underutilised crop species. The current research presents one such generic approach, using bambara groundnut as an exemplar species.
Here we report the construction of two genetic linkage maps in two intraspecific crosses using genotyping-by-sequencing (GbS) DArTseq markers (a combination of a set of population-specific markers and a set of pre-selected common ‘link’ markers), followed by the identification of the most likely syntenic locations for the markers shared between the genomes of common bean (Phaseolus vulgaris), adzuki bean (Vigna angularis) and mung bean (Vigna radiata) [11–13]. These three species have 11 pairs of chromosomes and they were chosen not only because they have been sequenced and annotated (although they are at different levels of completion), but also because of the close evolutionary relationship among these legumes. The divergence time between Phaseolus and Vigna has been estimated to be 8 million years ago (MYA) . The putative synthetic blocks identified across legume genomes would facilitate more effective comparison of gene order in the legume species and assist in the identification of the location of genes underlying QTL involved in controlling agronomic and yield traits in bambara groundnut, facilitating the marker-assisted selection process.
Results and discussion
Characterisation of DArTseq markers
35.6 and 31.1% of the total DArTseq markers generated from populations IA and TD, respectively, could be mapped to the common bean genome using the marker sequence tag, with 55% of these matches occurring in genomic contexts expected to be transcribed. In line with expectations based on genetic relatedness, a higher percentage of markers showed good sequence homology hits with the Vigna genomes, with more than 45% of the bambara groundnut DArTseq markers locatable on the adzuki bean and mung bean genomes (DArTseq markers from IA population: 47.8% mapped to adzuki bean genome, 46.2% to mung bean genome; DArTseq markers from TD population: 50.2% mapped to adzuki bean genome, 48.0% mapped to mung bean genome), nevertheless, some were less informative as they were found to be homologous to the remaining superscaffolds within the genome assemblies which had not been assigned to chromosome locations in the sequenced species.
Selection of markers for genetic maps
Genetic linkage maps from selected markers
The distribution of DArTseq markers (dominant DArT and co-dominant SNP) across each linkage group in both bambara groundnut populations
Pre-selected common markers
Syntenic relationships with other legume species
As this syntenic mapping approach is achieved through a cross-species comparison, we have been conservative in terms of mapping stringency so that the possibility of having more than one single good match location for a particular bambara groundnut marker sequence tag in the query genome was minimised. However, there is a possibility that the mapping algorithm has misidentifying the best match within the genome of the other species, or that the best match may not have been represented in the comparison species genome sequence. For example, SNP100030767 SNP |F|0-54 marker located at 69.6 cM on LG5 in Map IA has a best match position to a genomic region of Va02, whilst the adjacent markers in this linkage group mapped to Va08. A comparison of the potential sequence matching location suggested that while the Va02 position gave the higher BLAST score due to a higher percentage of matching bases (57 out of 61 bases), the second best aligned position at Va08 with a slightly lower match (56 out of 61 bases) is the better alignment and agrees with the adjacent pre-selected common markers between 0 cM and 76.0 cM of this linkage group (Additional file 1: Fig S1a and b). Therefore, for cases where the genetic location and comparative syntenic position did not match with flanking markers, we manually used BLAST to determine whether the marker was likely to be a genuine breach of conserved synteny or whether other syntenic target sites might exist. From 13 markers which do not show syntenic coherence with their neighbouring markers in either the common bean, adzuki bean or mung bean genome comparison, nine markers could be reassigned through a BLAST analysis to a syntenic location on the same chromosomes as their flanking markers. A dotted line in LG11 indicates that SNP100024712 |F|0-54 marker has the best match position on Pv07; 45,623,999 bp with one insertion, one deletion and eight mismatched bases observed but not having any good match to Pv06 as expected from its flanking markers on the same linkage group. The mapped location on Pv07 is at a genomic region flanked by AT rich sequences (496 bp and 392 bp apart upstream and downstream, respectively) and it could be mapped to the same chromosomes of adzuki bean and mung bean as the other 11 pre-selected markers on the same linkage group. Together, all these observations suggest that this locus detected a genuine divergence between Vigna and Phaseolus species.
The QTL analysis of internode length from TD F3 population
nSNP100015970|F|0-21 (52.2 cM)
Collectively, the finding of a predicted soybean internode length QTL (located through the mung bean genome) is in accordance with our bambara groundnut internode length QTL study and has demonstrated that our approach could be adopted in other minor and underutilised species, both to translate existing information in more studied species and to identify de novo candidate lists. The current coincidence of the QTL in soybean and bambara groundnut for internode length needs further investigation and the gene content within the target region between species of interest is by no means guaranteed to be the same, although the closer the species are related taxonomically, the more likely it is that they will have similar gene content in the regions of conserved synteny.
The ability to use pre-selected GbS markers in two bambara groundnut crosses to generate consistent and coherent linkage maps which have conserved synteny links to the sequenced chromosomes of common bean (Phaseolus vulgaris), adzuki bean (Vigna angularis) and mung bean (Vigna radiata) suggests that this approach could be applied to other species of interest which have limited ex ante sequence information, but have closely related sequenced genome relatives. Our preliminary results overlaying the location of within-species QTL onto sequenced genomes suggests that translation of information (and the generation of gene lists within a trait controlling locus) can be used in an attempt to identify candidate controlling genes for further analysis within the species of interest which has no genome sequence. In bambara groundnut, before the release of whole genome sequence by AOCC, the identified potential candidate genes involved in internode length regulation could be further investigated for a better understanding of the domestication events in this species and a key trait for growth habit in different environments.
This study was based on segregation data from two populations of bambara groundnut. The F2 and F3 segregating populations were derived from a controlled cross between single genotype parental lines to produce Population IA: IITA686 (maternal) x Ankpa4 (paternal) and Population TD: Tiga Nicuru (maternal) x DipC (paternal), respectively. Plant materials were planted and trait data collected in the controlled environment FutureCrop Glasshouses, University of Nottingham Sutton Bonington Campus, UK, in 2012. The internode length trait data was recorded in mm for the average length of the fourth internode measured for the five longest stems per plant at harvest.
Young leaves were collected from 263 F2 individual lines, 71 F3 individual lines and two parental lines for each population. The DNA from the Population TD was extracted following the Dellaporta method whereas column purification method was used for Population IA (DNeasy Plant Mini Kit, Qiagen) . A total of 2 μg of genomic DNA of each line were sent to DArT Pty. Ltd. (Canberra, Australia) for DArTseq genotyping.
Marker selection for genetic map construction
DArTseq markers (as identified by the sequence tags) found in both bambara groundnut populations were mapped to common bean, adzuki bean and mung bean genomes using CLC Genomic Workbench v8 (Qiagen). The default settings for sequence mapping were used except for ‘0.8’ for ‘length fraction’ and ‘ignore’ for ‘non-specific match handling’. Subsequently, the markers mappable to all three genomes were selected further for polymorphism between all parental lines and for not more than five missing data points per individual line.
The presence or absence (0/1) scoring of dominant DArT markers for each individual line in the segregating populations were converted into genotype codes, either (c,a) or (b,d), by comparison with the parental lines. Bi-allelic SNP markers were assigned as ‘a’, ‘h’ and ‘b’ as appropriate in each individual line according to the scoring pattern in both parental lines. The markers that were filtered out in one population were also removed from the other population.
Construction of the genetic linkage maps
A total of 142 pre-selected common markers (Additional file 3: Table S1) meeting the selection criteria were added to the population-specific SNP markers for genetic linkage analysis using JoinMap v4.1 . The grouping of markers was set between LOD 2.0 and 10.0 with a step of 0.5 and the Independence LOD option adopted. All genotypic data were first analysed using a Chi-square test in JoinMap4.1 against the expected segregation patterns for the population and marker segregation type and for potential segregation distortion at a significance level of p <0.05. Linkage groups were established using the regression mapping approach with grouping at LOD >3.5. The Haldane mapping function with default calculation settings (recombination fraction ≤4.0, ripple value = 1, jump in goodness-of-fit threshold = 5) was selected. Each linkage group was initially screened for double crossover events. Markers showing double crossover events between two neighbouring markers within a distance of 1 to 3 cM were removed from the datasets. This reiterative process of marker removal based on graphical genotyping and ‘stress and fit’ testing allowed a high quality framework map to be generated for QTL analysis.
The physical locations of the pre-selected common DArTseq markers in the linkage maps on common bean, adzuki bean and mung bean genomes are illustrated using Circos .
Detection of QTLs
Internode length trait was subjected to QTL analysis through the IM and MQM approach using MapQTL® v6.0 . The analysis options were set to be default whereby the regression algorithm was used for IM and MQM mapping. The significance threshold of the LOD score was identified through permutation tests using 10,000 reiterations. The LOD score generated from IM mapping was then compared with the Genome Wide (GW) threshold at p ≤0.05 from the permutation test to be termed as ‘significant’. Prior to MQM mapping, the closest linked marker to the QTL with significant LOD scores was selected as a co-factor. The locations of QTLs selected by marker cofactors were verified through the LOD table and visual inspection of the LOD profile, together with a 1-LOD drop confidence interval.
Amplified fragment length polymorphism
Diversity arrays technology
Logarithm of odds
Multiple QTL mapping
Phenotypic variation explained
Quantitative trait loci
Recombinant inbred lines
Single nucleotide polymorphism
Simple sequence repeat
Unweighted pair group method with arithmetic mean
The authors would like to acknowledge funding for HHC from the University of Nottingham Malaysia Campus MIDAS scheme, for PK from the Nigerian Government Tertiary Education Trust Fund and from Crops For the Future for SM and WKH.
Availability of data and materials
All data generated or analysed during this study are included in this published article and its supplementary information files.
WKH carried out the cross species comparison analysis, participated in the genetic mapping and QTL analysis and drafted the manuscript. HHC carried out the genetic mapping and QTL analysis. PK developed the IA population with marker information. NSA contributed phenotypic data and initial marker analysis in the TD population. JJ developed the circos analysis. FM developed the TD population and helped to draft the manuscript. AK developed the bambara groundnut DArTseq analysis pipeline. SM conceived of the study, and participated in its design and coordination and revised the manuscript. All authors read and approved the final manuscript.
Consent for publication
The authors declare that they have no competing interests.
Ethical approval and consent to participate
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Food and Agriculture Organization of the United Nations (FAO). Staple foods: What do people eat? In: Loftas T, editor. Dimensions of need: An atlas of food and agriculture. Rome: Banson; 1995. p. 21–4.Google Scholar
- Convention of Biological Diversity (CBD). Biodiversity for food security and nutrition. In: Get Ready for 2015 Newsletter. 2013.Google Scholar
- Brough SH, Azam-Ali SN. The effect of soil moisture on the proximate composition of Bambara groundnut (Vigna subterranea (L.) Verdc). J Sci Food Agric. 1992;60:197–203.View ArticleGoogle Scholar
- Brough SH, Azam-Ali SN, Taylor AJ. The potentials of bambara groundnut (Vigna subterranean) in vegetable milk production and basic protein functionality systems. Food Chem. 1993;47:277–83.View ArticleGoogle Scholar
- Azam-Ali SN, Sesay A, Karikari SK, Massawe F, Aguilar-Manjarrez J, Bannayan M, Hampson KJ. Assessing the potential of an underutilised crops — a case study using bambara groundnut. Expl Agric. 2001;37:433–72.View ArticleGoogle Scholar
- Forni-Martins ER. New chromosome number in the genus Vigna Savi (Leguminosae-Papilionoideae). Bull Natl Plantentium. 1986;56:129–33.Google Scholar
- Basu S, Roberts JA, Azam-Ali SN, Mayes S. Bambara groundnut. In: Kole CM, editor. Genome mapping and molecular breeding in plants — pulses, sugar and tuber. New York: Springer; 2007. p. 159–73.Google Scholar
- Ahmad NS, Redjeki ES, Ho WK, Aliyu S, Mayes K, Massawe F, Kilian A and Mayes S. Construction of a genetic linkage map and QTL analysis in bambara grouudnut (Vigna subterranean (L.) Verdc.). Genome; 2016: DOI 10.1139/gen-2015-0153.
- Stadler F. Analysis of differential expression under water deficit stress and genetic diversity in Bambara groundnut (Vigna subterranea (L.) Verdc.) using novel high throughput Dissertation, Technische Universität München. 2009.Google Scholar
- African Orphan Crops Consortium (AOCC). 2015. http://africanorphancrops.org/. Accessed 12 Jan 2016.
- Schmutz J, McClean PE, Mamidi S, Wu GA, Cannon SB, Grimwood J, Jenkins J, et al. A reference genome for common bean and genome-wide analysis of dual domestications. Nat Genet. 2014;46:707–13.View ArticlePubMedGoogle Scholar
- Kang YJ, Kim SK, Kim MY, Lestari P, Kim KH, Ha B-K, Jun TH, et al. Genome sequence of mungbean and insights into evolution within Vigna species. Nat Commun. 2014;5:5443.View ArticlePubMedPubMed CentralGoogle Scholar
- Kang YJ, Satyawan D, Shim S, Lee T, Lee J, Hwang WJ, Kim SK, et al. Draft genome sequence of adzuki bean. Vigna Angularis Sci Rep. 2015;5:8069.View ArticlePubMedGoogle Scholar
- Lavin M, Herendeen PS, Wojciechowski MF. Evolutionary rates analysis of Leguminosae implicates a rapid diversification of lineages during the tertiary. Syst Biol. 2005;54:575–94.View ArticlePubMedGoogle Scholar
- Silva LC, Cruz CD, Moreira MA, Barros EG. Simulation of population size and genome saturation level for genetic mapping of recombinant inbred lines (RILs). Genet Mol Biol. 2007;30:1101–8.View ArticleGoogle Scholar
- Alcivar A, Jaconson J, Raingo J, Meksem K, Lightfoot DA, Kassem MA. Genetic analysis of soybean plant height, hypocotyl and internode lengths. J Agric Food Environ Sci. 2007;1:1–20.Google Scholar
- Dellaporta SL, Wood J, Hicks JB. A plant DNA minipreparation: Version II. Plant Mol Biol Report. 1983;1:19–21.View ArticleGoogle Scholar
- van Ooijen JW. JoinMap ® 4, Software for the calculation of genetic linkage maps in experimental populations. Kyazma B.V., Wageningen, Netherlands; 2006.
- Krzywinski M, Schein JE, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ, et al. Circos: An information aesthetic for comparative genomics. Genome Res. 2009;19:1639–45.View ArticlePubMedPubMed CentralGoogle Scholar
- van Ooijen JW. MapQTL®6, Software for the mapping of quantitative trait loci in experimental populations of diploid species. Kyazma B.V., Wageningen, Netherlands; 2009.