Method verification
Wild Vitis species can be used in grape breeding programs to introgress disease and abiotic stress resistance into susceptible germplasm belonging to the domesticated grape, V. vinifera. Commercial cultivars with wild Vitis ancestry are often referred to as “hybrids”. An evaluation of ancestry across commercial hybrids can provide insight into the history of hybrid grape breeding and a foundation for future efforts to select for ancestry based on marker data. Previous work provided accurate ancestry estimates of interspecific grape cultivars using Vitis9KSNP array data for cultivars belonging to the USDA germplasm collection [24]. We applied the same PCA-based method to evaluate the ancestry of some of the most widely grown hybrid cultivars sampled from North America and Europe using GBS data.
PCA provides a clear separation of wild Vitis and V. vinifera samples along PC1, with commercial hybrids found between the two ancestral groups (Fig. 1a). The projected position of a hybrid along PC1 was used to calculate its percentage V. vinifera ancestry (Fig. 1b).
In order to evaluate the accuracy of our ancestry estimates, we performed in silico crosses between wild Vitis and V. vinifera populations using our genome-wide SNP data to simulate F1 hybrids as well as hybrids generated from F1 simulated hybrids backcrossed to V. vinifera or wild Vitis. The simulated progeny were projected onto PC axes determined using the ancestral populations and the resulting PCA plot is shown in Fig. 2a.
The expected V. vinifera content in an F1 offspring with one V. vinifera and one wild Vitis parent is 50 %, and the mean estimated content in the simulated F1 population described here was 50.1 %, with a 95 % confidence interval (CI) ranging from 42.7 % to 57.2 %. In progeny produced by an F1 hybrid backcrossed to wild Vitis, the expected V. vinifera content is 25 %, which was the mean estimate of our simulated data, with a 95 % CI of 18.4 % to 32.6 %. Finally, the mean V. vinifera content in simulated F1 hybrids backcrossed to V. vinifera is expected to be 75 %, and our results have a mean value of 75.1 %, with a 95 % CI of 68.5 % to 80.9 %. The proximity of our simulated values to expected values provides support for the accuracy of our method, but it is worth noting that our 95 % confidence intervals indicate that estimates may deviate by as much as 7–8 % from the expected value. Moreover, the accuracy of our estimates may decrease in cases where crosses are generated from parents whose ancestry differs significantly from the samples used as ancestral populations in the present study. Ancestry estimates for simulated progeny are shown in Fig. 2b.
Commercial grape ancestry estimation
The distribution of V. vinifera content estimated for the hybrid grape cultivars examined in this work is found in Fig. 3a, and the ancestry estimates for each cultivar are listed in Fig. 3b.
Hybrids previously genotyped in Sawler et al. [24] and replicated in this study using GBS include ‘Bertille-seyve 5563’ (DVIT 169), ‘Van Buren’ (DVIT 1129), ‘Rofar Vidor’ (DVIT 2258), DVIT 2180, ‘Jackson Sel. #3’ (DVIT 2916), and ‘Marechal Foch’ (California) (DVIT 214). The ancestry estimates for these samples differed by 2–5 % from those previously estimated, with the exception of DVIT 2180 where our estimate of V. vinifera ancestry was 19 % higher than in the previous work. DVIT 2180 is an unnamed accession simply identified as a Vitis species by the USDA. Given that the tissue for both studies was collected separately, the large difference in our estimates may be due to mislabelling or sample mix-up. Regardless of this discrepancy, the position of this sample in PC space confirms that it is indeed a hybrid sample (Fig. 1a).
In order to further confirm the accuracy of our ancestry estimates, we compared V. vinifera ancestries inferred from well-known pedigrees to our genomics-based ancestry estimates. For example, ‘Beta’ is a cross between Vitis riparia and ‘Concord’, a Vitis labrusca cross thought to possess some V. vinifera ancestry due in part to its hermaphroditic flowers [36, 37]. Sawler et al. [24] estimated the V. vinifera content of ‘Concord’ as 31 %. Based on these values, the percentage V. vinifera found in ‘Beta’ is expected to be approximately 16 %, and it was estimated as 11 % here (Fig. 3a). ‘Baco Noir’ is a known F1 hybrid between ‘Folle Blanc’ (V. vinifera) and V. riparia, and therefore it is expected to be 50 % V. vinifera. Our estimate is 46 %, which falls within the 95 % confidence interval of the V. vinifera ancestry estimates from our simulated F1 hybrid offspring. In these two cases, our genomics-based ancestry estimates are consistent with pedigree-based estimates.
Our study also included several cultivars collected from multiple locations, and the ancestry estimates were generally similar or equivalent for these replicates from different geographic regions. For example, ‘Frontenac’ sampled from two locations in Nova Scotia, Missouri, as well as a Gris sport, were all estimated to be 30 % V. vinifera. ‘Marquette’ samples from both Nova Scotia and Missouri were estimated to contain 37 % V. vinifera. However, the ancestry estimate (52 %) for a ‘Marechal Foch’ accession retrieved from the USDA germplasm collection was 6 % and 7 % higher than the samples collected from two different locations in Nova Scotia. IBS values indicate that this sample is likely not the same cultivar as the ‘Marechal Foch’ grown in Nova Scotia (Additional file 4: Figure S2). Still, all ancestry estimates of ‘Marechal Foch’ fall within the putative F1 range, which is expected given ‘Marechal Foch’ is the offspring of ‘101–14 Mgt.’ (V. riparia x V. rupestris) x ‘Goldriesling’ (V. vinifera). ‘Leon Millot’ (44 %) and ‘Marechal Joffre’ (47 %) are siblings of ‘Marechal Foch’, and their ancestry estimates also fall within the range expected from an F1 hybrid (Fig. 3b) [38].
Within cultivar differences in ancestry estimates may be due partially to genotyping error. Curation error also leads to the mislabeling of samples and misidentification of cultivars. Previous work on V. vinifera cultivars from the USDA collection revealed widespread curation error [7], and recent work on the same collection found that the species names assigned to samples were incorrect in approximately 4 % of cases [24]. In another example, three different Italian varieties all referred to as ‘Bonarda’ had no direct genetic relationship with each other [39]. Thus, curation error represents a likely source for the discrepancies we observe between samples with identical names.
While our data do not allow us to resolve first-degree relationships, we did examine the distribution of IBS values based on expected relationships derived from pedigree data (Additional file 4: Figure S2). We found that, while many cultivars do share alleles in a manner that supports their expected relationship, several pairs of samples that are supposed to be either geographic replicates or first-degree relatives did not have IBS values consistent with their pedigrees. For example, the IBS value for ‘Villaris’ and ‘Felicia’ (0.83) was at least 0.02 lower than all other sibling pairs examined. Additionally, the ‘Seyval Blanc’ sampled from Germany does not resemble the ‘Seyval Blanc’ from Nova Scotia to the degree we expect. In both cases, the V. vinifera ancestry estimates also differed. Furthermore, ‘Orion’, ‘Staufer’ and ‘Phoenix’ are all progeny of crosses between ‘Villard Blanc’ (62 %) and V. vinifera varieties, which has been confirmed by simple sequence repeat genotyping (Rudolf Eibach, personal communication). However, the expected ancestry for these progeny based on pedigree information should be higher (~81 %) than what we observe (59 %–65 %). Further work is required in order to confirm potential sample mislabeling, cross-contamination, or genotyping error.
Wild species introgression
Often the best source for improvement of a crop plant is its wild relatives [11]. One crop that has benefited greatly from the use of wild relatives in breeding is tomato. Disease resistance in most commercial tomato cultivars is the result of genes introgressed from wild species [40, 41]. However, recurrent backcrossing to elite varieties is performed for several generations in order to remove undesirable genes introduced from the wild relative [41]. In tomato, it is customary to continue backcrossing to elite germplasm for 4 to 6 generations before the resulting hybrid is tested commercially [42].
In comparison to tomato, grape breeding appears to still be in its infancy. Approximately one third (22/64) of the hybrids analyzed in this study have V. vinifera content consistent with F1 hybridization (Fig. 3b). Our results suggest that grape breeders have not extensively backcrossed with V. vinifera in order to introgress wild genes of interest. The distribution of V. vinifera ancestry across hybrids actually implies that backcrosses to wild Vitis species have been more frequent than backcrosses to V. vinifera during hybrid grape breeding. Breeders may have generated hybrids with high wild content when aiming to introgress numerous beneficial traits from wild relatives over a small number of generations. Further local ancestry estimates would be required in order to determine the number of generations of crossing.
The high number of hybrids consistent with F1 hybridization suggests that, overall, recent hybrid grape breeding has not followed standard breeding practices that aim to introgress desirable traits from wild species by repeatedly backcrossing to elite germplasm. Alternatively, because breeders often target numerous traits for introgression from the wild, the optimal V. vinifera content may be lower than the desired elite content in other crops. Ultimately, the crucial factor will be which desirable parts of each ancestral genome are captured, rather than the final V. vinifera percentage.
One instance where repeated backcrossing to V. vinifera has been exploited is in the development of Pierce’s disease (PD) resistant wine grapes by tracking PD resistance alleles from the wild species V. arizonica through MAS [43]. Seedlings resistant to PD were repeatedly backcrossed to V. vinifera, resulting in progeny with 97 % V. vinifera ancestry in the fifth generation, a value much higher than any estimates of commercial cultivars examined in this study [44]. There are many more opportunities for desirable traits, such as cold hardiness, to be introgressed from wild Vitis species into novel elite cultivars [45].
The use of molecular markers can also allow breeders to introgress multiple resistance genes into a single variety, a process called pyramiding [46]. ‘Regent’ is a cross between ‘Diana’, a V. vinifera variety, and the hybrid grape ‘Chambourcin’, which has 46 % V. vinifera ancestry according to our work. Based on these values, the expected V. vinifera ancestry of ‘Regent’ is approximately 73 %, and our estimate is 68 %. The complex pedigree of ‘Regent’ enabled the introgression of mildews and botrytis disease resistance from several Vitis species as well as high frost tolerance and early maturity [47]. In 2013, ‘Regent’ ranked 12th in Germany according to total acreage [48]. Recently, 'Regent' was crossed with VHR 3082-1-42 (Muscadinia rotundifolia x V. vinifera, then backcrossed four times with V. vinifera) to successfully combine powdery and downy mildew resistance genes into a single variety whose ancestry likely exceeds 80 % V. vinifera [49].
The Institute for Grapevine Breeding Geilweilerhof, which developed ‘Regent’, bred 6 of the 7 cultivars with the highest V. vinifera content in our study (Fig. 3b). Thus, some breeders have produced hybrids with a high percentage of V. vinifera ancestry while retaining desirable characteristics from wild species. However, the overall lack of evidence for repeated backcrossing to V. vinifera in hybrid grape breeding indicates that grape breeders have yet to fully exploit the potential of combining key traits from wild species into novel cultivars with high V. vinifera content.