Prospects and limits of marker imputation in quantitative genetic studies in European elite wheat (Triticum aestivum L.)

Table 2 Correlations between Rogers’ distance matrices of the individual lines of the test population

Data set	Ref 50	Ref 100	Ref 200	Ref 300
Data set	cor	cor	cor	cor
9 k panel	0.95	0.95	0.95	0.95
Beagle	0.83	0.92	0.95	0.96
FImpute	0.95	0.96	0.97	0.97
IMPUTE2	0.96	0.97	0.98	0.98
Random Forest	0.61	0.61	0.61	0.66

Estimates are based solely on imputed parts of data sets (90 k SNP minus 9 k SNP data) and the original 90 k SNP data set, as well as the correlation between Rogers’ distance matrices of the original 9 k and original 90 k SNP data sets. Different imputed low to high marker density data sets were generated by map- dependent (Beagle, FImpute, and IMPUTE2) and map-independent (Random Forest) imputation algorithms for reference populations of 50, 100, 200, and 300 out of 371 lines. All correlations were significantly larger than zero (P < 0.01) according to a Mantel test.

ISSN: 1471-2164