Table 1 Overview of the read assignment accuracy of EAGLE-RC against the concatenation method with real datasets. The first part of the table provides details on each dataset such as the species of origin, the type of replication (biological or technical), the sequencing strategy and the divergence between the two progenitor species, represented by the two-way average nuclear identity (ANI). The sequencing strategy includes the sequencing layout (PE = paired-end, SE = single-end) followed by the read length in bp. The two-way ANI was obtained using the ANI calculator from [56] with default parameters. The ANI value for Mimulus could not be calculated because of excessive computation time requirements (> 6′000 CPU hours). The second part of the table shows the average number of uniquely mapped reads for each approach, which was used to calculate the average error rate on the third part of the table. The error rate was obtained by the number of reads assigned to the wrong genome divided by the total number of reads that were uniquely mapped and deduplicated

From: ARPEGGIO: Automated Reproducible Polyploid EpiGenetic GuIdance workflOw

Datasets Average number of uniquely mapped reads Average error rate
Species Type of replicate (#) Sequencing layout and read length Two-way average nuclear identity Concatenated genome Read classification Concatenated genome Read classification
Arabidopsis halleri Biological (2) PE150 94.29 ± 3.94% 17′258’758 18′311’330 3.98% 1.16%
Arabidopsis lyrata Biological (2) PE150 22′204’342 23′301’056 5.94% 1.45%
Mimulus guttatus Technical (4) SE150 N.A. 1′420’116 1′288’800 26.78% 7.52%
Mimulus luteus Technical (4) SE150 3′889’458 3′760’614 9.80% 2.29%
Gossypium arboreum Technical (2) PE125 91.07 ± 4.68% 253′912’667 254′261’702 0.0044% 0.0013%
Gossipyium raimondii Technical (2) PE125 242′590’069 246′935’598 0.0039% 0.0019%