Mining statistically-solid k-mers for accurate NGS error correction

BMC Genomics

Table 1 The data sets that are used for evaluating the performance of error correction models

Data set	Genome name	Genome size (bp)	Error rate (%)	Read length (bp)	Coverage	Number of reads	Insert length	Is sythetic
R1	S. aueus	2,821,361	1.28	101	46.3 ×	1,294,104	180	No
R2	R. sphaeroides	4,603,110	1.08	101	45.0 ×	2,050,868	180	No
R3	H. chromosome 14	88,218,286	0.52	101	41.8 ×	36,504,800	155	No
R4	B. impatiens	249,185,056	0.86	124	150.8 ×	303,118,594	400	No
S1	H. chromosome 14	88,218,286	0.97	101	41.8 ×	36,504,800	180	Yes
S2	B. impatiens	249,185,056	0.98	124	150.8 ×	303,118,594	400	Yes

ISSN: 1471-2164