Skip to main content

Table 1 Compression ratio for the H. sapiens dataset-60 (171GB)

From: Sketch distance-based clustering of chromosomes for large genome database compression

ReferenceCompression ratio with algorithm  
 HiRGCiDoCompGDC2ERGCNRGCSCCG
GCA_000004845339.80184.20238.9811.00122.67225.35
hg19346.8026.78242.60128.11137.41265.03
YH351.74134.13237.39108.26123.24228.20
GCA_000252825241.0192.65230.45102.62176.08122.25
Huref245.79140.84224.4769.59177.85123.26
ECC clustering result443.51238.68248.11293.24184.13313.60
Ratio gain*22.05%22.83%2.22%56.31%3.41%15.49%
  1. Bold text indicates the highest compression ratio of an algorithm, italic text indicates the best case of fixed single reference compression result
  2. *The ratio gain of ECC against the best case