Skip to main content

Table 1 Statistics of the real data sets. The 14-bacterial-species data set includes 14 clusters. The viral set includes 9 clusters. Cluster counts in the bacterial, maize LTRs, and human microbiome sets are unknown

From: MeShClust v3.0: high-quality clustering of DNA sequences using the mean shift algorithm and alignment-free identity scores

Data set

Sequence count

Total length

Maximum length

Minimum length

Mean length

Median length

14 bacterial species

1,328

4,256,374,969

9,270,175

801,203

3,205,102

2,874,351

Bacterial

10,562

38,577,794,947

16,040,666

112,031

3,652,509

3,647,501

LTRs

253,224

346,337,915

5,999

100

1,368

1,187

Microbiome

1,071,335

269,374,512

372

171

251

256

Viral

96

635,979

13,246

2,605

6,625

7,458