Skip to main content

Table 1 Summary for genome sequencing, assembly and annotation

From: The genome of medicinal leech (Whitmania pigra) and comparative genomic study for exploration of bioactive ingredients

 

H. robusta

(Ref [7])

W. pigra

(this study)

H. medicinalis

(Ref [8])

H. medicinalis

(Ref [9])

Size of genome assembly

228 Mbp

177 Mbp

187Mbp

177Mbp

Num. of Scaffolds

1,993

10,050

14,042

19,929

Num. of scaffolds (> 2Kbp)

1124

3495

5277

10128

Scaffold N50

3,060 Kbp

728 Kbp

97 Kbp

504Kbp

Total reads

3,176,156

118,388,619

62,184,084

NA

Reads mapping to genome (%)

2,839,951 (89%)

112,480,685 (95%)

NA

NA

Sequencing coverage depth

7.92X

100X

73X

146X

Repetitive content (%)

33

23

NA

24

GC (%)

33

35

41

35

Num. of predicted genes

23,400 a

26,743 a

14,596 a

17205 a

Protein length

376

438

464

NA

Mean exon length

203 bp

205 bp

224

NA

Mean intron length

526 bp

391 bp

716

NA

Mean number of exons per gene

6.1

6.4

8

NA

  1. Note: a The genome of H. robusta (Ref7), W. pigra (this study), H. medicinalis (Ref8), H. medicinalis (Ref9) were annotated using Genewise, BRAKER_v2, AUGUSTUS_v3 and MAKER_v2, respectively. NA, the data is not available in the references. GC, fraction of guanine plus cytosine nucleobases. Scaffold N50, the length such that half of the assembled sequence is in scaffolds longer than this length