Skip to main content

Table 3 Transcriptome coverage analysis based upon CEGMA and BUSCO alignment

From: A survey of the complex transcriptome from the highly polyploid sugarcane genome using full-length isoform sequencing and de novo assembly from short read sequencing

CEGMA alignment

Assembly

Count

# CEGs Protein

Complete CEGs count

% Completeness

Partial CEGS

% Partials

Missing CEGs

% Missing

Total CEGs

% Complete and partial CEGs

PacBio isoforms

107,598

248

240

96.77

3

1.2

5

2.02

243

97.98

de novo contigs

906,566

248

242

97.58

6

2.4

0

0.00

248

100.00

Unigenes

72,269

248

224

90.32

13

5.2

11

4.44

237

95.56

SoGI dataset

121,342

248

156

62.90

61

24.6

31

12.50

217

87.50

BUSCO notation alignment

Assembly

Total complete (%)

Single copy BUSCOs (%)

Duplicated BUSCOs (%)

Fragmented BUSCOs (%)

Missing BUSCOs (%)

Complete and fragmented (%)

PacBio transcript isoforms

83.58

17.15

66.42

6.69

9.73

90.27

de novo transcript contigs

92.99

10.04

82.95

4.71

2.30

97.70

Unigenes

79.60

63.81

15.79

11.40

9.00

91.00

SoGI

46.65

26.67

19.98

34.41

18.93

81.07