Skip to main content

Table 1 Summary statistics of the genomes, automatically annotated and manually annotated gene sets, and gene model properties for the seven analyzed species

From: Repertoire-wide gene structure analyses: a case study comparing automatically predicted and manually annotated gene models

  

Holometabolous

Hemimetabolous

Coleoptera

Hymenoptera

Hemiptera

Thysanoptera

Anoplohora glabripennis

Leptinotarsa decemlineata

Athalia rosae

Orussus abietinus

Cimex lectularius

Oncopeltus fasciatus

Frankliniella occidentalis

 

Assembly size [Mbp] (% determined nucleotides)

707.7 (85.1)

1170.2 (58.0)

163.8 (95.7)

201.2 (92.7)

650.5 (79.0)

1098.7 (70.4)

415.8 (63.4)

 

AUTO

22,253

24,732

11,956

10,966

14,085

19,587

18,021

 

OGS

22,035

24,671

11,894

10,959

13,953

19,615

17,553

AUTO-

749

972

805

659

795

1013

1118

SUB

       
 

AUTO-SUB % of AUTO

3.4

3.9

6.7

6.0

5.6

5.2

6.2

 

MAN-SUB

770

933

825

670

778

945

1127

 

MAN-SUB % of OGS

3.5

3.8

6.9

6.1

5.6

4.8

6.4

 

MAN-ADD

216

98

50

30

221

161

381

 

MAN-ADD % of OGS

1.0

0.4

0.4

0.3

1.6

0.8

2.2

Median transcript length [bp]

AUTO-SUB

6183

8562.5

4340

5200

4362

9324

5001.5

MAN-SUB

5789.5

9280

3208

3996

4360

11,244

4064

Median protein length [aa]

AUTO-SUB

358

255

445

430

358

257

419.5

MAN-SUB

389

300

423

419

372.5

320

419

Median exon count p.t.

AUTO-SUB

4

4

6

5

5

4

6

MAN-SUB

4

4

5

5.5

5

4

6

Median median exon length p.t. [bp]

AUTO-SUB

1210

984

2220

2151

1200

1086

1807.5

MAN-SUB

1345.5

1127

1786

1828

1194.5

1347

1755

Median median intron length p.t. [bp]

AUTO-SUB

354.75

1192

107.5

1278.25

75

126.75

108

MAN-SUB

359

1363

100.5

1434

74

123

100.75

  1. Summary statistics on assemblies and manual annotation actions for each species and selected set-wide property values of MAN-SUB and AUTO-SUB
  2. aa amino acids, bp base pairs, det. Nucs. determined nucleotides (i.e., not N), Mbp mega base pairs, OGS official gene set, p.t.: per transcript