Skip to main content

Table 1 Overview of assembly statistics

From: An improved genome assembly uncovers prolific tandem repeats in Atlantic cod

Assembly

Total size

N50

N50

Percentage

CEGMAa

BUSCOb

REAPRc

FRC bam d

Potential

 

assembly

contig

scaffold

gap bases

    

conflict

 

(Mbp)

(kbp)

(Mbp)

     

(sequences)e

gadMor1f

832

2.3

0.14

26.9

444 (96.9%)

3 308 (89.4%)

2 547

4 210 772

76

ALPILM

660

4.4

0.16

28.7

424 (92.6%)

3 016 (81.6%)

19 787

2 182 096

122

NEWB454

656

6.2

1.30

24.4

435 (95.0%)

3 109 (84.1%)

18 117

2 044 008

26

CA454ILM

647

9.9

0.50

3.49

447 (97.5%)

3 379 (91.4%)

7 406

1 351 500

96

CA454PB

682

95

0.27

1.62

431 (94.1%)

3 310 (89.5%)

8 617

1 508 054

188

gadMor2g

643

116

1.15

1.69

435 (95.0%)

3 447 (93.2%)

7 359

1 248 792

15

  1. aCEGMA annotates 458 highly conserved eukaryotic genes
  2. bBUSCO annotates 3,698 actinopterygii specific genes
  3. cREAPR analyses the discordance between the expected order, orientation and distance of mapped paired reads, with detected potential errors, fewer is better
  4. d FRC bam uses a similar approach as REAPR, with total number of features (i.e., potential assembly problems), fewer is better
  5. eNumber of sequences mapping to more than one linkage group or to multiple linkage groups, fewer is better
  6. fFrom [5]
  7. g93% of the gadMor2 assembly is additionally oriented and ordered into 23 linkage groups (Additional file 1: Table S3)