Skip to main content

Table 2 Assembly statistics for the original reference genomes and de novo long-read derived genomes for Thalassiosira pseudonana and Phaeodactylum tricornutum. The mitochondrial and organellar genomes for both diatoms were assembled by Canu and Flye and were excluded from assembly statistical analyses. All Canu and Flye assemblies were corrected first by long-reads using Racon and Nanopolish followed by Illumina short-reads using Pilon (See Methods and Materials for more details). The BUSCO odb9 eukaryotic database (303 genes) was used to assess the different assemblies. The BUSCO scores are reported for the total gene completeness (C), complete single-copy (S), complete duplicated (D) and fragmented (F) orthologs

From: Re-examination of two diatom reference genomes using long-read sequencing

 

Assembly

Total length (Mbp)

Read depth coverage

No. contigs

Largest contig (Mbp)

Contig N50 (Mbp)

Contig L50

No. scaffolds

Largest scaffold (Mbp)

Scaffold N50 (Mbp)

Scaffold L50

G+C content (%)

% identity to reference

BUSCO

Complete-ness

ALE score

Phaeodactylum tricornutum

Reference (Bowler et al. 2008)

27.4

9.6x

179

n/a

0.42

20

88a

2.53

0.95

11

48.8

n/a

C:82.5%

S:80.2%

D:2.3%

F:5.9%

n/a

Canu

57.0

40x

291

2.51

0.25

43

n/a

n/a

n/a

n/a

48.7

99.3

C 85.4%

S:33.3%

D:52.1%

F:3.0%

-734,959,595

Flye

33.5

72x

196

1.66

0.36

24

n/a

n/a

n/a

n/a

48.7

99.1

C:80.9%

S:71.0%

D:9.9%

F:4.6%

-781,367,384

Canu-Bionano hybrid

66.8

n/a

n/a

n/a

n/a

n/a

219b

2.78

1.06

n/a

n/a

n/a

n/a

n/a

Thalassiosira pseudonana

Reference (Armbrust et al. 2004, Bowler et al. 2008)

32.4

n/a

115

n/a

1.27

8

64c

3.04

1.99

7

46.9

n/a

C:81.2%

S:79.2%

D:2.0%

F:5.3%

n/a

Canu

47.3

40x

222

2.77

0.98

14

n/a

n/a

n/a

n/a

46.9

99.4

C:79.2%

S:59.7% D:19.5%

F:6.6

-1,238,092,187

Flye

33.8

48x

52

2.76

1.38

8

n/a

n/a

n/a

n/a

47.0

99.4

C:80.6%

S:78.9%

D:1.7%

F:5.6%

-1,047,071,217

  1. aThe number of scaffolds reflects the 33 chromosome-level scaffolds and 55 unplaced, smaller contigs.
  2. bThe number of scaffolds for the Canu-Bionano hybrid includes both the 49 scaffolds that were assembled from the 138 long-read contigs that met minimum length requirement (≥150 kb) for Bionano optical map anchoring and the 155 unanchored contigs <150 kb.
  3. cThe number of scaffolds reflects the 27 chromosome-level scaffolds and 37 unplaced, smaller contigs.