Skip to main content

Table 1 Genome annotation statistics of P. wickerhamii, two closely related Chlorellales: A. protothecoides and Helicosporidium sp., and two pathogenic fungi: C. albicans and T. rubrum. Data acquired from GFF files available at NCBI Genome (https://www.ncbi.nlm.nih.gov/genome)

From: A first insight into the genome of Prototheca wickerhamii, a major causative agent of human protothecosis

Characteristic

P. wickerhamii

A. protothecoides

Helicosporidium sp.

C. albicans

T. rubrum

Sequencing

GenBank assembly accession (NCBI accession no. of assembly)

JADZLO010000000

GCA_000733215.1 (ASM73321v1)

GCA_000690575.1 (Helico_v1.0)

GCA_000182965.3 (ASM18296v3)

GCA_000151425.1 (ASM15142v1)

Assembly length (Mb)

16.7

22.9

12.4

14.3

22.5

Conting number

21

1386

5666

88

624

N50 contig

1,578,614

35,091

3036

334,289

83,988

Scaffold number

21

374

5666

8

36

N50 scaffold

1,578,614

285,543

3036

2,231,883

2,156,965

Genome coverage (Fold)

ca. 150x

145x

62x

700x

8.19x (7.49x Q > 20)

Sequencing platform

PacBio; Illumina MiSeq 2 × 300

454 GS FLX Titanium; Illumina HiSeq 2000

Illumina HiSeq; Illumina GAIIx

Illumina GAIIx

Sanger ABI

GC content

GC content total (%)

64.5

63.5

61.7

33.5

48.3

GC content exons (%)

68.7

68.1

66.5

35.1

51

GC content introns (%) (between exons)

60.9

63

58.8

29.6

43

GC content intergenic regions (%)

58.2

58.1

58.4

30.7

45.3

Protein coding genes

Number of genes

6081

7016

6033

6263

8804

Average gene length (bp)

2135

2347

1031

1447

1572

Average exon length (bp)

288

206

366

1336

454

Average no of exons per gene

5.1

5.7

2.2

1.1

3.1

Average intron length (bp)

162.8

247.2

170.2

146.1

85.4

Average no of introns per gene

4.1

4.7

1.2

0.1

2.1

Genes with introns (%)

97.4

88.7

56.3

6.8

81.7

Mean intergenic lenght (bp)

1734.1

2184.4

1027

937.8

1108.1

Coding sequence ratio (%)*

2.7

3.26

2.06

2.28

2.56

Percentage coding

45.9

36.1

39.9

62.6

53.7

Gene density (gene per Mb)

365.1

306.4

486.5

438.0

391.3

tRNA genes

64

71

29

126

100

Repetitive DNA in genome assembly (%)

2.25

1.98

1.23

4.6

1.89

  1. *Coding sequence ratio = assembly length / number of genes * 1000