Skip to main content

Table 1 Summary statistics for individual and merged assemblies

From: De novo sequence assembly and characterisation of a partial transcriptome for an evolutionarily distinct reptile, the tuatara (Sphenodon punctatus)

Kmer

Assembly

No. transcripts >100 bp

N50

Mean length

Max length

Total no. bases

21

Initial

33024

844

525

5689

17,354,832

 

Representative

29082

786

501

5659

14,561,997

25

Initial

28723

746

491

5689

14,105,603

 

Representative

26715

706

474

5659

12,660,658

29

Initial

26236

615

431

5689

11,307,053

 

Representative

25016

590

419

5659

10,488,297

33

Initial

23648

488

363

5584

8,591,562

 

Representative

22972

469

355

5584

8,148,996

37

Initial

19180

369

311

5111

5,898,486

 

Representative

18821

357

301

5111

5,664,511

41

Initial

12230

281

263

5750

3,218,609

 

Representative

12090

273

258

5750

3,122,927

Merged

 

35680

747

479

5750

17,086,468

Final

 

32911

675

451

5659

14,828,283

Annotated

 

15965

927

586

5659

9,357,209

  1. For each kmer, data from both the initial Velvet/Oases assembly (Initial), and the assembly containing only one representative transcript from each locus (Representative) are shown. The “Merged” assembly is the result of merging representative assemblies from different kmers using CD-HIT-EST, the “Final” assembly is after potentially misassembled transcripts were removed, and the “Annotated” set only contains transcripts with a significant BLAST match. Kmer = required length of overlap match between two reads in Velvet; N50 = length-weighted median contig length.