Skip to main content

Table 1 Macadamia integrifolia genome and transcriptome sequencing, assembly and annotation statistics

From: Genome and transcriptome sequencing characterises the gene space of Macadamia integrifolia (Proteaceae)

Library Type Reads post QC millions Nucleotides post QC gigabases
Genome sequencing:
 Illumina GAIIx 480 bp Insert (2x150 bp PE) 101.7 30.51
 Illumina GAIIx 700 bp Insert (2x150 bp PE) 48.6 14.58
 Illumina HiSeq 8000 bp Insert (2x100 bp MP) 32.4 6.48
  Total 182.7 51.57
Transcriptome sequencing:
 Illumina HiSeq Flower (2x100 bp PE) 82.1 16
 Illumina HiSeq Shoot (2x100 bp PE) 70 13.7
 Illumina HiSeq Leaf (2x100 bp PE) 76 14.9
  Total 228.1 44.6
Genome assembly Contigs Scaffolds
 Number 210,726 193,493
 Minimum size (bp) 388 500
 Maximum size (bp) 379,349 643,490
 N50 (bp) 3522 4745
 Total assembly length (Mb) 477 518
Transcriptome assembly Statistics
 Number of transcripts 298,030
 Maximum transcripts length (bp) 17,814
 Minimum transcript length (bp) 224
 Mean transcript length (bp) 823
 Standard deviation (bp) 886
 Total length (bp) 245,373,045
 N50 (bp) 1339
Genome annotation Statistics
 Number of gene models 35,337
 Average gene length (bp) 2518
 Average coding sequence length (bp) 1090
 Gene models similar to Arabidopsis thaliana TAIR10a 74%
 Gene models similar to Nelumbo nucifera a 79%
 Eukaryotic 458 CORE genes availablea 96%
  1. aBLASTP 1e-05