Skip to main content

Table 2 Transcriptome assembly quality evaluation metrics

From: A genome-wide transcriptome map of pistachio (Pistacia vera L.) provides novel insights into salinity-related genes and marker discovery

Parameters CLC SOAPdenovo-Trans Trinity Merged assembly (SOAPdenovo-Trans) Merged assembly (CLC)
Number of contigs 83,390 85,091 144,103 93,865 90,632
Average transcript length (bp) 787 695 1139 885 945
Minimum transcript length (bp) 300 300 300 300 300
Maximum transcript length (bp) 12,370 12,097 13,939 13,329 12,167
N50 (bp) 909 788 1679 1300 1450
N90 (bp) 392 361 489 397 425
Percentage of contig ≥1 kb 20.92 16.47 41.44 30% 33.83%
Percentage of mapped back reads to assembly 88.51 83.81 97.49 93.54 96.98
Percentage of mapped reads in pairs 78.23 70.71 89.93 83.67 87.76
Percentage of mapped broken paired reads 10.28 13.1 7.86 9.54 9.22
Percentage of complete cores proteins by CEGMA analysis 79.03 69.35 97.98 95.16 94.35
Percentage of partial cores proteins by CEGMA analysis 93.15 93.15 99.6 97.74 98.79
The number of unique proteins found in blastx 20,782 20,872 25,065 20,789 20,911
The number of unique contigs hit by proteins in the tblastn 44,479 44,503 44,807 44,302 44,803
The number of unique contigs with reciprocal best hits 13,730 13,724 15,690 14,003 14,101
The number of unique contigs with orthologue hit ratio of 0.8–1 5963 7249 12,903 7750 8423
RSEM-EVAL score −19,590,755,435 −22,457,309,826 −11,702,493,371 −14,998,853,445 −13,665,328,361
  1. Transcriptome assembly evaluation metrics for single 25 k-mer assemblies generated by CLC genomics workbench, SOAPdenovo-Trans and Trinity as well as the merged assembly with k-mer length ranging from 25 to 63 with the step size of 4