Skip to main content

Table 2 Transcriptome assembly quality evaluation metrics

From: A genome-wide transcriptome map of pistachio (Pistacia vera L.) provides novel insights into salinity-related genes and marker discovery

Parameters

CLC

SOAPdenovo-Trans

Trinity

Merged assembly (SOAPdenovo-Trans)

Merged assembly (CLC)

Number of contigs

83,390

85,091

144,103

93,865

90,632

Average transcript length (bp)

787

695

1139

885

945

Minimum transcript length (bp)

300

300

300

300

300

Maximum transcript length (bp)

12,370

12,097

13,939

13,329

12,167

N50 (bp)

909

788

1679

1300

1450

N90 (bp)

392

361

489

397

425

Percentage of contig ≥1 kb

20.92

16.47

41.44

30%

33.83%

Percentage of mapped back reads to assembly

88.51

83.81

97.49

93.54

96.98

Percentage of mapped reads in pairs

78.23

70.71

89.93

83.67

87.76

Percentage of mapped broken paired reads

10.28

13.1

7.86

9.54

9.22

Percentage of complete cores proteins by CEGMA analysis

79.03

69.35

97.98

95.16

94.35

Percentage of partial cores proteins by CEGMA analysis

93.15

93.15

99.6

97.74

98.79

The number of unique proteins found in blastx

20,782

20,872

25,065

20,789

20,911

The number of unique contigs hit by proteins in the tblastn

44,479

44,503

44,807

44,302

44,803

The number of unique contigs with reciprocal best hits

13,730

13,724

15,690

14,003

14,101

The number of unique contigs with orthologue hit ratio of 0.8–1

5963

7249

12,903

7750

8423

RSEM-EVAL score

−19,590,755,435

−22,457,309,826

−11,702,493,371

−14,998,853,445

−13,665,328,361

  1. Transcriptome assembly evaluation metrics for single 25 k-mer assemblies generated by CLC genomics workbench, SOAPdenovo-Trans and Trinity as well as the merged assembly with k-mer length ranging from 25 to 63 with the step size of 4