Skip to main content

Table 1 Summarised EnTAP annotation statistics

From: Defence transcriptome assembly and pathogenesis related gene family analysis in Pinus tecunumanii (low elevation)

 

Pinus tecunumanii (LE)

Pinus patula

Assembly Statistics

Assembly

PnteEviGene

Pnte_v1.0

PiptEviGene

Pipt_v2.0

Total Sequences

91,552

28,621

325,974

52,735

 Total Transcriptome Length (Mb)

100.95

50.01

259.02

72.15

 Average Sequence Length (nt)

1103

1747

794

1368

 N50 (nt)

1551

2296

870

1897

 Longest Sequence (nt)

16,886

16,886

16,570

16,570

 Shortest Sequence (nt)

351

351

351

351

 %GC

48%

44%

50%

46%

Sequence filteringa

 Sequences with FPKM > 1

77,563

 

203,996

 

(84.72%)

(62.58%)

 Sequences with GeneMarkS-T predicted CDS

74,556

194,568

(81.44%)

(59.69%)

 Total proteins after clustering to 90% identity

70,748

167,961

(77.28%)

(51.53%)

Annotationb

 

Predicted protein frame

 Complete

32,193

18,971

61,134

26,835

(45.50%)

(66.27%)

(36.40%)

(50.89%)

 Internal

18,279

3347

57,027

9602

(25.84%)

(11.70%)

(33.95%)

(18.21%)

 3′-partial

6686

1895

21,848

4354

(9.45%)

(6.63%)

(13.01%)

(8.26%)

 5′-partial

17,398

4408

54,559

11,944

(24.59%)

(15.40%)

(32.48%)

(22.65%)

 

Similarity Search Annotation

 Sequences with informative BLASTp alignments

19,296

15,192

38,714

27,328

(27.27%)

(53.09%)

(23.05%)

(51.82%)

 Sequences with uninformative BLASTp alignments

15,437

6131

34,101

9261

(21.82%)

(21.41%)

(20.30%)

(17.56%)

 Sequences without BLASTp alignments

36,015

7300

95,146

16,147

(50.91%)

(25.50%)

(56.65%)

(30.62%)

 

Functional Annotation

 Sequences with family assignment

55,627

28,484

128,952

52,166

(78.63%)

(99.52%)

(76.77%)

(98.92%)

 Sequences with at least one GO term

31,640

16,197

77,063

33,712

(44.72%)

(56.60%)

(45.88%)

(63.93%)

 Sequences with at least one pathway (KEGG) assignment

17,303

8004

47,383

21,094

(24.46%)

(27.98%)

(28.21%)

(40.00%)

 

Annotation Summary

Unannotated Sequences

14,568

0

36,698

0

(20.59%)

(0.00%)

(21.85%)

(0.00%)

Total sequences annotated

56,180

28,621

131,263

52,735

(79.41%)

(100.00%)

(78.15%)

(100.00%)

Non-pine origin sequences

27,550

0

78,527

0

(38.94%)

(0.00%)

(46.75%)

(0.00%)

  1. aPercentages relative to EviGene assemblies
  2. bPercentages relative to clustered GeneMarkS-T assemblies for EviGene columns and relative to total sequences for Pnte_v1.0 and Pipt_v2.0