Skip to main content

Table 4 Summary of co-identity in redundancy-removed contigs

From: Removal of redundant contigs from de novo RNA-Seq assemblies via homology search improves accurate detection of differentially expressed genes

 

Raw Contigs

Longest Contigs

Clustered Contigs

Annotated Contigs

Plant DB Contigs

PlantClust50 DB Contigs

Correlation coefficient of log2-fold change between gene dataset and reference sequences

0.60

0.60

0.88

0.88

0.83

0.73

No. of DECs exhibiting co-identity with DEGsa

8602 (73.8 %)

8331 (74.2 %)

8506 (75.8 %)

8303 (74.0 %)

7594 (67.7 %)

5878 (52.4 %)

No. of DECs redundant or not exhibiting co-identity with DEGsb

15,706 (64.6 %)

14,219 (63.1 %)

28,640 (77.1 %)

1372 (14.2 %)

9052 (54.4 %)

4809(45.0 %)

P-valuec

0.44

0.20

0.01

1.00

0.93

0.93

Fisher’s exact testd

30

30

31

4

29

17

  1. a Number of contigs identical to DEGs in gene dataset (values in parentheses are percentage of all DEGs identical to DECs)
  2. b Number of contigs not corresponding DEGs in gene dataset (values in parentheses are percentage of contigs without corresponding genes in contig group)
  3. c Correlation between gene database and contig group in GO slim term distribution calculated by Kolmogorov–Smirnov test
  4. d No. of GO terms significantly different from gene dataset in annotation count