Skip to main content

Table 1 Protein coding sequence features of the various sets analysed

From: The parasite Trichomonas vaginalis expresses thousands of pseudogenes and long non-coding RNAs independently from functional neighbouring genes

Category

  

CDSN

  
 

TVAG(1)

CDSP

PSEUDO

LNCRNA

INTG(2)

RNDN(3)

Number

59672

22609

1976

2175

4606

4606

Median longest ORF length

636

1002

195

165

156

120

Mean longest ORF length

917.64

1320.23

286.64

262.63

199.45

127.05

Median relative longest ORF

99.58%

89.19%

42.11%

44.69%

34.31%

24.52%

Longest ORF ≥50 aa

99.59%

98.92%

64.83%

55.82%

53.58%

26.90%

Proportion of stop codons (4)

0.29%

1.45%

3.02%

3.08%

4.16%

5.38%

GC-Content

35.49%

34.62%

31.07%

29.42%

27.82%

30.52%

  1. (1)Annotated protein-coding genes.
  2. (2)Intergenic regions without expression evidence randomly selected in size of CDSN.
  3. (3)Order of nucleotides randomized per sequence.
  4. (4)In reading frame with lowest number of stop codons.