Skip to main content

Table 7 Selected nucleotide pattern frequencies for human and mouse data

From: A Support Vector Machine based method to distinguish long non-coding RNAs from protein coding transcripts

 

GRCh37 and GRCm38

GRCh38 and GRCm38

1

aa, aaa, ac, aca, acg

aa, aaa, ac, aca, acg

2

act, ag, aga, at, ata

act, ag, aga, at, ata

3

atc, atg, att, ca, caa

atc, atg, att, ca, caa

4

cac, cag, cat, cc, cca

cac, cag, cat, cc, ccc

5

ccc, cg, cgc, ct, cta

cg, cgc, ct, cta, ctc

6

ctc, ctg, ga, gag, gc

ctg, ga, gac, gag, gc

7

gcg, gg, ggg, gt, gtc

gcg, gg, ggg, gt, gtc

8

gtg, ta, tac, tag, tat

gtg, ta, tac, tag, tat

9

tc, tca, tcg, tct, tg

tc, tca, tcg, tct, tg

10

tga, tgt, tt, ttg, ttt

tga, tgt, tt, ttg, ttt

  1. GRCh37, GRCh38 and GRCm38 data sets were analyzed to identify the 50 pattern frequencies with the highest PCA loadings. The patterns “cca” and “gac”, in bold, are the only differences