Skip to main content

Advertisement

Table 7 Selected nucleotide pattern frequencies for human and mouse data

From: A Support Vector Machine based method to distinguish long non-coding RNAs from protein coding transcripts

  GRCh37 and GRCm38 GRCh38 and GRCm38
1 aa, aaa, ac, aca, acg aa, aaa, ac, aca, acg
2 act, ag, aga, at, ata act, ag, aga, at, ata
3 atc, atg, att, ca, caa atc, atg, att, ca, caa
4 cac, cag, cat, cc, cca cac, cag, cat, cc, ccc
5 ccc, cg, cgc, ct, cta cg, cgc, ct, cta, ctc
6 ctc, ctg, ga, gag, gc ctg, ga, gac, gag, gc
7 gcg, gg, ggg, gt, gtc gcg, gg, ggg, gt, gtc
8 gtg, ta, tac, tag, tat gtg, ta, tac, tag, tat
9 tc, tca, tcg, tct, tg tc, tca, tcg, tct, tg
10 tga, tgt, tt, ttg, ttt tga, tgt, tt, ttg, ttt
  1. GRCh37, GRCh38 and GRCm38 data sets were analyzed to identify the 50 pattern frequencies with the highest PCA loadings. The patterns “cca” and “gac”, in bold, are the only differences