Table 2 Criteria for verifying rice uORFs in 5 out of 5

From: Conserved upstream open reading frames in higher plants

Accession FL- cDNAa Upstream & In-frame stop codon Agreement with genome annotationb Alignment of uORFSCAN identified main proteins with UniProt proteinsc uORF valid
     UniProt protein length (AA) Align length (AA) Identities (%) Expect Annotation GO classication (Arabidopsis thaliana)  
AK106095 Yes Yes Yes 392 392 100 2.2e-217 Oxidoreductase [go:19538] protein metabolism
[go:16706] oxidoreductase activity
AK103391 Yes Yes Yes 371 371 100 3.4e-194 Trehalose-6-phosphate phosphatase [go:5992] trehalose biosynthesis
[go:9507] chloroplast
[go:4805] trehalose-phosphatase activity
AK100589 Yes Yes Yes 398 398 100 1.1e-215 AdoMetDC [go:6596] polyamine biosynthesis
[go:5694] chromosome
AK073303 Yes Yes Yes 257 257 100 1.6e-141 Acyl-CoA independent ceramide synthase [go:6672] ceramide metabolism
[go:16020] membrane
[go:3824] catalytic activity
[go:16811] hydrolase activity
AK072868 Yes Yes Yes 443 443 100 3.6e-238 uncharacterized protein (probable CBL-interacting serine/threonine-protein kinase 15) [go:6468] protein phosphorylation
[go:7165] signal transduction
[go:5524] ATP binding
[go:4672] protein kinase activity
AK072649 Yes Yes Yes 480 488 76 9.6e-199 Ribosomal protein S6 Kinase [go:45946] positive translation
[go:6468] protein phosphorylation
[go:9507] chloroplast
[go:16301] kinase activity
AK066145 No Yes Yes 119 119 100 1.3e-59 Membrane-anchored ubiquitin-fold protein [go:6464] protein modification Yes
AK064792 Yes Yes 197d 109 108 57 8.4e-26 F9L1.29 protein Not available Yes
AK060523 Yes Yes Yes 166 166 100 1.9e-88 uncharacterized protein (probable ankyrin-3) [go:9507] chloroplast
[go:5515] protein binding
  1. a Used rice cDNA in blastn search against "NCBI EST_Others" database (rice) to search for longer 5' ESTs.
  2. b Used rice cDNA in blastn search against "TIGR Rice Genome Annotation DB: Coding Sequences" database to verify the cDNA ORF.
  3. c Translated the rice cDNA in the same frame as the main open reading frame identified by uORFSCAN (include translations upstream of predicted start Methionine). The resulting protein sequence was used in a blastp search against "The UniProt Knowledgebase (UniProt)" database.
  4. d The genome annotation for the CDS is longer by the indicated number of base pairs.