Skip to main content
Fig. 1 | BMC Genomics

Fig. 1

From: Systematic discovery of novel eukaryotic transcriptional regulators using sequence homology independent prediction

Fig. 1

Feature-based prediction pipeline to identify novel transcriptional regulator families. a Pipeline work flow: First, Arabidopsis protein families were filtered based on their size and the GO annotations of their members. Then, uncharacterized families with more than 2 members were filtered based on subcellular localization patterns using Yloc [98], percentage of disordered residues using Predisorder [99], and the ability of at least one member to activate transcription of a reporter gene in yeast (autoactivation) [22, 103,104,105,106,107]. Numbers in the Venn diagram represent the number of families with most members being nuclear localized (blue), high percentage of disordered residues (green) and autoactivation in yeast (red). Families that met all criteria (intersection of the Venn diagram) were considered as candidate regulator families. b-j Proportion of proteins predicted to contain nuclear localization signal (NLS) (b-d), distribution of the percentage of disordered amino acid residues (e-g), and proportion of proteins with autoactivation activity (h-j) in the background (white), TFs (dark gray), and predicted regulators (light gray). The background corresponds to all proteins in Arabidopsis (b, e, h), fruit fly (c, f, i), or human (d, g, j) genomes or the set of proteins that were tested for autoactivation in yeast (h, i, j). * = p-value <0.0001, chi-square test with Yates correction (b-d and h-j) or t-test (e-g)

Back to article page