Skip to main content
Figure 2 | BMC Genomics

Figure 2

From: Improving analysis of transcription factor binding sites within ChIP-Seq data based on topological motif enrichment

Figure 2

Background sequence selection impacts motif over-representation analyses. (a) For each background, the fraction of the 43 analyses that reported the ChIP’d TF in the top 5 enriched PWMs from a particular background (x-axis) is plotted against the average skew of the over-representation results for each background’s 43 analyses. Skew is the negative slope of the line fitted to the over-representation scores versus PFM GC content (i.e. values as displayed in Figure 1). The ideal is to have a large x-axis value (vertical dashed line) and an average skew of zero (horizontal dashed line). (b) and (c) summarize the standard deviation (y-axis) and mean (x-axis) of the ‘non-outlier’ oPOSSUM over-representation scores for 10 backgrounds against each of 43 ChIP-Seq datasets, where panel (b) displays the average value for each background across the 43 datasets and panel (c) displays the individual value of 430 analyses. The ideal result would be situated at the origin (the intersection of the dashed lines). For all panels each of the 10 backgrounds tested is denoted as a single colour: Light green circle – randomly chosen background from the dataset of mappable sequences, dark green cross – randomly chosen background from the dataset of DNase accessible sequences, orange circle – mononucleotide shuffled background, brown cross – mononucleotide shuffled background within a sliding window, black circle – dinucleotide shuffled background, gray cross – dinucleotide shuffled background within a sliding window, magenta triangle – 3rd order Markov model generated background sequences, blue circle – background selected from the mappable sequences dataset to match the GC composition of the target sequences, light blue cross – background selected from the mappable sequences dataset to match the distribution of GC composition in sliding windows of the target sequences, and red triangle – GC background from HOMER 2.

Back to article page