Skip to main content
Figure 1 | BMC Genomics

Figure 1

From: RECLU: a pipeline to discover reproducible transcriptional start sites and their alternative regulation using capped analysis of gene expression (CAGE)

Figure 1

Summary of the pipeline. The procedures surrounded by the pink dotted line are contained in the pipeline. The Roman numerals correspond to Additional file 1. (A) Tag count data are clustered using the improved Paraclu program [6] (version 4). Clusters with TPM per base < 0.1 were discarded instead of using the total tag count in a cluster. (B) Hierarchical stability is calculated using the stability provided by Paraclu. The hierarchical stability is simply the sum of stabilities of the hierarchical clusters. (C) Regions that have 90% overlap between replicates are extracted by BEDtools [15] (version 2.12.0). (D) Executing the IDR package [9] (version 1.1) in the R language to evaluate reproducibility between replicates. The clusters with IDR ≥ 0.1 are discarded as irreproducible ones. (E) As well as clusters longer than 200 bp. (F) Detecting differentially expressed genes by the edgeR package [17] in the R language.

Back to article page