ChIP-seq assay and data representation. a Schematic representation of a ChIP-seq experiment. Chromatin is first crosslinked and cut into small pieces. DNA fragments bound by a specific protein are isolated with an antibody and sequenced from the ends using a short-read sequencing technology. The reads are then computationally mapped to the genome. Note that the reads mapping to the plus + and – strand of the genome, respectively, are expected to form clusters upstream and downstream of the protein binding site. b ChIP-seq data representation in SGA format. SGA is the working format of the ChIP-Seq tools. Each line contains five obligatory fields: sequence identifier (here an NCBI RefSeq ID), feature name (designating a ChIP-seq experiment), sequence position, strand and read count. Note that only the genomic position corresponding to the 5′end of the mapped sequence read is recorded in an SGA file

