Volume 14 Supplement 8
QChIPat: a quantitative method to identify distinct binding patterns for two biological ChIP-seq samples in different experimental conditions
- Bin Liu†1, 2, 3, 4,
- Jimmy Yi†1,
- Aishwarya SV1,
- Xun Lan1,
- Yilin Ma1,
- Tim HM Huang5,
- Gustavo Leone6 and
- Victor X Jin5Email author
© Liu et al.; licensee BioMed Central Ltd. 2013
Published: 9 December 2013
Many computational programs have been developed to identify enriched regions for a single biological ChIP-seq sample. Given that many biological questions are often asked to compare the difference between two different conditions, it is important to develop new programs that address the comparison of two biological ChIP-seq samples. Despite several programs designed to address this question, these programs suffer from some drawbacks, such as inability to distinguish whether the identified differential enriched regions are indeed significantly enriched, lack of distinguishing binding patterns, and neglect of the normalization between samples.
In this study, we developed a novel quantitative method for comparing two biological ChIP-seq samples, called QChIPat. Our method employs a new global normalization method: nonparametric empirical Bayes (NEB) correction normalization, utilizes pre-defined enriched regions identified from single-sample peak calling programs, uses statistical methods to define differential enriched regions, then defines binding (histone modification) pattern information for those differential enriched regions. Our program was tested on a benchmark data: histone modifications data used by ChIPDiffs. It was then applied on two study cases: one to identify differential histone modification sites for ChIP-seq of H3K27me3 and H3K9me2 data in AKT1-transfected MCF10A cells; the other to identify differential binding sites for ChIP-seq of TCF7L2 data in MCF7 and PANC1 cells.
Several advantages of our program include: 1) it considers a control (or input) experiment; 2) it incorporates a novel global normalization strategy: nonparametric empirical Bayes correction normalization; 3) it provides the binding pattern information among different enriched regions. QChIPat is implemented in R, Perl and C++, and has been tested under Linux. The R package is available at http://motif.bmi.ohio-state.edu/QChIPat.
ChIP-seq (chromatin immunoprecipitation (ChIP) coupling with DNA sequencing) is widely used to precisely map the location of transcription factor (TF) binding or histone modification sites at a genome-wide scale [1–5]. Other related sequencing-based techniques DNase-seq , FAIRE-seq  are often used to define the open chromatin regions and identify accessible regulatory regions. Many computational programs [4, 7–9] have been developed to identify enriched regions (referring to either binding sites or histone modification sites) for the ChIP-seq data. However, a majority of these programs were developed for a single ChIP-seq sample data. Given that many biological questions are often asked to compare the enriched regions under two different conditions, such as before and after using drugs, with and without chemical or hormone treatment, binding patterns of different transcription factors, or one transcription factor binding information in two cell types, it is critical to develop new programs to meet this need and to quantitatively compare two biological ChIP-seq samples. In our recent study , we found following TGFβ stimulation of the A2780 epithelial ovarian cancer cell line, ChIP-seq identified SMAD4 binding loci were classified into four distinct binding patterns: 1) Basal; 2) Shifted; 3) Stimulated Only; 4) Unstimulated Only, indicating that TGFβ stimulation alters SMAD4 binding patterns in epithelial ovarian cancer cells. However, the binding patterns were measured on a qualitative basis. Therefore, it is essential to develop a quantitative method to address this increasingly generalized biological question.
A study from Xu et al  employed a Hidden Markov Model (HMM) to identify the differential binding enrichments from two ChIP-seq samples (ChIPDiff), which was tested on three histone modifications data in two different cell types. However, the ChIPDiff program suffers from some shortcomings. Firstly, the HMM-based algorithm itself cannot distinguish whether those identified differential enriched regions are indeed significantly enriched or not. This is because in many cases a more enriched region in sample A comparing to sample B isn't necessarily an enriched region in sample A itself and vice versa. Therefore, the program tends to detect many false positive modification sites. Secondly, this program directly uses the samples' sequencing reads to perform the comparison without considering the information in the control experiment of each sample. However, as noted by Zhang et al. , the control sample is critical for modeling the sequencing and mapping biases. Thirdly, it is questionable whether this program is suitable for narrow peaks comparison since the program sets the bin size at least 1 kb. For example, the peak width for a typical TF binding site is around 300-500 bp. Fourthly, it fails to define a binding pattern for each identified differential binding site after comparing two samples. A few other studies have made the same efforts to develop programs, DBChIP , DIME , to detect differential binding sites in two samples, but these programs suffer from the same drawbacks either lacking distinguishing binding patterns or neglecting the normalization between samples.
In this study, we developed a novel quantitative method for comparing two biological ChIP-seq samples, called QChIPat, not only to detect the differential binding sites but also to classify them into distinct binding patterns. Our method employs a new global normalization method (nonparametric empirical Bayes correction normalization) [14–17], utilizes pre-defined enriched regions identified from single-sample enriched regions identification programs, uses statistical methods to define differential enriched regions, then defines binding pattern information for those differential enriched regions. Our program was tested on a benchmark data, histone modifications data  used by ChIPDiffs . It was then applied on two study cases: one to identify differential histone modification sites for ChIP-seq of H3K27me3 and H3K9me2 data in AKT1-transfected MCF10A cells; the other to identify differential binding sites for ChIP-seq of TCF7L2 data in MCF7 and PANC1 cells.
Evaluation of QChIPat with benchmark data
A summary of comparing for the benchmark data with a bin size of 30 bp.
2875 ( 85.8%)
Comparison of different normalization methods
ChIP-seq samples performed in different conditions and time usually have different numbers of raw sequencing reads due to the number of sequencing lanes conducted for each sample (sequencing depth) and multiple-matched or non-matched matched reads in the sample. Before comparing two samples, it is necessary to normalize the read counts to eliminate system errors and minimize sequencing depth difference between two samples. In our program, we employed three normalization methods including linear normalization, nonparametric empirical Bayes correction normalization and quantile normalization, which are based on different assumptions of the data distribution. A comparative summary of the three different normalization methods is shown in Table 1. The results of the method without using any normalization methods were also reported. The results of ChIPDiff were used as a baseline to gauge the performance of each different normalization method. For each method, the percentage of DHMSs overlapping with ChIPDiff results is shown in parentheses. The results showed that the DHMSs identified by quantile and NEB normalization methods highly overlap with those predicted by ChIPDiff, while the linear normalization method and non-normalization method have fewer overlapping DHMSs with ChIPDiff. For all the three data, the overlapping percentage between quantile and ChIPDiff is the highest one among all the three normalization methods in ESC, while NEB method achieves the highest overlapping percentage in NPC. Furthermore, the NEB method identifies more DHMSs than the quantile method in ESC and a similar number of DHMSs as that of the quantile method in NPC. Therefore, the overall performance of the NEB is better than that of other normalization methods on the three data. The results are not surprising since NEB normalization has two advantages compared with other normalization methods. Firstly, NEB estimates the proportion of the reads as p0=n1/S, where n1 is the frequency of the read with count 1 and S is the sequencing depth of the ChIP-Seq data. Secondly, NEB adjusts read counts based on both the observed read counts and the nature of the frequency distribution of the read counts.
Optimization of the NEB-based method
Application to K9 and K27 datasets
To further test the performance and efficiency of QChIPat, we applied it to a study case, ChIP-seq of H3K27me3 and H3K9me2 in AKT1-tranfected MCF10A vs vehicle control, conducted in our laboratory. Many studies have shown that among many types of histone modifications, H3K27me3 and H3K9me2, two repressive marks are critical for normal and aberrant differentiation of stem and progenitor cells  as well as in the development of cancers . Our previous study also demonstrated that epigenetic silencing of a set of AKT1-mediated genes . AKT1 kinase is a key downstream effector of the phosphoinositide 3-kinase (PI3K) signaling pathway that regulates diverse cellular functions, including growth, proliferation, survival, metabolism, motility, angiogenesis, and vesicle trafficking [22, 23]. Therefore, it is important to determine AKT1-mediated genome-wide histone modification patterns for H3K27me3 and H3K9me2 and its subsequent influence on downstream target genes. We chose the NEB normalization method since it performed best on the benchmark data H3K27me3, H3K4me3 and H3K36me3 compared to other three methods (see section Comparison of different normalization methods).
The binding patterns identified for the K9 and K27 datasets.
Num. of DHMSs
H3K27me3 (vehicle control)
H3K9m2 (vehicle control)
Application to TCF7L2 in MCF7 and PANC1 cells
We next performed IPA analysis on genes associated with MCF7 and PANC1 "Only" TCF7L2 binding sites (Figure 3C). Interestingly, we found that the genes associated with "MCF7 Only" binding sites differed significantly from the genes associated with "PANC1 Only" binding sites in terms of the three IPA categories of network functions, physiological system development and function, and canonical pathways. For all three gene expression categories, only cancer and post-translational modification (network functions) showed up for both cases of genes. This functional difference between genes associated with "MCF7 Only" TCF7L2 binding sites compared to genes associated with "PANC1 Only" TCF7L2 binding sites is expected, because the genes are differentially regulated by definition of the "Only" binding pattern: genes associated with "Only" differential enriched regions (TCF7L2 binding sites in this case) in one sample are not associated with any differential enriched regions in the other sample, Therefore, the significant difference in function of the two groups of genes supports the accuracy of QChIPat's "Only" pattern detection.
Discussion and conclusions
In this study, we developed a quantitative method, QChIPat, to identify distinct binding patterns for two biological ChIP-seq samples, and the program is implemented in R, Perl and C++, and run in Linux system. Using a benchmarking test data, histone modifications data , we compared the performance of three different normalization methods on identification of DHMSs, including linear normalization, NEB correction normalization and quantile normalization, then compared it with ChIPDiffs program . Although each normalization method has its own advantages, we found that NEB correction normalization is a more suitable normalization method in our QChIPat. Two notable advantages of NEB correction normalization are: 1) NEB estimates the proportion of the reads using the frequency of the read against the sequencing depth, which avoids both underestimating the proportion of highly abundant reads and overestimating the proportion of low and intermediate abundance reads, and 2) NEB adjusts read counts based on both the observed read counts and the nature of the frequency distribution of the read counts. This is able to correct abundant reads among high and low regions. Therefore, the enriched regions can be correctly identified. In addition, QChIPat has implemented a novel feature: categorizing binding (or modification) sites into three distinct binding patterns. This novel feature implementation is very important since quantitatively associating each of the differential binding sites with one of two samples is essential to reveal key biological functions. For example, Once we identify "Only" binding of genes in Sample A and Sample B, we are able to perform GO or other pathway analyses on these genes and may identify some interesting biological functions. This was demonstrated in our two study cases (Figures 2 and 3). Thus, compared to other similar programs, ChIPDiffs , DBChIP , and DIME , QChIPat is able to not only identify differential enriched binding regions between two samples but also to further classify these regions into distinct binding patterns associated with each sample.
Our QChIPat was further tested and validated on two study cases, ChIP-seq of H3K27me3 and H3K9me2 in AKT1-tranfected MCF10A vs. vehicle control, and ChIP-seq of TCF7L2 in MCF7 vs. PANC1 cells, both from our previous studies. Interestingly, both H3K37me3 and H3K9me2 marks showed two major binding patterns "Only" or "Shift" binding at AKT1-tranfected MCF10A cells (Table 2). Although our previous study has shown  AKT signaling can be a trigger of the epigenetic silencing at many downstream target genes through the cross-talk between DNA methylation and H3K27me3, it didn't show the relationship between H3K37me3 and H3K9me2. Nevertheless, it is a little bit surprising that both AKT1-mediated up- and down-regulated genes have at least one differential binding site for H3K27me3 or H3K9me2 marks but there is no significant difference in enrichments since both histone marks are recognized as repressive marks. It is possibly that activated AKT signaling may trigger downstream key transcription factors and further dynamically regulate epigenetic processes for the interplay of these two histone modifications. A more thorough underlying mechanism for this epigenetic process needs to be elucidated in an experimental study in the future. The test of QChIPat on ChIP-seq of TCF7L2 in MCF7 and PANC1 cells was used to evaluate its accuracy in detecting binding patterns of a particular transcription factor. The correlation between gene expression levels with the detected MCF7 and PANC1 "Only" TCF7L2 binding sites demonstrated that QChIPat is also applicable to compare narrow peaks in two ChIP-seq samples such as TFs in addition to broad peaks such as repressive histone modifications.
Several notable advantages of QChIPat include: 1) this software is able to make use of the information in control experiment; 2) this software incorporates a novel global normalization strategy: nonparametric empirical Bayes correction normalization; and 3) this software provides the binding pattern information among different enriched regions.
Nevertheless, there are several limitations. First, although three binding patterns were presented in this study, another binding pattern that may be interesting is where there is a sharp enriched region in one ChIP-seq sample, while a broad enriched region in the other ChIP-seq sample, but the total number of mapped reads are approximately the same. We look forward to deeper investigations on this binding pattern and will consider this pattern in the updated version of QChIPat in the future. Second, in this study, we only focus on comparing two biological ChIP-seq samples and identifying their differential binding patterns. Another interesting task is to identify differential binding patterns from two groups of ChIP-seq samples, which requires the comparison of more than two samples. We are studying new mathematical and statistical models for quantitatively identifying binding patterns from two groups of ChIP-seq samples in future work.
Three optional normalization methods are provided in QChIPat, including linear normalization, nonparametric empirical Bayes correction normalization and Quantile normalization, which are based on different assumptions of the data distribution.
1) Linear normalization : Linear normalization is the simplest and most straightforward way to normalize ChIP-seq data by scaling the total number of reads of different ChIP-seq samples into the same level. This normalization method is reasonable if the total number of reads in the two samples is roughly the same. In our program, linear normalization is used, in which all reads in a sample is divided by the total number of reads in the sample so that the reads in two samples are in the same scale. This can be formulated as p = x/n, where x is the number of reads in each bin (in order to calculate the distribution of the data, the whole genome is divided into small fixed size regions which are called bins in this study, and then the total number of reads in each bin is counted), proportion p is the normalized reads number in a given bin and n is the total number of reads of the ChIP-Seq data.
3) Quantile normalization : If the distribution of the two samples is assumed to be the same, quantile normalization can be applied to the ChIP-seq data for further comparison. In this package, the read number distribution of the first sample is used as the reference distribution and the second sample is transformed to make sure that the distribution of the two samples is the same. The transformation can be formulated as is the distribution of the first sample and F2 is the distribution of the second sample. Since the Wilcoxon rank test only makes use of the rank of the read number, choosing either the first or second sample does not affect the final comparison results.
Comparison of enriched regions
In order to reduce the number of false positive enriched regions and make use of the samples obtained by control experiments, three optional publicly available enriched region identification programs are included in our method, including BELT , MACS  and FindPeaks . The users are able to select one based on their datasets and interests. After identifying the enriched regions in each sample, each enriched region is divided into equal length bins. The number of bins in each enriched region is a parameter, which needs to be optimized based on the experiments. If the length of the enriched region is less than the bin number, the bin number is automatically set to the length of the enriched region. Since the length of enriched regions may be different from each other, the length of bin may vary for different peaks. After the normalized read number is counted, the number of reads in each bin in the corresponding location in the comparing sample is also calculated, and then the two groups of reads number can be used for Wilcoxon rank test.
Wilcoxon rank test
There are two options of Wilcoxon rank tests: Wilcoxon signed rank test and Wilcoxon rank sum test. If it is believed that the read number of the two ChIP-Seq data is paired, Wilcoxon signed rank test should be used (wil.paired=TRUE). Let Z i = X i - Y i for i = 1, ... b n , and assume that the Z i is independent; each Z i comes from a same continuous population and is symmetric about its median θ. If not paired, Wilcoxon rank sum test should be used (wil.pair=FALSE). Let F denotes the distribution of X, the read enrichment of Sample A, and G denotes the distribution of Y, the read enrichment of Sample B. Assume that F and G have the same shape and they differ only by median: G(t)=F(t-θ) .
H0: θ=0. (The read enrichment difference between Sample A and B is not significant);
H1: θ>0 (Sample A has more significant read enrichment);
H2: θ<0 (Sample B has more significant read enrichment).
For signed rank test, let R i represents the rank of each ordered |Z i | and φ i denote whether Z i is positive. When Z i is positive, φ i is 1. Otherwise, it is -1. The Wilcoxon signed rank statistic T+ is calculated as . For the rank sum test, all × and Y are put together and the pooled data is ranked. Let S j denotes the rank of Y s , and then the Wilcoxon rank sum statistics is the sum of all S j (). Then, by checking the corresponding table, p-value can be calculated.
Determination of differential enriched regions
Once a p-value from Wilcoxon rank test and read ratio (, x1,istands for the reads number in each bin of sample A and x2,jstands for the reads number in each bin of sample B) and difference () have been calculated, a set of significant differential enriched regions in each sample are determined based on the user defined cutoffs.
Definition of distinct binding patterns
Each differential enriched region, identified from comparison of a sample A with a sample B, is classified into one of the following three distinct binding patterns: 1) Sample A/B "Unchanged" - one of two corresponding differential enriched regions in separate samples that are associated with the same annotated gene and are within 1 kb distance of each other; 2) Sample A/B "Shift"- one of two corresponding differential enriched regions in separate samples that are associated with the same gene, but are more than 1 kb apart from one another; and 3) Sample A/B "Only" - a differential enriched region associated with a gene in one sample, but no differential enriched regions are associated with that gene in the other sample. Note that the program, by default, associates a gene with a differential enriched region if they are within 100 kb of each other.
The ChIP-seq of histone modification data from Mikkelsen et al  (the raw data can be downloaded at ftp://ftp.broad.mit.edu/pub/papers/chipseq/Mikkelsen2007/alignments/) including H3K27me3 (K27), H3K4me3 (K4) and H3K36me3 (K36) were used for the benchmarking test to evaluate the performance of our program QChIPat.
QChIPat provides comprehensive output information including a summary report, differential enriched regions in each sample with BED format and UCSC web browser and Affymetrix Integrated Genome Browser compatible wiggle (.wig) for the visualization, as well as the assigned binding pattern of differential enriched regions.
Data sets for two study cases
We thank the members in the laboratory for helpful discussions. This study was supported by a Pelotonia Idea Grant from the Ohio State University Comprehensive Cancer Center, the National Natural Science Foundation of China (No. 61300112), the Natural Science Foundation of Guangdong Province (No. S2012040007390), Shanghai Key Laboratory of Intelligent Information Processing, China (Grant No. IIPL-2012-002), and funds from University of Texas Health Science Center at San Antonio.
The publication costs for this article were funded by University of Texas Health Science Center at San Antonio.
This article has been published as part of BMC Genomics Volume 14 Supplement 8, 2013: Selected articles from the International Conference on Intelligent Biology and Medicine (ICIBM 2013): Genomics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcgenomics/supplements/14/S8.
- Park PJ: ChIP-seq: advantages and challenges of a maturing technology. Nature Reviews Genetics. 2009, 10: 669-680.PubMedPubMed CentralView ArticleGoogle Scholar
- Johnson DS, Mortazavi A, Myers RM, Wold B: Genome-wide mapping of in vivo protein-DNA interactions. Science. 2007, 316: 1497-1502. 10.1126/science.1141319.PubMedView ArticleGoogle Scholar
- Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, Wei G, Chepelev I, Zhao K: High-resolution profiling of histone methylations in the human genome. Cell. 2007, 129: 823-837. 10.1016/j.cell.2007.05.009.PubMedView ArticleGoogle Scholar
- Frietze S, Lan X, Jin VX, Farnham PJ: Genomic targets of the KRAB and SCAN domain-containing zinc finger protein 263. J Biol Chem. 2010, 285: 1393-1403. 10.1074/jbc.M109.063032.PubMedPubMed CentralView ArticleGoogle Scholar
- Robertson G, Hirst M, Bainbridge M, Bilenky M, Zhao Y, Zeng T, Euskirchen G, Bernier B, Varhol R, Delaney A, Thiessen N, Griffith OL, He A, Marra M, Snyder M, Jones S: Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods. 2007, 4: 651-657. 10.1038/nmeth1068.PubMedView ArticleGoogle Scholar
- Song L, Zhang Z, Grasfeder LL, Boyle AP, Giresi PG, Lee BK, Sheffield NC, Gräf S, Huss M, Keefe D, Liu Z, London D, McDaniell RM, Shibata Y, Showers L, Simon J, Vales T, Wang T, Winter D, Zhang Z, Clarke N, Birney E, Iver V, Crawford G, Lieb J, Furey T: Open chromatin defined by DNaseI and FAIRE identifies regulatory elements that shape cell-type identity. Genome Res. 2011, 21: 1757-1767. 10.1101/gr.121541.111.PubMedPubMed CentralView ArticleGoogle Scholar
- Fejes AP, Robertson G, Bilenky M, Varhol R, Bainbridge M, Jones SJ: FindPeaks 3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology. Bioinformatics. 2008, 24: 1729-1730. 10.1093/bioinformatics/btn305.PubMedPubMed CentralView ArticleGoogle Scholar
- Jothi R, Cuddapah S, Barski A, Cui K, Zhao K: Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Nucleic Acids Res. 2008, 36: 5221-5231. 10.1093/nar/gkn488.PubMedPubMed CentralView ArticleGoogle Scholar
- Zhang Y, Liu T, Meyer CA, Eeckhoute J, Johnson DS, Bernstein BE, Nussbaum C, Myers RM, Brown M, Li W, Liu X: Model-based Analysis of ChIPSeq (MACS). Genome Biol. 2008, 9: R137-10.1186/gb-2008-9-9-r137.PubMedPubMed CentralView ArticleGoogle Scholar
- Kennedy BA, Deatherage DE, Gu F, Tang B, Chan MW, Nephew KP, Huang TH, Jin VX: ChIP-seq defined genome-wide map of TGFβ/SMAD4 targets: implications with clinical outcome of ovarian cancer. PLoS One. 2011, 6: e22606-10.1371/journal.pone.0022606.PubMedPubMed CentralView ArticleGoogle Scholar
- Xu H, CL W, Lin F, WK S: An HMM approach to genome-wide identification of differential histone modification sites from ChIP-seq data. Bioinformatics. 2008, 24: 2344-2349. 10.1093/bioinformatics/btn402.PubMedView ArticleGoogle Scholar
- Liang K, Keles S: Detecting differential binding of transcription factors with ChIP-seq. Bioinformatics. 2012, 28: 121-122. 10.1093/bioinformatics/btr605.PubMedPubMed CentralView ArticleGoogle Scholar
- Taslim C, Huang THM, SL : DIME: R-package for identifying diferential ChIP-seq based on an ensemble of mixture models. Bioinformatics. 2011, 27: 1569-1570. 10.1093/bioinformatics/btr165.PubMedPubMed CentralView ArticleGoogle Scholar
- Wu ZJ, Meyer CA, Choudhury S, Shipitsin M, Maruyama R, Bessarabova M, Nikolskaya T, Sukumar S, Schwartzman A, Liu JS, Polyak K, Liu X: Gene expression profiling of human breast tissue samples using SAGE-Seq. Genome Research. 2010, 20: 1730-1739. 10.1101/gr.108217.110.PubMedPubMed CentralView ArticleGoogle Scholar
- Good IJ: The population frequencies of species and the estimation of population parameters. Biometrika. 1953, 40: 237-264.View ArticleGoogle Scholar
- Robbins H: An empirical Bayes approach to statistics. In Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability. 1956, Berkeley, CA.: University of California Press, 1:Google Scholar
- Orlitsky A, Santhanam NP, Zhang J: Always Good Turing: Asymptotically optimal probability estimation. Science. 2003, 302: 427-431. 10.1126/science.1088284.PubMedView ArticleGoogle Scholar
- Mikkelsen TS, Ku M, Jaffe DB, Issac B, Lieberman E, Giannoukos G, Alverez P, Brockman W, Kim TK, Koche RP, et al: Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature Reviews Genetics. 2007, 448: 553-560.Google Scholar
- Simon JA, Kingston RE: Mechanisms of polycomb gene silencing: knowns and unknowns. Nat Rev Mol Cell Biol. 2009, 10: 697-708.PubMedView ArticleGoogle Scholar
- Lo PK, Sukumar S: Epigenomics and breast cancer. Pharmacogenomics. 2008, 9: 1879-1902. 10.2217/14622418.104.22.1689.PubMedPubMed CentralView ArticleGoogle Scholar
- Zuo T, Liu TM, Lan X, Weng YI, Shen R, Gu F, Huang YW, Liyanarachchi S, Deatherage DE, Hsu PY, Taslim C, Ramaswamy B, Shapiro C, Lin H, Cheng A, Jin V, Huang TS: Epigenetic silencing mediated through activated PI3K/AKT signaling in breast cancer. Cancer Res. 2011, 71 (5): 1752-1762. 10.1158/0008-5472.CAN-10-3573.PubMedPubMed CentralView ArticleGoogle Scholar
- Manning BD, Cantley LC: AKT/PKB signaling: navigating downstream. Cell. 2007, 129: 1261-1274. 10.1016/j.cell.2007.06.009.PubMedPubMed CentralView ArticleGoogle Scholar
- Vivanco I, Sawyers CL: The phosphatidylinositol 3-Kinase AKT pathway in human cancer. Nat Rev Cancer. 2002, 2: 489-501. 10.1038/nrc839.PubMedView ArticleGoogle Scholar
- Mimeault M, Bonenfant D, Batra SK: New advances on the functions of epidermal growth factor receptor and ceramides in skin cell differentiation, disorders and cancers. Skin Pharmacol Physiol. 2004, 17 (4): 153-166. 10.1159/000078818.PubMedView ArticleGoogle Scholar
- Rofstad EK, Halsor EF: Vascular endothelial growth factor, interleukin, platelet-derived endothelial cell growth factor, and basic fibroblast growth factor promote angiogenesis and metastasis in human melanoma xenografts. Cancer Res. 2000, 60 (17): 4932-4938.PubMedGoogle Scholar
- Germann M, Wetterwald A, Guzmán-Ramirez N, Pluijm Gvd, Culig Z, Cecchini MG, Williams ED, Thalmann GN: Stem-Like Cells with Luminal Progenitor Phenotype Survive Castration in Human Prostate Cancer. Stem Cells. 2012, 30 (6): 1076-1086. 10.1002/stem.1087.PubMedView ArticleGoogle Scholar
- Ye T, Krebs AR, Choukrallah MA, Keime C, Plewniak F, Davidson I, Tora L: seqMINER: an integrated ChIP-seq data interpretation platform. Nucleic Acids Res. 2011, 39 (6): e35-10.1093/nar/gkq1287.PubMedPubMed CentralView ArticleGoogle Scholar
- Bolstad BM, Irizarry RA, Astrand M, Speed TP: A Comparison of Normalization Methods for High Density Oligonucleotide Array Data Based on Bias and Variance. Bioinformatics. 2003, 19 (2): 185-193. 10.1093/bioinformatics/19.2.185.PubMedView ArticleGoogle Scholar
- Lan X, Bonneville R, Apostolos J, Wang W, Jin VX: W-ChIPeaks: a comprehensive web application tool to process ChIP-chip and ChIP-seq data. Bioinformatics. 2011, 27: 428-430. 10.1093/bioinformatics/btq669.PubMedPubMed CentralView ArticleGoogle Scholar
- Gale WA, Sampson G: Good-Turing frequency estimation without tears. J Quant Ling. 1995, 2: 217-237. 10.1080/09296179508590051.View ArticleGoogle Scholar
- Hollander M, Wolf DA: Nonparametric Statistical Method. 1999, WileyGoogle Scholar
- Frietze S, Wang R, Yao L, Tak YG, Ye Z, Gaddis M, Witt H, Farnham PJ, Jin VX: Cell type-specific binding patterns reveal that TCF7L2 can be tethered to the genome by association with GATA3. Genome Biol. 2012, 13: R52-10.1186/gb-2012-13-9-r52.PubMedPubMed CentralView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.