Skip to main content
Figure 1 | BMC Genomics

Figure 1

From: Normal colon epithelium: a dataset for the analysis of gene expression and alternative splicing events in colon disease

Figure 1

Principle Components Analysis and Analysis of Variance of Gene Expression Data: The individual genes are summarized from exon intensities mapping to each locus. The PCA plot shows data from CELLS (red), NORMAL tissues (blue) and tumors (green). It can be seen that the tumor and normal tissues cluster together while the CELLs form a discreet cluster distant from the other samples. (B) Following a 3-way ANOVA the Sources of Variation were plotted. It can be seen that "SCAN DATE" is a major contributer to the variation. This can be attributed to the variation in processing performed by the 2 different laboratories. Also seen as a major source of variation is SEX. This is due to the fact that 7 of the samples in the CELL group were classified as female whereas the TUMOR/NORMAL set of data had 5 of each sex. (C) The SEX and SCAN DATE sources of variation were removed from the ANOVA analysis and the PCA performed. It can be seen that the three sample types cluster more closely but the CELLs still retain a degree of separation. (D) shows the ANOVA -Source of Variation histogram following the removal of the batch effects due to SCAN DATE and SEX. It is now notable that the major source of variation is due primarily to the different TISSUE TYPEs.

Back to article page