| Tissue I | Tissue II | Pooled |
---|
 | Normal | Cancer | Normal | Cancer | Normal | Cancer |
---|
Gene A | 280 | 580 | 20 | 20 |
300
|
600
|
Other genes |
20,000
|
80,000
|
380,000
|
620,000
| 400,000 | 700,000 |
- This hypothetical case serves both as an example of how Cochran-Mantel-Haenszel (CMH) is applied as well as the occurrence of Simpson's paradox. Gene A is the gene under investigation. Expressions from all other genes are pooled into the "other genes" row. Bold typeface indicates columns showing higher cancer vs. normal propensities. CMH is applied on the stratified tissue columns (but not on the pooled data). A casual observation involving only the pooled data would suggest Gene A as having higher expression in cancer (X2 test p-value close to 0 when analyzing only the pooled). However, a closer inspection on each of the tissue columns reveals otherwise. The observed difference between cancer and normal of the "other genes" is theoretically mostly due to sampling bias.