New methods for separating causes from effects in genomics data

Statnikov, Alexander; Henaff, Mikael; Lytkin, Nikita I; Aliferis, Constantin F

doi:10.1186/1471-2164-13-S8-S22

BMC Genomics

Table 1 High-level description of the tested causal orientation methods.

From: New methods for separating causes from effects in genomics data

Method	Reference	Key principles	Sufficient assumptions for causally orienting X → Y	Sound
ANM	[14]	Assuming X → Y with Y = f(X) + e₁, where X and e₁ are independent, there will be no such additive noise model in the opposite direction X ← Y, X = g(Y) + e₂, with Y and e₂ independent.	• Y = f(X) + e₁; • X and e₁ are independent; • f is non-linear, or one of X and e is non-Gaussian; • Probability densities are strictly positive; • All functions (including densities) are 3 times differentiable.	Yes
PNL	[15]	Assuming X → Y with Y = f₂(f₁(X) + e₁), there will be no such model in the opposite direction X←Y, X = g₂(g₁(Y) + e₂) with Y and e₂ independent.	• Y = f₂(f₁(X) + e₁); • X and e₁ are independent; • Either f₁ or e₁ is Gaussian; • Both f₁ and f₂ are continuous and invertible.	Yes
IGCI	[16, 17]	Assuming X→Y with Y = f(X), one can show that the KL-divergence (a measure of the difference between two probability distributions) between P(Y) and a reference distribution (e.g., Gaussian or uniform) is greater than the KL-divergence between P(X) and the same reference distribution.	• Y = f(X) (i.e., there is no noise in the model); • f is continuous and invertible; • Logarithm of the derivative of f and P(X) are not correlated.	Yes
GPI-MML	[18]	Assuming X→Y, the least complex description of P(X, Y) is given by separate descriptions of P(X) and P(Y\|X). By estimating the latter two quantities using methods that favor functions and distributions of low complexity, the likelihood of the observed data given X→Y is inversely related to the complexity of P(X) and P(Y \| X).	• Y = f(X, e); • X and e are independent; • e is Gaussian; • The prior on f and P(X) factorizes.	No
ANM-MML	[18]	Same as for GPI-MML, except for a different way of estimating P(Y \| X) and P(X \| Y).	• Y = f(X) + e; • X and e are independent; • e is Gaussian. • The prior on f and P(X) factorizes.	No
GPI	[18]	Assuming X→Y with Y = f(X,e₁), where X and e₁ are independent and f is "sufficiently simple", there will be no such model in the opposite direction X←Y, X = g(Y,e₂) with Y and e₂ independent and g "sufficiently simple".	Same as for GPI-MML.	No
ANM-GAUSS	[18]	Same as for ANM-MML, except for the different way of estimating P(X) and P(Y).	Same as for ANM-MML.	No
LINGAM	[13]	Assuming X→Y, if we fit linear models Y = b₂X+e₁ and X = b₁Y+e₂ with e₁ and e₂ independent, then we will have b₁ < b₂.	• Y = b₂X+e₁; • X and e₁ are independent; • e₁ is non-Gaussian.	Yes

The last column indicates whether a method is sound, i.e. it can provably orient a causal structure under its sufficient assumptions. Because causal orientation methodologies are fairly new and not completely characterized, it is possible that proofs of correctness will become available for GPI-MML, ANM-MML, GPI, and ANM-GAUSS. All methods implicitly assume that there are no feedback loops. The noise term in the models is denoted by small "e".

Back to article page

ISSN: 1471-2164

Contact us

Submission enquiries: bmcgenomics@biomedcentral.com
General enquiries: ORSupport@springernature.com