CAUSALPATH – THE SCIENCE
Conceptually, CD methods can be thought of as trying to identify and quantify all plausible causal mechanisms that explain the data equally well (called the Markov Equivalent set). To make predictions one reasons with the set of possible models, e.g., if all plausible models agree that X causes Y, this hypothesis is postulated. To eliminate possible models certain assumptions about the nature of causality are made.
Suppose in a dataset we observe that two quantities X and Y are statistically associated (interchangeably, correlated or dependent) denoted as Dep(X;Y) (or reciprocally Ind(X;Y) to denote independence). A reasonable assumption is that associations appear because of causal structure, i.e,. either one is causing the other or (inclusively) a third confounding variable is causing both. This is known as the Reichenbach’s Common Cause Principle. Another foundational assumption is called the Markov Condition (Spirtes et al. 2001): indirect influences or associations become independent given the direct causes. Thus, if X influences Y indirectly (only) though Z, then we expect Ind(X ; Z | Y). If two variables X and Z are not correlated, could they nevertheless be causally related to each other? The answer is yes, but only in special cases requiring precisely balanced relationships between the probability densities involved. Assuming that this is not the case is called the Faithfulness Condition (hence, no dependency implies no causal relationship) (Spirtes et al. 2001). Thus, if we observe that Dep(X;Y), Dep(Y;Z), and Ind(X;Z) we can infer X – Y – Z where the edges denote a causal relation of unknown direction or a latent confounder; the missing edge denotes lack thereof. Next, suppose the data suggest that Dep(X;Z|Y), i.e., a conditional dependence. The simplest way to explain the dependence is that Y does not cause either X nor Z: if it did, it would be a common cause of X and Z and we would expect Dep(X ; Z) instead. We denote with L latent confounding variables, i.e., unobserved common causes of the observed quantities. Assembling the above constraints together we can graphically represent some causal possibilities: X → Y ← Z, X ← L → Y ← Z, X ← L → Y ← Z and additionally X → Y (both latent confounder and causal relation), or even X ← L1 → Y ← L2 → Z. Assuming that there exists no latent confounding variables is a strong assumption, called Causal Sufficiency. Additionally, the lack of feedback cycles is of-ten assumed, i.e., the causal structure is representable by a Directed Acyclic Graph. When all assumptions stated above are made the only causal structure that remains is X → Y ← Z which forms the graph of a Bayesian Network.
Dropping the Causal Sufficiency assumption requires a new type of graph to represent possible latent confounding variables called Maximal Ancestral Graphs (MAGs) (T. Richardson & Spirtes 2002). MAGs are graphical models generalizing Bayesian Networks to distributions with latent variables. MAGs capture causal probabilistic relations, they can represent different types of structural uncertainty (e.g., the fact that A maybe causing B or the two may be sharing a latent common cause), they capture the observable dependencies and independencies. They assume cross-sectional data and they do not allow the presence of causal cycles (feedback loops). There exist (asymptotically) sound and complete algorithms for inducing such models such as FCI (Colombo et al. 2012; Spirtes et al. 2001). MAGs can also be interpreted as path-diagrams that are subcases of Structural Equation Models [Kaplan]. The equivalence class of MAGs is represented with another type of graph called Partially-Oriented Ancestral Graph (PAG), e.g., X •→ Y ←• Z, where the • mark denotes the relation is X causes Y or (inclusively) they may have a latent confounder. In some cases, complicated analysis of the data independencies may lead to inducing pure-causal relations X → Y even when admitting latent confounders. Additionally, further analysis may re-veal the presence of latent confounding variables, a feat conceptually equivalent to postulating the presence of a yet invisible planet (named Pluto eventually) from observing disturbances in the orbits of nearby planets in the late 19th century. This basic approach presented is complemented by other causal principles or assumptions, reasoning on the parameters of the models and the relations, as well as algorithms for inducing models and relations, which together form the field of Causal Discovery.