Creation of causal graphs

From Machinelearning
Jump to: navigation, search

Creation of causal graphs; in other words, how are causal graphs created? Where do they come from?

In all the examples I have seen, causal graphs are presented to the reader, sometimes with alternatives. It seems like the writer has some process for generating these graphs, but what is this process? Is it just human intuition?

The second problem of causal inference that Shalizi lists is "Given data about a system, find its causal structure". This seems to be the same question as asking where causal graphs come from. But he says this problem is harder than the first problem he lists ("Given the causal structure of a system, estimate the effects the variables have on each other"), and only goes into the first problem (he probably talks about the second problem as well, but I haven't read that far).[1]

Morgan and Winship introduce a causal graph and then write: "suppose that these relationships are derived from a set of theoretical propositions that have achieved consensus in the relevant scholarly community" (p. 30).[2] It's not clear what these "theoretical propositions" are or how the community came up with the relationships in the first place.

Pearl writes: "The sharp distinction between statistical and causal concepts can be translated into a useful principle: behind every causal claim there must lie some causal assumption that is not discernable from the joint distribution and, hence, not testable in observational studies. Such assumptions are usually provided by humans, resting on expert judgment. Thus, the way humans organize and communicate experiential knowledge becomes an integral part of the study, for it determines the veracity of the judgments experts are requested to articulate."[3]

Chapter 2 of Pearl's book is basically about this, I think. (whereas chapter 3 assumes you have the graph)

Also see the "Discovering Causal Structure from Observations" chapter of Shalizi's book.

"These considerations imply that the slogan “correlation does not imply causation” can be translated into a useful principle: one cannot substantiate causal claims from associations alone, even at the population level—behind every causal conclusion there must lie some causal assumption that is not testable in observational studies." There is a footnote here: "The methodology of “causal discovery” (Spirtes et al. 2000; Pearl 2000a, Chapter 2) is likewise based on the causal assumption of “faithfulness” or “stability,” a problem-independent assumption that concerns relationships between the structure of a model and the data it generates."[4] So it seems like there is a two-step process: (1) you assume "faithfulness" or "stability" (these are again causal assumptions, but problem-independent), which allows you to construct a causal graph; (2) you use the causal graph and the joint distribution to infer the strength of the causal relationships. It seems like alternatively, you could do (1') draw a couple of causal graphs using intuition; (2') use the joint distribution to figure out which graph is correct and the strength of the causal relationships.

See also


  1. Cosma Rohilla Shalizi. "Identifying Causal Effects from Observations". April 7, 2016.
  2. Stephen L. Morgan; Christopher Winship. Counterfactual and Causal Inference: Methods and Principles for Social Research. 2nd ed. Cambridge University Press. 2015.
  3. Judea Pearl. Causality. p. 40.
  4. "Causal inference in statistics: An overview"