All Courses

A fundamental problem

If calculating the effect of a treatment is so easy, then what are we doing here. Well, as we all know, there is no multiverse 🙁 🙃

Even if there’s one, let’s not kid ourselves, we don’t have superpowers 🤷🏻‍♀️

The fundamental problem of causal inference

In reality, it is impossible to observe all potential outcomes for each subject. Once we observe one potential outcome, we become blind to all of the others. For instance, for a subject who ended up enrolling in our job training program (receiving the treatment), we will never observe their potential outcome under no treatment. We can only ask what would have happened if that subject didn’t enroll in the training program. This unobserved potential outcome for the subject is their counterfactual (Aliyah’s case).

On the other hand, if a subject didn’t, in fact, enroll in the training program, their counterfactual is their potential outcome under treatment (Connor’s case).

So to recap, before the treatment decision is made, any outcome is a potential outcome. Everybody has potential for Y0Y^0 and Y1Y^1. After the treatment decision is made and the outcomes are observed, we only have the observed outcomes.

Counterfactual outcomes are the ones that would have been observed had the treatment been different. Counterfactual outcomes are not observed:

  • If, for a person i, Di=0D_i=0, then Yi0{Y_i}^0 is the observed outcome for that person.
  • If, for a person i, Di=1D_i=1, then Yi1{Y_i}^1 is the observed outcome for that person.

Likewise…

  • If, for a person i, Di=0D_i=0, then Yi1{Y_i}^1 is the counterfactual outcome for that person.
  • If, for a person i, Di=1D_i=1, then Yi0{Y_i}^0 is the counterfactual outcome for that person.

This challenge is called the fundamental problem of causal inference and is our biggest challenge in this course. 🤨

Despite the hypothetical nature of counterfactuals, our minds never stop thinking about them. What if candidate B had won? What if I applied for that job? What if I chose architecture in college instead of engineering?

The causal effect of the treatment on a specific subject will always be impossible to estimate because we don’t observe counterfactual outcomes. As Rubin suggests, the estimation of causal effects is then simply a missing data problem where we’re interested in predicting the unobserved potential outcomes.

However, under certain causal inference assumptions, even if we can’t find individual treatment effects we can find the average treatment effect over a population of subjects. The fundamental problem of causal inference does not mean that causal inference is impossible. Depending on the situation, we might still be able to infer causality. Throughout this course, we will see how we can still figure out the effect of treatments even when some potential outcomes are not observed. The RCM model helps us think more clearly.

Next Lesson

Selection bias

Selection, selection, selection bias