All Courses

Estimating causal effects

We talked about the main causal inference assumptions and specifically what happens if the most important assumption (the ignorability assumption) is violated. Much of the rest of this course is about ways to mitigate the violation of this assumption.

In this lesson, we’ll assume for a moment, that the world is such that all of our causal assumptions hold. In such a world, how would we calculate causal effects given we have a set of confounders?

You saw that stratification is an option. We can use stratification to find the causal effect of the treatment.

The idea of stratification is very simple. In stratification, given that we have already identified confounders (age, sex, education), we first stratify (categorize) the data into strata (categories) based on the confounders. Then, in each stratum, we can identify those who received the treatment and those who didn’t and calculate the treatment effect by comparing the average of outcomes between the treatment and the control groups. Remember we can only do this because of the ignorability assumption.

But, by doing that we will end up with multiple treatment effects: one for each stratum. The overall causal effect will be the weighted average of each of those treatment effects.

The idea behind this is that, if the causal inference assumptions and specifically if the ignorability assumption hold, strata defined by the confounders will be homogenous, i.e., we are comparing 🍎 to 🍎. This mean within each stratum, we can safely assume that the assignment to treatment and control groups occurred as if assignment was randomized. Comparing the observed outcomes of the treated and control subjects within each stratum gives us the treatment effect within each stratum.

Problems with stratification

Unfortunately, stratification typically presents a set of problems 😔

Stratification can lead to empty cells: If our sample size is small and the number of strata is large, there will be strata without any control or treatment observations or both. In this case, our estimation based on stratification will be unstable.

Think about a study with 100 subjects and only two covariates gender (with levels male or female) and age (with levels 0-9, 10-19, 20-29, etc.). If we stratify by these covariates, we might have strata (such as 70-79 male subjects) for which there are no observations. Even if there are observations in a stratum, there could be a case in which all of the observations are of subjects who’ve received the treatment but there are not subjects in the control (or vice versa).

We should, therefore, think about alternative methods. Some of those alternatives include regression, matching, or inverse probability weighting.

Regression analysis requires further assumptions that we won’t discuss here but it’s a better tool for dealing with continuous variables (as opposed to categories) and it’s better at extrapolation. Regression analysis isn’t the scope of this course. If you’re interested to know more about regression models, take our econometrics course.

A major drawback of regression analysis is that it assumes (without any warrant) a specific functional form that dictates the relationship between variables. In future modules, we will also see how matching does not depend on any functional form.

Next Lesson

Basics of causal graphs

What are directed acyclic graphs (DAGs)?