All Courses

Quasi-experimental designs

We’ve seen that randomized experiments can be costly or impossible to implement and that instrumental variables can have their own pitfalls (mainly, the assumptions are hard to satisfy).

Another way of doing causal studies is by taking advantage of quasi-experiments. Quasi-experimental designs or natural experiments are a special type of observational study. They involve observational data that just happens to fit some of the qualifications of a randomized experiment, not by design but just by the nature of the data.

Quasi-experimental designs can help mitigate the confounding issues in observational settings. In this module, we’ll use the terms quasi-experimental designs and natural experiments interchangeably. Let’s take a look at a study in economics.

Can watching Sesame Street improve future academic outcomes for children? Parents would love to know the answer to this question, but a lot could go wrong when making causal inferences here.

For starters, access to the show may be tied to factors that also impact the outcome variable. For example, household income may impact the likelihood of watching the show and will also impact academic outcomes.

How can quasi-experimental design help here? For the answer, we turn to researchers Kearney and Levine who authored a study on this very question.

In the early 1970s, Sesame street wasn’t accessible for all children. Only about a third of American children watched the program while one-third of America was unable to watch the show even if they wanted to. Rather than focusing on non-random explanations for why some children would have access to the show over others, Kearney and Levine first focused on whether there were any random factors that helped explain which children had access to the show and which didn’t.

It turns out that at that time TV sets could only receive either VHF or UHF signals, and the type of signal received affected the likelihood of being able to watch Sesame Street. VHF signals were weaker and therefore TV sets with VHF would have spotty access to shows like Sesame Street.

Because of federal laws in the US, the show could only be aired on the weaker VHF signals in cities such as New York and Boston but in cities like LA or Washington DC, the show was aired via the stronger UHF signals. So kids were “naturally” randomized into having good or bad access to the show. Note, however, that a weak signal didn’t mean the kids didn’t watch the show or a strong signal didn’t mean they definitely watched the show.

What is the problem mentioned above called?
SUTVA violation
Non-compliance
non-ignorable treatment
The fundamental problem of causal inference

The interesting finding of the study was that children (especially the ones coming from disadvantaged families) who could watch the show benefited the most from the show.

What is natural about natural experiments?

What distinguishes natural experiments from other types of observational studies is the random assignment of the treatment. This brings natural experiments closer to the realm of randomized experiments the main difference being that the treatment is not randomly assigned by the researcher.

You may have also noticed some similarities between IV studies and quasi-experimental methods. In fact, instrumental variables can also be thought of as quasi-experimental methods. The natural experiment mentioned above could easily be interpreted as an instrumental variable study where the instrument (or the encouragement) is the strength of the signal in one’s city. The instrument is created by the forces of nature or government policy to create a situation similar to a randomized experiment. Likewise, the Vietnam-era lottery study that we covered in the previous module, could be considered a natural experiment.

In some natural experiments, nature (the physical world) does the randomization for us. For instance, Burke et al. exploit randomness in rainfall to understand the effect of climates on conflicts. But be careful not to interpret the word “nature” or “natural” as only referencing non-human effects or intervention. In the Vietnam war lottery example, it was a bunch of bureaucrats who came up with the draft policy, and hence, determined which people received the treatment. When we say “natural” experiments, we are simply highlighting the fact that the treatment occurred naturally without the involvement of the researcher. The bureaucrats in the Vietnam study had no intention of assessing the effect of military service on future income. They just wanted to send people to war 🙁

In a perfect natural experiment, the treatment is fully randomized and the treatment is assumed to be exogenous (the ignorablity assumption is satisfied). In other words, we can assume that those who ended up in the treatment were just as likely to end up in the control group and vice versa.

Although this natural or haphazard assignment to treatment doesn’t necessarily guarantee randomness, it is still preferred to selection to assignment by the subjects themselves, which is the case for most observational studies.

Here are some examples of quasi-experimental designs in practice.

Examples of natural experiments

The Oregon Health Insurance Experiment

One of the most well-known natural experiments in public health and economics is the Oregon Health Insurance Experiment, which you can read more about here. The program was about the Medicaid program in the United States, which is a federal and state program designed to help alleviate medical expenses for low-income individuals.

In early 2008, annual enrollment for Medicaid had closed, but the state of Oregon decided to re-open enrollments for a limited number of spots. When they did this, about 90,000 people signed up for only 10,000 remaining spots. What did the state do?

They ran a lottery to determine who should get the benefit. 10,000 people were randomly selected to receive the benefit out of the 90,000 who applied. Researchers were then able to study the impact of access to Medicaid on health-related outcomes by comparing those who won the lottery to those who did not.

Why is the experiment above a natural experiment and not a randomized experiment?
Researchers decided who received treatment and who didn’t.
The state of Oregon had no intention of doing research or running an experiment.
Assignment to the treatment (Medicaid benefits) was not random.
Assignment to treatment was based on a non-human intervention.

Long term effects of in-utero influenza exposure during the 1918 influenza pandemic

We all know about pandemics by now. In 1918, at the end of WWI, a flu pandemic killed more people around the world than had died in the war. The pandemic caught the US by surprise in October 1918 and lasted until January of the following year. During that time, about one-third of women of childbearing age contracted the virus. Researchers have used this data to study the long-term health effects of in-utero exposure to the virus.

Because parents didn’t plan to be pregnant during the months of the pandemic, within the same year, some children were born to parents who were exposed to the virus and some children weren’t. Data showed that cohorts in utero during the pandemic displayed reduced educational attainment, increased rates of physical disability, lower income, lower socioeconomic status, and higher transfer payments compared with other birth cohorts.

Why is this a natural experiment? Clearly, no researcher imposed the virus on individuals involved in the study. Therefore, treatment assignment, i.e., in utero exposure to the virus was random.

Class size and academic achievement

The causal effect of class size on academic achievement is a tough causal question. Typically, we worry a lot about selection bias, because schools with small classes tend to be very different from those with large class sizes.

In Israel, however, a historical rule called the Maimonides’ rule, states that classes may not exceed 40 students (a magical number). If a class exceeds 40 students, one of two things happens. The class is either divided into two smaller classes or some of the students enrolled are redistributed to another class so that the class is capped at 40. This provides data for a natural experiment study on how class size affects academic outcomes.

Why is this a natural experiment? The class-size cutoff wasn’t created by researchers, it was dictated by policy. Why is it experimental? Well, classes with 40 students shouldn’t be in very different schools compared to classes with 39 students. The only difference is that classes with 40 students get broken down into two smaller classes but classes with 39 students don’t. In the next lesson, we will see that this is a classic case of a regression discontinuity design, which is a particular type of natural experiment.

These are just a few examples of natural or quasi-experimental settings. These scenarios don’t eradicate the confounding bias but lessen their effect on our causal estimates. For instance, you could still argue that parents to some extent plan when the month they want their children to be born. As a result, the month of birth might be correlated with parental characteristics and there is confounding.

If we look around, we may find natural experiments that would help us understand causal questions. Finding natural experiments requires some subject matter knowledge; knowing policies, previous research, following the news, etc. Can you think of an example around you?

Two types of quasi-experimental methods

The examples above all fit under the category of quasi-experimental methods in causal inference. In general, any natural experiment or quasi-experimental design is in the form of one of the following categories:

  1. A setting where we can believe that the treatment is assigned almost randomly. We will see in the next lesson that regression discontinuity designs fit this description. People argue that instrumental variables are a form of approximate randomized experiments. Additionally, randomized experiments created by natural random elements (such as weather patterns, date of birth, genes, etc.) are examples of this type of natural experiment.
  2. A setting where it’s hard to beleive that the treattment was assigned randomly but there is no obvious reason to believe the treatment group is significantly different from the control group. We will see soon that synthetic control methods and difference-in-differences methods help us with this type of natural experiment. This form of natural experiments is less compelling and harder to sell as a quasi-experimental approach because the treatment assignment isn’t perceived to be random.

In this module, we’ll go over some quasi-experimental methods that might help you understand causal effects in the absence of a proper randomized experiment. Get ready to have some fun with quasi-experimental methods 🎩

Next Lesson

Regression discontinuity designs

You'll learn about the basics of causal inference and why it matters in this course.