All Courses

The LATE estimator

Let’s get our mind set on our goal. We want to find the treatment effect in a randomized experiment in which not every subject that is assigned to the treatment will actually receive the treatment and not every subject assigned to the control abstains from receiving the treatment. In other words, we have non-compliance. In randomized experiments such as these, non-compliance makes it so that treatment assigned isn’t the same as treatment received. This means the experiment isn’t fully randomized, which further means, that the treatment effect can’t simply be the difference between potential outcomes.

A word of warning. This lesson has a lot of mathematical notation, but we’re pretty sure you’ll survive 🤓 Just go one step at a time.

One thing is easy to estimate: the causal effect of treatment assignment. As you’ve already seen, we call this the intention-to-treat (ITT) causal effect:

ITT =E(YZ=1YZ=0)=E(YZ=1)E(YZ=0)\text{ITT } = E(Y^{Z=1} - Y^{Z=0}) = E(Y|Z=1) - E(Y|Z=0)

Now let’s see how we go from the ITT to the actual treatment effect (the effect of treatment received). We’ll use the following acronyms:

  • Always-taker (AT)
  • Complier (CM)
  • Never-taker (NT)
  • Defier (DF)

With these four compliance groups, we can down break down E(YZ=1)E(Y|Z=1) into the effect for each of the groups:

E(YZ=1)=E(YZ=1,AT)Pr(AT)+E(YZ=1,CM)Pr(CM)+E(YZ=1,NT)Pr(NT)+E(YZ=1,DF)Pr(DF)E(Y|Z=1) = E(Y|Z=1, \text{AT}) \Pr(\text{AT}) + E(Y|Z=1, \text{CM}) \Pr(\text{CM}) + E(Y|Z=1, \text{NT}) \Pr(\text{NT}) + E(Y|Z=1, \text{DF}) \Pr(\text{DF})

With our monotonicity assumption (the no-defier assumption), we assumed away the defiers, so the last term in our equation E(YZ=1,DF)Pr(DF)E(Y|Z=1, \text{DF}) \Pr(\text{DF}) is equal to zero and can be dropped :

E(YZ=1)=E(YZ=1,AT)Pr(AT)+E(YZ=1,CM)Pr(CM)+E(YZ=1,NT)Pr(NT)E(Y|Z=1) = E(Y|Z=1, \text{AT}) \Pr(\text{AT}) + E(Y|Z=1, \text{CM}) \Pr(\text{CM}) + E(Y|Z=1, \text{NT}) \Pr(\text{NT})

For our always-takers and never-takers, the assignment, ZZ, should have no effect on the outcome, YY. Remember, the always-takers and the never-takers will always/never take the treatment regardless of their assignment). Therefore, E(YZ=1,AT)=E(YAT)E(Y|Z=1, \text{AT}) = E(Y|\text{AT}) and E(YZ=1,NT)=E(YNT)E(Y|Z=1, \text{NT}) = E(Y|\text{NT}), and we can further simplify the expression as follows:

E(YZ=1)=E(YAT)Pr(AT)+E(YZ=1,CM)Pr(CM)+E(YNT)Pr(NT)E(Y|Z=1) = E(Y|\text{AT}) \Pr(\text{AT}) + E(Y|Z=1, \text{CM}) \Pr(\text{CM}) + E(Y|\text{NT}) \Pr(\text{NT})

Similarly, for Z=0Z=0 we have:

E(YZ=0)=E(YAT)Pr(AT)+E(YZ=0,CM)Pr(CM)+E(YNT)Pr(NT)E(Y|Z=0) = E(Y|\text{AT}) \Pr(\text{AT}) + E(Y|Z=0, \text{CM}) \Pr(\text{CM}) + E(Y|\text{NT}) \Pr(\text{NT})

Next, we can replace E(YZ=1)E(Y|Z=1) and E(YZ=0)E(Y|Z=0) in the intention-to-treat estimator formula above:

ITT =E(YZ=1YZ=0)=E(YZ=1)E(YZ=0)=E(YAT)Pr(AT)+E(YZ=1,CM)Pr(CM)+E(YNT)Pr(NT)E(YAT)Pr(AT)E(YZ=0,CM)Pr(CM)E(YNT)Pr(NT),\text{ITT } = E(Y^{Z=1} - Y^{Z=0}) = E(Y|Z=1) - E(Y|Z=0) = E(Y|\text{AT}) \Pr(\text{AT}) + E(Y|Z=1, \text{CM}) \Pr(\text{CM}) + E(Y|\text{NT}) \Pr(\text{NT}) - E(Y|\text{AT}) \Pr(\text{AT}) - E(Y|Z=0, \text{CM}) \Pr(\text{CM}) - E(Y|\text{NT}) \Pr(\text{NT}),

Which then simplifies to:

ITT =E(YZ=1YZ=0)=E(YZ=1)E(YZ=0)=E(YZ=1,CM)Pr(CM)E(YZ=0,CM)Pr(CM)=(E(YZ=1,CM)E(YZ=0,CM))Pr(CM)\text{ITT } = E(Y^{Z=1} - Y^{Z=0}) = E(Y|Z=1) - E(Y|Z=0) = E(Y|Z=1, \text{CM}) \Pr(\text{CM}) - E(Y|Z=0, \text{CM}) \Pr(\text{CM}) = (E(Y|Z=1, \text{CM}) - E(Y|Z=0, \text{CM})) \Pr(\text{CM})

With a little bit of rearrangement we have:

E(YZ=1,CM)E(YZ=0,CM)=ITT Pr(CM)E(Y|Z=1, \text{CM}) - E(Y|Z=0, \text{CM}) = \dfrac{\text{ITT }}{\Pr(\text{CM})}

Finally, for compliers, treatment received is equal to treatment assigned, so we can replace Z=1Z=1 with D=1D=1 and Z=0Z=0 with D=0D=0. We then get:

E(YD=1,CM)E(YD=0,CM)=E(Y1,CM)E(Y0,CM)=ITT Pr(CM)E(Y|D=1, \text{CM}) - E(Y|D=0, \text{CM}) = E(Y^1, \text{CM}) - E(Y^0, \text{CM}) = \dfrac{\text{ITT }}{\Pr(\text{CM})}

LATE estimator

Remember that we wanted to estimate the effect of treatment received on the outcome. Well, E(Y1,CM)E(Y0,CM)E(Y^1, \text{CM}) - E(Y^0, \text{CM}) is what we are looking for, except it’s the treatment effect only among the compliers. Because the treatment effect is only for a subsection of the population, we call this treatment effect the local average treatment effect or (LATE).

LATE is the treatment effect of subjects for whom treatment assignment is the same as treatment received.

LATE is related to ITT through the following relationship:

LATE=ITT Pr(CM)=E(YZ=1)E(YZ=0)Pr(CM)LATE = \dfrac{\text{ITT }}{\Pr(\text{CM})} = \dfrac{E(Y|Z=1) - E(Y|Z=0)}{\Pr(\text{CM})}

As we said before, ITT is easily estimable because ZZ and YY are independent (due to randomization at the treatment assignment level). We can also easily calculate Pr(CM)\Pr(\text{CM}) as we did in the previous lesson. Pr(CM)\Pr(\text{CM}) is the share of compliers in the sample and from the previous lesson, we found the share to be:

Compliance Rate =Sahre of subjects that were treated and encouraged out of all who were encouragedShare of subjects that were treated but not encouraged out of all who were not encouraged\text{Compliance Rate } = \text{Sahre of subjects that were treated and encouraged out of all who were encouraged} - \text{Share of subjects that were treated but not encouraged out of all who were not encouraged}

The expression above can be written as

Share of compliers =E(DZ=1)E(DZ=0)\text{Share of compliers } = E(D|Z=1) - E(D|Z=0)

Finally, let’s put everything together:

LATE =ITTPr(CM)=E(YZ=1)E(YZ=0)E(DZ=1)E(DZ=0)\text{LATE } = \dfrac{\text{ITT}}{\Pr(\text{CM})} = \dfrac{E(Y|Z=1) - E(Y|Z=0)}{E(D|Z=1) - E(D|Z=0)}

which translates to:

Causal effect of treatment received on the outcome among compliers =Causal effect of treatment assigned on the outcomeCausal effect of treatment assigned on treatment received\text{Causal effect of treatment received on the outcome among compliers } = \dfrac{\text{Causal effect of treatment assigned on the outcome}}{\text{Causal effect of treatment assigned on treatment received}}

Note that if we have perfect compliance and everybody is a complier, then Pr(CM)=1\Pr(CM)=1 and as a result, LATE =ITT\text{LATE } = \text{ITT}. However, because Pr(CM)<1\Pr(\text{CM})< 1, LATE >ITT\text{LATE } > \text{ITT} and LATE\text{LATE} is underestimated by ITT\text{ITT}.

LATE and ATT

We just saw that LATE captures the average treatment effect among compliers. So it’s not really an average treatment effect (ATE) or even an average treatment effect among the treated (ATT).

Consider a special case where the non-encouraged group is excluded from taking the treatment. For instance, imagine an encouragement design where only the encouraged group can enroll in a training program because the experiment is designed so that the encouraged group receives a link in an email to enroll in the program. In contrast, the non-encouraged group doesn’t get the link. Therefore, those in the non-encouraged group can’t enroll in the program because they don’t even know about it. However, note that this could still be a case of non-compliance because those in the encouraged group can still choose whether they want to enroll or not.

Based on the table we saw in the previous lesson, there won’t be any second row because we can’t have a case where Z=0Z=0 and D=1D=1. Therefore, there won’t be any always-takers (we already assumed there are no defiers). If there are no always-takers, then according to the fourth row of the table, everybody who is treated is a complier.

In short, all treated units are compliers and all compliers are treated.

Therefore, LATE \text{LATE } which captures the treated effect among the compliers, also captures the treatment effect among the treated, so LATE =ATT\text{LATE } = \text{ATT}.

We run an encouragement design in which students who are encouraged receive information about applying for government-sponsored higher education financial aid. We want to know how financial aid helps with college graduation. We find out that those in the encouraged group are 20 percent more likely to graduate than those in the non-encouraged group. We also know that the share of subjects treated in the encouraged group is 0.9, and the share of subjects treated in the non-encouraged group is 0.1. What is the estimate of the local average treatment effect of the financial aid on graduation among the compliers?
23
15
18
25

Next Lesson

Instrumental variables

You'll learn about the basics of causal inference and why it matters in this course.