Treatment effects

We have mainly talked about two treatment effects aka causal effects: the average treatment effect (ATE) and the average treatment effect among the treated (ATT). Let’s discuss these treatment effects in more detail.

Comparison of treatment effects

ATE is a concept based on “averages” and, therefore, has the advantages and disadvantages of averages. For one, ATE is sensitive to outliers. If the treatment effect for a specific subject is very large, it will highly influence the average treatment effect.

Additionally, ATE does not reflect the treatment effect for any specific subject. It’s the treatment effect across all subjects.

In program evaluation studies, policymakers may be interested in something beyond the “average” impact of the program. Think about a case where the average impact of a treatment is negative but the treatment is still appealing because of its distributional impacts.

We can imagine two job training programs that both have the same average treatment effect (close to zero) measured in terms of increases in future earnings. However, the first one increases future earnings at the top of the earnings distribution while the other increases future earnings at the bottom of the distribution. Policymakers would potentially be interested in the second one and not the first one even if both programs don’t really increase future earnings on average.

In such cases, we can look at quantile treatment effects. For instance, we can look at the treatment effect among the bottom quartile (the bottom 25% of the distribution). We may also want to look at median treatment effects (50%).

ATE vs ATT

When it comes to causal effects, we are particularly interested in two types of causal effects: ATE (which is the one we’ve talked about so far) and the average treatment effect on the treated or ATT.

As we previously saw, ATT is basically the treatment effect if we only look at those who received the treatment and not everybody in the study.

\text{ATT} = E[Y^1−Y^0|D=1]= E[Y^1|D=1]−E[Y^0|D=1]

Therefore, ATT is the average of the difference between potential outcomes under treatment and no treatment given treatment is 1 ( $D=1$ ).

In general, ATT is not equal to ATE and they measure very different things. Most medical or evaluation studies use ATT because we often care about the causal effect of a drug/program for patients/participants who receive the treatment (drug or the enrollment in the program). In our job training example, we’re interested in how the program affects the future earnings of those who actually received the training and not those who didn’t.

For a medical example, suppose we are trying to estimate the effect of smoking on the incidence of lung cancer. In this case, we are basically interested in ATT as opposed to ATE since we won’t be interested in the effect of smoking among non-smokers.

ATE also assumes that every participants can be switched from their current treatment status to the opposite, which isn’t usually possible.

If ATT is different than ATE, this is likely an indication that the treatment assignment was non-random. This is because if the treatment assignment is non-random, the group that received the treatment is not a representative sample of all the subjects in the study.

Imagine we know the treatment vector given as below:

t <- c(1, 0, 1, 1, 0, 0)
y0 <- c(20000, 12000, 30000, 70000, 40000, 22000)
y1 <- c(18000, 13000, 57000, 70000, 45000, 23000)
# Creating a data frame of all the three vectors
dt <- data.frame(t, y0, y1)
# Calculating the average treatment effect on the treated
att <- mean(dt$y1[dt$t == 1] - dt$y0[dt$t ==1])

import numpy as np
import pandas as pd
 
t = [1, 0, 1, 1, 0, 0]
y0 = [20000, 12000, 30000, 70000, 40000, 22000]
y1 = [18000, 13000, 57000, 70000, 45000, 23000]
 
# Creating a data frame of all the three vectors
data = {'y0': y0,'y1': y1,'t': t}
dt = pd.DataFrame(data)
 
# Calculating the average treatment effect on the treated
np.mean(dt.where(dt.t == 1).y1) - np.mean(dt.where(dt.t == 1).y0) #ate

input c y0 y1
1 20000 18000
0 12000 13000
1 30000 57000
1 70000 70000
0 40000 45000
0 22000 23000
end
* Calculating the average treatment effect on the treated
egen avg = mean(y1 - y0) if c == 1
display avg

In the example above, what is the value of average treatment effect on the treated (ATT)?

Too lazy to calculate that

8333.33 dollars

4333.56 dollars

5333.33 dollars

Other treatment effects

There are other causal effects besides ATE and ATT that are used. Here’s a non-exhaustive list:

Average treatment effect on the untreated or ATU which is the average treatment effect for those in the control group
Causal relative risk calculated as $E(Y^1/Y^0)$
Conditional average treatment effect or CATE which is the average treatment effect in a subpopulation identified by one or more covariates such as the treatment effect among female workers, or treatment effect among large Asian countries
Conditional average treatment effect on the treated or CATT which is similar to the one above but only among those who are in the treatment unit
Local average treatment effect or LATE which we will discuss in a later module on instrumental variables

Next Lesson

Estimating causal effects

If we observe all confounders, how can we then estimate the causal effects.