Skip to contents

Here we show how to deal with confounding events, see Tsai (2024) for the full treatment. We first simulate some data with confounding events.

library(fastdid)
library(ggplot2)
library(data.table)
simdt <- sim_did(1e+04, 5, cov = "cont", hetero = "all", balanced = TRUE, seed = 1, 
                 second_cohort = TRUE, second_het = "no") #comfounding event
dt <- simdt$dt #dataset

#ground truth att
att <- simdt$att |> merge(dt[,.(w = .N),by = "G"], by = "G")
att[, event_time := time-G]
att <- att[event == 1,.(att = weighted.mean(attgt, w)), by = "event_time"]

Using the default estimator, the estimates (black line) is biased from the ground truth (red line).

naive_result <-fastdid(data = dt, 
                 timevar = "time", cohortvar = "G", unitvar = "unit", 
                 outcomevar = "y", result_type = "dynamic")
plot_did_dynamics(naive_result) + 
  geom_line(aes(y = att, x = event_time), 
            data = att, color = "red") + theme_bw()

Diagnostics can be obtained by using time >= G2 as outcome. As we set the effect of the confounding event at 10 constantly, we can see that the bias of the default estimator is roughly 10 times the diagnostics.

dt[, D2 := time >= G2]
diag <- fastdid(data = dt, 
                 timevar = "time", cohortvar = "G", unitvar = "unit",
                 outcomevar = "D2", result_type = "dynamic")
plot_did_dynamics(diag) + theme_bw()

Using the double did estimator with cohortvar2, the estimator recovers the ground truth.

double_result <-fastdid(data = dt, 
                 timevar = "time", cohortvar = "G", unitvar = "unit",
                 outcomevar = "y", result_type = "dynamic",
                 cohortvar2 = "G2", event_specific = TRUE)
plot_did_dynamics(double_result) + 
  geom_line(aes(y = att, x = event_time), 
            data = att, color = "red") + theme_bw()

Double DiD also allow for two additional aggregation scheme: group-group-time (“group_group_time”) and dynamic-staggered (“dynamic_stagger”, event time by event stagger, G1-G2).

double_result_ds <-fastdid(data = dt, 
                 timevar = "time", cohortvar = "G", unitvar = "unit",
                 outcomevar = "y", result_type = "dynamic_stagger",
                 cohortvar2 = "G2", event_specific = TRUE)

double_result_ggt <-fastdid(data = dt, 
                 timevar = "time", cohortvar = "G", unitvar = "unit",
                 outcomevar = "y", result_type = "group_group_time",
                 cohortvar2 = "G2", event_specific = TRUE)