Staggered adoption • coresynth

library(coresynth)

When units adopt treatment at different times, coresynth fits each adoption cohort separately and aggregates the cohort ATTs with weights proportional to N_treated × T_post (Clarke et al. 2023):

$\hat\tau = \frac{\sum_g N_{tr,g}\, T_{post,g}\, \hat\tau_g}{\sum_g N_{tr,g}\, T_{post,g}}.$

All six estimators support this; staggered timing is detected automatically from the treatment column — no extra flag is needed.

A staggered panel

Here u1 is treated from period 11 and u2 from period 16.

set.seed(42)
N <- 12; TT <- 20
f   <- cumsum(rnorm(TT, 0, 0.5))
lam <- rnorm(N, 1, 0.3)
dat <- expand.grid(time = seq_len(TT), id = paste0("u", seq_len(N)))
dat$y <- as.vector(outer(f, lam)) + rnorm(nrow(dat), 0, 0.3)

dat$d <- 0L
dat$d[dat$id == "u1" & dat$time > 10] <- 1L
dat$d[dat$id == "u2" & dat$time > 15] <- 1L
dat$y[dat$d == 1] <- dat$y[dat$d == 1] + 2.0   # true ATT = 2.0

Fitting

fit <- scm_fit(y ~ d | id + time, data = dat, method = "sdid")
fit$estimate          # aggregate ATT
#> [1] 1.872811
fit$staggered         # TRUE
#> [1] TRUE

Per-cohort detail is in cohort_estimates:

fit$cohort_estimates
#>   cohort n_treated T_pre T_post estimate    weight
#> 1     11         1    10     10 1.839366 0.6666667
#> 2     16         1    15      5 1.939702 0.3333333

Choosing the control group

control_group controls which units serve as donors for each cohort:

"clean" (default) — never-treated units plus not-yet-treated units (those adopting later than the current cohort).
"never_treated" — never-treated units only.

fit_clean <- scm_fit(y ~ d | id + time, data = dat, method = "sdid",
                     control_group = "clean")
fit_nt    <- scm_fit(y ~ d | id + time, data = dat, method = "sdid",
                     control_group = "never_treated")
c(clean = fit_clean$estimate, never_treated = fit_nt$estimate)
#>         clean never_treated 
#>      1.872811      1.962067

Note the trade-off: "clean" gives earlier cohorts a larger donor pool, but a future adopter’s post-adoption outcomes enter the synthetic control of the periods after it adopts, which attenuates the aggregate ATT when its treatment effect is large. If the effect you estimate is sizeable relative to the outcome noise, "never_treated" is the safer choice (this matches the fixed donor-pool rule of Ben-Michael, Feller & Rothstein 2022, who exclude any unit treated within the evaluation window).

Across estimators

The same call works for every method:

methods <- c("scm", "sdid", "gsc", "mc", "tasc", "si")
sapply(methods, function(m)
  scm_fit(y ~ d | id + time, data = dat, method = m)$estimate)
#>      scm     sdid      gsc       mc     tasc       si 
#> 1.835085 1.872811 1.760712 2.292777 3.153267 1.934303

SCM: partially pooled synthetic controls

The classic SCM was designed for a single treated unit, so its staggered extension rests on Ben-Michael, Feller & Rothstein (2022, JRSS-B). Units adopting at the same time are averaged into one cohort — fully pooling within a cohort is justified by their theory — and each cohort gets its own synthetic control. How the cohort fits interact is controlled by nu:

nu = NULL (default) — each cohort is fitted independently with the classic V-optimised SCM (the behaviour of earlier versions).
nu in [0, 1] — partially pooled SCM: all cohort weight vectors are chosen jointly, trading off per-cohort pre-treatment fit (nu = 0) against the fit of the average placebo gap across cohorts (nu = 1), which anchors the aggregate ATT.
nu = "auto" — the paper’s heuristic based on how well separate synthetic controls already balance the average.

fixedeff = TRUE additionally demeans every unit by its own pre-treatment mean within each cohort (an intercept shift), which the paper recommends when outcome levels differ across units.

fit_pp <- scm_fit(y ~ d | id + time, data = dat, method = "scm",
                  nu = "auto", fixedeff = TRUE)
fit_pp$estimate
#> [1] 1.97436
fit_pp$pooling[c("nu", "q_sep", "q_pool", "q_sep_separate", "q_pool_separate")]
#> $nu
#> [1] 0.8693383
#> 
#> $q_sep
#> [1] 0.3485897
#> 
#> $q_pool
#> [1] 0.2143312
#> 
#> $q_sep_separate
#> [1] 0.3147809
#> 
#> $q_pool_separate
#> [1] 0.2356445

q_pool (the pooled imbalance) improves relative to the separate solution at the cost of a small increase in q_sep (the average per-cohort imbalance).

Inference under staggered adoption

sdid_inference(), gsc_inference(), and si_inference() all extend to staggered fits. jackknife_global is the staggered-specific variant that removes each unique control unit across all cohorts at once, correctly capturing cross-cohort correlation.

library(broom)
tidy(sdid_inference(fit, method = "bootstrap", n_boot = 100, seed = 1))
#>   term estimate  std.error statistic     p.value conf.low conf.high    method
#> 1  ATT 1.872811 0.09200243  20.35611 4.09877e-92 1.711835  2.058967 bootstrap
#>   alternative n_controls staggered
#> 1   two.sided       10.5      TRUE
tidy(sdid_inference(fit, method = "jackknife_global"))
#>   term estimate std.error statistic      p.value conf.low conf.high
#> 1  ATT 1.872811 0.1185776  15.79397 3.423562e-56 1.640404  2.105219
#>             method alternative n_controls staggered
#> 1 jackknife_global   two.sided         11      TRUE

Staggered SCM fits use scm_inference(), the wild bootstrap of Ben-Michael, Feller & Rothstein (2022): donor weights stay fixed and per-treated-unit effect contributions are perturbed with mean-zero multipliers. With only a couple of treated units the interval is unreliable (a warning is issued below 5).

tidy(scm_inference(fit_pp, n_boot = 500, seed = 1))
#>   term estimate  std.error statistic     p.value conf.low conf.high
#> 1  ATT  1.97436 0.02866306  68.88168 0.001996008 1.929855  2.018865
#>           method alternative n_controls staggered
#> 1 wild_bootstrap   two.sided         11      TRUE

Notes

plot() and augment() for staggered fits operate per cohort; the aggregate synthetic series is not defined (Y_synth = NULL).
SCM staggered adoption supports covariates = (partial-out) but not the predictors = pred(...) interface.
SI additionally supports staggered and multi-arm simultaneously — see Estimators.