For many research questions, treatment effects are likely to follow a temporal pattern rather than being uniform.
If there is a binary treatment indicator (e.g. marriage, childbirth), we can use impact functions (or count functions) around the onset of the treatment in twoway FE to investigate its temporal patterns (Ludwig and Brüderl 2021).
# Ordermwp <- mwp[order(mwp$id, mwp$year),]# Count since treatment by idmwp$Treat_count <-ave( mwp$marry, mwp$id,FUN =function(x)cumsum(x))# First treatment instance & distributemwp$Treat_first <-ifelse(mwp$Treat_count ==1, mwp$year, 0)mwp$Treat_first <-ave(mwp$Treat_first, mwp$id,FUN = max)
Dummy impact function
# Ordermwp <- mwp[order(mwp$id, mwp$year),]# Count since treatment by idmwp$Treat_count <-ave( mwp$marry, mwp$id,FUN =function(x)cumsum(x))# First treatment instance & distributemwp$Treat_first <-ifelse(mwp$Treat_count ==1, mwp$year, 0)mwp$Treat_first <-ave(mwp$Treat_first, mwp$id,FUN = max)
Dummy impact function
# Create event time indicatormwp$time_to_treatment <- mwp$year - mwp$Treat_first# Define reference periods (use minus 2 to allow for anticipation in -1)control <-c(-2, min(mwp$time_to_treatment)) mwp$time_to_treatment <-ifelse( mwp$time_to_treatment %in% control | mwp$Treat_first ==0,-9999, mwp$time_to_treatment )mwp$time_to_treatment <-relevel(as.factor(mwp$time_to_treatment), "-9999")
Dummy impact function
# Create event time indicatormwp$time_to_treatment <- mwp$year - mwp$Treat_first# Define reference periods (use minus 2 to allow for anticipation in -1)control <-c(-2, min(mwp$time_to_treatment)) mwp$time_to_treatment <-ifelse( mwp$time_to_treatment %in% control | mwp$Treat_first ==0,-9999, mwp$time_to_treatment )mwp$time_to_treatment <-relevel(as.factor(mwp$time_to_treatment), "-9999")
Dummy impact function
# Create event time indicatormwp$time_to_treatment <- mwp$year - mwp$Treat_first# Define reference periods (use minus 2 to allow for anticipation in -1)control <-c(-2, min(mwp$time_to_treatment)) mwp$time_to_treatment <-ifelse( mwp$time_to_treatment %in% control | mwp$Treat_first ==0,-9999, mwp$time_to_treatment )mwp$time_to_treatment <-relevel(as.factor(mwp$time_to_treatment), "-9999")
# Adjusting the results matrix setup to include all marcount levelscoef.df <-data.frame(time =factor(c(-4:7), levels =c(-4:7)),# Include all levels as factorsatt =NA,se =NA)# Extracting coefficients and SEs for marcount levelsoutput <-summary(fe_dummy)$coefficientsfor (i inlevels(coef.df$time)) { coef_name <-paste0("time_to_treatment", i)if (coef_name %in%rownames(output)) { coef.df[coef.df$time == i, c("att", "se")] <- output[coef_name, 1:2] }}# Fill reference categorycoef.df$att[coef.df$time == control[1]] <-0coef.df$se[coef.df$time == control[1]] <-0coef.df$model <-"TWFE Event-Study Design"coef.df$time2 <-as.numeric(as.character(coef.df$time))# Calculate 95% CIinterval2 <--qnorm((1-0.95) /2)coef.df$ll <- coef.df$att - coef.df$se * interval2coef.df$ul <- coef.df$att + coef.df$se * interval2# Pre vs postcoef.df$post <-ifelse(coef.df$time2 >=0, 1, 0)coef.df$post <-factor(coef.df$post, labels =c("Before treatment","After treatment"))# Plotzp <-ggplot(coef.df, aes(x = time, y = att)) +geom_hline(yintercept =0) +geom_vline(xintercept =4.5, linetype ="dashed") +geom_pointrange(data = coef.df,aes(x = time,y = att,ymin = ll,ymax = ul,color = post,shape = post )) +scale_color_viridis_d(option ="B",end =0.80,begin =0.2,direction =-1 ) +theme_minimal() +theme(panel.grid.minor =element_blank(),text =element_text(family ="Times New Roman", size =16),axis.text =element_text(colour ="black"),legend.position ="bottom",legend.title =element_blank() ) +scale_x_discrete() +labs(x ="Event time (Year of marriage = 0)",y =paste0("Effect on ln Wage"))zp
Code
# save mwpmwp_new <- mwp
Dynamic Diff-in-Diff - the problem
Diff-in-Diff
Diff-in-Diff design, adopted from Cunningham (2021)
Diff-in-Diff
The difference-in-differences (Diff-in-Diff) design is a simple yet powerful approach to evaluating the impact of a treatment in a panel data setting. In its basic form, the \(2 \times 2\) Diff-in-Diff estimator involves two groups—treatment (\(T\)) and control (\(C\))—observed at two time points, before and after the treatment.
In this setup, the treatment is uniform across observations and occurs at the same time. Diff-in-Diff is thus equivalent to two-ways FE.
Diff-in-Diff
The situation becomes more intricate with multiple time periods and when the treatment timing varies and when treatment effects are dynamic - i.e they follow a temporal pattern.
With multiple time-periods, Goodman-Bacon (2021) has demonstrated that Twoway Fixed Effects (TWFE) can be seen as a weighted average of all possible \(2 \times 2\) Diff-in-Diff estimators.
The weights are influenced by
the group size
treatment variance within each subgroup (how long we observe each combination before and after treatment)
FE and Diff-in-Diff
In many settings — particularly when treatment effects are homogeneous, which would be the case if all individuals experience the same static treatment effect — this is not a problem at all. TWFE will give the correct results.
However, two-way FE may produce biased results when treatment effects are dynamic over time and treatment timing varies (Roth et al. 2023).
For a review of these estimators, see, for instance, Roth et al. (2023) or Rüttenauer and Aksoy (2024).
Potential solutions
The idea of these estimators can be described as (parametrically or non-parametrically) estimating several \(2 \times 2\) Diff-in-Diffs.
In a multi-group and heterogeneous treatment-timing setting, we compute group-time average treatment effects by grouping all treatment units that receive treatment at the same period into a common group \(g\).
Dynamic Diff-in-Diff
For each treatment group \(g\) and time period \(t\), we estimate group-specific and time-specific ATTs:
This obviously yields a large number of different treatment effects. But we can combine them, e.g. by
\[
\theta_D(e) := \sum_{g=1}^G \mathbf{1} \{ g + e \leq T \} \delta(g,g+e) P(G=g | G+e \leq T),
\]
where \(e\) specifies for how long a unit has been exposed to the treatment. It is basically the average effects across all treatment-timing groups at the period \(e\) after treatment.
Assumptions
Staggered treatment adoption: once treated, a unit remains treated
Parallel trends assumption
based on never-treated (very strong)
based on not-yet-treated (a bit more likely)
No treatment anticipation
based on never-treated (a bit more likely)
based on not-yet-treated (very strong)
Assumptions
Trade-off: If assumption 2) is likely to hold, we can use only the never-treated as controls to relax assumption 3). If assumption 3) is likely to hold, we can include the not-yet-treated as control to relax assumption 2).
Dynamic Diff-in-Diff
Note
The estimator of Callaway and Sant’Anna (2021) uses a single period before treatment (by default it’s the year before treatment) as pre-treatment period for many \(2\times 2\) Diff-in-Diff estimators. The estimator is thus sensitive to anticipation.
In contrast, Borusyak, Jaravel, and Spiess (2023) uses all pre-treatment periods as control periods. It thus less sensitive to anticipation, but more sensitive to violations of parallel trends.
Example - Marriage and wage
As an example, we use the mwp panel data, containing information on wages and family status of 268 men.
We exemplary investigate the ‘marriage wage premium’: we analyse whether marriage leads to an increase in the hourly wage for men.
# treatment timing = year if marriedmwp$treat_timing <-ifelse(mwp$marry ==1, mwp$year, NA)# set never treated to zeromwp$treat_timing[mwp$evermarry ==0] <-0# if married is not NA, used min year per id (removing NAs)mwp$treat_timing[!is.na(mwp$marry)] <-ave(mwp$treat_timing[!is.na(mwp$marry)], mwp$id[!is.na(mwp$marry)],FUN =function(x) min(x, na.rm =TRUE))
Example - Marriage and wage
# treatment timing = year if marriedmwp$treat_timing <-ifelse(mwp$marry ==1, mwp$year, NA)# set never treated to zeromwp$treat_timing[mwp$evermarry ==0] <-0# if married is not NA, used min year per id (removing NAs)mwp$treat_timing[!is.na(mwp$marry)] <-ave(mwp$treat_timing[!is.na(mwp$marry)], mwp$id[!is.na(mwp$marry)],FUN =function(x) min(x, na.rm =TRUE))
Example - Marriage and wage
# treatment timing = year if marriedmwp$treat_timing <-ifelse(mwp$marry ==1, mwp$year, NA)# set never treated to zeromwp$treat_timing[mwp$evermarry ==0] <-0# if married is not NA, used min year per id (removing NAs)mwp$treat_timing[!is.na(mwp$marry)] <-ave(mwp$treat_timing[!is.na(mwp$marry)], mwp$id[!is.na(mwp$marry)],FUN =function(x) min(x, na.rm =TRUE))
Example - Marriage and wage
head(mwp[, c("id", "year", "marry", "evermarry", "treat_timing")], n =35)
To make this more interpretable, we re-aggregate the individuals results to a dynamic time-averaged effect (we now restrict this to observations from -3 to 6).
wages.dyn <-aggte(wages.attgt, type ="dynamic", na.rm =TRUE,min_e =-3, max_e =6)summary(wages.dyn)
Call:
aggte(MP = wages.attgt, type = "dynamic", min_e = -3, max_e = 6,
na.rm = TRUE)
Reference: Callaway, Brantly and Pedro H.C. Sant'Anna. "Difference-in-Differences with Multiple Time Periods." Journal of Econometrics, Vol. 225, No. 2, pp. 200-230, 2021. <https://doi.org/10.1016/j.jeconom.2020.12.001>, <https://arxiv.org/abs/1803.09015>
Overall summary of ATT's based on event-study/dynamic aggregation:
ATT Std. Error [ 95% Conf. Int.]
0.0438 0.0419 -0.0384 0.126
Dynamic Effects:
Event time Estimate Std. Error [95% Simult. Conf. Band]
-3 0.0333 0.0494 -0.0958 0.1625
-2 0.0152 0.0476 -0.1093 0.1396
-1 0.0314 0.0355 -0.0614 0.1242
0 0.0088 0.0389 -0.0929 0.1105
1 0.0484 0.0403 -0.0569 0.1538
2 0.0711 0.0420 -0.0386 0.1809
3 0.0704 0.0526 -0.0670 0.2079
4 0.0270 0.0549 -0.1165 0.1704
5 0.0322 0.0595 -0.1232 0.1876
6 0.0488 0.0635 -0.1172 0.2148
---
Signif. codes: `*' confidence band does not cover 0
Control Group: Not Yet Treated, Anticipation Periods: 0
Estimation Method: Inverse Probability Weighting
Example - Marriage and wage
The did package also comes with a handy command ggdid() to plot the results
These individual effects are similar to running a lot of individual regressions, where we compute a lot of individual \(2 \times 2\) DD estimators, e.g. for group 1981:
t <-1981# run individual effectsfor(i insort(unique(mwp$year))[-1]){# not yet treated mwp$notyettreated <-ifelse(mwp$treat_timing > t & mwp$treat_timing > i, 1, 0)# select 1980 group, never-treated and not yet treated oo <-which(mwp$treat_timing == t | mwp$treat_timing ==0| mwp$notyettreated ==1) df <- mwp[oo, ]# after set to 1 for year rolling year i df$after <-NA df$after[df$year == i] <-1# control yearif(i < t){# if i is still before actual treatment, compare to previous year tc <- i -1 }else{# if i is beyond actual treatment, compare to year before actual treatment (t-1) tc <- t -1 } df$after[df$year == tc] <-0# Restrict to the two years we want to compare df <- df[!is.na(df$after), ]# Define treated group df$treat <-ifelse(df$treat_timing == t, 1, 0)# Estiamte 2x2 DD tmp.lm <-lm(lnw ~ treat*after, data = df)# Printprint(paste0(i, ": ", round(tmp.lm$coefficients[4], 4)))}
The Sun and Abraham (2021) estimator calculates the cohort-specific average treatment effect on the treated \(CATT_{e,\ell}\) for \(\ell\) periods from the initial treatment and for the cohort of units first treated at time \(e\). These cohort-specific and time-specific estimates are the average based on their sample weights.
The algorithm
Estimate \(CATT_{e,\ell}\) with a two-way fixed effects estimator that interacts the cohort and relative period indicators
The control group cohort \(C\) can either be the never-treated, or (if they don’t exist), Sun and Abraham (2021) propose to use the latest-treated cohort as control group. By default, the reference period is the relative period before the treatment \(\ell=-1\).
Calculate the sample weights of the cohort within each relative time period \(Pr\{E_{i}=e\mid E_{i}\in[-\ell,T-\ell]\}\)
Use the estimated coefficients from step 1) \(\widehat{\delta}_{e,\ell}\) and the estimated weights from step 2) \(\widehat{Pr}\{E_{i}=e\mid E_{i}\in[-\ell,T-\ell]\}\) to calculate the interaction-weighted estimator \(\widehat{\nu}_{g}\):
where \(A_{it}^{'}\lambda_i\) contains unit FEs, but also allows to interact them with some observed covariates unaffected by the treatment status
and \(X_{it}^{'}\delta\) nests period FEs but additionally allows any time-varying covariates,
\(\lambda_i\) is a vector of unit-specific nuisance parameters,
and \(\delta\) is a vector of nuisance parameters associated with common covariates.
The algorithm
For every treated observation, estimate expected untreated potential outcomes \(A_{it}^{'}\lambda_i + X_{it}^{'}\delta\) by some unbiased linear estimator \(\hat Y_{it}(0)\) using data from the untreated observations only,
For each treated observation (\(\in\Omega_1\)), set \(\hat\tau_{it} = Y_{it} - \hat{Y}_{it}(0)\),
Estimate the target by a weighted sum \(\hat\tau = \sum_{it\in\Omega_1}w_{it}\hat\tau_{it}\).
Example 2: Marriage and satisfaction with sex life
We use 13 waves of panel data from the Panel Analysis of Intimate Relationships and Family Dynamics (pairfam) survey, release 14.1 (Brüderl et al. 2023), to examine how the transition into first marriage is associated with changes in respondents’ sexual satisfaction.
See Rüttenauer and Kapelle (2024) for more information
Example 2: Marriage and satisfaction with sex life
Remember that we have to make the parallel trends assumption in twoways FE models. A violation of the parallel trends assumption leads to biased estimates.
Usually, when controlling for time fixed effects, we make the assumption that every observation experiences the same “effect of time”.
However, we can relax this assumption by giving each individual their own intercept and their own slope.
Fixed Effects Individual Slopes
The FEIS estimator
\[
y_{it} = \beta x_{it} + \alpha_i + \alpha_i*w_{it} + \zeta_t + \epsilon_{it},
\] includes the person-fixed effects \(\alpha_i\), and an interaction between person-fixed effects \(\alpha_i\) and another time-varying variable \(w_{it}\), which often is a function of time.
If time-fixed effects \(\zeta_t\) can be included, depends on the specification of \(w_{it}\).
Fixed Effects Individual Slopes
As with the conventional FE, FEIS can be estimated using lm() by including \(N-1\) individual-specific dummies and interaction terms of each slope variable with the \(N-1\) individual-specific dummies (\((N-1) *J\) controls).
This is however highly inefficient.
Fixed Effects Individual Slopes
we can achieve the same result by running an lm() on pre-transformed data. Therefore, specify the ‘residual maker’ matrix \(\boldsymbol{\mathbf{M}}_i = \boldsymbol{\mathbf{I}}_T - \boldsymbol{\mathbf{W}}_i(\boldsymbol{\mathbf{W}}^\intercal_i \boldsymbol{\mathbf{W}}_i)^{-1}\boldsymbol{\mathbf{W}}^\intercal_i\), and estimate
where \(\tilde{\boldsymbol{\mathbf{y}}}_{i}\), \(\tilde{\boldsymbol{\mathbf{X}}}_{i}\), and \(\tilde{\boldsymbol{\mathbf{\epsilon}}}_{i}\) are the residuals of regressing \(\boldsymbol{\mathbf{y}}_{i}\), each column-vector of \(\boldsymbol{\mathbf{X}}_{i}\), and \(\boldsymbol{\mathbf{\epsilon}}_{i}\) on \(\boldsymbol{\mathbf{W}}_i\).
FEIS intuitively
estimate the individual-specific predicted values for the dependent variable and each covariate based on an individual intercept and the additional slope variables of \(\boldsymbol{\mathbf{W}}_i\),
‘detrend’ the original data by these individual-specific predicted values, and
run an OLS model on the residual (‘detrended’) data.
FEIS Mundlak
Similarly, we can estimate a correlated random effects (CRE) model (Chamberlain 1982; Mundlak 1978; Wooldridge 2010) including the individual specific predictions \(\hat{\boldsymbol{\mathbf{X}}}_{i}\) to obtain the FEIS estimator:
It does not necessarily be time or person-specific. You could also control for family-specific pre-treatment conditions or regional time-trends (Rüttenauer and Ludwig 2023).
Example
As an example, we use the mwp panel data, containing information on wages and family status of 268 men.
We exemplary investigate the ‘marriage wage premium’: we analyse whether marriage leads to an increase in the hourly wage for men.
RE: Married observations have a significantly higher wage than unmarried observations.
FE: If people marry, they experience an increase in wages afterwards. The effect is significant and slightly lower than the RE.
FEIS: Accounting for the individual wage trend before marriage, we do not observe an increase in wages if people marry. The effect is small and non-significant.
Overall, this indicates that there is a problem with non-parallel trends: Those with steeper wage trajectories are more likely to marry (or marry earlier).
As mentioned above, we can achieve the same by 1) manually calculating the individual specific trends and 2) including them as additional covariates in the model.
The biggest limitations
It is crucial to model the trends correctly
You need \(k+1\) time-periods per unit to estimate the trend based on \(k\) variables (related to selection?)
With dynamic treatment effects (which are not specified in your model), the individual trends may absorb the unfolding treatment effect
References
Athey, Susan, Mohsen Bayati, Nikolay Doudchenko, Guido Imbens, and Khashayar Khosravi. 2021. “Matrix Completion Methods for Causal Panel Data Models.”Journal of the American Statistical Association 59: 1–15. https://doi.org/10.1080/01621459.2021.1891924.
Borusyak, Kirill, Xavier Jaravel, and Jann Spiess. 2021. “Revisiting Event Study Designs: Robust and Efficient Estimation.”https://doi.org/10.48550/ARXIV.2108.12419.
Brüderl, Josef, Sonja Drobnič, Karsten Hank, Franz. J. Neyer, Sabine Walper, Christof Wolf, Philipp Alt, et al. 2023. “The German Family Panel (pairfam)Beziehungs- und Familienpanel (pairfam).”https://doi.org/10.4232/PAIRFAM.5678.14.1.0.
Callaway, Brantly, and Pedro H. C. Sant’Anna. 2021. “Difference-in-Differences with Multiple Time Periods.”Journal of Econometrics 225 (2): 200–230. https://doi.org/10.1016/j.jeconom.2020.12.001.
Clark, Andrew E., and Yannis Georgellis. 2013. “Back to Baseline in Britain: Adaptation in the British Household Panel Survey.”Economica 80 (319): 496–512. https://doi.org/10.1111/ecca.12007.
Cunningham, Scott. 2021. Causal Inference: The Mixtape. New Haven and London: Yale University Press.
De Chaisemartin, Clément, and Xavier D’Haultfœuille. 2020. “Two-Way Fixed Effects Estimators with Heterogeneous Treatment Effects.”American Economic Review 110 (9): 2964–96. https://doi.org/10.1257/aer.20181169.
Ludwig, Volker, and Josef Brüderl. 2021. “What You Need to Know When Estimating Impact Functions with Panel Data for Demographic Research.”Comparative Population Studies 46. https://doi.org/10.12765/CPoS-2021-16.
Mundlak, Yair. 1978. “On the Pooling of Time Series and Cross Section Data.”Econometrica 46 (1): 69. https://doi.org/10.2307/1913646.
Roth, Jonathan, Pedro H. C. Sant’Anna, Alyssa Bilinski, and John Poe. 2023. “What’s Trending in Difference-in-Differences? A Synthesis of the Recent Econometrics Literature.”Journal of Econometrics 235 (2): 2218–44. https://doi.org/10.1016/j.jeconom.2023.03.008.
Rüttenauer, Tobias, and Ozan Aksoy. 2024. “When Can We Use Two-Way Fixed-Effects (TWFE): A Comparison of TWFE and Novel Dynamic Difference-in-Differences Estimators.” Preprint. SocArXiv. https://doi.org/10.31235/osf.io/cpvzf.
Rüttenauer, Tobias, and Henning Best. 2021. “Environmental Inequality and Residential Sorting in Germany: A Spatial Time-Series Analysis of the Demographic Consequences of Industrial Sites.”Demography 58 (6): 2243–63. https://doi.org/10.1215/00703370-9563077.
Rüttenauer, Tobias, and Volker Ludwig. 2023. “Fixed Effects Individual Slopes: Accounting and Testing for Heterogeneous Effects in Panel Data or Other Multilevel Models.”Sociological Methods & Research 52 (1): 43–84. https://doi.org/10.1177/0049124120926211.
Sun, Liyang, and Sarah Abraham. 2021. “Estimating Dynamic Treatment Effects in Event Studies with Heterogeneous Treatment Effects.”Journal of Econometrics 225 (2): 175–99. https://doi.org/10.1016/j.jeconom.2020.09.006.
Wooldridge, Jeffrey M. 2010. Econometric Analysis of Cross Section and Panel Data. Cambridge, Mass.: MIT Press.
———. 2021. “Two-Way Fixed Effects, the Two-Way Mundlak Regression, and Difference-in-Differences Estimators.”SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3906345.
Zoch, Gundula, and Stefanie Heyne. 2023. “The Evolution of Family Policies and Couples’ Housework Division After Childbirth in Germany, 1994.”Journal of Marriage and Family 85 (5): 1067–86. https://doi.org/10.1111/jomf.12938.