Notes on:
Abadie, Athey, Imbens, Wooldridge (2022) When Should You Adjust Standard Errors for Clustering?

1 The clustering problem as a case of design problem

The authors argued that the clustering problem is “in essence a design problem, either a sampling design or an experimental design issue.”

The common argument – “The clustering problem is caused by the presence of a common unobserved random shock at the group level that will lead to correlation between all observations within each group”. There are some problem with this argument:

  • It’s to broad to apply. A regression of wages on years of education, can be clustered by age cohorts just as easily as by state, using this argument.
  • Why don’t researcher cluster by groups in randomized experiment?

1.1 Sampling design issue

Clustering problem due to sampling design issue
This happens when the sampling follows a two stage process:
  1. A subset of clusters is sampled randomly from a population of clusters
  2. Units are sampled randomly from the sampled clusters.

This case is not so relevant in economics: “Most of the samples that we work with are close enough to random that we typically worry more about the dependence due to a group structure than clustering due to stratification.” (in fn. 10, Angrist 2009, 309).

1.2 Experimental design issue

Clustering problem due to experimental design issue
When clusters of units, rather than units, are assigned to treatment.

This perspective fits best the typical application in economics, but surprisingly it is rarely explicitly presented as the motivation for cluster adjustments to the standard errors.

2 Main takeaways

2.1 Proposed procedure for the cluster problem

First,
the researcher should assess whether the sampling process is clustered or not, and whether the assignment mechanism is clustered.
  • If the answer to both is no, one should not adjust the standard errors for clustering, irrespective of whether such an adjustment would change the standard errors.
Second,
In general, the standard Liang-Zeger clustering adjustment (Liang and Zeger 1986) is conservative unless one of three conditions holds:
  1. there is no heterogeneity in treatment effects
  2. we observe only a few clusters from a large population of clusters
  3. a vanishing fraction of units in each cluster is sampled, e.g. at most one unit is sampled per cluster.
Third,
the (positive) bias from standard clustering adjustments can be corrected if all clusters are included in the sample and further, there is variation in treatment assignment within each cluster. For this case we propose a new variance estimator.
Fourth,
Fourth, if one estimates a fixed effects regression (with fixed effects at the level of the relevant clusters), the analysis changes. Then, heterogeneity in the treatment effects is a requirement for a clustering adjustment to be necessary.

2.2 Why there would be no formal test for the clustering problem

We argue that the design perspective on clustering, related to randomization inference (e.g., Rosenbaum [2002], Athey and Imbens [2017]), clarifies the role of clustering adjustments to standard errors and aids in the decision whether to, and at what level to, cluster, both in standard clustering settings and in more general spatial correlation settings (Bester et al. [1] [2009], Conley [1999], Barrios et al. [2012], Cressie [2015]). For example, we show that, contrary to common wisdom, correlations between residuals within clusters are neither necessary, nor sufficient, for cluster adjustments to matter. Similarly, correlations between regressors within clusters are neither necessary, not sufficient, for cluster adjustments to matter or to justify clustering. In fact, we show that cluster adjustments can matter, and substantially so, even when both residuals and regressors are uncorrelated within clusters. Moreover, we show that the question whether, and at what level, to adjust standard errors for clustering is a substantive question that cannot be informed solely by the data. In other words, although the data are informative about whether clustering matters for the standard errors, but they are only partially informative about whether one should adjust the standard errors for clustering. A consequence is that in general clustering at too aggregate a level is not innocuous, and can lead to standard errors that are unnecessarily conservative, even in large samples.

It is important to:

  • Define the estimand clearly

  • Articulate precisely the relation between the sample and the population:

    1. how units in the sample were selected and, most importantly, whether there are clusters in the population of interest that are not represented in the sample
    2. how units were assigned to the various treatments, and whether this assignment was clustered.

    …If either the sampling or assignment varies systematically with groups in the sample, clustering will in general be justified.

2.3 Debunked misconceptions

  • Clustering matters only if the residuals and the regressors are both correlated within clusters
  • If clustering matters, one should cluster

3 See also

4 References

References

Angrist, Joshua D. 2009. Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton University Press. https://www.xarg.org/ref/a/0691120358/.
Liang, Kung-Yee, and Scott L. Zeger. 1986. “Longitudinal Data Analysis Using Generalized Linear Models.” Biometrika 73 (1): 13–22. https://doi.org/10.1093/biomet/73.1.13.

This post is in the collection of my public reading notes.