Randomization inference, "ri" sampling_method in rwolf, gives too tight a sample of null t-statistics #717

marcandre259 · 2024-11-15T15:41:01Z

Possible issue I noticed while working on #698.

The behavior was initially noticed when comparing "wild-bootstrap" to the "ri" sample_method p-values when the parameter of interest has no association with the outcome.

With the tight null t-distribution, the resulting p-value is too small.

To reproduce:

import pyfixest as pf
import numpy as np

import matplotlib.pyplot as plt

# Get data and randomize
data = pf.get_data()

np.random.default_rng(232)
data["X1"] = np.random.choice(data["X1"], size=data.shape[0], replace=False)

fit = pf.feols("Y ~ X1", data=data)

fit.summary()

Estimation: OLS
Dep. var.: Y, Fixed effects: 0
Inference: iid
Observations: 998

| Coefficient | Estimate | Std. Error | t value | Pr(>|t|) | 2.5% | 97.5% |
|:--------------|-----------:|-------------:|----------:|-----------:|-------:|--------:|
| Intercept | -0.160 | 0.119 | -1.344 | 0.179 | -0.394 | 0.074 |
| X1 | 0.033 | 0.090 | 0.367 | 0.714 | -0.144 | 0.211 |

RMSE: 2.304 R2: 0.0

seed = 111
df_wild, df_t_wild = fit.wildboottest(param="X1", reps=9999, return_bootstrapped_t_stats=True, seed=seed)

rng = np.random.default_rng(232)
fit.ritest(resampvar="X1", reps=9999, type="randomization-t", store_ritest_statistics=True, rng=rng)

fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(12, 4))
ax[0].hist(fit._ritest_statistics, label="RI t stats", alpha=0.4);
ax[0].axvline(x=fit._ritest_sample_stat, linestyle="--", label="Observed RI t stats", color="black")
ax[0].legend()
ax[1].hist(df_t_wild, label="Wild t stats", alpha=0.4, color="orange");
ax[1].axvline(df_wild["t value"], label="Observed Wild t stat", color="black", linestyle="--");
ax[1].legend()

s3alfisc · 2024-11-15T18:23:15Z

Yes, this looks wrong! I'll take a look later. Thanks for reporting!

s3alfisc · 2024-11-16T07:16:48Z

At second thought, this might not necessarily be a bug, for two reasons:

slightly different nulls: first, the randomization inference estimator tests a "sharp" null hypothesis of no effect for any individual, i.e. we test that $H0:Yi(1)=Yi(0)$ for all i, which is slightly different from testing that the average treatment effect is zero (which is what we do when we run inference via the bootstrap).
different properties of the tests: it might be that the bootstrap is more conservative (or the ritest being less conservative), leading to different distributions

Will have to think about this more - took a look at the code & it looked mostly fine, though will have to check again. Width of the sampling interval differences looks indeed suspicious.

marcandre259 · 2024-11-16T10:08:43Z

Hi @s3alfisc,

Based on testing the sharp hypothesis with randomization inference, I would expect the boostrap approach to be the less conservative one then.

I'm quickly peeking in this paper that confirms this with simulations in table 1 (Fischer -> sharp, Neyman -> average afaik).

Namely, sharp null rejection implies average null rejection.

As far as progress on #698, I'll get back to including RI for Westfall-Young now that this is open.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Randomization inference, "ri" sampling_method in rwolf, gives too tight a sample of null t-statistics #717

Randomization inference, "ri" sampling_method in rwolf, gives too tight a sample of null t-statistics #717

marcandre259 commented Nov 15, 2024

s3alfisc commented Nov 15, 2024

s3alfisc commented Nov 16, 2024

marcandre259 commented Nov 16, 2024 •

edited

Loading

Randomization inference, "ri" sampling_method in rwolf, gives too tight a sample of null t-statistics #717

Randomization inference, "ri" sampling_method in rwolf, gives too tight a sample of null t-statistics #717

Comments

marcandre259 commented Nov 15, 2024

| Coefficient | Estimate | Std. Error | t value | Pr(>|t|) | 2.5% | 97.5% | |:--------------|-----------:|-------------:|----------:|-----------:|-------:|--------:| | Intercept | -0.160 | 0.119 | -1.344 | 0.179 | -0.394 | 0.074 | | X1 | 0.033 | 0.090 | 0.367 | 0.714 | -0.144 | 0.211 |

s3alfisc commented Nov 15, 2024

s3alfisc commented Nov 16, 2024

marcandre259 commented Nov 16, 2024 • edited Loading

| Coefficient | Estimate | Std. Error | t value | Pr(>|t|) | 2.5% | 97.5% |
|:--------------|-----------:|-------------:|----------:|-----------:|-------:|--------:|
| Intercept | -0.160 | 0.119 | -1.344 | 0.179 | -0.394 | 0.074 |
| X1 | 0.033 | 0.090 | 0.367 | 0.714 | -0.144 | 0.211 |

marcandre259 commented Nov 16, 2024 •

edited

Loading