About RECAST
What is RECAST?
RECAST stands for Replication and Extension with Causal AI Statistical Toolkit. It is an autonomous multi-agent pipeline that takes a published econometrics paper and its replication data and produces a RECAST — a complete package of replicated results, a DoubleML extension, and a referee-reviewed report.
Terminology
| Term | Meaning |
|---|---|
| to RECAST a paper | To run the full pipeline: replicate + extend + review |
| RECASTed | A paper that has completed the full pipeline |
| a RECAST | The output package: results, DML extension, and peer-reviewed report |
| RECASTing | The process of running the pipeline |
Usage examples: “This paper has been RECASTed.” · “Run /recast to RECAST your paper.” · “The RECAST of Finkelstein (2012) finds…”
What is DoubleML?
Double/Debiased Machine Learning is a framework for causal inference that uses cross-fitting to remove the regularization bias introduced when machine learning methods estimate nuisance parameters. Proposed by Chernozhukov et al. (2018), it yields root-n consistent, asymptotically normal estimates of structural parameters even when the first stage is estimated with high-dimensional or nonparametric ML models.
DML is particularly useful for replication because it relaxes the functional-form assumptions of traditional IV and OLS estimators while retaining their causal identification arguments.
What is a Causal Forest?
A Causal Forest is a nonparametric method for estimating heterogeneous treatment effects (Athey, Tibshirani, and Wager, 2019). Built on honest random forests, it estimates individual-level Conditional Average Treatment Effects (CATEs) — how much each unit benefits from treatment — and provides valid confidence intervals through sample splitting.
RECAST uses EconML’s CausalForestDML (which combines DML residualization with forest-based heterogeneity estimation) and CausalIVForest (for IV designs with binary instruments).
Two pipelines
RECAST offers two extension methods, run via separate commands:
| Command | Method | Best for |
|---|---|---|
/recast |
DoubleML (PLIV/PLR) | Robust ATE under flexible nuisance; comparing learners |
/recast-cf |
Causal Forest | Individual-level treatment effects; heterogeneity drivers |
Both share the same replication stage (1–3) and review loop (referees + revision).
When does causal ML help? An honest assessment
These methods are not always an improvement over traditional econometrics. Here is when they add value — and when they don’t:
DoubleML adds value when:
- The original specification relies on strong functional form assumptions (linear, log-linear)
- There are many potential controls and concern about model selection
- You want to check robustness of the ATE to flexible first-stage estimation
DoubleML adds little when:
- The experiment is clean and well-randomized (e.g., Oregon lottery) — the parametric and DML estimates will be similar
- The nuisance functions are inherently unpredictable (randomized instrument → near-zero R²)
- Sample size is very small (cross-fitting with K=5 needs decent fold sizes)
Causal Forest adds value when:
- You have rich individual-level covariates that could drive heterogeneity
- The policy question is “who benefits most?” not just “does it work on average?”
- Treatment varies at the individual level
Causal Forest adds little when:
- Treatment is assigned at the group level (e.g., ethnic group) — no individual variation to split on
- Covariates are sparse (only fixed effects / dummies) — the forest splits on design artifacts, not substance
- N is very small — honest splitting requires sufficient leaf samples
Why automate replication?
Replication studies are methodologically valuable but labour-intensive. Researchers must:
- Reconstruct the original author’s cleaning and specification decisions from sparse documentation
- Re-implement the analysis in a reproducible environment
- Design and justify an ML-based extension
- Write up and defend the results
RECAST automates steps 2–4. The framework cannot replace human judgment on the identification strategy (step 1), but the paper intelligence notebook extracts the key decisions from the PDF, making them explicit and auditable.
Limitations
- The pipeline automates structure, not judgment. Referee reports are AI-generated and may miss domain-specific subtleties.
- The DoubleML extension inherits the original paper’s identification assumptions. If the instrument is weak or the exclusion restriction is questionable, the DML estimate will be equally questionable.
- Stage 1 (paper intelligence) depends on PDF quality. Scanned or poorly formatted papers may produce an incomplete
paper_spec.json.
Suggest a paper
Want to see a paper RECASTed? Submit it here — just provide the paper link and replication data URL. We’ll review it and run the pipeline.
Track the status of all submissions on the Tracker page.
Citation
If you use RECAST in your research, please cite:
@software{recast_gallea,
title = {{RECAST}: Replication and Extension with Causal AI Statistical Toolkit},
author = {Gallea, Quentin},
year = {2025},
url = {https://github.com/qgallea/recast-causal-ai}
}References
Athey, S., Tibshirani, J., & Wager, S. (2019). Generalized random forests. The Annals of Statistics, 47(2), 1148–1178.
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., & Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 21(1), C1–C68.