# Data This repository does not track raw market data (see `.gitignore`). Large binaries don't belong in a public git repo, and Binance's public API makes the data trivially reconstructible. ## BTCUSDT 1-minute OHLCV Used in experiments 2 and 3 (baseline and honest replication against real data). To obtain it: 1. If `download_data.py` exists in this directory, run it — it pulls the exact date range used in the original experiment from the public Binance API and writes `BTCUSDT_1m.parquet`. 2. Verify the SHA-256 hash of the resulting file matches the one recorded in `audit/input/MANIFEST.md` (forensic record of the original dataset used when the bug was found). ```bash sha256sum BTCUSDT_1m.parquet ``` If the hash doesn't match, the date range or Binance API response has drifted — do not proceed with replication until it's reconciled. ## Synthetic GBM data (noise harness) Generated on the fly by `experiments/01_generate_gbm.py`. No download needed — this is the point of using synthetic null data: it requires zero external dependency and is perfectly reproducible from a seed.