Some checks failed
noise-harness / run-noise-harness (push) Failing after 1m54s
30 lines
1.1 KiB
Markdown
30 lines
1.1 KiB
Markdown
# Data
|
|
|
|
This repository does not track raw market data (see `.gitignore`). Large
|
|
binaries don't belong in a public git repo, and Binance's public API makes
|
|
the data trivially reconstructible.
|
|
|
|
## BTCUSDT 1-minute OHLCV
|
|
|
|
Used in experiments 2 and 3 (baseline and honest replication against real
|
|
data). To obtain it:
|
|
|
|
1. If `download_data.py` exists in this directory, run it — it pulls the
|
|
exact date range used in the original experiment from the public Binance
|
|
API and writes `BTCUSDT_1m.parquet`.
|
|
2. Verify the SHA-256 hash of the resulting file matches the one recorded in
|
|
`audit/input/MANIFEST.md` (forensic record of the original dataset used
|
|
when the bug was found).
|
|
|
|
```bash
|
|
sha256sum BTCUSDT_1m.parquet
|
|
```
|
|
|
|
If the hash doesn't match, the date range or Binance API response has
|
|
drifted — do not proceed with replication until it's reconciled.
|
|
|
|
## Synthetic GBM data (noise harness)
|
|
|
|
Generated on the fly by `experiments/01_generate_gbm.py`. No download
|
|
needed — this is the point of using synthetic null data: it requires zero
|
|
external dependency and is perfectly reproducible from a seed.
|