# Experiments Five experiments, run in order. Each script is `0N_.py`. Scripts that don't exist yet are listed here as a spec so the CI workflow and the paper's Section 6 stay in sync with what's actually implemented. | # | Script | Purpose | Status | |---|--------|---------|--------| | 1 | `01_generate_gbm.py` | Generate pure-noise GBM price series (fixed seed, documented params) | pending | | 2 | `02_baseline_replication.py` | Run K12 golden hyperparameters on real BTCUSDT 1m, buggy backtester → expect Sharpe ≈ 14.49 | pending — needs `audit/input/code` | | 3 | `03_honest_replication.py` | Same hyperparameters/data, `time_machine.py` engine → expect Sharpe ≈ -0.25 | pending — needs `audit/input/code` | | 4 | `04_noise_control.py` | Run both engines across ≥30 independent GBM seeds, compare Sharpe distributions | pending | | 5 | `05_noise_harness.py` | CI-gating version of experiment 4: fails the build if mean Sharpe on noise falls outside a pre-registered null band | pending | ## Reproducibility rules - Every script must take `--seed` and print it in its output. - Every output JSON must include: seed, kernel version/hash, library versions (numpy/pandas), and a UTC timestamp. - No script reads from `audit/input/` directly in a way that would couple the public reproduction path to the forensic copy — `audit/input/` is for our own verification, not for the published reproduction instructions. ## Environment Pin dependencies in `requirements.txt` (to be added alongside the first script). CI installs from that file — see `.github/workflows/noise-harness.yml`.