Some checks failed
noise-harness / run-noise-harness (push) Failing after 1m54s
27 lines
1.6 KiB
Markdown
27 lines
1.6 KiB
Markdown
# Experiments
|
|
|
|
Five experiments, run in order. Each script is `0N_<name>.py`. Scripts that
|
|
don't exist yet are listed here as a spec so the CI workflow and the paper's
|
|
Section 6 stay in sync with what's actually implemented.
|
|
|
|
| # | Script | Purpose | Status |
|
|
|---|--------|---------|--------|
|
|
| 1 | `01_generate_gbm.py` | Generate pure-noise GBM price series (fixed seed, documented params) | pending |
|
|
| 2 | `02_baseline_replication.py` | Run K12 golden hyperparameters on real BTCUSDT 1m, buggy backtester → expect Sharpe ≈ 14.49 | pending — needs `audit/input/code` |
|
|
| 3 | `03_honest_replication.py` | Same hyperparameters/data, `time_machine.py` engine → expect Sharpe ≈ -0.25 | pending — needs `audit/input/code` |
|
|
| 4 | `04_noise_control.py` | Run both engines across ≥30 independent GBM seeds, compare Sharpe distributions | pending |
|
|
| 5 | `05_noise_harness.py` | CI-gating version of experiment 4: fails the build if mean Sharpe on noise falls outside a pre-registered null band | pending |
|
|
|
|
## Reproducibility rules
|
|
|
|
- Every script must take `--seed` and print it in its output.
|
|
- Every output JSON must include: seed, kernel version/hash, library versions
|
|
(numpy/pandas), and a UTC timestamp.
|
|
- No script reads from `audit/input/` directly in a way that would couple the
|
|
public reproduction path to the forensic copy — `audit/input/` is for our
|
|
own verification, not for the published reproduction instructions.
|
|
|
|
## Environment
|
|
|
|
Pin dependencies in `requirements.txt` (to be added alongside the first
|
|
script). CI installs from that file — see `.github/workflows/noise-harness.yml`.
|