# Lookahead Bias in Vectorized Backtesting: A Noise Harness Diagnostic

Repository for the paper documenting a lookahead-bias defect found in a
vectorized backtester (the "K12" kernel), and a noise-harness methodology
to detect this class of bug using pure Geometric Brownian Motion data.

## Structure

```
paper/        LaTeX source (revtex4-2)
experiments/  Description and specs of the 5 planned experiments
data/         Instructions to obtain the BTCUSDT 1m dataset (no large binaries in git)
results/      Raw experiment outputs (json/csv), git-tracked once produced
audit/input/  Forensic copy of the original K12 code/data/results for reproduction
```

## How to reproduce

1. Read `experiments/README.md` for the experiment list and what each one tests.
2. Read `data/README.md` to obtain the dataset (or regenerate synthetic GBM data).
3. Run the scripts under `experiments/` (CI runs the noise harness automatically
   on every PR, see `.github/workflows/noise-harness.yml`).
4. Compare your output against the reference files in `results/`.

## License — read this before reusing anything

This repository carries **two separate licenses** for two separate kinds of content:

| Content | License | File |
|---|---|---|
| Code: experiment scripts, harness, CI workflows, anything under `experiments/`, `data/`, `.github/` | **MIT** | [`LICENSE`](LICENSE) |
| Paper text and figures, anything under `paper/` | **CC-BY 4.0** | [`LICENSE-TEXT-CC-BY-4.0.md`](LICENSE-TEXT-CC-BY-4.0.md) |

Use the code freely, commercially or not, with attribution (MIT terms).
Reuse the paper text/figures freely, commercially or not, with attribution (CC-BY 4.0 terms).
These are independent grants — reusing the code does not require complying with CC-BY, and vice versa.

## Authorship

See [`AUTHORS.md`](AUTHORS.md).