# Seed-robust QI lane

The seed-robust quantitative indicator (QI) lane measures whether FAST
reduced-MHD trajectories are stable under tiny stochastic initial-condition
perturbations. It is a reproducibility and sensitivity gate, not a production
uncertainty-quantification study.

## What the lane checks

For a deterministic base ensemble seed, MHX expands a fixed list of child seeds,
adds zero-mean, unit-RMS-scaled perturbations to the cosine-tearing initial
condition, evolves each FAST trajectory, and records:

- `gamma_fit`
- `final_total_energy`
- `final_magnetic_energy`
- `final_kinetic_energy`
- `final_magnetic_divergence_linf`

For each metric the lane writes the ensemble samples, mean, sample standard
deviation, coefficient of variation (CV), minimum, maximum, and pass/fail gate.

The reported coefficient of variation is

$$
\mathrm{CV}(m)=
\frac{\operatorname{std}_s[m_s]}
     {\max(|\operatorname{mean}_s[m_s]|,\epsilon_m)},
$$

where $s$ indexes seeds and $\epsilon_m$ is the metric-specific floor used to
avoid meaningless ratios when the physical quantity is expected to be near
zero. Gates are applied to CV, absolute mean, or both depending on the metric.

## Physics-motivated gates

The default gates are deliberately conservative for FAST validation:

- Growth/decay fits should be seed robust at the few-percent CV level.
- Magnetic and total energy should be insensitive to tiny perturbations at the
  `1e-3` CV level.
- Kinetic energy may be near zero in the short linear FAST run, so it is gated
  by both CV and absolute mean.
- `B_perp = (∂_y ψ, -∂_x ψ)` is analytically solenoidal, so spectral magnetic
  divergence should remain near roundoff.

These gates complement the FKR, current-sheet, energy-budget, and duration
validation lanes by checking that stochastic seed choice does not dominate the
reported FAST trajectory diagnostics.

The seed perturbation itself is deliberately smooth and tiny. It is not a
surrogate for broadband turbulent noise, kinetic particle noise, or a physical
uncertainty model. Its purpose is narrower: catch fragile diagnostics and
accidental seed dependence before larger campaigns are launched.

## Artifacts

The single-amplitude writer is available as a Python API:

```python
from mhx.benchmarks import write_seed_robust_qi_validation

write_seed_robust_qi_validation("outputs/benchmarks/seed_robust_qi")
```

or from the CLI:

```bash
mhx benchmark seed-robust-qi \
  --outdir outputs/benchmarks/seed_robust_qi \
  --seeds 0,1,2,3 \
  --nx 16 --ny 16 \
  --t-end 0.12
```

Expected files:

- `diagnostics.json`
- `validation.json`
- `ensemble.npz`
- `figures/qi_summary.png` when `matplotlib` is available
- `manifest.json`

## Amplitude-sweep QI

The stronger reviewer-facing gate keeps the seed list fixed and sweeps the
seed-noise amplitude `epsilon`.  It validates two questions:

1. At each `epsilon`, are the fitted growth rate, energies, and divergence
   metrics insensitive to seed choice?
2. As `epsilon` increases through a tiny admissible range, do metric means stay
   close to the zero-noise baseline?

```bash
mhx benchmark seed-robust-qi-sweep \
  --outdir outputs/benchmarks/seed_robust_qi_sweep \
  --seeds 0,1,2,3 \
  --amplitudes 0,1e-9,1e-8 \
  --nx 16 --ny 16 --steps 12
```

Expected files:

- `diagnostics.json`
- `validation.json`
- `sweep.npz`
- `figures/qi_sweep_cv.png`
- `figures/qi_sweep_mean_drift.png`
- `manifest.json`

The NPZ stores a metric cube with shape
`(n_amplitudes, n_seeds, n_metrics)`.  The JSON diagnostics store
`metric_cv_max` and `metric_relative_mean_drift_max` so reviewers can audit
whether failures come from seed spread or amplitude drift.

Pure helpers are also exposed for tests and downstream rollout:
`generate_seed_ensemble`, `seeded_perturbation`,
`make_seeded_initial_state`, `compute_metric_statistics`, and
`default_seed_robust_qi_gates`.

## Claim boundary

The manifest is `claim_level = "validation"`. The QI lane can support claims
that FAST diagnostics are stable under tiny smooth seed perturbations and,
for the sweep command, under a documented perturbation-amplitude range for the
tested configuration. It cannot support production uncertainty quantification,
plasmoid-count statistics, or turbulent ensemble convergence. Those require
larger ensembles, long-duration campaign manifests, and convergence sweeps.

For a production nonlinear campaign, use the QI lane as a secondary check after
the duration and convergence gates pass. A production QI bundle should archive:

- the exact seed list;
- the base configuration and perturbation amplitude;
- all scalar samples, not only means;
- failure thresholds chosen before the run;
- the figure and JSON manifest in the production artifact directory.

## Source links

- QI implementation:
  [`src/mhx/benchmarks/seed_robust_qi.py`](https://github.com/uwplasma/MHX/blob/main/src/mhx/benchmarks/seed_robust_qi.py)
- CLI entrypoint:
  [`src/mhx/cli/main.py`](https://github.com/uwplasma/MHX/blob/main/src/mhx/cli/main.py)
- Validation-suite integration:
  [`src/mhx/benchmarks/suite.py`](https://github.com/uwplasma/MHX/blob/main/src/mhx/benchmarks/suite.py)
- Tests:
  [`tests/test_seed_robust_qi.py`](https://github.com/uwplasma/MHX/blob/main/tests/test_seed_robust_qi.py)

## Review checklist

Before citing a seed-robust QI result, verify:

1. `validation.json` has `passed = true`.
2. `manifest.json` records `claim_level = "validation"` or a justified
   production claim after longer campaigns.
3. `ensemble.npz` contains the full metric samples.
4. `figures/qi_summary.png` matches the archived JSON values.
5. The seed list and perturbation amplitude are written in the command or
   config bundle.