Abstract

A mixture-of-experts (MoE) router conditions which sub-models trade on an observed market state. We test the strongest version available to us: route the validated Phase 1.7 per-coin directional specialists through a freshly fit, strictly-causal three-state market-regime detector. Under an honest, leakage-controlled evaluation, the routed book underperforms an always-on book of the same experts by −4.36 Sharpe on the well-powered discovery window and is mildly negative on the walk-forward window. The regime-conditional skill that exists in-sample does not persist out-of-sample, and routing additionally throws away the diversification that makes the 175-expert always-on book strong. This replicates two earlier regime-detector findings on a much stronger expert set and with three balanced regimes — so the failure is not a regime-coverage artefact. We did not build the live router.

1 · The idea, and why it's tempting

By the end of the Phase 1.7 funnel we hold a roster of per-coin specialists: (architecture, coin) pairs whose directional signal is sign-consistent across three out-of-sample regimes and clears its noise floor. The natural hypothesis is that these specialists are conditionally good — a coin's short specialist should fire in a risk-off regime, a long specialist in a risk-on grind. If true, a regime router would lift the book by activating the right experts at the right time and silencing the rest. This is the classic mixture-of-experts gating argument.

The hypothesis is testable, and the honest test is unkind to it.

2 · The detector we built

A Gaussian mixture model over ten causal, market-wide hourly features — BTC realised volatility at 24h and 168h, cross-sectional mean realised vol and return dispersion, average coin-to-market correlation, BTC trend at 24h and 168h, funding mean and dispersion, and open-interest velocity. Every feature is backward-looking. The standardiser and the GMM are fit on the training window only (≤ 2025-12-31, 11,361 bars), then frozen and applied by argmax posterior to all later bars — strictly causal across both out-of-sample windows. The number of states is chosen by a knee rule over BIC (raw min-BIC degenerately fragments into thin near-duplicate mid-vol states); the knee selects k = 3.

Three interpretable regimes · centroids in original units
Fit on train only, frozen, applied causally · 14,528 labelled hours · mean dwell ~34h
RegimeBTC RV 168hBTC trend 168hXS corrOI velocityReading
LOWVOL_UP0.0041 (low)+3.2%+0.81+4.2%risk-on grind, OI building
MEDVOL_DN0.0044−1.1%+0.87−1.5%choppy mid-vol drift-down
HIVOL_DN0.0075 (high)−4.8%+0.88+1.3%high-vol stress / sell-off

Both out-of-sample windows contain all three regimes with non-trivial mass (the discovery window is near-balanced at 25 / 45 / 30 percent), so the negative verdict below is not caused by a missing regime.

3 · The decisive numbers

The book is an equal-weight portfolio of the per-coin specialist directional trades: honest rolling-median centring, sign-corrected, non-overlapping 24-hour holds, 4 bps per leg, phase-averaged over 24 entry offsets. We fit the static regime → specialist mapping on a train half and evaluate it on a disjoint validate half. Lift = routed book Sharpe − always-on book Sharpe on the same bars.

Routed vs always-on — annualised net Sharpe
Static regime→specialist map, fit on train half, evaluated on disjoint validate half · 4 bps / leg

On the well-powered discovery window the always-on book earns +4.98 Sharpe and the routed book just +0.63 — a −4.36 deficit. On the walk-forward window the routed book is also slightly worse (−0.76 vs −0.52). The conditional skill found in-sample does not generalise, and routing discards diversification.

4 · The "adaptive" variant is a trap, not a green light

A more flexible router — a causal expanding-window gate that activates an expert once its cumulative signed return in the current regime has been positive — looks spectacular: +6.20 Sharpe on discovery, +5.86 on walk-forward. It is not deployable evidence, for two reasons.

Why we don't publish the +6.2

Our numerical-plausibility contract flags any Sharpe above 5 and any per-bar mean/std above 0.3 as "verify before you believe." Both adaptive books trip both flags. We treat them as confounded and do not report them as deployable alpha — the same discipline that caught two earlier too-clean headline Sharpes that turned out to be metric bugs.

5 · This is the third time

Two earlier internal experiments reached the same verdict with weaker tooling: a rule-baseline regime MoE and a hidden-Markov regime detector both failed to beat an always-on baseline. The standard objection to those was regime coverage — one of them evaluated on a window that was 89 percent a single regime. This run removes that objection: three balanced, well-populated regimes over both out-of-sample windows, routing the much stronger Phase 1.7 IC specialists. The conclusion holds. Market-wide regime routing of a per-coin specialist book does not generalise.

6 · What would change the conclusion

A negative result is a verdict on a specific construction, not on the whole idea. We would revisit regime conditioning if:

None of these were pursued here. The gate's only job was to decide whether to build the live router, and the answer is no. The capital stays in the always-on book and the cross-sectional ensemble, both of which we can defend.

Units

Book Sharpe is the annualised Sharpe of non-overlapping 24-hour-hold net per-trade returns on the 24-hour forward log-return series, sign-corrected, 4 bps per leg, from the canonical metrics implementation. Lift is routed minus always-on on identical validate-half bars. Targets were verified to be raw forward log returns, not a rank transform, before any Sharpe was computed.

Sources & references

  1. Jacobs, R. A., Jordan, M. I., Nowlan, S. J., & Hinton, G. E. (1991). Adaptive Mixtures of Local Experts. Neural Computation.
  2. Hamilton, J. D. (1989). A New Approach to the Economic Analysis of Nonstationary Time Series and the Business Cycle. Econometrica. (regime-switching)
  3. López de Prado, M. (2018). Advances in Financial Machine Learning. Wiley. (selection bias under online gating)
  4. Axon Ridge internal — `research/experiments/results/Phase1_7_regime_routing_gate_2026-05-28.md`
  5. Axon Ridge internal — `src/signals/regime_gmm.py`, `scripts/build_gmm_regime_labels.py`