№ 01
Literature Review · Foundation

What we actually read: a grounded survey

The architecture families this program is built on — patch-attention transformers, MLP mixers, modern TCNs, gradient-boosted trees, LOB-specialised networks — and the primary literature each rests on. Where a paper's claims did not replicate on Hyperliquid data, we note it.

Read →
№ 02
Empirical · Architecture

A funnel through 28 model families

Every architecture family in the modern time-series literature run through a single audited protocol on a 26-coin Hyperliquid panel: 116 hyperparameter cells, identical OOS windows, identical cost model. No favourites. Most fail — that is the point. The few that survive go to the next stage.

Read →
№ 03
Empirical Finding · Architecture

What transferred and what didn't: architecture lessons

Literature top-tier models (ModernTCN, PatchTST, TimesNet) do not transfer cleanly to short-horizon crypto perps. The strongest standalone cross-sectional signal on the held-out window came from a tuned GRU and a tabular ranker. We explain why sequence models struggle and what the tabular models see that they don't.

Read →
№ 04
Methodology

The full validation pipeline: Pass A, Pass B, and walk-forward

A single backtest is a hypothesis, not a result. We run three gates in sequence. Pass A tests the most recent live regime — the hardest one to overfit away. Pass B excises the test cascade from training entirely, catching models that memorised the evaluation window. Walk-forward uses a true held-out future block neither pass touches. A (model, coin) pair must be sign-consistent across all three to be a candidate. Each gate catches a different failure mode; none is redundant.

Read →
№ 05
Methodology · Lessons Learned

Information coefficient is not Sharpe

IC measures rank correlation between predicted and realised returns — it says nothing about dollar PnL or trading costs. We published a +7.38 headline that became −3.53 once the pipeline was corrected to use raw forward returns at 4 bps per leg. The IC had looked great throughout. The bug, the fix, and why IC and dollar Sharpe must never be conflated.

Read →
№ 06
Methodology · Signal Quality

Excess above noise: the signal quality floor

A positive IC is not enough — it must exceed its own standard error. We define excess IC as |IC| − SE(IC), where SE is estimated from the per-bar IC time series. Cells with excess IC ≤ 0 are statistically indistinguishable from noise regardless of their headline IC. Applying this floor as a promotion gate cut roughly half of otherwise-passing cells and meaningfully improved the quality of what reached capital.

Read →
№ 07
Methodology · Lessons Learned

Five ways Sharpe gets warped — and how we catch them

A Sharpe ratio is only as trustworthy as its inputs. We document five bugs that produced plausible-looking numbers on real data: future-leaking centring (one line of code turned −2 SR into +2 SR); cross-sectional rank targets passed as raw returns (+7.4 became −3.5); h-bar overlap in the cost function (+8 SR vanished); non-causal feature construction; and stale-data evaluation. For each: what the bug looks like, what plausibility flag catches it, and the canonical fix.

Read →
№ 08
Negative Result · Debunked

Debunked: regime-routed experts don't beat always-on

We built a three-state market-regime detector and routed validated per-coin specialists through it — a mixture-of-experts architecture. Out-of-sample it destroyed approximately 4.4 Sharpe versus simply leaving every specialist always on. The adaptive variant that looked compelling in-sample tripped every plausibility flag we had. We did not build the live router.

Read →
№ 09
Negative Result · Debunked

Debunked: stacked per-coin specialists

Training a dedicated model per coin and stacking their outputs into a portfolio signal looks appealing — more data per coin, no cross-sectional blending noise. In practice the per-coin models overfit to coin-specific regimes, fail to generalise across the portfolio, and produce correlated drawdowns exactly when diversification is needed most. The cross-sectional approach is more robust.

Read →
№ 10
Empirical Finding · Pending update

The ensemble beats every single arm

A sign-corrected, z-scored blend across the best cell of each architecture — the cross-sectional ensemble — out-Sharped every standalone model on the walk-forward window. This paper is being updated to reflect the Phase 1.7 reproduced results before the figures are confirmed. The finding is expected to hold; the numbers are under revision.

Coming →
A note on figures

All headline Sharpe ratios on this site are annualised dollar Sharpe computed on raw forward returns at a uniform 4 basis points per leg cost convention, the operational estimate for our intended size on Hyperliquid. Information coefficient and rank-IR are reported separately and never substituted. Backtest performance is not a forecast of live results.