Mechanism diagram · MLP / Linear
MLP / LINEAR Input lookback L flat features Dense layer with activation Dense layer + residual Linear head Linear floors (DLinear / NLinear-RevIN) collapse to one dense layer.

How it works

KAN replaces fixed activation functions with learnable univariate transforms — typically B-splines.

Each edge in the network carries its own learnable spline; the composition of these splines represents the multivariate function.

On forecasting tasks, KAN layers map a flattened lookback to the forecast horizon via stacked spline edges.

Pros and cons on this universe

Pros

  • Strong BNB long (+0.268) and ETH long (+0.262) per-coin IC.
  • Theoretically grounded in the Kolmogorov–Arnold representation theorem.
  • Rank #18 IC overall.

Cons / failure modes

  • Pass A vs walk-forward divergence (negative Pass A IC −0.043 vs walk-forward IC −0.009) — flagged regime-fragile.
  • Cluster overlap with KAN-derived mixer (RMoK).
  • STRAT-13 deployment was gated by regime fragility.

References