Mechanism diagram · Hybrid / Mixer
MLP MIXER / HYBRID Input series lookback × N coins RevIN Patching (optional) Embed → d-model Time-mixing MLP mix across time tokens Feature-mixing MLP mix across channels Forecast head Forecast

How it works

Inputs pass through an embedding then through a router that learns to softly assign weight across multiple KAN expert subnetworks.

Each KAN expert uses spline-based learnable univariate transforms instead of fixed activations.

Expert outputs are weighted and aggregated for the forecast head.

Pros and cons on this universe

Pros

  • Strong on the major-long side (ETH +0.176, BNB +0.146 per-coin IC).
  • Conditional computation — different inputs use different experts.

Cons / failure modes

  • No dossier — added based on Nixtla curiosity arm; sparse internal literature.
  • KAN family has thin published track record on financial perpetuals.
  • Likely cluster-redundant given typical Nixtla mixer signal profile.

References