Mechanism diagram · Transformer / Attention
How it works
TimeXer separates inputs into endogenous (the target series) and exogenous (in our setup, funding rates).
Endogenous patches go through self-attention. A global endogenous token represents the entire series.
Variate-wise cross-attention then injects exogenous signals — the funding stack — into the endogenous representation.
Pros and cons on this universe
Pros
- Built for exogenous-aware time-series forecasting — aligned with the funding-z hypothesis.
- NeurIPS 2024 — recent and well-replicated.
- Rank #13 IC — comfortably positive.
Cons / failure modes
- Recent MSE-only crypto replication on BTC + M2 reported only marginal lifts.
- Without exogenous channels, collapses toward PatchTST and loses its differentiator.
- Cluster-redundant with PatchTST on the IC profile.