Mechanism diagram · Foundation Model
FOUNDATION MODEL · ZERO-SHOT Input series lookback Tokenise / patch fixed vocab Pretrained Transformer frozen weights Probabilistic forecast quantile / next-token pretrained on broad TS corpus · no HL fine-tuning yet

How it works

TimesFM is a decoder-only Transformer pretrained on a large corpus of public time series.

Input series are patched and embedded; the decoder autoregressively generates the forecast horizon.

Operates on raw numerical inputs rather than tokenised quantiles.

Pros and cons on this universe

Pros

  • Zero-shot performance competitive with supervised models on standard benchmarks.
  • Decoder-only design produces probabilistic forecasts naturally.
  • Google publication, broad availability of weights.

Cons / failure modes

  • No empirical HL IC — frontier item gated on EXP-012b.
  • Crypto distribution shift from Google's pretraining corpus is unknown.
  • Compute cost at inference still meaningful relative to mixer arms.

References