Mechanism diagram · Transformer / Attention
PATCH / ATTENTION TRANSFORMER Input series lookback × N coins RevIN norm + reverse Patching L → P patches Linear proj → d-model Transformer encoder multi-head self-attention + FFN × L Flatten + head → horizon h Forecast — h × N coins

How it works

SoFTS applies a soft frequency-domain transform to the input before attention.

Spectral thresholding controls which frequency components dominate the representation.

An inverse transform maps the spectral hidden state back to the time domain for the forecast head.

Pros and cons on this universe

Pros

  • Frequency-domain priors can capture periodic structure missed by time-domain attention.
  • Rank #8 on the screen — comfortably positive.

Cons / failure modes

  • No dossier — added based on Nixtla arm naming.
  • Spectral methods are sensitive to non-stationarity in crypto data.
  • Cluster-redundant with TSMixer / iTransformer on this universe.

References