Mechanism diagram · RNN / GRU / LSTM
RECURRENT (GRU / LSTM) x_{t-2} GRU cell h_{t-2} GRU cell h_{t-1} GRU cell h_t Linear head hidden state recurrence across lookback window

How it works

GRU processes the lookback window step-by-step. A hidden state is updated at each step using update and reset gates.

The final hidden state at step T is mapped to the forecast horizon by a linear head.

Cheaper than LSTM (two gates instead of three) and competitive in our experience.

Pros and cons on this universe

Pros

  • IC vector approximately orthogonal to the modern mixer cluster — adds true diversification in a four-arm stack.
  • Cheap, well-understood, fast to train.
  • Useful baseline / sanity check.

Cons / failure modes

  • Rank #21 IC — was misreported +0.105 before audit, actually +0.061.
  • Sequential processing — slow at long lookback compared to convolution or attention.
  • Many CPCV cards fail the Mistake #21 gate (training metric not tracking Sharpe).

References