Mechanism diagram · CNN / TCN
CNN / TCN (DILATED) Input series lookback × N Conv 1×k dilation 1 Conv 1×k dilation 2 Conv 1×k dilation 4 Residual + LayerNorm stacked depth-wise blocks Pool + head Forecast

How it works

ModernTCN uses depthwise separable convolutions with very large kernels (typically 31, 51, or 81).

Channel mixing follows depthwise convolution — similar to ConvNeXt design from computer vision.

Squeeze-and-excitation style residual blocks expand the effective receptive field.

Pros and cons on this universe

Pros

  • Top of the 918-experiment benchmark on crypto, FX, and equities (RMSE rank #1-#2).
  • Modern design — depthwise separable, parallelisable, well-engineered.
  • Strong literature support.

Cons / failure modes

  • Empirical HL IC essentially zero (+0.0085) — rank #24 of 28.
  • Directional accuracy ≈ 50 percent in the 918-paper crypto experiments.
  • Best illustration in the program that literature RMSE leaderboards do not transfer to directional rank correlation.

References