ModernTCN — Axon Ridge Capital

Mechanism diagram · CNN / TCN

How it works

ModernTCN uses depthwise separable convolutions with very large kernels (typically 31, 51, or 81).

Channel mixing follows depthwise convolution — similar to ConvNeXt design from computer vision.

Squeeze-and-excitation style residual blocks expand the effective receptive field.

Pros and cons on this universe

Pros

Top of the 918-experiment benchmark on crypto, FX, and equities (RMSE rank #1-#2).
Modern design — depthwise separable, parallelisable, well-engineered.
Strong literature support.

Cons / failure modes

Empirical HL IC essentially zero (+0.0085) — rank #24 of 28.
Directional accuracy ≈ 50 percent in the 918-paper crypto experiments.
Best illustration in the program that literature RMSE leaderboards do not transfer to directional rank correlation.

References