Recent Literature

  • Transformers Learn the Optimal DDPM Denoiser for Multi-Token GMMs (arXiv, 2026)

  • Learnable Multi-Scale Wavelet Transformer: A Novel Alternative to Self-Attention (arXiv, 2025)