Projects

A Minimal Model of Representation Collapse

We build a minimal dynamical model directly in representation space, abstracting away the details of network architecture and parameters. We use the concept of frustration from statistical physics to describe the core mechanism behind representation collapse, and analyze how Stop-Gradient can break the symmetry and open up a non-collapsing subspace that preserves geometric separation between classes.

Spin Glass Model of In-Context Learning

We mapped in-context learning in a linear attention model to a spin glass with real-valued spins, and solved the ground state, energy landscape and phase behavior to show how task diversity drives a unique solution that enables in-context prediction in pre-trained transformers.