Projects
A Minimal Model of Representation Collapse
We build a minimal dynamical model directly in representation space, abstracting away the details of network architecture and parameters. We use the concept of frustration from statistical physics to describe the core mechanism behind representation collapse, and analyze how Stop-Gradient can break the symmetry and open up a non-collapsing subspace that preserves geometric separation between classes.
View Project
Spin Glass Model of In-Context Learning
We mapped in-context learning in a linear attention model to a spin glass with real-valued spins, and solved the ground state, energy landscape and phase behavior to show how task diversity drives a unique solution that enables in-context prediction in pre-trained transformers.
View Project