Projects
A Free Energy Perspective for SFT and RL
From a free-energy perspective, RL and SFT can be unified under the same formal structure; their essential difference lies in the training signals they use. Model behaviors can be divided into four regimes, each corresponding to a different type of landscape. Here, we characterize the evolution of these four landscape regimes -- from basin to tail, and further to barrier and singularity.
View Project
A Minimal Model of Representation Collapse
We build a minimal dynamical model directly in representation space, abstracting away the details of network architecture and parameters. We use the concept of frustration from statistical physics to describe the core mechanism behind representation collapse, and analyze how Stop-Gradient can break the symmetry and open up a non-collapsing subspace that preserves geometric separation between classes.
View Project
Spin Glass Model of In-Context Learning
We mapped in-context learning in a linear attention model to a spin glass with real-valued spins, and solved the ground state, energy landscape and phase behavior to show how task diversity drives a unique solution that enables in-context prediction in pre-trained transformers.
View Project