|
Selected Publications
* denotes equal contribution
|
ICLR 2026
|
Block-Recurrent Dynamics in ViTs
Mozes Jacobs*,
Thomas Fel*,
Richard Hakim*,
Alessandra Brondetta,
Demba Ba,
T. Andy Keller
We introduce the Block-Recurrent Hypothesis (BRH), arguing that trained ViTs admit a block-recurrent
depth structure. To validate this, we train recurrent surrogates called Raptor. We demonstrate that
a Raptor model can recover 96% of DINOv2 ImageNet-1k linear probe accuracy in only 2
blocks while maintaining equivalent runtime. We leverage our hypothesis to
perform dynamical interpretability, revealing directional convergence into class-dependent basins,
token-specific trajectory dynamics, and low-rank attractor structure in late layers.
|
CCN 2025 (Oral)
|
Traveling Waves Integrate Spatial Information Through Time
Mozes Jacobs,
Robert C. Budzinski,
Lyle Muller,
Demba Ba,
T. Andy Keller
blog
/
talk
We investigate how traveling waves of neural activity enable spatial information integration in
convolutional recurrent networks. Our models learn to generate traveling waves in response to visual
stimuli, effectively expanding receptive fields of locally connected neurons. This mechanism
significantly outperforms local feed-forward networks on semantic segmentation tasks requiring
global spatial context, achieving comparable performance to non-local U-Nets while using
significantly fewer parameters.
|
|
|
Traveling Waves Integrate Spatial Information Into Spectral Representations
Mozes Jacobs,
Robert C. Budzinski,
Lyle Muller,
Demba Ba,
T. Andy Keller
ICLR 2025 Re-Align Workshop
|
|
|
HyperSINDY: Deep Generative Modeling of Nonlinear Stochastic Governing Equations
Mozes Jacobs,
Bingni W. Brunton,
Steven L. Brunton,
J. Nathan Kutz,
Ryan V. Raut
arXiv, 2023
|
|
|
Gradient Origin Predictive Coding
Mozes Jacobs,
Linxing Preston Jiang,
Rajesh N.P. Rao
Undergraduate senior thesis, 2022
|
|