Simple models of sudden learning | המחלקה לפיזיקה

Seminar

מועמד למחלקה

Off

Speaker

Yohai Bar-Sinai (Tel Aviv University)

Date

22/06/2026 , 10:30 - 12:00 Add to Calendar 2026-06-22 10:30:00 2026-06-22 12:00:00 Simple models of sudden learning A quantitative understanding of how and when neural networks learn from data is a fundamental problem with far-reaching practical consequences. An intriguing and still poorly understood phenomenon in modern machine learning is that models often learn to perform tasks in a sequence of sudden transitions, a behavior known as "Grokking". Notably, in these cases the model generalized to unseen data long after it has completely fit the training set. This sharp transition between memorization and generalization has been observed across various synthetic and realistic scenarios. I will present a set of idealized models that exhibit this behavior and help explain its emergence "in the wild". These include linear and near-linear settings in both regression and classification, where the full training dynamics can be solved analytically, and where grokking can be understood as a manifestation of critical slowing down. In more complex settings, similar phenomenology may arise from glassy-like escape dynamics in high-dimensional loss landscapes. I will conclude by discussing how these simplified models apply to realistic settings. Physics (Building 202), Room 301 המחלקה לפיזיקה physics.dept@mail.biu.ac.il Asia/Jerusalem public

Place

Physics (Building 202), Room 301

Abstract

A quantitative understanding of how and when neural networks learn from data is a fundamental problem with far-reaching practical consequences. An intriguing and still poorly understood phenomenon in modern machine learning is that models often learn to perform tasks in a sequence of sudden transitions, a behavior known as "Grokking". Notably, in these cases the model generalized to unseen data long after it has completely fit the training set. This sharp transition between memorization and generalization has been observed across various synthetic and realistic scenarios. I will present a set of idealized models that exhibit this behavior and help explain its emergence "in the wild". These include linear and near-linear settings in both regression and classification, where the full training dynamics can be solved analytically, and where grokking can be understood as a manifestation of critical slowing down. In more complex settings, similar phenomenology may arise from glassy-like escape dynamics in high-dimensional loss landscapes. I will conclude by discussing how these simplified models apply to realistic settings.

תאריך עדכון אחרון : 14/06/2026