• Starts: 11:00 am on Friday, February 28, 2025
  • Ends: 12:30 pm on Friday, February 28, 2025

ECE/CS Seminar: Eran Malach

Title: Learning Hard Problems with Neural Networks and Language Models

Abstract: Modern machine learning models, and in particular large language models, can now solve surprisingly complex mathematical reasoning problems. In this talk, I will explore how neural networks and auto-regressive language models can learn to solve reasoning problems, and how the choice of data, learning paradigm, and architecture influences their behavior. I will begin by discussing computationally hard learning problems, analyzing the computational resources required for neural networks to learn these problems. Next, I will demonstrate how introducing step-by-step supervision through auto-regressive language models overcomes computational barriers, enabling simple models trained on next-token prediction to efficiently learn any computationally tractable function. Finally, I will discuss how language model architectures influence model behavior across different reasoning tasks. These results serve as a basis for studying machine learning with language models, with implications for data structure, architecture design and training paradigms.

Bio: Eran Malach is a postdoc Research Fellow in the Kempner Institute at Harvard University. Previously, he did his PhD at the School of Computer Science and Engineering in the Hebrew University of Jerusalem, advised by Prof. Shai Shalev-Shwartz. His research focus is Machine Learning, Theoretical Foundations of Deep Learning and Language Models. He is mainly interested in computational aspects of learning and optimization. He also worked in Mobileye, where he developed machine learning and computer vision algorithms for driver-assistance systems and self-driving cars. His research is supported by the Rothschild Fellowship, the William F. Milton Fund and the OpenAI Superalignment Fast Grant.

Location:
CDS 950