CISE Seminar: October 11, 2019 – Alexander (Sasha) Rakhlin, Massachusetts Institute of Technology (MIT)
8 St. Mary’s St., PHO 203
3:00pm-4:00pm
Alexander (Sasha) Rakhlin
Massachusetts Institute of Technology
Is Memorization Compatible with Learning?
One of the key tenets taught in courses on Statistics and Machine Learning is that fitting the data too well (or data memorization, data interpolation) inevitably leads to overfitting and poor prediction performance. Yet, over-parametrized neural networks appear to defy this “rule.” We will provide theoretical evidence that challenges the common wisdom. In particular, we will consider the minimum norm interpolant in a reproducing kernel Hilbert space and show its good generalization properties in certain high-dimensional regimes. Since gradient dynamics for wide randomly-initialized neural networks provably converge to a minimum-norm interpolant (with respect to a certain kernel), our results imply generalization and consistency for such neural networks. We will contrast our approach with the classical techniques based on uniform convergence and Rademacher averages and argue that these techniques are not sufficient for analyzing the memorization regime. This work was done jointly with Tengyuan Liang and Xiyu Zhai.
Alexander (Sasha) Rakhlin is an Associate Professor at MIT, with appointments in the Statistics & Data Science Center and the Department of Brain and Cognitive Sciences. Sasha is currently the Chair of the Interdisciplinary Doctoral Program in Statistics at MIT. His research is in Statistics, Machine Learning, and Optimization. He received his Bachelor’s degrees in mathematics and computer science from Cornell University, and doctoral degree in computational neuroscience from MIT. He was a postdoc at UC Berkeley EECS before joining the University of Pennsylvania, where he was an associate professor in the Department of Statistics and a co-director of the Penn Research in Machine Learning (PRiML) center. He is a recipient of the NSF Career Award, IBM Research Best Paper Award, Machine Learning Journal Award, and COLT Best Paper Award.
Faculty Host: Yannis Paschalidis
Student Host: Athar Roshandelpoor