ECE Seminar with Daniel Hsu

12:00 pm on Thursday, March 21, 2013
Photonics Center, 8 Saint Mary’s St., Room 339
Fast Learning Algorithms for Discovering the Hidden Structure in Data With Daniel Hsu Postdoctoral Researcher Microsoft Research New England Faculty Host: Ioannis Paschalidis Refreshments will be served outside Room 339 at 11:45 a.m. Abstract: A major challenge in machine learning is to reliably and automatically discover hidden structure in data without any human intervention. Examples of such structure found in numerous applications include the stratification of a population into clusters, the thematic structure of document collections, and the dynamical processes governing complex time series. Many of the core statistical estimation problems of these types are, in general, provably intractable for both computational and information-theoretic reasons. However, much progress has been made over the past decade or so to overcome these hardness barriers by focusing on realistic cases that rule out the intractable instances. In this talk, I’ll describe a general computational approach for correctly estimating a wide class of statistical models, including Gaussian mixture models, Hidden Markov models, Latent Dirichlet Allocation, Probabilistic Context Free Grammars, and several more. The key idea is to exploit the structure of low-order correlations that is present in high-dimensional data. The scope of the new approach extends beyond the purview of previous algorithms, and it leads to both new theoretical guarantees for unsupervised learning, as well as fast and practical algorithms for large-scale data analysis. About the Speaker: Daniel Hsu is a postdoc at Microsoft Research New England. Previously, he was a postdoc with the Department of Statistics at Rutgers University and the Department of Statistics at the University of Pennsylvania from 2010 to 2011, supervised by Tong Zhang and Sham M. Kakade. He received his Ph.D. in Computer Science in 2010 from UC San Diego, where he was advised by Sanjoy Dasgupta, and his B.S. in Computer Science and Engineering in 2004 from UC Berkeley. His research interests are in algorithmic statistics and machine learning.