ECE Colloquium with Michael Mahoney

4:00 pm on Wednesday, February 27, 2013
Photonics Center, 8 Saint Mary’s St., Room 339
Implementing Randomized Matrix Algorithms in Parallel and Distributed Environments With Michael Mahoney Stanford University Refreshments will be served outside Room 339 at 3:45 p.m. Faculty Host: Venkatesh Saligrama Abstract: Motivated by problems in large-scale data analysis, randomized algorithms for matrix problems such as regression and low-rank matrix approximation have been the focus of a great deal of attention in recent years. These algorithms exploit novel random sampling and random projection methods, and implementations of these algorithms have already proven superior to traditional state-of-the-art algorithms, as implemented in Lapack and high-quality scientific computing software, for moderately-large problems stored in RAM on a single machine. Here, we describe the extension of these methods to computing high-precision solutions in parallel and distributed environments that are more common in very large-scale data analysis applications. In particular, we consider both the Least Squares Approximation problem and the Least Absolute Deviation problem, and we develop and implement randomized algorithms that take advantage of modern computer architectures in order to achieve improved communication profiles. Our iterative least-squares solver, LSRN, is competitive with state-of-the-art implementations on moderately-large problems; and, when coupled with the Chebyshev semi-iterative method, scales well for solving large problems on clusters that have high communication costs such as on an Amazon Elastic Compute Cloud cluster. Our iterative least-absolute-deviations solver is based on fast ellipsoidal rounding, random sampling, and interior-point cutting-plane methods; and we demonstrate significant improvements over traditional algorithms on MapReduce. In addition, this algorithm can also be extended to solve more general convex problems on MapReduce. About the Speaker: Michael Mahoney is at Stanford University. His research interests center around algorithms for very large-scale statistical data analysis, including both theoretical and applied aspects of problems in scientific and Internet domains. His current research interests include geometric network analysis; developing approximate computation and regularization methods for large informatics graphs; applications to community detection, clustering, and information dynamics in large social and information networks; and the theory of randomized matrix algorithms and its application to genetics, medical imaging, and Internet problems. He has been a faculty member at Yale University and a researcher at Yahoo, and his PhD was in computational statistical mechanics at Yale University.