ECE Seminar with Dan Feldman
- 4:00 pm on Tuesday, March 5, 2013
- Photonics Center, 8 Saint Mary’s St., Room 339
Learning Patterns in Big Data From Small Data Using Core-Sets Dan Feldman Postdoctoral Researcher The Distributed Robotics Lab Massachusetts Institute of Technology Faculty Host: Venkatesh Saligrama Refreshments will be served outside Room 339 at 3:45 p.m. Abstract: When we need to solve an optimization problem we usually use the best available algorithm/software or try to improve it. In recent years we have started exploring a different approach: instead of improving the algorithm, reduce the input data and run the existing algorithm on the reduced data to obtain the desired output much faster. A core-set for a given problem is a semantic compression of its input, in the sense that a solution for the problem with the (small) corset as input yields a provable approximate solution to the problem with the original (Big) Data. Core-set can usually be computed via one pass over a streaming input, manageable amount of memory, and in parallel. For real time performance we use Hadoop, Clouds and GPUs. In this talk I will describe how we applied this magical paradigm to obtain algorithmic achievements with performance guarantees in iDiary: a system that combines sensor networks, robotics, differential privacy, and text mining. It turns large signals collected from smartphones or robots into maps and textual descriptions of their trajectories. The system features a user interface similar to Google Search that allows users to type text queries on their activities (e.g., "Where did I have dinner last time I visited Paris?") and receive textual answers based on their signals. About the Speaker: Dan Feldman is a post-doc at MIT in the Distributed Robotics Lab, where he develops systems for handling streaming big data from sensors, smartphones, images, and robots. He got his Ph.D. from Tel-Aviv University in 2010, under the supervision of Professor Micha Sharir and Professor Amos Fiat. He then was a postdoc at the Center for the Mathematics of Information at Caltech for a year and a half, where he started to reduce the gap between theoretical computational geometry and practical machine learning. He specialized in developing software for scalable data compression, based on core-set constructions with provable guarantees. His coresets were implemented in several start-ups, banks, supermarkets, and internet search companies over the recent years, to name just a few. When he is not working, Dan is building robots with his very own coresets, Ariel and Eleanor.