Targeted Matrix Completion: Natali Ruchansky, BU

Starts:
11:00 am on Monday, December 9, 2013
Ends:
12:00 pm on Monday, December 9, 2013
Location:
MCS 148
Abstract: In many problems the input consists of a partially-observed matrix and the goal is to identify a low-rank matrix that best fits these observations. This task, known as matrix completion, has in recent years become recognized as fundamental in data analysis. Traditional statistical approaches formulate the matrix completion problem as an optimization one, assuming that the set of observed entries is sufficient for accurate reconstruction with high probability. A major drawback is that such methods give no indication of which entries have been accurately recovered and which not -- since they output some completion for any set of observations, even the smallest. On the other hand, the recent structural completion methods (such as the Information Cascade Matrix Completion method we adopt) evaluate the positions of observed entries and indicate precisely which missing entries can be recovered accurately. Here, we ask the following natural question: when the observations are not enough to recover all entries, how can we add observations to enable full completion? One strategy is to randomly reveal entries until a density at which statistical methods can output a good estimate. Unfortunately this density can be rather high, and so we wonder if there is a smarter way. Through a deep understanding of structural completion, we introduce an algorithm called Order&Extend that chooses entries strategically. Together with structural completion, our algorithm can identify exactly which entries can be recovered given the observation, and then specify how much and which additional information is needed for full completion. We show that Order&Extend requires less (almost minimal) information and leads to better recovery than statistical methods. Short Bio: Natali received a Bachelor of Art from Boston University in Mathematics and Computer Science. She is now a PhD student in the Data Management Lab at Boston University, working with Evimaria Terzi and Mark Crovella. Her research interests span many topics including algorithmic data mining and mathematics.