Naiad: a system for incremental, iterative and interactive parallel computation: Frank McSherry, MSR
- Starts: 11:00 am on Monday, December 17, 2012
- Ends: 12:30 pm on Monday, December 17, 2012
Abstract:
We are developing a new system for large-scale data analysis -- called
"Naiad" -- which has the goal of supporting complex iterative queries
over dynamic inputs at interactive timescales. Like many existing
systems, Naiad supports high-level declarative queries, data-parallel
execution, and transparent distribution. Unlike these systems, Naiad
can efficiently execute queries with multiple (possibly nested)
iterative loops, while simultaneously supporting low-latency
incremental changes to the query inputs. As a highlight of its
characteristics, Naiad can not only efficiently compute the strongly
connected component structure of a 24 hour sliding window of the
Twitter @mention graph (using a doubly nested fixed-point
computation), but also maintains the computation with sub-second
latencies in the face of Twitter's full volume of continuously
arriving tweets.
I will describe the computational model underlying Naiad, a
generalization of traditional incremental dataflow to partially
ordered logical times, and work through some of the (very friendly,
picture oriented) mathematical details. I will also highlight several
new distributed systems challenges faced in order to fully realize the
multiple orders-of-magnitude performance improvements Naiad presents.
This is joint work with Derek Murray, Rebecca Isaacs, Michael Isard
and Martìn Abadi.
Lunch will follow the talk in 135.
- Location:
- MCS 148
- Link:
- Learn More