Red Hat Colloquium: “Towards Tail Latency-Aware Caching in Large Web Services,” Daniel S. Berger (CMU)

Wednesday, September 5th, 2018, 12:00 PM – 1:00 PM
refreshments & networking at 11:30 AM
Hariri Institute for Computing
111 Cummington Mall, Room 180

Daniel S. Berger
Carnegie Mellon University

Towards Tail Latency-Aware Caching in Large Web Services

Abstract: Tail latency is of great importance in user-facing web services. However, achieving low tail latency is challenging, because typical user requests result in multiple queries to a variety of complex backends (databases, recommender systems, ad systems, etc.), where the request is not complete until all of its queries have completed.

In this talk, Berger presents findings for the case of several large web services at Microsoft. He analyzes production system request structures and finds that requests vary greatly in the backends that they access and in the number of queries made to each backend. Furthermore, he finds that backend query latencies vary by more than two orders of magnitude across backends and vary widely over time, resulting in high request tail latencies.

This talk proposes a novel solution for maintaining low request tail latency: repurpose existing caches to mitigate the effects of backend latency variability. His team’s solution, RobinHood, dynamically reallocates cache resources from the cache-rich (backends which don’t affect request latency) to the cache-poor (backends which affect request latency). He evaluates RobinHood with production traces on a 50-server cluster with 20 different backend systems. He finds that, in the presence of load spikes, RobinHood meets a 150ms SLO 99.7% of the time, whereas the next best policy only meets this SLO 70% of the time.

Bio: Daniel S. Berger is the 2018 Mark Stehlik Postdoctoral Fellow in the Computer Science Department at Carnegie Mellon University. His research interests intersect systems, mathematical modeling, and performance testing. Daniel’s research explores how caching can be used to reduce tail latency in large web services and CDNs. Daniel has received his Ph.D (2018) from the University of Kaiserslautern, Germany, and has spent extended visits at CMU (2015-2017), Warwick University (2014), T-Labs Berlin (2013), ETH Zurich (2012), and at the University of Waterloo (2011). Previously, Daniel worked as a data scientist at the German Cancer Research Center (2008-2010) and as a project scientist at CMU (2017-2018).