{"id":1433,"date":"2018-08-15T10:44:52","date_gmt":"2018-08-15T14:44:52","guid":{"rendered":"https:\/\/www.bu.edu\/rhcollab\/?p=1433"},"modified":"2018-09-20T21:16:31","modified_gmt":"2018-09-21T01:16:31","slug":"colloquium-towards-tail-latency-aware-caching-in-large-web-services","status":"publish","type":"post","link":"https:\/\/www.bu.edu\/rhcollab\/2018\/08\/15\/colloquium-towards-tail-latency-aware-caching-in-large-web-services\/","title":{"rendered":"Colloquium: Towards Tail Latency-Aware Caching in Large Web Services"},"content":{"rendered":"<p><script type=\"text\/javascript\" src=\"https:\/\/addevent.com\/libs\/atc\/1.6.1\/atc.min.js\" async defer><\/script><\/p>\n<div class=\"event-content\">\n<h3>Red Hat Collaboratory at Boston University Colloquium<\/h3>\n<h3 style=\"font-weight: normal;\">Daniel S. Berger<\/h3>\n<p>2018 Mark Stehlik Postdoctoral Fellow in the Computer Science Department at Carnegie Mellon University<\/p>\n<h3><i>Towards Tail Latency-Aware Caching in Large Web Services<\/i><\/h3>\n<h4>Abstract<\/h4>\n<p>Tail latency is of great importance in user-facing web services. However, achieving low tail latency is challenging, because typical user requests result in multiple queries to a variety of complex backends (databases, recommender systems, ad systems, etc.), where the request is not complete until all of its queries have completed.<\/p>\n<p>In this talk we present our findings for the case of several large web services at Microsoft. We analyze production system request structures and find that requests vary greatly in the backends that they access and in the number of queries made to each backend. Furthermore, we find that backend query latencies vary by more than two orders of magnitude across backends and vary widely over time, resulting in high request tail latencies.<\/p>\n<p>This talk proposes a novel solution for maintaining low request tail<br \/>\nlatency: repurpose existing caches to mitigate the effects of backend latency variability. Our solution, RobinHood, dynamically reallocates cache resources from the cache-rich (backends which don\u2019t affect request latency) to the cache-poor (backends which affect request latency). We evaluate RobinHood with production traces on a 50-server cluster with 20 different backend systems. We find that, in the presence of load spikes, RobinHood meets a 150ms SLO 99.7% of the time, whereas the next best policy only meets this SLO 70% of the time.<\/p>\n<p><i>The team working on this project includes Benjamin Berg (CMU), Timothy Zhu (Penn State), Mor Harchol-Balter (CMU), and Siddhartha Sen (MSR). Will appear at USENIX OSDI 2018.<\/i><\/p>\n<h4>Bio<\/h4>\n<p>Daniel S. Berger is the 2018 Mark Stehlik Postdoctoral Fellow in the Computer Science Department at Carnegie Mellon University. His research interests intersect systems, mathematical modeling, and performance testing. Daniel&#8217;s research explores how caching can be used to reduce tail latency in large web services and CDNs. Daniel has received his Ph.D (2018) from the University of Kaiserslautern, Germany, and has spent extended visits at CMU (2015-2017), Warwick University (2014), T-Labs Berlin (2013), ETH Zurich (2012), and at the University of Waterloo (2011). Previously, Daniel worked as a data scientist at the German Cancer Research Center (2008-2010) and as a project scientist at CMU (2017-2018).<\/p>\n<h3>Agenda<\/h3>\n<ul>\n<li>11:30 AM &#8211; 12:00 PM: <strong>Pizza &amp; Networking<\/strong><\/li>\n<li>12:00 &#8211; 1:00 PM: <strong>Talk and Discussion<\/strong><\/li>\n<\/ul>\n<h3>Questions?<\/h3>\n<p><a href=\"\/rhcollab\/get-involved\/contact-us\/\">Contact the Collaboratory<\/a> with any questions you may have about this event.<\/p>\n<h3>Recording of Event<\/h3>\n<p>This talk was held as scheduled. A recording can be accessed<span>\u00a0<\/span><a href=\"https:\/\/echo360.org\/media\/5049c693-4f73-454d-8e6a-f786fbc8b798\/public\">here<\/a>.\u00a0 Slides can be accessed\u00a0<a href=\"\/rhcollab\/files\/2018\/08\/Berger-Collaboratory-Talk-9.5.18.pdf\">here<\/a>.<\/p>\n<\/div>\n<aside class=\"event-meta\">\n<h4>Dates &amp; Times<\/h4>\n<p>Wednesday, September 5, 2018<br \/>\n11:30 AM &#8211; 1:00 PM<br \/>\n(Pizza &amp; networking until 12 PM)<\/p>\n<div title=\"Add to Calendar\" class=\"addeventatc\" style=\"margin-bottom: 30px;\">Add to Calendar<br \/>\n<span class=\"start\">09\/05\/2018 11:30 AM<\/span><span class=\"end\">09\/05\/2018 1:00 PM<\/span><span class=\"timezone\">America\/New_York<\/span><span class=\"title\">Colloquium: <i>Towards Tail Latency-Aware Caching in Large Web Services<\/i><\/span><span class=\"description\">Tail latency is of great importance in user-facing web services. However, achieving low tail latency is challenging, because typical user requests result in multiple queries to a variety of complex backends (databases, recommender systems, ad systems, etc.), where the request is not complete until all of its queries have completed.<br \/>\nIn this talk we present our findings for the case of several large web services at Microsoft. We analyze production system request structures and find that requests vary greatly in the backends that they access and in the number of queries made to each backend. Furthermore, we find that backend query latencies vary by more than two orders of magnitude across backends and vary widely over time, resulting in high request tail latencies.<br \/>\n<\/span><span class=\"location\">Hariri Institute for Computing<br \/>\n111 Cummington Mall, Seminar Room, Boston MA<\/span><\/div>\n<h4>Location<\/h4>\n<address style=\"font-style: normal;\">Hariri Institute for Computing<br \/>\n111 Cummington Mall, Seminar Room<\/address>\n<p><a href=\"https:\/\/goo.gl\/maps\/96msDgREAF32\">Directions<\/a><\/p>\n<\/aside>\n","protected":false},"excerpt":{"rendered":"<p>Red Hat Collaboratory at Boston University Colloquium Daniel S. Berger 2018 Mark Stehlik Postdoctoral Fellow in the Computer Science Department at Carnegie Mellon University Towards Tail Latency-Aware Caching in Large Web Services Abstract Tail latency is of great importance in user-facing web services. However, achieving low tail latency is challenging, because typical user requests result [&hellip;]<\/p>\n","protected":false},"author":14350,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[50,51,48,6],"tags":[],"_links":{"self":[{"href":"https:\/\/www.bu.edu\/rhcollab\/wp-json\/wp\/v2\/posts\/1433"}],"collection":[{"href":"https:\/\/www.bu.edu\/rhcollab\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.bu.edu\/rhcollab\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/rhcollab\/wp-json\/wp\/v2\/users\/14350"}],"replies":[{"embeddable":true,"href":"https:\/\/www.bu.edu\/rhcollab\/wp-json\/wp\/v2\/comments?post=1433"}],"version-history":[{"count":7,"href":"https:\/\/www.bu.edu\/rhcollab\/wp-json\/wp\/v2\/posts\/1433\/revisions"}],"predecessor-version":[{"id":1524,"href":"https:\/\/www.bu.edu\/rhcollab\/wp-json\/wp\/v2\/posts\/1433\/revisions\/1524"}],"wp:attachment":[{"href":"https:\/\/www.bu.edu\/rhcollab\/wp-json\/wp\/v2\/media?parent=1433"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.bu.edu\/rhcollab\/wp-json\/wp\/v2\/categories?post=1433"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.bu.edu\/rhcollab\/wp-json\/wp\/v2\/tags?post=1433"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}