Tail Latency Meets Caching - An Unusual Alliance

1 591

15.6

Microsoft Research334 тыс

Следующее

22.02.19 – 2 69551:26

Learning Challenges in Natural Language Processing

Популярные

68 дней – 2 07038:07

At the Foothills of an AI Era in Science | Gilbert S. Omenn Grand Challenges Address

172 дня – 21 2745:48

AutoGen Update: Complex Tasks and Agents

Опубликовано 22 февраля 2019, 18:14

In today’s world of interactive computing, web services need to achieve low latency for almost all user requests (e.g., low 99-th percentile latency). Reducing the latency tail is challenging because tail events are rate and often have complex causes. Hence, tail latency has been a recurring theme in academic and industry systems research for over a decade.

See more at microsoft.com/en-us/research/v...

In this talk, I will demonstrate that caching can be a strong ally in the pursuit of low tail latency. My talk will present two concrete examples, in datacenter and edge caching, where redesigned caching systems lead to order-of-magnitude reductions in tail latency. These results contradict the common belief that, due to their non-negligible miss ratios, caches are of little benefit for reducing tail latency. I will show how to overcome this common belief by exploiting analytical performance modeling to guide the design of caching systems.

While rooted in theory, my research is highly practical and conducted in continuous collaboration with partners from industry. I will describe how collaboration and open-source-prototypes have enabled production use of my research at a top-ten US website and deployment tests at several other companies.

Свежие видео