Toward Automated Debugging for Datacenter Applications

40
Опубликовано 17 августа 2016, 1:19
Debugging data-intensive distributed applications running in a datacenter (ΓÇ£datacenter applicationsΓÇ¥) is complex and time-consuming. Developers wish they had a way to debug failed executions with little human effort, but unfortunately no such tool exists today. In this talk, I will present ADDA -- a system that reduces, to a significant extent, the manual effort needed to debug datacenter applications. Specifically, ADDA enables developers to perform powerful automated analyses (like global invariant checks and distributed data flow) on the executions of large-scale, distributed applications, thereby precluding the need to manually search and reason through those executions. The key challenge in building ADDA is that of performing such heavyweight analysis while incurring little in-production overhead. To address this, ADDA harnesses deterministic replay technology to offload expensive analyses to an offline replay execution. With the power of deterministic replay, ADDA incurs low in-production overheads (at ~15) and automates, in large part, the debugging of real-world failures in applications like Hypertable and Cassandra. To conclude the talk, I will give a demo of ADDA in action, and will argue that ADDA brings us one step closer to the holy grail of fully-automated datacenter debugging.
автотехномузыкадетское