Automatic Failure Diagnosis in Large-Scale Systems

74
Следующее
Популярные
336 дней – 1 3621:52
Project Mosaic
Опубликовано 6 сентября 2016, 5:11
As modern computer systems grow in both size and complexity, so has the need for automatic analysis and computer-aided administration of these systems. With recent booms in computing power and efficient algorithms, statistical machine learning methods have become increasingly practical for dealing with the deluge of data generated by these systems. In this talk, I present statistical diagnostic platforms for several large-scale systems, focusing on the problem of selecting fault-related components from a long list of potential candidates. Examples include a distributed software monitoring system for automatic debugging, and a probing system for detecting failures on clusters of network computers.
автотехномузыкадетское