Clustering

282
Опубликовано 12 августа 2016, 2:09
Clustering is the problem of finding a 'good'’’ partition of a set of data points in d space into clusters (groups, each consisting of 'nearby'’’ points). In the worst-case, the problem is hard. This can be overcome in two ways: Assuming stochastic models of data, the 'correct’’' clustering can be found. The last decade has seen many detailed results in this setting. A second alternative is to make no stochastic assumptions, but settle for non-optimal solutions. The workhorse of this approach is Lloyd’s k-means algorithm. After surveying both alternatives, I will describe a recent result (joint work with Amit Kumar) which, qualitatively, seems to provide the best of both worlds.
автотехномузыкадетское