Automatic Discovery of the Statistical Types of Variables in a Dataset

599
22.2
Опубликовано 19 мая 2017, 22:30
A common practice in statistics and machine learning is to assume that the statistical data types (e.g., ordinal, categorical or real-valued) of variables, and usually also the likelihood model, is known. However, as the availability of real-world data increases, this assumption becomes too restrictive. Data are often heterogeneous, complex, and improperly or incompletely documented. Surprisingly, despite their practical importance, there is still a lack of tools to automatically discover the statistical types of, as well as appropriate likelihood (noise) models for, the variables in a dataset. In this work, we fill this gap by proposing a Bayesian method, which accurately discovers the statistical data types in both synthetic and real data.

See more on this video at microsoft.com/en-us/research/v...
Случайные видео
264 дня – 2 7871:00
Rock is Life 🪨
12.09.23 – 3 4220:31
Bing x SportsCenter
21.09.21 – 5 057 73820:56
We Roasted Your WORST Setups
автотехномузыкадетское