Data Mining for Causal Inference

2 807

17.3

Microsoft Research335 тыс

Следующее

14.11.16 – 763:40

Lightning Talks-Lars Liden

Популярные

14 дней – 271:11:26

Improving the Security of United States Elections with Robust Optimization

36 дней – 1 7613:56

Introducing BiomedParse, a groundbreaking foundation model for biomedical image analysis

Опубликовано 14 ноября 2016, 19:20

As an increasing amount of daily activity---ranging from what we purchase to who we talk---shifts to online platforms, it is only natural to ask how those platforms impact our behavior. Take, for instance, online recommendation systems: how much activity do recommendations actually cause over and above what would have happened in their absence? Without doing randomized experiments, which may be costly or infeasible, estimating the impact of such systems is non-trivial. In this talk, I will argue that careful data mining can help in answering relevant causal questions in a more general way than traditional observational approaches. In the first example, I will show how data mining can be used to augment a popular technique, instrumental variables, by searching for large and sudden shocks in time series data. Applying this method to system logs for Amazon's "People who bought this also bought" recommendations, we are able to analyze over 4,000 unique products that experience such shocks. This leads to a more accurate estimate of the impact of the recommender system: at least 75% of recommendation click-throughs would likely occur in their absence, questioning popular industry estimates based on observed click-through rates. In the second example, I will present a general data-driven identification strategy for finding natural experiments in time series data, inspired from the shock-based approach above. This method too reveals a similar overestimate for the impact of recommendation systems.

See more on this video at microsoft.com/en-us/research/v...

Свежие видео