Adversarial Benchmarks for Commonsense Reasoning

1 449

10.5

Microsoft Research336 тыс

Следующее

26.03.19 – 1 6031:01:48

Panel Discussion: Challenges & Opportunities in AI

Популярные

148 дней – 64954:23

Mapping the World: Creating a Global and Temporal High-Resolution Building Density Map

15.02.23 – 2 53914:39

Responsible AI Tracker Tour

Опубликовано 25 марта 2019, 19:46

Human intelligence involves comprehending new situations through a rich model of the world. Given a single image from a movie, or a paragraph from a novel, we can easily infer people’s intentions, mental states, and actions. However, enabling machines to perform this kind of commonsense reasoning remains elusive. Beyond the inherent difficulty of building models that reason, we lack robust benchmarks that evaluate AI reasoning ability.

In this talk, I will present two new large-scale benchmark datasets for commonsense reasoning, covering text (SWAG, rowanzellers.com/swag) and vision (VCR; visualcommonsense.com). These datasets pose new types of reasoning challenges: machines must abstract away from text and images and understand the entire situation, and then explain their predictions. Equally important is what these datasets don’t contain: they are adversarially constructed using a suite of new techniques, so as to be resistant to biases. In addition, I will introduce models for these datasets, and discuss where the field might go next towards human-level commonsense reasoning.

See more at microsoft.com/en-us/research/v...

Свежие видео