Providing Richer Descriptions for Images

Microsoft Research330 тыс

Следующее

17.08.16 – 3031:24:14

Coping with Uncertain Data: Multi-Source Integration and Fuzzy Lookups

Популярные

160 дней – 6113:07:51

AI For All: Embracing Equity for All

295 дней – 1 54253:47

Effective Human-AI Decision-Making or Everyone: A Sisyphean Task?

Опубликовано 17 августа 2016, 1:11

Object recognition is now becoming a usable technology. When it is used in applications, fundamental questions arise: What should we recognize in images? What are the desirable outcomes of a recognition system? What should we say if we encounter an unfamiliar object? ... In this talk I focus on representational architectures that enable us to provide deeper and richer descriptions for images. These descriptions are in forms of properties of objects, their functions, and complex relationships between entities in images. I introduce visual attributes and show the benefits of adopting an attribute-centric framework in describing familiar and unfamiliar objects. I then explain a nonparametric approach that provides concise image descriptions in form of natural language sentences. This method uses the predictions of all objects, actions, and scenes to establish a scoring function between an image and a sentence. To enhance image descriptions, I introduce visual phrases; chunks of meanings bigger than objects and smaller than scenes. Further, I show how learning visual phrases directly helps recognition significantly. Finally, I explain a decoding algorithm that decides on the final outcome of our recognition system using predictions of objects and visual phrases.

Свежие видео