Providing Richer Descriptions for Images

23
Следующее
Популярные
160 дней – 6113:07:51
AI For All: Embracing Equity for All
Опубликовано 17 августа 2016, 1:11
Object recognition is now becoming a usable technology. When it is used in applications, fundamental questions arise: What should we recognize in images? What are the desirable outcomes of a recognition system? What should we say if we encounter an unfamiliar object? ... In this talk I focus on representational architectures that enable us to provide deeper and richer descriptions for images. These descriptions are in forms of properties of objects, their functions, and complex relationships between entities in images. I introduce visual attributes and show the benefits of adopting an attribute-centric framework in describing familiar and unfamiliar objects. I then explain a nonparametric approach that provides concise image descriptions in form of natural language sentences. This method uses the predictions of all objects, actions, and scenes to establish a scoring function between an image and a sentence. To enhance image descriptions, I introduce visual phrases; chunks of meanings bigger than objects and smaller than scenes. Further, I show how learning visual phrases directly helps recognition significantly. Finally, I explain a decoding algorithm that decides on the final outcome of our recognition system using predictions of objects and visual phrases.
автотехномузыкадетское