Probabilistic Models for Parsing Images

180

Microsoft Research330 тыс

Следующее

06.09.16 – 1 7091:27:11

The Ellsberg Paradox and the Neural Foundations of Decision-Making under Uncertainty

Популярные

62 дня – 3961:44

AgriAdvisor Concept Video

148 дней – 545 0810:31

Join us for Research Forum on September 3, 2024

Опубликовано 6 сентября 2016, 6:15

A grand challenge of computer vision is to understand and parse natural images into boundaries, surfaces and objects. To solve this problem we would inevitably need to work with visual entities and cues of heterogeneous nature, such as brightness and texture at low-level, contour and region grouping at mid-level, and shape recognition at high-level. Learning to represent and incorporate these entities and cues, along with the complexity of the visual world itself, calls for probabilistic models for image parsing. Many previous efforts in this line suffer from issues such as lack of a compact representation, lack of scale invariance or lack of comprehensive experimentation. We describe a scale-invariant image representation using piecewise linear approximations of contours and the constrained Delaunay triangulation (CDT) for completing gradientless gaps. On top of the CDT graph we develop conditional random fields (CRF) for contour completion, figure/ground organization as well as object segmentation. Large datasets of human-annotated natural images are utilized for both training and evaluation. Our quantitative results are the first to demonstrate the working of mid-level visual cues in general natural scenes. The CDT/CRF framework enables efficient representation and inference of both bottom-up and top-down information, hence applicable to various vision problems. We extend our work to joint object recognition and segmentation, in particular finding people, in static images and video.

Свежие видео