Interleaved Visual Object Categorization and Segmentation
in Real-World Scenes
Bernt Schiele
Fachbereich Informatik, Multimodale Interaktive Systeme
TU Darmstadt
Freitag, 09.07.2004, 11 Uhr c.t., T2-226
We present a method for object categorization in real-world scenes. Following a common consensus
in the field, we do not assume that a figureground segmentation is available prior to recognition.
However, in contrast to most standard approaches for object class recognition, our approach
effectively segments the object as a result of the categorization. This combination of recognition
and segmentation into one process is made possible by our use of an Implicit Shape Model, which
integrates both into a common probabilistic framework. In addition to the recognition and
segmentation result, it also generates a per-pixel confidence measure specifying the area that
supports a hypothesis and how much it can be trusted. We use this confidence to derive a natural
extension of the approach to handle multiple objects in a scene and resolve ambiguities between
overlapping hypotheses with an MDL-based criterion. In addition, we present an extensive
evaluation of our method on a standard dataset for car detection and compare its performance to
existing methods from the literature. Our results show a significant improvement over previously
published methods. Finally, we present results for articulated objects, which show that the
proposed method can categorize and segment unfamiliar objects in different articulations and with
widely varying texture patterns. Moreover, it can cope with significant partial occlusion and
scale changes.