Compound Exemplar based Object Class Detection and Beyond with VARIS System
thesisposted on 20.06.2014 by Kai Ma
In order to distinguish essays and pre-prints from academic theses, we have a separate category. These are often much longer text based documents than a paper.
Determining the location and scale of a particular object class in a 2D image is usually referred as object detection in computer vision area. Object detection is a well-studied topic and many successful algorithms have been proposed during last two decades. However, recent experimental surveys reveal that the performance of the state-of-the-art detection systems still have low performance on images under unconstrained environment. The major reasons are due to high intra-class variation and object self-occlusion. Here we present a novel exemplar-based object detection framework that outperforms the state-of-the-art systems in terms of accuracy. The proposed method, Vector Array Recognition by Indexing and Sequencing (VARIS), is designed to fulfill two requirements in object detection: Generalization and Reliability. The foundation of VARIS is to dynamically assemble an object exemplar that maximizes the similarity to the input image. Experimental results show that VARIS achieves better results than its competitors even with a very compact training dataset. Meanwhile, the computational speed is significantly increased with the help of a modified random forest module, which allows the full system to run in real time on standard images. Beyond 2D object detection topic, I also explored the 3D computer vision domain. Cooperated with Siemens Corporate Research, I designed and implemented a novel framework that estimated human body shape and pose simultaneously, which is named as parametric deformable model (PDM). PDM demonstrates the ability to recover the true human body pose and shape even by given a noisy and occluded 3D depth image as the input. PDM brings many potential applications, such as better body joints estimation. Once the joint locations are determined, we can extend the 1D VARIS system to recognize the human activity.