Abstract:
Object tracking is an essential task in application domains such as traffic
monitoring, automated surveillance and robot navigation. Tracking objects can be
challenging due to object fading, noise and background clutter, varying appearance,
orientation, scale, and velocity of the maneuvering object and object occlusion. The
work undertaken in this dissertation is mainly focused on development of a reliable
and robust tracking system that can track any object of interest in a video acquired
from a stationary or moving camera. A procedure for automatic target selection has
also been proposed and implemented that can select target with the help of single
click at the object of interest in a single frame. The proposed automatic target
selection algorithm is based on segmentation technique and extracts a bounded object
(hole) around the point selected by user. The steps involved in the algorithm are
smoothing, edge enhancement and morphological operations. The proposed visual
tracking system has been implemented in the RGB space. To reduce the
computational complexity of the system, RGB histogram based feature set has been
extracted. Bhattacharya coefficient is used as similarity measure between template
and each template size section of the search window. The varying appearance of the
object and the short-term neighboring clutter is addressed by dynamic template
updating scheme. This scheme augments the feature set values of template iteratively,
by taking weighted sum of template and the current best-match. This scheme
minimizes the template drift phenomenon and copes with transient occlusion and
background clutter. The search for the target is carried out in a dynamically generated
resizable search-window instead of whole frame to reduce computational complexity
and false positive. The size of the search window is determined based on the change
in position in previous iteration. To cope with the target scaling challenge, an adaptive
scale adaptation algorithm has been implemented in which the size of the section at
the best-match location is varied and RGB histogram based features of this varied size
section is compared with the RGB histogram based features of target. The proposed
system was tested on videos with diverse objects with above mentioned issues. The
results show that the proposed system can handle most of the issues.