Abstract:
This thesis presents a methodology to design an embedded vision system to detect upright
humans in images and video feed from camera. Our goal is to develop an algorithm
that can extract high dimensional feature vectors from encoded image regions that can
substantiate object/non-object decisions. A simple learning framework is used to differentiate
between an object and non-object in an image region using support vector machine
as the classifier. A combination of both bottom-up and bottom-down approaches
are used to design the system so that the benefits of both approaches be utilized to
elucidate the algorithm. A detailed study of existing state of the art object detection
algorithms is carried out, they perform well on general purpose computers with large
memory units and adequate computational power but they fail to perform when it comes
to implementation on single board computing devices with limited computational power
and memory resources. The algorithm is based on HOG (Histogram of Oriented Gradients)
features presented by N. Dalal and B. Triggs which is optimized to use efficiently
in embedded hardware so that the results can be achieved in real time. A dense overlaid
grid is formed over the image upon which histogram of oriented gradients is calculated
keeping the resolution fixed, resulting in a high dimensional feature vector. The HOG
descriptors are robust to significant variations in illumination, colour and minute variations
in image contour locations and directions. ACF (Aggregated Channel Features),
HOG features from literature and our implementation of HOG are evaluated on INRIA
Person Dataset on a Intel Core i7 processor with 8GB RAM and on Raspberry pi B+.
Almost all performed equally on dataset, but they have their own failures, like ACF
performs well on desktop machine and is relatively faster but on single board computer
it is slower than HOG which means that ACF is computationally more expensive than
HOG. The algorithm developed is first implemented on MATLAB, then python implementation
is completed on desktop machine which is then optimized and implemented