Abstract:
Various computer vision techniques are used to provide vision to machines and robots to
provide them with the ability to become aware of the surroundings. Its becoming very
vital area of research due to vast applications in the domain of robotics and intelligent
machines. Advanced sensors combined with ML and AI based algorithms are being
used for providing the machine with an ability to detect objects and manipulation of
surroundings. Microsoft Kinect is a low cost sensor which can can provide with RGB
and depth data. In this research, the human detection problem in 2D RGB frames is
explored and the extraction of geomatric objects in 3D pointcloud data is explored. A
ML based detection model is formed for detection of humans in pose variant scenarios.
The detection model consists of two parts which include training and testing respectivly.
Training is done using a training set containing human RGB images in different
poses, scenarios and viewpoints, feature extraction is done by using a novel approach
which consist of combining HoG and LBP features to form a single featureset, which
is then used to train an SVM based classifier. Testing is done by detection humans in
RGB frames of test dataset, features are extracted as in training phase and the trained
model is used to detect humans in the frames. Depth data from Kinect is used to form
pointclouds and RGB stream is mapped on it, resultant RGB pointclouds are used to
extract geometric shaped objects from pointclouds using M-SAC a variant of RANSAC