Abstract:
Crowd counting has become a growing field of research in machine learning and computer vision mainly to achieve an improved surveillance system and effective traffic or crowd management system. In the past, many researchers have used standard regression-density map techniques to achieve an authentic and accurate model for crowd counting. These models failed to learn the variability in sizes and poses of the crowd, hence assuming an estimate of the shape and size, affecting the accuracy of the prediction. Object detection algorithms are also used for counting however overlapping of objects hinders their results. In this thesis, a ”state-of-the-art technique” is proposed, using both the classification approach and object detection approach for crowd counting, using the results achieved by a ‘model decider’ which is able to learn the overlapping in an image. This technique results in achieving top results on counting datasets and also better performance on images. Furthermore, the method treats each image in a dataset individually based on their overlapping intensity which ultimately increases the accuracy of the count of the crowd in an image. The results proved that the proposed model decider gives the most accurate results for datasets such as Shanghai Tech, MTC and UCF CC 50 dataset, with an improvement of more than 12% comparative to past approaches developed for crowd counting. The detection method uses weak supervision and semantic segmentation which ultimately decreases memory consumption, works on every image size, and determines the exact location of humans for non-overlapping crowd images. The classification technique, on the other hand, uses a feature map which later divides recursively using near to accurate division decider and then classifies the count to give a state of art result for an overlapping crowd. Different images give different results on both the techniques according to their crowd densities.