Abstract:
People seamlessly perceive a massive amount of information, regarding the scene while they
observe a scene, such as objects, surfaces, objects interaction and many more is in just a
few seconds. Also, they can reason their surroundings, correlation among components
of that scene, components saliency and further semantic knowledge which gives humans
ability to sense the world. Empowering machines with such ability can improve peoples
quality of life. Though humans understand real-world scenes in a glimpse and accurately
recognize it, this is not an easy task for computers to classify scenes automatically due
to scene image's variability, ambiguity, diverse illumination and scale conditions that a
natural real-world scene may possess. Scene classi cation is a fundamental problem in
computer vision and provides contextual information to guide other processes, such as
browsing, content-based image retrieval and object recognition. This problem has been
widely explored but limited literature can be found on large-scale dataset of Places which
is primarily developed to improve scene recognition. In this research work, this dataset is
used for classifying scene categories. A baseline model based on traditional bag of words
model is used. Proposed approach is based on a novel idea of using ne to coarse category
mapping for scene categories. Information fetched from the mapping is combined with
the fusion of feature descriptors resulting in a single feature representation. This extra
information enhance performance exploiting hierarchical relationship among the categories.
E ectiveness of the proposed approach is validated using the evaluation metrics considered
in this work. Proposed model performs considerably better compared to the given baseline
as well as several state-of-the-art methods. Also, a suitable trade-o between spend and
accuracy is considered as the test scene image is classi ed quickly once its coarse class is
selected as the number of comparisons are reduced to categories in only that coarse category
rather than comparing it with whole category set