Abstract:
Socioeconomic classification using satellite imagery is an emerging way of monitoring
different social classes dwelling in a heterogeneous urban society. As it has proved
to be effective in addressing numerous social challenges, developing countries demand
such hassle-free mechanisms to keep a close check on the growth rate of these social
groups. This research is an attempt to perform this challenging task by using machine
learning techniques, with the economic hub of Pakistan, Karachi, as the study area.
The study caters the issue of unavailability of very high resolution (VHR) satellite
data by using Google basemaps. Furthermore, acquisition of ground truth data was
also accomplished through one of the leading and legitimate real estate web portals.
The processed VHR raster data of Karachi is classified into three major social classes
i.e. lower, middle and upper class, using the grid-based approach with random forest
classifier via semi-supervised classification technique. According to the findings, the
populated areas of Karachi are constituted by 10%, 11%, and 8% of citizens belonging
to the middle, lower, and upper classes, respectively. Classification results demonstrate
that a major population of people belonging to the lower class dwell in the west and
malir districts, while the upper and middle classes predominate the south and central
districts of Karachi. However, all three socioeconomic strata have been shown to make
significant contributions to the population of the east district. The results obtained
show overall accuracy of 78.2%, which is higher than the accuracy of similar work done
in past. The study paves way for automated socioeconomic classification, opening
doors for further experimentation in this domain.