dc.description.abstract |
Crimes have both short term and long term effects on individuals and on a society as a
whole. Modern day law enforcement agencies are making use of data analytics and
machine learning algorithms for predictive policing in order to prevent crimes. Using
historical data, data analysts can find key factors that contribute to crimes and predict the
occurrence of a particular crime type. This problem is popularly referred to as crime
classification problem. Due to increasing need of more accurate systems, this thesis
presents improved use of machine learning algorithms to predict the crime types and also
analyzes their performance given common set of features. The dataset used for this work is
Chicago Crime Dataset which is available on City of Chicago Data Portal. Related research
works in the past have used data from multiple domains such as geography, socio economics and data also data from education is also used. Data from multiple domains is
treated but had a difficulty in retrieving nonlinear relationships and creating distinction
between multiple values of the feature set. In this research work we specially focused on
extracting features with the help of visualization and clustering in light of results of
multiple iterations while running algorithms. This study also captures the importance of
crime prediction systems in society, it presents a literature survey of the systems in place
and provides a methodology to predict crimes in advance using machine learning
algorithms. The results also show that the dataset suffers from high number of distinct
values for features, class imbalance and lack of independent features. Among the three
Algorithms of Naïve Bayes, Decision Tree and Random Forest, Random Forest performed
the best with 54% accuracy. For future studies it is better using Random Forest Classifier,
extracting more independent features from the data and sampling data in order to reduce
the class imbalance problem. This study also highlights the importance of crime reporting
in Pakistan in order to make an accurate crime prediction system |
en_US |