NUST Institutional Repository

Crime Prediction Using Machine Learning and Data Analytics

Show simple item record

dc.contributor.author Haider, Muhammad Naqi
dc.date.accessioned 2023-07-26T14:44:52Z
dc.date.available 2023-07-26T14:44:52Z
dc.date.issued 2020
dc.identifier.other 171469
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/35201
dc.description Supervisor: Dr. Rafia Mumtaz en_US
dc.description.abstract Crimes have both short term and long term effects on individuals and on a society as a whole. Modern day law enforcement agencies are making use of data analytics and machine learning algorithms for predictive policing in order to prevent crimes. Using historical data, data analysts can find key factors that contribute to crimes and predict the occurrence of a particular crime type. This problem is popularly referred to as crime classification problem. Due to increasing need of more accurate systems, this thesis presents improved use of machine learning algorithms to predict the crime types and also analyzes their performance given common set of features. The dataset used for this work is Chicago Crime Dataset which is available on City of Chicago Data Portal. Related research works in the past have used data from multiple domains such as geography, socio economics and data also data from education is also used. Data from multiple domains is treated but had a difficulty in retrieving nonlinear relationships and creating distinction between multiple values of the feature set. In this research work we specially focused on extracting features with the help of visualization and clustering in light of results of multiple iterations while running algorithms. This study also captures the importance of crime prediction systems in society, it presents a literature survey of the systems in place and provides a methodology to predict crimes in advance using machine learning algorithms. The results also show that the dataset suffers from high number of distinct values for features, class imbalance and lack of independent features. Among the three Algorithms of Naïve Bayes, Decision Tree and Random Forest, Random Forest performed the best with 54% accuracy. For future studies it is better using Random Forest Classifier, extracting more independent features from the data and sampling data in order to reduce the class imbalance problem. This study also highlights the importance of crime reporting in Pakistan in order to make an accurate crime prediction system en_US
dc.language.iso en en_US
dc.publisher School of Electrical Engineering and Computer Science (SEECS), NUST en_US
dc.title Crime Prediction Using Machine Learning and Data Analytics en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [375]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account