Abstract:
This thesis addresses the problem of predicting crime and crime types. It is a challenging issue to predict crime as criminals do not follow predefined patterns. Criminals are always looking for places where there is less chance to get caught. They are aware of areas where there is less police patrolling. Therefore, we need to identify potential criminal activities, which will help police and other law enforcement agencies to predict crime beforehand. Most of the crime data consists of crime type attribute, therefore, we have used supervised learning algorithms for predicting crime. We have also used spatial and temporal information for identification of crime types. Neighborhood information is used for identifying race of majority of the population in a particular locality. We have also incorporated census information in the crime data by adding attributes like literacy rate and average income. In this way, we will be in a position to better predict crime and crime types. We have identified criminal hotspots. This study is beneficial for law enforcement agencies for better patrolling by using past crimes. Police patrolling can be increased or decreased based on predefined crime rates in certain localities. On the other hand, this study may also benefit police by identifying whether they need to hire more people or they need to lay off some? We have used Naïve Bayes, Decision Trees, Artificial Neural Network, Support Vector Machine and Ensemble methods for predicting crime. 10 fold cross validation has been used for testing and the results show that Ensemble methods have the best prediction with accuracy of over 80%. In future, we would like to add more features including user employment history as well as previous criminal records.