Skip to content

ganesh292/TorontoCrimeDataAnalysis

Repository files navigation

Toronto Crime Data Analysis using Unsupervised Learning

Index Terms—Criminal Activities, Data Analysis, Knowledge Discovery, Clustering, K-Means, Agglomerative, DBSCAN.

Public safety and protection is the need of the hour in large cities like Toronto. Law enforcement agencies in large cities have this uphill task of identifying criminal activities, and a lot of resources and time is wasted in identifying such crime hot spots in the form of surveillance, investigations and man-hunt. Recently, modern techniques such as Data Analysis and Knowledge discovery have been playing a major role in the process of extracting unknown patterns and understanding hidden relationships of the data for many applications. With the exponential increase in the size of the crime dataset every year, the need to process this data becomes essential in order to extract meaningful information out of it.

Cluster analysis is the method of classifying a large pool of data items into smaller groups which share similar properties.This papers aims to apply different clustering techniques such as K-Means, Agglomerative and DBSCAN to Toronto’s Major Crime Indicator (MCI) Dataset and identify violent and non-violent neighbourhoods in the city of Toronto. It intends to perform data analysis to find out which crimes occurs at what time of the day and the geographic location associated with crime.

Methodology

The study aims at identifying violent and non-violent neighbourhoods in the city of Toronto, while providing a better visualization for the public. An attempt is made to model a relationship between several criminal patterns, the behaviour and degree of crime. The project tries to cluster the crime prone areas with respect to different major crimes that have occurred in the past.

The major challenge is to understand the versatile data available from Toronto Police public portal and employ different pattern recognition techniques to provide a better crime heat map. Data analysis is performed to find the temporal and spatial distribution of the crimes over the day. These findings are performed using different clustering techniques . The results of each clustering algorithm are compared against several internal validation measures. At the end, an attempt is made to showcase the clustering results over the map of Toronto. The plot tries to present the types of crimes which happen at various times of the day and week, and based on geographical locations.

About

Unsupervised Learning on Public Toronto Crime Data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published