Exploratory Data Analysis

Exploratory Data Analysis, often known as EDA, is an essential part of the process of analyzing data. The primary goals of EDA are to summarize and comprehend the connections, patterns, and distributions that exist within a dataset. It is an essential phase in the process of getting insights and understanding of the data, and it assists in the identification of any possible difficulties that may have an effect on the analysis and interpretation of the findings. During the EDA process, analysts will utilize a variety of statistical and visual tools to summarize and comprehend the data. Some examples of these methods are descriptive statistics, frequency distributions, box plots, and histograms. EDA is an iterative process that involves multiple rounds of visualizing and summarizing the data to uncover any relationships, patterns, or outliers that could impact the analysis.

Additionally, EDA is used to provide direction for the selection of relevant statistical models and methods to be employed in subsequent analysis. For instance, a quick glance at the data distribution can reveal that the information has to be transformed in some way or that a different statistical approach should be used. In addition, the EDA process is helpful in validating assumptions and hypotheses about the data, such as the normality of the data distribution, which is essential for choosing the appropriate statistical models. This is accomplished via the use of the normalcy of the data distribution. In a nutshell, EDA plays a crucial part in giving a thorough knowledge of the data and leading the process of data analysis so that choices may be made that are guided by information that is correct and relevant.


The Historical Crime Dataset has both Temporal and Spatial properties to its records, so this page is divided based on the plotting style used. Temporal Plots include time based changes and Spatial Plots represent the geography and magnitude of occurrences. 

General Aggregation Charts

bar-chart-crime-types

Most recorded Crime Types

From the above chart, it is evidently seen that Theft is the most reported form of Crime in Chicago, followed by Battery. Yearly representation also shoes similar where both types have leading number of reports.

Most Recorded Crime Types aggregated by Year.

Districtwise Crimes Recorded in conjuction with Arrest Made.

The stacked bar graph depicting crime data in Chicago provides valuable insights into the number of crimes reported and arrests made in each district. The graph indicates that District 11 has the highest number of crimes reported, but a significantly lower number of arrests made compared to the other districts. This could indicate a variety of reasons, such as a lack of resources or an inefficient law enforcement strategy in District 11. By contrast, Districts 1, 2, and 18 have a higher number of arrests made relative to the number of crimes reported, indicating a more effective law enforcement strategy in these areas.

However, the trend of a lower number of arrests made compared to the number of crimes reported is common in all districts. This suggests that there may be several factors that contribute to the lower arrest rate, including the complexity of investigations, the number of officers available to handle cases, and the quality of evidence available. It is also possible that some cases may not result in arrests due to lack of evidence, witness cooperation, or other reasons.

In order to improve the situation and reduce the number of crimes, it is essential that law enforcement agencies in Chicago analyze the data and develop targeted strategies to address the problem areas. For instance, they may increase patrols, allocate more resources, or implement new technologies to improve crime prevention and detection. Additionally, collaboration with the community, increased public awareness campaigns, and community policing initiatives can also help to reduce crime rates and improve public safety.


Density of crime by Community Area and District.

Hierarchial arrangement of Community Areas and the distributed crime report indication. 

The tree plot is arranged in a hierarchical manner, with the crime types nested within the community area name. This allows the viewer to see the distribution of different crime types within a given community area, as well as how the crime patterns in that area compare to other areas in the city. In the tree plot, the size of the boxes representing the different crime types can also indicate the frequency of those crimes. Larger boxes indicate a higher frequency of crime, while smaller boxes indicate a lower frequency.


Monthly Aggregation of Crimes reported in the Year of 2021

A violin plot is a type of chart that shows the distribution of a variable over time. In this case, the variable is the number of crimes, and the x-axis represents the months of the year with 12 intervals. The y-axis represents the number of crimes. In the chart, the violins for each month appear to have significant outliers, meaning there are some unusual amounts of crimes reported on a few days of the month. This observation can be seen as the points outside of the main body of the violin plot. One possible reason for this observation could be an increase in criminal activity on those days. This could be due to various factors such as a temporary increase in gang activity, a special event that attracts criminals, or a temporary reduction in law enforcement presence in the area. Another possibility is that the outliers could be due to a reporting error, such as a misclassification of the crime or an error in the data collection process.


Spatial Analysis 

Map of Chicago and Community Areas

Crime Records per District Region

Crime Records per Community Area

The two maps showing the spatial heatmaps of Chicago depict the distribution of crime in the city. The maps use a color-coded system to indicate the density of crime in different areas, with red areas indicating higher crime rates and blue areas indicating lower crime rates.

In both maps, the South Side of Chicago stands out as a red, high-crime area. This is in contrast to other areas of the city, which are mostly blue or green and have lower crime rates. The high crime rate on the South Side of Chicago is due to a number of factors, including poverty, gang activity, and limited access to resources and opportunities.

The spatial heatmaps show that crime is not evenly distributed throughout the city, but is concentrated in certain areas. By using heatmaps, the maps allow us to see patterns in crime distribution that would be difficult to discern from raw crime data alone.

The information presented in the maps can be useful for policymakers, law enforcement agencies, and community organizations to understand the crime situation in different parts of the city. This information can help them to target resources and interventions to areas where crime is most concentrated and make a difference in reducing crime rates.

Overall, the two maps showing the spatial heatmaps of Chicago provide a valuable visual representation of crime in the city and can be used to inform efforts to improve public safety and reduce crime rates.

Major Crime types recorded and with possibility o arrests as a coloration

This is a spatial heatmap chart of Chicago with four subplots of different crime types (Gun Violation, Theft, Narcotics, and Homicide) provides a visual representation of crime patterns in the city. The General Color of the chart shows the number of criminal records with the coloration indicating weather there has been an arrest made for the crime report. 

The subplot of Gun Violation and Narcotics related offences shows that both types of crimes have significantly higher arrests compared to the other two crime types (Theft and Homicide). The red areas in these subplots indicate that these crimes are more widespread and frequent in certain parts of the city, especially on the South and West sides.

One reason for the high number of arrests in Gun Violations and Narcotics related offences could be due to the efforts of law enforcement agencies to target these types of crimes. These crimes pose a serious threat to public safety, and law enforcement agencies may have increased their efforts to crack down on them.

Another reason could be the nature of these crimes, as both Gun Violation and Narcotics related offences often involve the use of firearms, which can lead to violent confrontations. Additionally, the illegal drug trade is highly profitable, which can result in more frequent incidents of Narcotics related offences.

While Theft also has a decent number of arrests, it is not considered an ideal scenario, as it is still a widespread crime in Chicago. This could be due to the lack of sufficient law enforcement efforts to address theft, or it could be a result of high poverty rates and limited access to resources in certain areas, which can lead to a higher incidence of theft.


Temporal Analysis

Temporal Chart of Crime Types representing both their Growth & Decline

The Downward trend in the rolling sum of crimes reported for crim sexual assault, battery, and gambling in the chart could be due to various reasons. The following can be potential reasons for the decline. 

Increased law enforcement efforts: If law enforcement has increased its efforts to prevent these crimes, this could have led to a reduction in the number of incidents reported. Improved community safety measures: If the community has taken additional measures to improve safety, such as installing cameras or increasing patrols, this could have helped reduce the number of incidents. Changes in reporting practices: If changes have been made to the reporting process, such as improved training for reporting personnel, this could have led to a reduction in the number of incidents reported. A Decline in demand: If the demand for these types of crimes has decreased, this could have contributed to the downward trend. For example, if gambling has become less popular, this could result in a decline in the number of incidents related to gambling. Social and cultural factors: Social and cultural factors, such as changes in attitudes towards certain crimes, can also influence crime rates. For example, if society is becoming more aware of and intolerant of sexual assault, this could be contributing to the downward trend in criminal sexual assault.


Temporal Map of Temperature °C and Crimes Recorded.

This chart is a miniature attempt at answering the question "Does factors like Climate affect the rate of crimes?". This chart represents a temporal plot of total crimes recorded for the Month of January 2018 in Chicago against the temperatures reported in those days. In my assumption It is not likely that the temperature itself is causing the increase in crime. There are many factors that contribute to crime, and temperature is only one of them. Other factors, such as poverty, unemployment, population density, and access to drugs and alcohol, can also play a role in determining crime rates.

It is possible that the increase in crime at 1 degree Celsius is a result of a confounding factor, such as a major event or a change in law enforcement practices, that is not directly related to the temperature. Alternatively, the increase in crime could be a result of an underlying relationship between temperature and crime that has not been fully explored.