Here are the files for my presentation to the San Diego Crime Analysts Association:
A Rhythm Map is a heat map that displays time in the X and Y dimensions. They are an excellent way to visualize repeating patterns in time, such as how crimes occur by hour and data of week. Here we look at some interesting patterns in burglaries in the City of San Diego.
First, here is the map for a range of crime types in San Diego, compiled from the type, time and date of about 400K crime incidents in the City of San Diego from 2006 to 2012.
Each square is a crime type. The vertical axis is the hour of the day, and the horizontal axis is the day of the week, with Sunday being the cell between 0 and 1. Darker red means there are more crimes than lighter red and yellow. The colors are not comparable across squares, only within the cell. So, the dark red cell at 5:00PM on Friday in the Burglary square may represent a very different number of crime incidents than then dark red cell at 12:00AM on Thursday in Sex Crimes. Also note that these views combine citations, arrests and reported crimes, and there may be different patters when the maps are broken out on that factor.
There are a lot of interesting patterns here, but we’ll focus on Burglary. The first thing to notice is there are two time ranges, groups of darker red cells, when burglaries occur: during the work hours on weekdays and on Friday evenings. ( The strong line at noon is most likely an artifact of crimes for which the time is not known being given that value arbitrarily. )
What accounts for the two separate time ranges? First, let’s break it out by community. This chart uses Clarinova Place Codes for the community names.
Here we see that some communities exhibit one pattern or the other, and sometimes both. Downtown ( SanDOW ), La Jolla ( SanLAJ ) and Mira Mesa ( SanMIR ) show the Friday pattern, while Southeastern ( SanSOT), Greater North Park ( SanGRE) and Midtown ( SanMID ) show the week day pattern.
Community distinctions may explain some of the differences in the patterns, but there is a factor that is probably more important: residential vs commercial crime. So, let’s split out the maps on that factor.
Here is where the distinctions become the strongest. In Otay Mesa ( SanOAT ), Mira Mesa ( SanMIR ) University ( SanUNV ) and others, the Friday evening pattern completely splits from the weekday pattern. However, we also see a new weekday pattern in the commercial burglaries in Claremont ( SanCLA ), Uptown, Midtown, with commercial burglaries occurring across the weekday evenings.
Those features are consistent with exactly what you’d expect from burglary: the burglaries occur when the business and homes are unoccupied. But it doesn’t explain why in many communities the commercial crimes would occur more frequently on Friday evenings. Another unusual pattern is that in Pacific Beach ( SanPCF ) there is a residential burglary cluster on Friday and Saturday evenings, with a similar but weaker pattern occurring in Uptown and College.
Rhythms are a powerful way to look for patterns in time-structured data, because they take advantage of the ways that human brains most quickly process visual information. However, they aren’t a complete solution; they are just a start. Before making any recommendations based on the data, we’d want to do a few statistical tests, and at least, look at the absolute number of incidents per cell in the areas exhibiting patterns.
For the last 5 months, SANDAG has been publishing their crime incident data to the web. The file they publish only stores the last 180 days, and it is a bit hard to find, so we’re archiving the files to our data repository .
Here is an interactive data application that explores how crime incidents vary over the day of week and time of day. In the checkbox below, select one or more crime types, and the heatmap will show the relative intensity of those crime types over day of week and time of day. There many interesting patterns here, some you would expect, some you might not: Things you might expect: DUI and Drugs violations are primarily committed in the evening and early mornings on weekends. Assaults are most frequent in the late evenings and early mornings on...read more
For the last few months, a team of geography students at SDSU have been working with the crime data provided by the Library, producing analyses and visualizations of the data. Elias Issa has been looking at Drugs and Alcohol violations in Downtown San Diego and East Village. He writes: The Hot Spot tool calculates the Gi* statistic for each feature in a dataset. The resultant z-scores and p-values tell you where features with either high or low values cluster spatially. To have a statistically significant hot spot, a feature will have...read more
Last year, the U.S. Department of Housing and Urban Development commissioned a report to study the feasibility of creating a nationwide database of parcel information. This is a difficult task because the parcels are usually maintained by the counties, and the US has about 3200 counties. The resulting report is remarkably thorough, including a description of the data collection process and the effort required to get the data, and the information contained in each county’s dataset. The effort required varied greatly, with 13% of the...read more
The Voice of San Diego is running a Q&A regarding Open Data, which just happens to involve an interview with the Director of the Library.read more
SANDAG, through its public safety division ARJIS, is now publishing crime data to the web. This is a major advance in accessibility, since previously crime incidents were only available through a Public Records Act request, and usually involved a fee. The download file includes crime incidents for the last 180 days. The Library will be archiving these files occasionally, and adding them to our crime incident datasets. The files are updated weekly. Unlike the data we acquired previously, this release does not include the ‘legend’...read more
We recently converted the SWITRS database of traffic collisions in California, extracting the records for San Diego County and creating a basic visualization in Tableau Public. Tableau Public is a fantastic data analysis tool, although it takes a bit of training to do complex things. Below is a simple visualization of the number of people killed and injured in San Diego county traffic collisions by day of week and hour of day, for the years 2002 to 2012, inclusive. The deaths line (orange) shows a familiar “bathtub” shape, with...read more
For the Dig Into Data meetup this Wednesday, we’ll be talking about tools to use for analyzing data. If you’d like to follow along in the meeting, you can install these tools before you arrive. The two applications are: Tableau public, for analyzing tabular data. It is a great tool for basic data mining. QGIS, for geographic data. Tableau Public is the free, limited version of Tableau’s professional data mining tools. It is really easy to install, just visit their download page to get started. Tableau Public runs...read more