Exploring San Diego Crime Data using Python – Workshop

Tonight at Downtown Works,  SCALE San Diego and Open San Diego are hosting a workshop on analyzing crime data with Python, Pandas and Matplotlib. Unlike past analysis we done at the Library on San Diego Crime, this analysis uses data from the San Diego Police, rather than the whole county data from ARJIS, so it is more focused, and more detailed. If you’d like to go, visit the signup page to register.

It is an interesting dataset because, while the location addresses are often missing, and it doesn’t have UCR crime codes, it does have detailed call types and is linked to SDP beats. The 2.5 years of data that are published have more than 1.1M records. The San Diego Open Data Portal also has shapefiles for the beats, districts and neighborhoods, so we can make maps.

For instance, here is a choropleth that is colored according to the counts of calls for “LOUD PARTY” by beat. It should surprise no one where the hotspot is:

The dataset also has very nicely formated date/times so Pandas has an easy time extracting time parts, allowing us to build Rhythm Maps. This heat map displays the count of LOUD PARTY incidents, over the whole dataset, organized by hour and month”

You can clearly see two significant patterns: Loud Party calls are, as expected, primarily made in the late night and early morning, and the calls are more frequent in the summer than in the winter.

There are certainly a lot of other interesting patterns to find in this dataset, and if you are interesting in finding them, I hope to see you at tonight’s meeting.

BTW, if you’d like to see how these charts were generated, the Jupyter Notebook is in Github.

eric.