Once you’ve got the basic skills in programming and statistics, the best way to learn data analysis is to do it. So, we’re developing a practical experience program for aspiring data analysts. The program is in the pilot phase with a small set of students, but you can read about how it works on the … Read more Learn Data Analysis Techniques
We’ve frequently mentioned that people who work on data projects tell us that frequently, 80% of their projects are consumed by data preparation and cleaning, so it is interesting to get this data point from Kaggle: (2) How long is a typical project? When working with a top 0.5% data scientist, projects take just eight … Read more The Cost of Cleaning
Here are the files for my presentation to the San Diego Crime Analysts Association: PDF File PPT File
A Rhythm Map is a heat map that displays time in the X and Y dimensions. They are an excellent way to visualize repeating patterns in time, such as how crimes occur by hour and data of week. Here we look at some interesting patterns in burglaries in the City of San Diego. First, here is … Read more Burglary Rhythm Maps
For the last 5 months, SANDAG has been publishing their crime incident data to the web. The file they publish only stores the last 180 days, and it is a bit hard to find, so we’re archiving the files to our data repository .
Here is an interactive data application that explores how crime incidents vary over the day of week and time of day. In the checkbox below, select one or more crime types, and the heatmap will show the relative intensity of those crime types over day of week and time of day. There many interesting patterns … Read more Day/Time Crime Heatmaps
For the last few months, a team of geography students at SDSU have been working with the crime data provided by the Library, producing analyses and visualizations of the data. Elias Issa has been looking at Drugs and Alcohol violations in Downtown San Diego and East Village. He writes: The Hot Spot tool calculates the Gi* … Read more Drugs and Alcohol in Downtown and East Villiage
Last year, the U.S. Department of Housing and Urban Development commissioned a report to study the feasibility of creating a nationwide database of parcel information. This is a difficult task because the parcels are usually maintained by the counties, and the US has about 3200 counties. The resulting report is remarkably thorough, including a description of … Read more Why is Open Data Hard?
The Voice of San Diego is running a Q&A regarding Open Data, which just happens to involve an interview with the Director of the Library.
SANDAG, through its public safety division ARJIS, is now publishing crime data to the web. This is a major advance in accessibility, since previously crime incidents were only available through a Public Records Act request, and usually involved a fee. The download file includes crime incidents for the last 180 days. The Library will be … Read more SANDAG Is Now Publishing Crime Data Weekly