Along with SDSU and Teradata, the San Diego Regional Data Library is running a data analysis contest for San Diego area High School and College students. The contest starts February 28. Top prize is $1200.
For complete details and to register, see the contest announcement page.
We’re looking for some programers to visualize crime data and present it at our booth at the San Diego Magazine Big Ideas Party on Jan 21. The Data Library was one of the 25 Big Ideas covered in their January issue, so they’d like us to have a presentation at the party.
I’d like to have an interactive display, probably using D3, that shows a crime hot spot map for the region, as well as a collection of time-based Rhythm maps for selected areas. A visitor to the booth could select a neighborhood or city, see the hot spots in that area, and see how the crime incidents change in that area over time.
You’ll get a ticket to the party on the 21st, to share in the glory, get free food, and do some high-quality hobnobbing.
If you are interested, send me an email, with a link to your Github/Bitbucket/etc account or portfolio, to firstname.lastname@example.org. We can use any number of volunteers, but I only have three free tickets.
Even little data, such as a list of crime incidents, locations of bus stops, or population growth estimates, is a big idea, at least according to San Diego magazine’s 25 BIG Ideas.
Sponsor a Project in Our Student Data Analysis Contest Give your nonprofit, agency or news organization valuable data-driven insights by sponsoring a project at our student data analysis contest. The San Diego Regional Data Library, the SDSU Society for Statisticians and Actuaries and Teradata are organizing a data analysis contest to aid nonprofits, journalists and government agencies in making better use of data, develop a broader regional capacity for data analysis, and introduce students interested in data analysis to future...read more
Data.gov is the top level data search system for the US, with references to over 130,000 datasets from federal and state agencies. And yet, I’ve never successfully used it for finding data. Here is an example search for “Diabetes Rates”: So, we look for diabetes, and get births, 22 year old mortality data from the US Geological Survey, and quality of service data as the first three hits. The first link at least points to the right agency, but you still have to click three times to get there. Here is the same search on...read more
Here is the Tableau workbook that we’ll be using for the SPJ Hacks/Hackers Data Show event tonight. You can also get links to all of the documentation from our data warehouse.read more
Voice of San Diego has published a story we supported, a fact-check about the density of veterans in San Diego County. You can get all of the data behind the story — including schemas and SQL queries ! – in our data warehouse.read more
Once you’ve got the basic skills in programming and statistics, the best way to learn data analysis is to do it. So, we’re developing a practical experience program for aspiring data analysts. The program is in the pilot phase with a small set of students, but you can read about how it works on the Internship Program’s wiki page. The goal of the program is to develop more experience with answering questions with data in San Diego and to make that experience available to non profits, government agencies, journalists and other...read more
We’ve frequently mentioned that people who work on data projects tell us that frequently, 80% of their projects are consumed by data preparation and cleaning, so it is interesting to get this data point from Kaggle: (2) How long is a typical project? When working with a top 0.5% data scientist, projects take just eight to 40 hours ($3k to $12k). Projects are finished in closer to eight hours for clean data and closer to 40 hours when the data requires cleaning. So, in this anecdote, with some squinty-eyed interpretation, data cleaning...read more
Here are the files for my presentation to the San Diego Crime Analysts Association: PDF File PPT Fileread more