Data analysts should always be skeptical of their data, because it very often goes bad. The Quartz bad data guide is a summary of the most common data flaws, from automatic Excel conversions to inappropriate use of Margins of Error.
Big Data Hackathon on Oct 3 at 9AM at SDSU!
On October 3, the Center for Human Dynamics in the Mobile Age will host a Big Data Hackathon at San Diego State University. Contestants will use data analysis and programming to solve civic problems related to water conservation, disaster recovery and crime prevention. Visit the Devpost website for details and to register, or hit Github for the tech details and data.
Finally, our legislators are getting below the surface of the Open Data issue and addressing one of the deeper plains: Open Data is nearly useless when it is delivered in PDF. To address this problem, our very own Brian Maienschein (well, the inland “us” ) has introduced AB 169, which mandates that when agencies publish open data, it is published in a machine readable format.
Yea! One prayer answered! If you are in district 77, send Maienschein some love, if not, tell your assembly person to get with the program.
We completed our 2015 Data Contest with final presentations and winners at the awards ceremony on Tuesday. Here are the winners and their presentations:
- UCSD MAS Data Science, Time and Space Analysis of Food Distribution
- irHacker, California Suspensions
- Flash and Shadow, A Visual Geographical Study on Location, Availability, Public Transportations and Crime Exposure
- A Mathematical Modeling Team, Are Some Teachers Just “Meaner” than Others?
We also have two Honorable Mentions:
- Kearny Komets, CELDT and English Language Arts
- DS3, An Analysis of the Methods of Discipline at Monarch
Thank you all for participating! The submissions were very valuable for the non-profits that were involved, and we’re looking forward to the contest next year. Until then, if you’d like to get involved in other nonprofit data analysis projects, join the Practical Data Program for announcements about upcoming projects.
The SDSU Data Contest Kickoff is tomorrow! Better register if you haven’t already. Here are all of the last minute details.
Time and Location:
- Register for the contest web app at http://sandiegodata.org/contest/register.
Also, Our Twitter hashtag is #sddc15
To get ready for the Data Contest, you’ll want to ensure that your laptop already has installed on it all of the tools you’ll need. There is a set of tools that we use in most of our programs, and it will serve as a good base for your contest toolkit. These tools are:
- Tableau Public, for quick yet attractive visualizations.
- QGIS, for open source Geographic analysis and maps.
- The Anaconda Python distribution, to get IPython Notebook and many other important Python libraries.
- RStudio Desktop, an excellent R environment.
- Open Refine is particularly good for cleaning data sets that have names.
- Open Office for working with CSV and Excel files.
- Github account to share code.
Additionally, we frequently use Sqlite files for data storage and to sort and search thorough data using SQL. Sqlite is already installed almost everywhere, but you may want to get a Sqlite GUI to make it more like working with a spreadsheet.
If you’d like to get a visual introduction to these tools, we’ll be running Google Hangouts to demo the tools. Signup and get contact information for these sessions at our Meetup Page.
This same set of tools will serve you well if you come to the full day data conference that the Python Meetup group is hosting, starting at 8:30 on the same day as the Contest. Come early, learn some useful skills, and put them to the test in the afternoon. Visit the Data Science FunConference registration page for details and to get a free ticket.