The SDSU Data Contest Kickoff is tomorrow! Better register if you haven’t already. Here are all of the last minute details.
Time and Location:
- Register for the contest web app at http://sandiegodata.org/contest/register.
Also, Our Twitter hashtag is #sddc15
To get ready for the Data Contest, you’ll want to ensure that your laptop already has installed on it all of the tools you’ll need. There is a set of tools that we use in most of our programs, and it will serve as a good base for your contest toolkit. These tools are:
- Tableau Public, for quick yet attractive visualizations.
- QGIS, for open source Geographic analysis and maps.
- The Anaconda Python distribution, to get IPython Notebook and many other important Python libraries.
- RStudio Desktop, an excellent R environment.
- Open Refine is particularly good for cleaning data sets that have names.
- Open Office for working with CSV and Excel files.
- Github account to share code.
Additionally, we frequently use Sqlite files for data storage and to sort and search thorough data using SQL. Sqlite is already installed almost everywhere, but you may want to get a Sqlite GUI to make it more like working with a spreadsheet.
If you’d like to get a visual introduction to these tools, we’ll be running Google Hangouts to demo the tools. Signup and get contact information for these sessions at our Meetup Page.
This same set of tools will serve you well if you come to the full day data conference that the Python Meetup group is hosting, starting at 8:30 on the same day as the Contest. Come early, learn some useful skills, and put them to the test in the afternoon. Visit the Data Science FunConference registration page for details and to get a free ticket.
The SDSU / Data Library Data Contest has teamed up with UCSD, Python User Group and several Data Science User’s groups to now offer a full day event with two morning tutorials (R and Python) a mid-day exhibition with many Data Science projects and software demos, an afternoon Machine Learning challenge and the kick off to the SDSU / Data Library Data Contest. Visit the signup page to join the contest, learn more about data science, and have a chance to win part of the $2,100 in prizes. Visit the Conference Eventbright page to register for the conference.
The Student Data Contest is in less than two weeks, so it’s time to get your tools ready. If you are a student and want a shot at $2,100 in prizes, signup for the contest. One of the best tools available for quickly visualizing data is Tableau, and best of all, if you don’t need to connect to a database, Tableau Public is free. Tableau allows you to quickly produce beautiful charts and tables, and makes it easy to embed those visualizations on the web. Tableau runs on both Mac and Windows, but while it has a very well...read more
Probably the most common statistic that people deal with is the average, which can often be a good approximation of the typical or general case. However, there are many cases where the average fails, and the most extreme example I’ve seen in recent data is lawyers’ salary distributions. The NALP has been publishing salary distributions for a few years, and the blog Social Evolution Forum provides a good overview of the distribution. Since 2000 or so, the distribution is extraordinarily bi-modal, making the average, as well as the...read more
Along with SDSU and Teradata, the San Diego Regional Data Library is running a data analysis contest for San Diego area High School and College students. The contest starts February 28. Top prize is $1200. For complete details and to register, see the contest announcement page.read more
We’re looking for some programers to visualize crime data and present it at our booth at the San Diego Magazine Big Ideas Party on Jan 21. The Data Library was one of the 25 Big Ideas covered in their January issue, so they’d like us to have a presentation at the party. I’d like to have an interactive display, probably using D3, that shows a crime hot spot map for the region, as well as a collection of time-based Rhythm maps for selected areas. A visitor to the booth could select a neighborhood or city, see the hot spots in...read more
Even little data, such as a list of crime incidents, locations of bus stops, or population growth estimates, is a big idea, at least according to San Diego magazine’s 25 BIG Ideas.read more
Sponsor a Project in Our Student Data Analysis Contest Give your nonprofit, agency or news organization valuable data-driven insights by sponsoring a project at our student data analysis contest. The San Diego Regional Data Library, the SDSU Society for Statisticians and Actuaries and Teradata are organizing a data analysis contest to aid nonprofits, journalists and government agencies in making better use of data, develop a broader regional capacity for data analysis, and introduce students interested in data analysis to future...read more
Data.gov is the top level data search system for the US, with references to over 130,000 datasets from federal and state agencies. And yet, I’ve never successfully used it for finding data. Here is an example search for “Diabetes Rates”: So, we look for diabetes, and get births, 22 year old mortality data from the US Geological Survey, and quality of service data as the first three hits. The first link at least points to the right agency, but you still have to click three times to get there. Here is the same search on...read more