Social progress depends on data, but San Diego lacks a comprehensive source for data about our region and individual communities. We propose to create a nonprofit Data Intermediary to collect, store, distribute, and analyze a wide range of data about the San Diego region and the individual neighborhoods within the region. This regional data resource will will ensure that San Diego’s civic, social, and government organizations have the data they need to manage their operations and pursue their civic and social missions.

The more nonprofits, civic organizations, and government agencies know about their communities, the better they can target their services, demonstrate need for grants, and develop innovative solutions. In many other cities, this need for data has encouraged the creation of nonprofit organizations specifically dedicated to collecting, analyzing, and distributing data. Working in partnership with but separate from universities and governments, these Data Intermediaries, operating in hundreds of communities nationwide, ensure that data flows from data producers to the organizations that need it most. They also provide training in GIS and data analysis and produce dedicated information systems, such as Neighborhood Information Systems and Community Indicator websites. Some examples of these organizations include:

Each of these groups, and hundreds more around the country, are dedicated to distributing the knowledge that drives progress. Creating this sort of Data Intermediary for the San Diego region will make our social programs, civic groups, and government organizations more effective; our citizens better informed; and our policy makers able to make better decisions. This proposal details what Data Intermediaries are, how San Diego will benefit, and how we should create a Data Intermediary for San Diego.

What Are Data Intermediaries?

As the name implies, a Data Intermediary is an organization that coordinates between data producers and data consumers. While many data consumers, such as land planners or demographers, are skilled at finding the data they need, most nonprofit, civic, and social data users are not. These users have difficulty acquiring even basic data, such as demographic profiles. Complex data—neighborhood-level crime reports, for example—are completely out of reach for many organizations because, while it is possible to acquire the data, doing so takes far too much effort, and properly extracting and analyzing it takes too much skill.

A Data Intermediary serves as a skilled broker that understands the needs of data users and knows what is available from the data producers. An intermediary has a specific mission to talk to regional nonprofit data users to understand what they need for their operations, to acquire the most valuable datasets, and to make them available in a form the nonprofit community needs.

Why Are Intermediaries Needed?

Data Intermediaries play an important role in the nonprofit landscape because they offer a set of services that other groups cannot. They have staff with high-value technical skills and the time to work on complex data projects; they are trustworthy, neutral third parties that can facilitate data sharing; and they have a focus on ensuring data consumers get what they need.

Time and skill. Because a data intermediary serves many data consumers, it can afford to have staff with very particular skills, like data analysis or GIS, that would be too expensive for most nonprofits to hire. And because an intermediary can split its costs across many clients, the intermediary can afford to dedicate time to acquiring and processing datasets outside a typical nonprofit’s price range.

Trust and sharing. Many community groups are suspicious of data efforts, particularly in communities where data has been used in the past to promote development that was detrimental to the community. These groups are more likely to be involved in data efforts that are seen as being independent and trustworthy.

Being a trusted third party also allows intermediaries to facilitate data sharing. A web of complex personal and corporate relationships, political issues, and longstanding grievances can impede the flow of information between groups. Impoverished communities may not trust developers, newspapers may not trust each other, companies are competitive with each other, and even agencies of the same municipal government have practical or legal reasons why they can’t share data. A data intermediary, as a trusted, neutral third party, can be a conduit between organizations that aren’t able to work directly with each other.

Focus on users. The first studies of Data Intermediaries from 1996 found that the most effective way to ensure that community organizations use data is to provide a range of data and services, rather than just providing the data. The best Data Intermediaries provide a knowledgable human connection between producers and consumers.  The Intermediary cannot be solely an online data repository; it must be a partner with both data producers and consumers.

What Will Our Library Do?

The San Diego Regional Data Library will:

  • Understand the data needs of social organizations, civic groups, and government agencies.
  • Maintain a repository of data to serve those data needs.
  • Involve volunteer and freelance data analysts, programmers, and policy makers in the analysis, presentation, and use of the data for the benefit of local data users.

The Library will be a nonprofit partnership that will work with many other organizations in San Diego, including data-oriented organizations like the Equinox Center, Open San Diego, universities, newspapers, government agencies, and other nonprofits.

Understand data needs. The Library staff will be directly involved with data-consuming organizations: talking to their staff, learning about their operations, and collecting their data questions. They will use traditional methods such as making phone calls, doing interviews, conducting surveys, and attending meetings, as well as running mailing lists and online forums.

Maintain a data repository. As Library staff learn about what data users need, they will work with data producers to acquire the data, organize it, and post it online. Sometimes acquiring data is as easy as visiting a government agency’s website, but sometimes more work is involved. Depending on the value of the data set, Library staff may issue FOIA or PRA requests, visit agencies in person, or work with universities to extract data from their collections. Once Library staff have the data, it will be cleaned, posted in the online repository, organized, and cataloged to make it easy to find and use.

Build a community. One of the most important goals of the Library, and one that differentiates it from most Data Intermediaries, is community involvement. San Diego has a broad technical community, including data modeling experts, several predictive analytics companies, and tens of thousands of programmers. We also have many non-technical people who are looking for interesting ways to improve their home city. The Library will develop programs to allow these volunteers to be involved in collecting and using data for the benefit of local nonprofit organizations, and to better understand the important issues in the region.

What Will a Data Library Do for San Diego?

The most common benefits of Data Intermediaries are making the right data available, providing analysis services, and increasing the community capacity to use data. The San Diego Regional Data Library will pursue the same goals, although in a somewhat different way than other Data Intermediaries. The Library also intendes to reduce data-related costs for local nonprofits and use the San Diego technical community to:

Make the right data available. A regional data library will ensure that the greatest data needs are satisfied, with two very important outcomes: nonprofits will be better able to win grants, and they will improve their operational efficiency.

Provide analysis services. Most data analysis jobs are small, with the data user requiring a special demographic profile of an area, a thematic map, or a data-driven answer to a question of “How much?” or “How Many?” These questions are too small and too infrequent for a nonprofit to hire a person with specialized skills, or even to find contractors who can take on small jobs, so these questions go unanswered. For larger jobs, most nonprofits don’t have the capacity to even determine who to hire. A Regional Data Library will serve as a well-known resource that can provide the analysis services for small jobs and the guidance on how to complete the large ones.

Increase capacity to use data. For organizations with a persistent need for analysis and GIS services, the Library will offer training sessions in using the library, acquiring specialized data sets, performing basic data analysis and presentation, and using GIS tools.

Reduce costs. Many San Diego nonprofits have dataset and data analysis needs in common. Traffic data, walkability maps, special demographic profiles, and education statistics are all datasets applicable to many different nonprofit sectors. Because the Library staff will work across sectors, they can discover these opportunities for sharing data and analysis, reducing costs for everyone.

Inspire Innovative Expression and Analysis. By working with a broad technical and civic community, the Library will recruit volunteers to acquire new data and develop new ways of analyzing data and presenting visualizations, similar to the work of DataKind or the Health Data Initiative. These data volunteer services have proven to be an effective way to give civic and social organizations the infusion of technical skill they need to solve old problems in new ways, while developing creative ways to display data to explain their mission and detail their successes.

Working with Other Organizations

Since all successful Data Intermediaries develop partnerships, San Diego Regional Data Library must work with many other regional organizations. Some of these organizations are traditional partners of data intermediaries, some are specific to the San Diego area, and some are valuable partners to emphasize a wide community involvement. These organizations include: the Equinox Center, SANDAG, Open San Diego, universities, and city governments.

Equinox Center. The Equinox Center is a nonpartisan, not-for-profit research and policy center that publishes an annual Dashboard report of regional quality of life issues. Its mission and operation is very similar to a Community Indicators organization; like Community Indictor projects, the Equinox Center collects and reports on data about the San Diego region, but the Equinox Center does more much more work on policy analysis and research than most indicator projects. The Equinox Center can both contribute to and benefit from the San Diego Regional Data Library, since Library will benefit from the Equinox Center’s data analysis, and the Library can provide data to the Equinox Center.

SANDAG. SANDAG fulfills some of the role of a Data Intermediary, but it has an emphasis on land and transportation planning and provides data that is derived from the census. It does not serve as a clearinghouse for data in general, so it is not a source for data regarding health, the environment, children, education, or the other major topics of a data repository. Additionally, SANDAG is strongly affiliated with government; whereas, most successful community-oriented data intermediaries are outside of government to avoid conflicts between community and government organizations.

SANDAG provides detailed demographic reports based on census data, as well as population and demographic estimates through 2050.  San Diego Regional Data Library can refer users to these profiles, rather than replicate them internally, and can augment the datasets that SANDAG does provide with additional data that it does not currently provide.

Open San Diego. Open San Diego is a community project that aims to make more data about San Diego freely available. The members of Open San Diego have deep technical experience, and the Open San Diego project can be a critical ally of any regional data library, as it has a very similar mission. While San Diego Regional Data Library emphasizes the role of data consumers—nonprofits, civic and social groups—in determining what data is valuable to publish, Open San Diego has a strong connection to the technical community that can find the innovative ways to use data, making the combination a powerful way to solve real social problems.

Universities.  Universities are the most common partners for Data Intermediary organizations. SDSU and UCSD both have public affairs and urban planning departments that can be both producers and consumers of data. Both universities also have students interested in solving social problems.  The Institute for Public Health at SDSU is an intermediary that connects health practitioners with academic research. Their work finds and generates data about health, most of it in the San Diego region.  University programs such as these are natural partners of a Data Intermediary.

Building the San Diego Regional Data Library

The team working on this project has the technical capacity to build and operate the library, as well as connections with other Data Intermediaries that can guide us in the process, so the project is ready to recruit participants, build a coalition, and establish the initial management structure. The project needs to assemble three communities of participants, build the advisory and management boards, and raise money for the first phase.  Over the next few months, we will be recruiting people to participate in the following steering committees:

  • Data consumers, primarily nonprofit civic and social organizations.
  • Data producers, including city agencies,  state agencies, and universities.
  • Technical volunteers, particularly analysts and programmers.
  • Advisory board, with members from the above three communities.
  • Management board, which will become the board of the Library after it becomes a nonprofit.

If you would like to help San Diego by increasing its civic and social organizations’ capacity to find and use data, please contact Eric Busboom at eric@civicknowledge.org or 858.386.4134.