Data Philanthropy: Inside Tech’s Version of the Legal Clinic
If some of the world’s most well-resourced companies are still struggling to become data-driven, where does that leave nonprofits?
One company thinking about that question is Two Sigma, a New York-based hedge fund that uses big data and artificial intelligence to inform its investing. In 2014, a group of Two Sigma employees founded the “Data Clinic,” a tech alternative to the legal clinic that offers nonprofits, government agencies and mission-driven organizations pro bono tech and data support.
The work of the Data Clinic can be classified as “data philanthropy,” a relatively new form of corporate giving in which companies share anonymized datasets with researchers, academics and nonprofits to benefit the public good. Two Sigma takes a more active approach to data philanthropy, donating its data scientists and engineers’ time and talent to projects and the creation of open-source tools for cleaning and curating datasets.
Rachael Weiss Riley, the clinic’s director, said this hands-on approach is designed to encourage nonprofits and other organizations to adopt a new mindset toward data.
“We try to shift the way people think about data from something out of reach and a little intimidating to just another tool they can use,” said Weiss Riley.
Built In recently spoke with Weiss Riley to learn more about the clinic’s structure, projects and the future of data philanthropy.
Rachael Weiss Riley, Director of the Data Clinic at Two Sigma
How does the Data Clinic source projects and what does the scoping process entail?
The Data Clinic consists of a core team who works full time on sourcing, scoping, researching and managing client projects and our open-source tooling verticals. Projects are sourced through a variety of channels: employees, direct communication with an organization and referrals from past partners. We then schedule an intro call to learn more about the organization, their mission, challenges and whether we can add value.
Scoping out a project is often the most time-consuming part of the process. The key is getting organizations to express what would help them make a bigger impact and then seeing if we can reframe that as a research question. We then need to determine if we can answer the question with data they have or with public and open data. Sometimes the data just doesn’t exist and this is where we start a conversation about the information they need and how best to start collecting it. Once we align on the scope, we create a dedicated team of Two Sigma employees and Data Clinic core staff to support the project.
Does the Data Clinic offer training or post-project support for organizations after their partnership with Two Sigma ends?
After the hand off, we remain in contact with our partners to support them if they have additional questions and check in to learn more about how the deliverables were used and any challenges they encountered. Something we do throughout our projects is empower people to use data and tech to their advantage. We try to shift the way people think about data from something out of reach and a little intimidating to just another tool they can use.
A great example of this would be our work with the Environmental Defense Fund to use data to drive oil and gas well management. A significant part of the work was enabling a cultural shift within the EDF to convince them that data-driven strategies are not only useful but don’t have to be a huge resource drain. We used open data, and this project was in effect a low-lift exercise that kick-started their journey to use data for impact.
Can you talk about the work the Data Clinic has done during the pandemic? In addition to COVID-19 projects, the team has also done some work related to the census, right?
We began exploring census response rates to understand how the count was shaping up and to help in outreach efforts. However, once people started leaving New York City, we wanted to see if this exodus from higher-income neighborhoods was also reflected in the data, and sure enough, it was. Neighborhoods that typically had high response rates were falling short of their 2010 expectations, perhaps indicating that residents who had left were filling out their census info at a different or secondary address, or maybe not at all.
We have also been looking into subway accessibility and recently released several open-data products to make this data easier to combine and use in analysis. As NYC slowly began to reopen, we pivoted slightly to see if we could quantify subway train “crowdedness” to help people navigate their commute and expect to have a beta release of this tool in the next week or so. We are also in scoping discussions with several orgs on COVID-19 recovery work but cannot speak publicly just yet.
How do you see data philanthropy evolving in the next five years?
It’s been really encouraging to watch both the creation and expansion of different data and tech-for-good initiatives. There has been a recent and welcome shift away from calling our work “philanthropy” when really the value add exists in both directions. Yes, it is the right and socially responsible thing to do, and this type of skills-based volunteerism should be an integral part of any corporate social responsibility program. But this perspective misses the fact that it’s also good business. The benefits to Two Sigma include improved employee recruitment and retention, learning and development opportunities for volunteers, and the burnishing of our scientific and social reputations.
Ultimately we hope initiatives like ours are not needed as data and tech are democratized and made accessible more broadly, but realistically we are clearly a long way off from that day. In the meantime, these types of public-private corporate-nonprofit partnerships are needed to tackle complicated and entrenched social justice issues.
Responses have been edited for length and clarity.