Big Data in the Social Sector: Promises and Pitfalls

By Kriss Deiglmeier

Every day, we create 2.5 quintillion bytes of data, so much that ninety percent of the world’s data has been created in just the last two years. But data is only valuable if you can do something with it. And, like any tool, data can be used for good or bad.

Making data-informed decisions is nothing new. But the scale of today’s data ecosystem challenges us to think creatively about new ways data can inform our work, policies, and programs. I’m writing to highlight a couple examples that inspired me, and to share tactics that socially-minded organizations and innovators can adopt to tap into data’s potential for good.

Crisis Trends Website

Promise and Problems

Data is easier to acquire and cheaper to store than ever before. This trend is democratizing access to information and increasing transparency between the information haves and have-nots. Data can tell us whether our work is having the impact we desire, reveal patterns, and even predict future needs among vulnerable populations and environments.

However, the immensity, diffusion, and ethics surrounding data are daunting. Data without interpretation is meaningless, yet extracting knowledge can seem intrusive and expensive, requiring skills from social and computer sciences. Some parts of the world are lagging behind the digital divide, without comprehensive legal frameworks to govern data access and storage. We are at a new frontier, navigating uncharted terrain where conflicting priorities around privacy and public good converge.

Crisis Text Line and the Institute of Public and Environmental Affairs have taken different approaches to using data for social impact, each offering its own lessons for our work:

Crisis Text Line

Nancy Lublin founded Crisis Text Line to provide free, 24/7 emotional support for those in crisis. She has trained volunteer counselors around the United States to respond to text messages sent by people in crisis, helping them transition from “a hot moment to a cool calm” in dealing with issues ranging from eating disorders to suicide. In just 2.5 years of operations, Crisis Text Line has processed more than 7 million messages. Their data has volume, velocity, and variety. Patterns in the anonymous texts help Nancy and her team understand user needs and improve their service without compromising privacy.

These patterns can also inform research, policy, and programming far beyond their own organization. According to their website, there has not been a comprehensive study on youth and mental health since 1997. Crisis Text Line launched CrisisTrends.org to help us better understand the crises that Americans face, offering unprecedented insight into where and when these crises occur. This interactive, aggregate dataset can inform the public and the media, shape government and school policies, and drive academic research to help people beyond those who use the Crisis Text Line service.

The Institute of Public and Environmental Affairs

China has many environmental protection laws, but lax enforcement by state and local governments has hindered their effectiveness. Ma Jun founded the Institute of Public and Environmental Affairs (IPE) in 2006 to gather and analyze publicly available but hard-to-find data on environmental violations across China. His organization makes this data accessible and easy to use, so that press, investors, and citizens can hold suppliers, multinational corporations, and local governments accountable. He and his team have aggregated and analyzed this data to rank multinational brands by their environmental impact and to rank Chinese cities according to their level of environmental information disclosure.

By making data on environmental violations publicly accessible and recently introducing a mobile app to monitor air quality, IPE has empowered Chinese citizens and international consumers to pressure factories into compliance and inspire local regulatory action. IPE also works with companies and factories found in violation to develop more responsible environmental practices. Jun’s work is a great example of how data can be accessed, interpreted and shared to tangibly change the behavior of firms, governments, and individuals.

IPE Waste Map

There are many other examples of innovative uses of data for social impact, from HP Earth Insights’ monitoring of biodiversity in tropical forests, to the United Nations’ tracking global patterns in refugee movement, to Fundación Paraguaya’s mobile data capture to inform its microlending strategy.

What Social Innovators and Organizations Can Do

Most of us have limited resources for complex data capture or analysis projects. For those yet to prioritize such work, what can we do to move in the right direction?

  • Identify and respond to staff needs. Ask your front-line team members where their pain points are at work. Data for data’s sake can be a source of staff frustration. Find the places where staff struggle to deliver on your organization’s mission, and see where data can make their work easier or more meaningful.
  • Start small. It’s easy to feel paralyzed by the magnitude of your vision or the complexity of what is possible. Start by defining the problem you most want to solve, and then finding a couple of metrics that will help you to know if you are on the right track. The answers you get will reveal which question to pursue next, and your data will grow over time.
  • Ensure you can access your data.  Beware of custom solutions that require outside expertise to administer. You don’t want your data to be the hostage of an obsolete database, so make sure you can export it at will. Open source or mainstream solutions that can be customized are often better than something developed just for you.
  • Talk to partners and others working on the same issues. What questions do they ask? How are they tracking their outcomes and which indicators do they measure? Collaborating to build parallel data sets is valuable because it creates the future potential to aggregate your information for greater statistical significance.
  • Plan the feedback loop. Since data without interpretation is worthless, allocate time to analyze results, learn from patterns, and adapt your strategies based on what you learn. Ideally this is a regular part of operations and programming, not a once-a-year event.

Words for the Wise

Despite all its promise, big data is no panacea. Society’s biggest challenges are left to the social sector because markets have failed, legislation is lacking, and incentives are twisted. To solve the most pressing social and environmental problems of our time, there are no barcodes and PayPal is not accepted, so we have a tougher path than does Google or Amazon. We do this work because we care, not because it is easy. Remember:

  • Not all people and problems are “data-ified.” Many of the environmental and social services we provide happen offline and are impossible to codify in a stream of 1s and 0s, yet they are of vital importance as we work to build a more sustainable and just society. Many people and places are not just offline, they are living without access to regular electricity and clean water. We can’t overlook these people or forget the urgency of meeting basic needs.
  • There is no substitute for the human touch. Nancy Lublin of Crisis Text Line emphasized her decision to retain human counselors at the core of her service, despite the rich information in her data set. No algorithm or automated process can match the empathy our work requires.
  • Experts are expensive and hard to find. It’s going to be tough to recruit data scientists away from top private sector firms, despite the heroic effort of the engineers who left Google to resolve the Healthcare.gov debacle. In targeted, short-term cases, a team of pro bono data scientists might be able to help. However, for the longer term, we need to dedicate more funding for data science and invest in the statistical and analytic skill set of the people already working in our field.

At its best, big data can be interpreted to help us allocate our resources to maximize social impact. It can help us evolve from what “might be working” to what has “maximum likelihood to work the best.” It can help us to prevent future challenges, to predict shortages, and to avoid disasters. This insight can direct our time, resources, and services more wisely. But unlocking the power of big data is not easy and it’s not cheap. We cannot expect nonprofits to add this to their list of things to do with existing resources, so we must encourage new funding, learn from the pioneers doing this work, and continue the spirit of creativity, innovation, and optimism that is needed in this sector.

Note: Amanda Greco contributed to and edited this piece, a true partner.

I’d also like to acknowledge the Data on Purpose conference at Stanford University and presenters Nancy Lublin, Andrew Means, and Jim Fruchterman for many of the ideas summarized above.

News & Press