Data driven journalism for local newsrooms: it’s possible!

As local American newsrooms go through difficult times, plagued both by reduced staff and lower budgets, The Big Local News collects, cleans and shares data so that journalists and researchers can write stories about policies that affect local communities and how institutions are operating.

By aggregating data-sets from many locations and then sharing that data, the Big Local News project sets out to address the phenomenon of ‘news deserts’ and help strengthen local newsrooms.

The focus is increasingly on collaboration, especially if it can expand reach or impact. That collaborative mindset will enable journalism work that otherwise could not be achieved,” said Cheryl Phillips, Professor in Professional Journalism at Stanford University, in charge of Big Local News, and also a speaker at the GEN Sumit 2019. In this interview, she answers questions on how the use of data is vital for journalism in all newsrooms, no matter what size.

GEN: What is Big Local News (BLN) ? Where does it stem from and what are your goals? How to explain it to non-American journalists? Do you want to solve the issue of “news deserts” in the US through this initiative?

Cheryl Phillips, in charge of Big Local News at the Stanford Journalism Program: Our project aims to collect, process and share governmental data that are hard to obtain and difficult to analyse; partner with local and national newsrooms on investigative projects across a range of topics; and make it easy to teach best practices for finding stories within the data.

The impetus for Big Local News came from my time as a local journalist, most recently at The Seattle Times. Over the years, I have seen time and again that once a data-driven accountability project publishes, it may have local impact, but the effort often ends there. As local news organisations become more strapped, it grows harder to continue to do the important journalism that helps shift policy, change laws and hold government accountable. What’s more, when that journalism is done, it is a challenge to ensure that five years, or even one year in the future, someone can return to the subject with new data and measure change.

Big Local News tries to address those pain points. We:

  • Collect local data from a variety of governmental sources.
  • Add specialised data acquisition and cleanup tools, making it easy to share and teach the processes for working with the data.
  • Make the data available to local, regional and national reporters through a sharing platform.
  • Archive all data via the Stanford Digital Repository, managed by the Stanford University Libraries, with a commitment to long-term stewardship.

Our goal is to help address the problem of news deserts and to help bolster journalism everywhere. By aggregating data-sets from many locations and then sharing that data, we think we can help journalism do better local reporting and regional and national reporting too.

Have any local newsrooms yet published any story based on the help of Big Local News Project? Or is it too early? If not, can you cite any of those stories?

The Stanford Open Policing Project was in essence the pilot for Big Local News, and as part of that project, local journalists reported on disparities and potential bias by state patrols in stops and searches across the nation.

Big Local News trained more than 100 journalists on how to analyse the data and numerous stories have been published. We are in the midst of releasing the next step in that project — more than 200 million records, this time around with data on police traffic stops in 50 U.S. cities.

We again will be training journalists in how best to analyse the data. We also have moved into other areas of coverage. We are collecting voter registration and history data, housing-related data and fiscal performance data on local governments and nonprofits.

In the fall, I taught a Big Local News class. As part of that class, one project team published an explanatory project related to the devastating 2018 forest fires. That project was published by the Bay City News, a local wire service, as well as multiple other journalism sites. Big Local News also worked with ProPublica, Newsy and the Center for Investigative Reporting/Reveal to archive data on rape clearances as part of that award-winning project.

With media losing advertising revenues to digital platforms, many local newsrooms no longer have the resources for investigative and public service reporting. How do you see this playing out?

I am increasingly concerned by the dwindling resources for investigative reporting. That’s why we created Big Local News. But I’m also hopeful. The landscape of how investigative and accountability journalism is accomplished has shifted dramatically. The focus is increasingly on collaboration, especially if it can expand reach or impact. That collaborative mind-set will enable journalism work that otherwise could not be achieved.

Based on your experience, what are the main difficulties for smaller/local newsrooms when using and collecting data for an in-depth story? Financial issues, time consuming, etc? Is BNL based on money or on training and mentoring?

The main difficulty is time — or the lack thereof. Daily reporters have to turn around stories quickly and often. If they work on longer pieces, it’s because they are dedicated and put in extra hours. But many of these same journalists have little in the way of data journalism training. If they do have some training, it is still a challenge to devote time to one project. Even major regional news organisations often have only a handful of journalists who understand how to analyse data. A key sticking point is that data comes in messy formats and cleaning and processing the data is time-prohibitive. Big Local News works to help take care of that step — cleaning and normalising data as much as possible. We also are working to build in workshops where we can teach journalists about data we collect. And we are building a platform which will make it easier for journalists to share data pre-publication with collaboration partners.

Big Local News also is building out story recipes, how-to guides for analysing key data for journalistic purpose. We are working with WorkBench (http://workbenchdata.com/), which is a platform where journalists can analyse data without having to code. For example, our fires analysis has been replicated in Workbench, with the goal being that this next fire season, reporters can follow our story recipe in Workbench and add invaluable context about the cost of wildfires in their region.

How do you teach journalists, or up-and-coming journalists, to work with data? Does it require specific training, mastering analytical tools for example? Are you facing any difficulties?

Our goal is to teach journalists how to use the data we collect, or partner with news organisations to collect. For example, with our policing data, we introduce journalists to the R programming language. For the fires story, we use Excel, Python and Workbench as ways into understanding the data. By focusing on the data we have collected, journalists are not just learning how to analyse the information but they are working on a possible story. It makes it more interesting, and it’s an easier sell to editors. The difficulties have to do with time. Journalists need time to learn and they need to keep learning. That’s one of the reasons we also are focusing on building out tools that will make it easier for journalists to collect and analyse data.

Is it easier to train a developer interested by journalism or a journalist with some development skills? What are your priorities?

I think this is a chicken and egg question. It’s easiest to train someone motivated to make a difference through journalism and news stories, regardless of whether they came into it from programming or beat reporting.

You gather data for Big Local News. How is it used? Newsrooms can access your gathered data for free, but are you also helping local newsrooms ‘learn’ how to get hold of their own statistics? How can newsroom join your project and get access to the data?

We are partnering with newsrooms on specific projects where they are collecting data with the idea that we will then archive that data and make it available upon publication of the stories.

We are initiating our own data collection efforts, such as the policing work and working with local journalists to help them collect similar data.

Lastly, when we see data journalism work out there, we ask to archive that data and any code with full credit, so that it isn’t lost to time or web site redesigns.

Have you received any feedback about the project? What is the main benefit you see for local newsrooms? How to develop data journalism at a local level when news organisations are in big trouble?

We are a small organisation at the moment, but all three of the full-time data journalists are working at full tilt because we have had so much interest from news organisations who want to partner with us. We also have two graduate students working on projects and a growing cohort of interested volunteers. It’s this steady stream of interest that tells us we have tapped into a need. The journalists who have reached out to us have a desire to collaborate in a way that leverages the knowledge base of the local reporters and the data negotiation and analysis skills of the Big Local News team.

Will you only target smaller newsrooms or are bigger ones also joining? How can they work together? Is collaboration the keyword of BLN?

We work with all sizes of newsrooms as well as programs at other universities, such as the University of Maryland, for example. Students there played an important role in negotiating for local police data. The metric for us is whether the work starts with an eye toward local impact that can be extended. In many cases, this means that some of the local data we collect and analyse can also be aggregated for regional and national patterns and stories. And none of this can happen without collaboration.

James Hamilton, who is running the Stanford Journalism Program, has evaluated the cost of an in-depth story : $ 300,000 and 6 months of a reporter’s time to do a deep dive into public interest issues like crime and corruption. Is Big Local News enough to ensure proper coverage of sensitive issues and the ability of news outlets to perform their traditional ‘watchdog’ role ?

No one organisation is enough. We hope Big Local News will help fill a gap in the news eco-system though, and that in doing so, will enable more accountability journalism.

Do you see Big Local News being implemented in other countries, or is this mainly a topic for American journalists and the American public system?

We’re starting in the U.S. but that doesn’t mean this concept doesn’t work elsewhere. DocumentCloud, which enables journalists to share documents and partners with MuckRock to request documents, is another example of how collaborative communities can change journalism.

Internationally, ICIJ’s work with the Panama Papers is all about sharing information among journalism partners. And look at what ICIJ did with NBC and USA TODAY on medical devices. Journalists want to change laws, policies, and lives. The best way to do that often is through figuring out how to marshal our resources to best effect. That means sharing the hard work of data collection and analysis and then letting the journalism organisations tell the best stories they can.


Cheryl Phillips is has been teaching journalism at Stanford since 2014. She founded Big Local News, part of the Stanford Journalism and Democracy Initiative, in September 2018. She previously worked at The Seattle Times for 12 years. Her roles included serving as data innovation editor, deputy investigations editor, assistant metro editor and investigative reporter. She was twice on teams that were Pulitzer finalists. She taught data journalism and data visualisation at the University of Washington and Seattle University. She served for 10 years on the board of Investigative Reporters and Editors and is a former board president.

Cheryl Phillips is a jury member for this year’s Data Journalism Awards competition.