Being part of the Swiss Public Broadcast (SRF) and publicly financed, SRF Data is dedicated to providing a real service to the public. Apart from covering news stories with data-driven explainers or uncovering misconduct or corruption through data investigations, SRF Data also publishes the large majority of code and data behind its stories. This often concerns governmental data that was either previously unpublished or that was in a format which was hard to handle for people without technical expertise. All code and data is published on the overview page https://srfdata.github.io, which leads to the respective stories and GitHub repositories. With this service, SRF Data is a major producer of open data in Switzerland – data that can be re-used and re-published by other journalists and the public, as it is licensed under Creative Commons.
What makes this project innovative?
Unlike with many other news organizations, including big ones like 538.com and the NYT, each published code repository is described in great detail. The consistent use of RMarkdown and the provision of all necessary raw data allows our analysis scripts to be fully reproduced and understood by laymen. In the last year, a lot of additional effort was put into "full" reproducibility, meaning that a script can be executed several years after the publication – as if it was frozen in time. To our understanding, this practice is unique in data journalism. Two examples of reproducible scripts are linked to: First, the data analysis behind our big interactive story about Roger Federer's 20est Grand Slam title (also submitted). All the graphics that made it into our final online piece were recreated in R, so readers can see exactly how we came to these results. Second, the statistical learning project behind our Instagram fake followers story is published. As we are among the first to use machine learning for an investigative story, we think that others can greatly benefit from us making our approach transparent.
What was the impact of your project? How did you measure it?
After publications, we often receive emails from fellow journalists or interested readers who want to build on our work or want to examine some details. We always refer them to our open data platform. Last year, this project was on the shortlist for "Data Journalism Website of the Year", which made us very happy – still, we think it should also be honored for its open data impact.
Source and methodology
This is very individual – see the respective projects.
R & RMarkdown, Git and GitHub Pages. Most of the R stuff on srfdata.github.io is based on the reproducible R template by Timo Grossenbacher (https://github.com/grssnbchr/rddj-template). There, it also explained how to easily construct a GitHub Page like we did.
Julian SchmidliAngelo ZehrPascal Burkhard