Project description

I’m a data journalist at Spiegel Online since 2015. We’re a rather small team of generalists which means I usually get to work on every piece of a project from start to end. While this is a constant challenge of my skills (and I can image this might change if we’ll be growing one day) this approach minimizes friction throughout the project. You’re simply more familiar with the data when you’ve scraped, analyzed and visualized it yourself. During the past year, these were the most important projects I was involved in:

1. Football fan atlas
Knowledge about friendships and rivalries in German football is very anecdotal. We’ve been able to collect 60,000 survey responses and get an unprecedently deep look into the field. The resulting story is rich in visualizations and the collected data has been released to multiple scientists afterwards.

2. Black box Schufa
Schufa is the most influential credit bureau in Germany. We investigated its scoring algorithm using 2,000+ credit reports requested by customers during a crowdsourcing project. We found out that many people are declared a risk case with no fault of their own.

3. World Cup Squads
For the 2018 WC, I’ve analyzed the squads of all participating team and calculated new parameters like “top-level experience per player” that are highly meaningful. The aggregated statistics were reported in a one-time analysis and were at the same time the foundation for a widget we used 100+ times throughout the world cup. Showing basic stats as well as a continually updated evaluation of the teams’ chances.

4. Shot Maps
Shot maps visualize every shot attempt during a football match and thus help understanding the game. The underlying data for the visualization is piped in from a live feed and the widgets can be built and integrated into our CMS with just a few clicks. They are used hundreds of times a year, often within minutes after the final whistle.

5. Explanatory election maps
As a national news site, we regularly report on state elections. In order to convey to most important information at a glance we’ve generated a new approach: a dense static visualization that focuses on the most important information and can be generated in a reproducible way with little effort.

6. The spatial distribution of ATMs
Cash payments are still very popular in Germany, but the number of ATMs is declining steadily. For this article I’ve scraped and analyzed the location of all ATMS throughout the country. It turned out that some operators report misleading figures about the number of locations and that credit unions play a critical role in rural supply.

7. How a speed limit could save lives
The potential effects of a speed limit in Germany are a research gap. With a spatial analysis based on fine-granular open data I’ve been able to conduce a model calculation, showing that up to 140 deaths per year could be avoided if Germany introduced a speed limit on the Autobahn.

8. How Bayern Munich dominates the Bundesliga
Bayern Munich won the Bundesliga six years in a row. Directly after their victory in the title race, I’ve published a visual explainer of their dominance and of other teams in Europe that dominate their domestic league.

What makes this project innovative?

1. Focus on narrative During the past years data journalism has grown as a field. Opulent interactive visualizations and technical experiments gave way to well told stories where visualizations blend in naturally. For me, this meant, that the two most data-heavy projects I was involved in last year were reported in completely different ways. The story on credit scoring included months of complex data analysis, yet in the interest of our readers we decided to focus strongly on intelligibility and narrative. Using very few visualizations due to the complexity of the technical matter. For the story on friends and rivals in German football it was completely the other way round. We wanted to present the depth of the data set in the most visual way possible and produced a big variety of chart types to explain one aspect at a time with only very short text sections in between. 2. Reusable content Through the last year, I’ve built five data-driven, dynamic visualizations, that are now being used in roughly 500 articles per year on our site. The topics range from football over e-mobility to politics and all those widgets provide meaningful context for our everyday reporting. The repeated usage of the widgets allows me to put a lot of effort into design and development. Including them in an article is done with just a few clicks and can be handled by any reporter. 3. More than just beautiful maps Maps have always played an important role in data journalism. They enjoy great popularity and the required tools to build them have become much more accessible in recent years. However, having a background in spatial analysis, I’m under the impression that too many projects with spatial reference remain superficial. They simply show where things are, instead of explaining or analyzing spatial context. Digging deeper in these cases can lead to unprecedented insights, as my stories on the speed limit and the spatial distribution of ATMs demonstrate.

What was the impact of your project? How did you measure it?

During the past year, the stories I published were read by three million unique visitors. The engagement time on average was 2:30 min, which is an exceptionally high value for a news-driven site like Spiegel Online. My story on how a speed limit could save lives was one of the most discussed stories for Spiegel Online during the last year with 25,000 facebook interactions. Sometimes our stories have political impact, as well: After our investigation on the German credit bureau Schufa’s scoring algorithm the Federal Minister of Justice and Consumer Protection called for more transparency for consumers.

Source and methodology

Too many to name them all here. Every project is different. All of my bigger projects include at least a separate FAQ on this. For politically sensitive issues, I try to publish all underlying data together with well-documented and completely reproducible code on github whenever possible.

Technologies Used

I use the R programming language (relying heavily on tidyverse packages) for scraping, processing and analyzing data, as well as for producing preliminary visualizations. The more simple final graphics are produced in Adobe Illustrator (static versions) or Highcharts (simple, interactive charts) whereas more advanced visualizations are realized with D3.js. For maps I work with a combination of QGIS (data processing, exploration as well as styling static maps) and Mapbox.js / MapboxGL (interactive maps).

Project members

For seven out of eight projects, I was solely responsible for every step from data collection / scraping over data processing and analysis to front end design and visualization. Reporting was done by myself four times, while I teamed up with reporters from other departments in the three other cases. The story on credit scoring is an exception. This was a larger team effort, where I was mainly responsible for data processing and analysis and partially for design and reporting.


Additional links


Click Follow to keep up with the evolution of this project:
you will receive a notification anytime the project leader updates the project page.