Stuttgart is not only the location of the headquarters of Mercedes Benz and Porsche, it is also the city where legal PM10 (fine dust or Feinstaub in German) limits have been exceeded more often than anywhere else in the EU. State courts have even ruled driving bans on days with unfavourable weather conditions. Feinstaub is a highly controversial issue: lots of people work in the car manufacturing industry, which is being protected by politicians of all parties. On the other side, huge street protests against pollution took place. (Along main roads, car traffic produces about 50 percent of fine dust emissions in Stuttgart.)Our data project “Feinstaubradar”, for which we developed the first mockup during the GEN hackathon at Süddeutsche Zeitung in 2016, aims at closing an information gap in this discussion. The state-run LUBW agency, who is in charge of officially measuring air pollution, has installed no more than 10 sensors in the larger Stuttgart area, most of them (complying with EU laws) along main roads. Yet these measured values do not allow for a profound judgement of PM10 pollution in areas where people actually breathe all that bad air, i.e. in small towns and residential areas.“Feinstaubradar” uses data of more than 750 privately installed PM10 sensors. They have been developed by the local Open Knowledge Foundation group, yet it is hundreds of individual citizens who build and maintain their sensors, mostly at the facades of their private homes. They provide live PM10 data 24/7 for almost all of Stuttgart’s city districts and the surrounding towns. “Feinstaubradar” makes these data accessible and understandable for 2.3 million citizens in the area, who make up the audience we target with this project.“Feinstaubradar” consists of two components: (1) a map where you can address live data, the development of PM10 levels in your community during the day and visualize immissions for every day since the launch of the project. (2) Additionally, we have trained a text automation software (provided by the Stuttgart start up AX Semantics) to produce PM10 reports for 40 local and hyperlocal communities. Those reports are being enhanced by adding the official LUBW data plus a prognosis for PM10 levels.“Feinstaubradar” therefore aims to fulfill three goals: (1) raise attention for PM10 pollution and broaden the scope of the discussion, (2) provide citizens with detailed information on pollution in the places where they live or work, (3) report on a hyperlocal and daily basis what level of air pollution can be expected during the day.In the initial project, there was no monetisation plan apart from generating traffic for our ad-supported website. Yet in the course of the project we received an offer by the tourist destination Bad Hindelang who have now committed themselves to a 12-month advertising period on our website with additional poster campaigns throughout the city. This contributed to re-financing the costs of the project (~30 000€, costs for staff excluded).
What makes this project innovative?
“Feinstaubradar” is, first and foremost, a cooperation of civil society with a media organization. We build on a citizen science project and use a massive amount of data generated by 750 individuals in the Stuttgart area which is innovative by itself and makes for a fine example of “sensor journalism”. Because of the huge number of data requests and for reasons of scaleability, we have used a serverless cloud architecture (provided by Amazon Web Services) to set up a flexible and reliable technical environment which can handle those more than half a million API requests per day (plus a lot of data cleaning and rendering). We have gained valuable experiences with this fast and powerful data architecture that can be used for our next projects.A second innovative approach concerns the use of text automation software. We have used a software developed by AX Semantics that allows editors to train a text machine that creates a potentially unlimited number of reports based on structured data. This software can intelligently recognise semantic structures and it could even translate our reports in more than 20 foreign languages. We use “Feinstaubradar” as a use-case to gain experience with this innovative way of reporting which is particularly useful if you want to create personalized or hyperlocal content. In our opinion, the project also showed that text automation can meaningfully be used in “hot” journalistic topics, in contrast to the usual reports on stock markets or upcoming events.We are planning to use text automation on a much larger scale in the future for a broad range of topics. In the “Feinstaubradar” case, the text automation software proved functional for our goal to produce hyperlocal pollution reports on a large scale (>2500 reports / month).
What was the impact of your project? How did you measure it?
“Feinstaubradar” reached 30 000 users on the weekend it was launched. After that, it was 1000 to 2500 users per day. We also had a tremendous amount of readers who sent us feedback or informed us about possible sources of fine dust in their neighbourhood – particularly in the follow-up of detailed analyses that we produced and published ourselves. Apart from these reactions, our project has massively contributed to a further spreading of PM10 sensors in the Stuttgart region. When the project was launched in early November 2017, there were 300 of them – until February 2018, this number has increased to over 750.The publicity of privately measured data (and our official requests) increased the pressure on the state agency LUBW to report more official air pollution data to the public. Apart from opening an API for those data, LUBW massively improved their website and the amount of official pollution data published there. This, in turn, contributed to a significant increase of press reports on the issue and therefore of public awareness.A hackathon that we organised in January 2018 in the context of “Feinstaubradar” produced tremendous results. We had, among others, big data experts checking the quality of our air pollution forecasts (it is pretty good) and the mockup of a gamification app was developed. The “Feinstaub App” has been published in the App Store in March 2018. Additionally, the hackathon was a major networking event. In the aftermath, a colleague of ours started developing a PM10 Facebook chatbot based on our data, and the local public transport data officer got in touch with an AX semantics representative, aiming to start a project on automated reports on the punctuality of the public transport system.In terms of internal impact we can say that the project produced massive learnings on the use of Big Data in the newsroom – not only for those directly involved but also for leading figures in our hierarchy, including our CEO who actively supported the project. We have also shared our experience with colleagues in media reports and within our news corporation, including the web developers of Süddeutsche Zeitung. “Feinstaubradar” will also be featured in the programmes of journalism conferences, among others Netzwerke Recherche and SCICAR.
Source and methodology
Our main data source is the PM10 database of OK Lab Stuttgart, a local group that is part of the Open Knowledge Foundation. We address PM10 levels measured by the OK Lab’s sensors through their Luftdaten API. Each sensor sends a PM10 measurement every few seconds. Every 5 minutes we retrieve data from each sensor, clean them (i.e. remove unusually high or low PM10 levels and detect outliers which are not working) accumulate and check the data for possible errors and save it to our database, where we compute a range of PM10 levels, e.g. hourly and daily averages for the areas covered by our “Feinstaubradar”.We also receive official PM10 data collected by the state-run LUBW agency. We’ve been granted access to their data on a FTP server, were we can retrieve it on an hourly basis. Our third data source is Kachelmannwetter, a weather data service. They are so far the first and only company to provide hyperlocal forecast data on air pollution. We licensed these forecasts for the area covered by our “Feinstaubradar”.All these data are being mingled and accumulated on the basis of predefined areas and sub-areas, with historical data being archived for future analyses.
Technically, "Feinstaubradar" was mostly realized with Amazon Web Services. In preparation for this project, we quickly realized that we need a solid, scalable data storage solution. In the beginning we needed to cope with 300 PM10 sensors. Since we anticipated that their number would increase over time (as of today there are over 750 sensors), we wanted to be prepared to maintain scalability and short response times for the future of the project.We also needed to access data from multiple applications (primarily the map and the text enginge, with further applications to come). So we decided to use an API as SPOT (single point of truth) to retrieve and store data so we could access and use the data through the same interface from different applications. This was realized with AWS API Gateway, which also enabled us to conveniently secure the API requests with API keys and also provides extensive caching mechanisms. The data is then written to and retrieved from AWS DynamoDB through AWS Lambda functions.Since we had to deal with a number of different data sources and formats, we needed a central worker instance that is able to consume, accumulate, compute and send data to our database through the API. This was realized with a PHP application hosted in a Virtual Private Cloud in an AWS Elastic Beanstalk instance. We didn't want this worker instance to be accessible over the internet, so the VPC was configured to only accept connections from our company's local network.After collecting and accumulating the data, we send it to the text engine provided by AX Semantics. With the help of their proprietary backend and ATML3 Syntax we created the templates for the reports that we import into our own CMS.The map visualization was created with AngularJS and Angular Material. It consumes its data through our AWS Api Gateway and is hosted in an AWS S3 Bucket.In front of the AWS S3 Bucket as well as the AWS Api Gateway we use AWS CloudFront as CDN and to provide further caching for the AWS API Gateway.
Christian Frommeld, web developer