Project description

With the election of a new president in France in May 2017, we asked ourselves many questions about the journalistic coverage of a complete mandate. Compared to 5 years ago, there are a lot of new sources of data and content production automation techniques that have emerged.

The election of Emmanuel Macron was for us the opportunity, for the first time, to put an entire presidency under monitoring. Since the start of the mandate, we have been tracking and storing a large amount of informations about French political life in a database. For example: the agendas of the president and the prime minister, daily tweets from all MPs, their individual votes on each bill, their participation in parliamentary committees, etc.

We want to create the most complete database on French political life, since the very beginning of the presidency, live updated. It allows us to feed a series of 8 data-driven investigations \”A Data sur la politique\”, published by the french pure player “Les Jours”.

“Les Jours” is an independent subscription-based news media that covers news through a series of articles, called “obsessions”. For our data-driven political “obsession”, we have also created a new format, inspired by stories such as Snapchat’s or Instagram’s ones. Ideal for reading on mobile.

What makes this project innovative?

As journalists, we have had to take a appraisal, for example, of a political mandate. Often, it was at this point that we realized that the data that would have allowed us to do so had disappeared over time, or was too massive to be realistically collected and analyzed. Our first innovation for this project is to reverse the logic by preciously gathering and storing data first and foremost to feed our stories. The creation of the database has resulted in those stories, not the reverse as is often the case. We wanted to create the first data-driven tool dedicated to a presidency. From our database continuously automatically updated and reviewed by our journalists, we drew 8 original angles for “Les Jours”, but there are so many other possible angles. We have exploited the data of unwelcome interruptions during parliamentary debates to the affinities of the MPs on Twitter, while passing by the audience of their Wikipedia pages. A way to tell politics differently, to take the pulse on a regular basis of a eventful mandate and to make a quantitative and factual assessment of the government's activity. This innovative and voluntary approach required an equally innovative publication. The choice of the story format quickly imposed itself to stage a series of stories mainly built around datavisualizations. We were largely inspired by a New York Time format ("Automated vehicles can not save the cities") and practices installed by social networks. We have created a tool to produce this stories that can manage a journalist without any notion of code. The result is a mobile-first and highly visual format articulated around graphics and a new way of writing for journalist.

What was the impact of your project? How did you measure it?

Les Jours, the publisher of "A data sur la politique", is an subscription-based online newspaper. In terms of social metrics, the first episode, in free access, accounted for over 80% of the 2200 visitors brought via Facebook. On Twitter, the visits were more numerous (3800) and better distributed between the different episodes of the series. The potential readership on Twitter, more technophile and receptive to innovative formats such as our story-style layout, was the target. It is therefore expected that most of the new subscriptions to “Les Jours” have been triggered from this platform. "A data sur la politique" was the first series of articles published by “Les Jours” in a story format. It was well received by the readership. Shape and substance contributed, according to the newsroom’s analysts, to the loyalty and the conversion of the readership. The tool we developed to stage our stories has since been picked up by other journalists from the editorial staff of “Les Jours” for new series of articles. The series is placed in the most read on the site of “Les Jours”, exceeded by some historical “obsessions” of the site. The rate of reading and retention in the stories, overall satisfactory, improved with our decision to reduce the length and the number of slides. The format, reused for other series, is popular for Les Jours’s readership. Qualitatively, "A data sur la politique" was very well received by the public, especially for the choice of simple and didactic graphics. The series participates in assuring the legitimacy of the data-driven coverage of political news in France.

Source and methodology

Our data come from many sources. At the beginning of our reflection, we focused on identifying sources for which the interest of storing them as we went along was the greatest. For example: the agenda of the President of the Republic, published weekly on the site of the Elysée did not have easily accessible archives after a few weeks (that has been updated since). Our approach is to scrap and store each week this calendar, to keep a simple access to this data for future stories. Other sources of data have an advantage in being query on a regular rhythm by their volume. For example, if we want to retrieve today all the tweets of the French MPs, the huge number produced in nearly two years of mandate would make this scrapping very painful. By storing every tweet from MPs and government members with their metadata weekly, we anticipate this issue. We also collected more unexpected datasets that we did not necessarily identify or that did not need to be queried on a regular basis. For example, we used MPs Wikipedia page audiences to measure their popularity with this slightly offbeat indicator or we scrap all modifications and corrections made each day on MPs Wikipedia page to see if the pages were “vandalized” and by whom.

Technologies Used

Most of the data sources that we query regularly are compiled by Python scripts that we have improved over time. In concrete terms, we use the following techniques: http request, page consultation automation with machine-controlled browsers, API queries when they exist. This data is stored in distinct CSV files organized by a meta-database. Analysis, data mining, and angles seach for the stories were done through Tableau. To build the story format, we created a semi-WYSIWG editor that allows non-developers of the team to work on their own. We use common web technologies: Javascript and PHP mainly. This editor produce a public version of the story which is easy to navigate, optimized and responsive, ready to be upload on a server.

Project members

Wedodata's team and Les Jours


Additional links


Click Follow to keep up with the evolution of this project:
you will receive a notification anytime the project leader updates the project page.