As Italy was approaching general election, a big scandal hit Five stars movement. After entering both Houses for the first time back in 2013, FSM’s representatives made a point of giving back a part of their salaries, in order to fund small businesses. A TV show, called "Le Iene", found out some of them were actually just pretending to give back this money.
We were able to scrap al data from Tirendiconto.it, a website where MPs were reporting their expenses and the money they were giving back.
And we found out that since 2013 expenses had risen, while amounts of given back money just dropped.
What makes this project innovative?
We received a Python code which allowed us to scrap data from a source that asked to remain anonymous. Yet, we had to check on data. So we ran ouserlves the code for an eight our scraping, then we compared data we scraped with those our source sent us along with the code. They were matching. But since nor me neither my editor know Python, we decided to be transparent. So we not only let data available on data.world (https://goo.gl/dka582). We also published the code on GitHub (https://goo.gl/d2RUxa) so that everyone could use it.
What was the impact of your project? How did you measure it?
The story was shared by 3.700 people on Facebook. And the dataviz was seen by at least 25k people (Wired.it counts an average of 50k visitors per day)
Source and methodology
As I said, we received the code to scrap data from a source we grante anonimity. In order to check data, we run the code and scrap data ourselves. We also decided to go public both with data, available on data.world, and with the code, available on GitHub.
We used Scrapy to run the code to scrap data. OpenRefine and LibreOffice were used to clean and compare data. Tableau Public was used to visualize them.
I must credit the anonymous source. And Andrea Borruso at OnData for helping me with GitHub