FiveThirtyEight is the most prominent data journalism site in the U.S., but in our fourth year since re-launching in 2014 we had to somewhat reinvent ourselves for the Trump era. We took on more forms of storytelling, more urgent stories — and, of course, did it all in an empirically sound, data-driven way.
You want data?
We did a 12-part, statistically driven investigation of what really happened in the presidential election. We used pitch-tracking data to show that — despite denials from Major League Baseball — the baseballs were juiced. We stole a technique from machine learning to profile President Trump’s most rabid online following. We helped create 12 new tests to measure Hollywood diversity. We used data to help readers pick a World Cup team to root for, and to show the loooooong recovery ahead for Hurricane-struck areas. We created an NFL game that allowed users to try their own hand at modeling and compete against one another. We showed how the legacy of slavery still manifests itself in how people die in the United States. We gave readers real-time, updating data to help them make sense of their world — on the NFL, Congress, the NBA, Trump, the Oscars, the job market and more.
Through it all, we’ve learned and grown as a newsroom. We’ve focused more on where we really add value: politics and policy, sports and science. There’s still a long way to go, but we’d suggest (with oodles of humility, of course) that no other site consistently produces data-driven journalism on such a variety of topics and in such quality and quantity. Maybe we should try to show that with data, but we’ll leave that judgement to you instead. ?
What makes this project innovative?
What was the impact of your project? How did you measure it?
Source and methodology
For data analysis we mostly use R but we also use Python, Ruby, STATA and Excel. For interactive visualizations we use D3 and Node.js. For static visualizations, we often use D3 or ggplot2 with Illustrator as well as some internal web-based tools we built using Node.js and React. For databases and backend interfaces we often use Ruby on Rails with MySQL or Postgres. For mapping we mostly use QGIS.
And, of course, we do a ton of reporting and basic research.
We also have many different bots that help us with our work by keeping track of different data sources. They communicate with our various databases and predictive models and interact with our journalists via Slack.