I am one of two data journalists at The Daily Telegraph in London and this is my personal portfolio. The links I’ve submitted are some of the best examples of my work this year, a year during which I have refined the way I do the basics of data journalism while pursuing innovation through new methods of analysis and presentation. These stories cover a huge breadth of topics and include a wide variety of different techniques, from predicting the UK General Election with machine learning, to estimating football transfer values using regression modelling and taking a look at State of the Union speeches through the lense of natural language processing. There is a mix of long- and short-term work, of reactive news, analysis and investigation. Central to all this work is the use of visualisations to display the data in question. This year has seen me produce a wide array of different map and chart types, both static and interactive (but always mobile first), all with the goal of helping any reader better understand the story they are reading. These examples are designed to serve the Telegraph’s informed readership, offering them deeper analysis and new angles on the topics of most interest to them. In keeping with the Telegraph’s business model, some of these articles were Premium – i.e. behind a registration/pay wall – when they were initially published. The ten stories I would like to submit are as follows. While around half of these stories involved collaboration with other journalists, I was the originator of, or key editorial contributor for all of them: 1) 2017 UK General Election seat predictor // 2) ISIS, MS-13 and terrorism: how Donald Trump’s State of the Union stands out against presidents past // 3) This uncanny chart shows the Bitcoin bubble could be about to burst (first published in January 2017) // 4) Born equal. Treated unequally – On International Women’s Day, we explore the UK’s gender gap // 5) Every Premier League club’s fans mapped – how local is your team’s support? // 6) Cars to travel slower than bicycles on England’s clogged-up roads within a decade // 7) Find out your Premier League value with our transfer fee generator // 8) Gender inequality goes pop: Brit Awards showcase songwriting’s women problem // 9) Star Wars: The Last Jedi is the most critic-skewed film in 2017 // 10) Top baby names in England and Wales revealed: Is your name in or out of fashion?
What makes this project innovative?
All these stories feature some kind of innovation, whether this be through the use of complex analytical methods, exclusive news lines or unusual visualisation. For example, the machine learning general election predictor was a first for the Telegraph and unmatched by any of our rivals on election night, while the tf-idf analysis of State of the Union speeches yielded a simple, yet fascinating table that acts as a kind of elevated word cloud. In both cases the crucial thing is not that complex methods were used, but that they were communicated in an accessible fashion - the pursuit of this clarity is just as important as the pursuit of technological development. The Brit Awards gender investigation shone a light onto inequality at the Brits, while the ideas behind the transfer value piece and the Bitcoin bubble piece were innovative in themselves. I believe there is also value in executing otherwise routine news stories with a high level of production value. This can be seen in the most popular baby names piece which was published within half an hour of the data being released but still delivered an experience that could be personalised to the reader through interactive visualisations as well as informing them of the news. Our treatment of the story stood out because of this. My work is also innovative due to how it fits in with the goals of the Telegraph as a whole. My analyses are often placed behind a paywall in recognition of the fact that they are ideal for driving registrations and subscriptions. We have also had success in initial experimentation with introducing registration walls within otherwise open articles. In these situations people can register to see more granular breakdowns of data in bigger, more detailed graphics. This is an exciting avenue of development going forward and crucial for the sustainability of the business. Our data team is still a new one and we are constantly breaking exciting new ground because at the Telegraph because of this.
What was the impact of your project? How did you measure it?
There are several key traffic metrics we use when gauging the impact of our stories. Most of my examples are high performers in this regard with some attracting millions of page views. For example, the interactive on popular baby names had the most page views of any article on the site for the day on which it was published, while the Bitcoin article garnered the highest number of registrations for any article on that day. Some of the stories I work on also appear prominently in print - as with the exclusive analysis of road speeds which was the Saturday Telegraph’s front page lead. These metrics are crucial to the Telegraph when it comes to measuring success and the fact that my stories do well is vital to securing continued investment in data journalism at the title. Outside of the world of site metrics, our work on the Telegraph’s Women Mean Business campaign also resulted in recognition from the UK government. The day after my piece exposing the UK’s gender gap was published the Government ordered the first ever serious review into the funding gap preventing women from becoming business leaders in Britain. However, we also measure our success by comparing our coverage against that of our competitors and seeking to offer our readers something more ambitious. This can be through presenting data to a high standard - as in the baby names interactive - or by presenting something that is exclusive and unique to the Telegraph - as in the Brit Award investigation or the transfer values interactive. DDJ is still relatively new at the Telegraph and the fact that it has become an integral part of the newsroom over the past 18 months is also an impact worthy of noting. My team is breaking new, and exciting ground within the publication in terms of the ambitious visual journalism that we are producing and also in the ways in which we are helping to equip the newsroom as a whole to allow other journalists to create their own data journalism through training and new tools.
Source and methodology
The sources for these pieces are all largely credited within them, but I’ll pick out a few examples that could do with a bit more explanation. 1) General Election predictor: Follow our live seat by seat forecast - The data used to power the model was a mixture of demographic data - obtained through various governmental departments and the 2011 Census - and historic polling figures compiled from various polling companies. We used this information to power a model which predicted the votes for each party in each constituency. // 2) ISIS, MS-13 and terrorism: how Donald Trump's State of the Union stands out against presidents past - I scraped the speeches data from the American Presidency Project and then used R to conduct tf-idf analysis on the texts before using ggplot and Illustrator to create the main chart. // 3) Find out your Premier League value with our transfer fee generator - This piece used actual transfer values from Premier League moves as recorded by Transfermarkt and Telegraph sources. This was then modelled against the various attributes of these players as they appeared on the FIFA 17 video game. // 4) Gender inequality goes pop: Brit Awards showcase songwriting's women problem - This was an investigative piece and involved me creating the data set through a mix of scraping and manual tagging of genders.
I do the majority of my data analysis in R, although I still frequently use Google Sheets/Excel. Scraping is done in R and with OutWit Hub. Visualisation is done using either ggplot2 in R or d3.js - although Adobe Illustrator is also a big component in terms of design prototyping.