Project description

New gender pay gap reporting regulations came into force in the UK in 2017, requiring UK companies with 250 or more employees to publish details of the difference between the pay of the male and female employees by April 2018, and annually after that.

The FT’s coverage of this topic centred not only on ongoing analysis and visualisation of this new data source, but also extensive reporting that revealed limitations and flaws in the methods of data collection and enforcing compliance.

Our analysis showed that one in 20 companies that submitted gender pay gap data reported numbers that are statistically improbable and therefore likely inaccurate. Some claimed they had paid their male and female employees exactly the same, reporting a gender pay gap of zero measured by both the mean and median.

In time for the first disclosure deadline in April 2018, we created a searchable database of companies that improved upon the publicly-available dataset with supplemental information joined to the data from other sources.

This database was updated in 2019 with new functionality allowing users to see the change between companies’ disclosures over the two years and to maintain pressure on poor performers.

What makes this project innovative?

We were aware that the contents of this interesting new public dataset would be widely and comprehensively reported across British media immediately after the disclosure deadline. We therefore set out to add value to our version of this reporting through a close examination of the quality of the underlying data, which is ultimately self-reported by the businesses themselves, as well as regulators’ capacity to enforce accurate reporting.

What was the impact of your project? How did you measure it?

The story was a lead story both in print and online, ensuring that it was widely read. Our reporting on the limitations of this data disclosure exercise showed that the gender pay gap reporting process leaves a lot of space for employers to misreport data or to avoid making the required disc;losures. Our measure of success was therefore to reveal the limitations of the process both in terms of reporting and enforcement and communicate our findings to readers in a thoughtful way. After our reporting identified the companies reporting with unlikely figures, some — including large, well-known brands — corrected their submissions. Officials said that employers were legally liable for the accuracy of the 14 data points required by law and our journalists were asked to provide details of our analysis to the chair of the House of Commons select committee on Women and Equalities, Maria Miller. Our reporting also revealed that the government had no comprehensive list of employers required to report, and that it required employers to report as separate legal entities making it easier to hide disparities between business units or avoid reporting. When we interviewed the chair of the House of Commons select committee on Women and Equalities, she was unaware of the problems with disclosure process. She was critical of the Equality and Human Rights Commission, the body tasked with policing the reporting process. Eventually pressure led to the EHRC compiling a list of companies expected to disclose their data and changing the structure of the publicly-available dataset to show which companies had failed to report until after the deadline.

Source and methodology

The primary data source was the disclosure data itself, which was collected from a government website where it was regularly updated. Because our interactive database was published before the disclosure deadline, R scripts running on AWS were used to ensure the database and was always using the latest version of the data. Because the number of employees at each company was not one of the data points that companies were required to disclose, we obtained this information from corporate data provider DueDil, creating a sample covering 65 per cent of employers — large enough to offer meaningful conclusions. In addition, we used cross-national data gathered by the OECD to put the situation in the UK into international context.

Technologies Used

The data acquisition, cleaning and analysis as well as some of the graphics production was done in R. The interactive graphics were produced using D3.

Project members

Aleksandra Wisniewska, Billy Ehrenberg-Shannon, Sarah Gordon, Caroline Nevitt, and Cale Tilford


Additional links


Click Follow to keep up with the evolution of this project:
you will receive a notification anytime the project leader updates the project page.