How using Excel may have caused thousands of unreported Covid cases
- 8 October 2020
The use of Microsoft’s Excel software may have caused nearly 16,000 coronavirus cases to go unreported – but how did it happen?
Public Health England (PHE) said that a “technical error” caused some 15,841 cases between 25 September and 2 October to be left out of daily figures.
Reports have suggested the glitch was caused by an Excel spreadsheet, which contained lab results, reaching its maximum size and therefore preventing new cases from being added to the file.
Speaking in the House of Commons on 5 October health secretary Matt Hancock said the issue was caused by PHE using a “legacy system” and a decision had been made in July to replace it.
He said contracts to replace the system had been awarded and work to upgrade it was already underway.
Why Excel didn’t work
Many experts have said that Excel was not designed for handling such large amounts of data, as it was being used for by PHE.
The error was caused by the way PHE was bringing together information from commercial firms paid to analyse Covid tests, according to the BBC.
The way the files were submitted – as text-based lists called CSV files – caused no problems. It was the file format PHE’s developers were using that led to cases being missed.
The developers had opted to use an old file format known as XLS, which dates back to 1987. Microsoft replaced it with an updated version, XLSX, in 2007.
As a consequence of using the old format, each template could only handle about 65,000 rows of data rather than the one million-plus Excel is capable of managing.
Each test result created several roes of data, so in practice the template used was only capable of holding about 1,400 cases – much less than the actual number being reported back.
Hancock confirmed the issue was being investigated, adding that it “should never have happened”.
As at 9am on 5 October only 51% of cases had been contacted a second time for contact-tracing purposes.
.@MattHancock says the Excel was a PHE legacy system and he had already commissioned its replacement.
— PARLY (@PARLYapp) October 5, 2020
The fall out
Concerns were immediately flagged about how the unreported cases may have affected local measures.
Labour’s shadow health secretary Jon Ashworth labelled the error “shambolic”. He told the Commons thousands of people had been aware they’d been exposed to the virus at a time when infection rates and hospital admissions are climbing.
Each individual who tested positive had received their test result as normal and all were advised to self-isolate, but it means those who were in close contact with them were not given appropriate advice.
The issue was reported, and steps taken to rectify it, within a matter of days.
All outstanding cases had been transferred to contact-tracers by 1am BST on 3 October 2020, PHE confirmed.
The glitch means cases reported daily in the UK towards the end of last week were likely significantly higher than the near 7,000 reported.
The government’s coronavirus dashboard confirms that as of Sunday there were a further 22,961 lab-confirmed cases of Covid-19 in the UK. It brings the total to 502, 978.
A note on the dashboard on Sunday (4 October) said: “The cases by publish date for 3 and 4 October include 15,841 additional cases with specimen dates between 25 September and 2 October – they are therefore artificially high for England and the UK.”
3 Comments
This episode is basically the IT chickens coming home to roost. The fragmented nature of NHS systems means that Excel has become the go-to method to transfer data, undertake ETL tasks and paper over the lack of robust interfaces.
I feel sorry for the techs/analysts that are taking the brunt of this as I expect that they had an impossible task dropped on their desks and were told to just ‘get it done’.
Ingesting millions of results from myriad systems that have no formal interfaces, or an integrated order and results infrastructure is not a trivial task – something that obviously passed the beaureucrats by when they were handing out contracts to the private sector.
A nationwide laboratory information system is what is required, but the decision makers bypassed that and went with the expedient solution of ignoring the complexities of healthcare Informatics.
You reap what you sow.
That description doesn’t tally with what I’ve heard and read. Besides, using Excel isn’t THE problem. Building an unreliable system that wasn’t properly tested and didn’t automatically detect errors is, to say nothing of blindly procuring it to do something it couldn’t do reliably. See my longer view here …. https://www.bmj.com/content/371/bmj.m3891
You’re spot on Bill Gates (presumably not the Bill Gates?). PHE’s data collection and processing infrastructure is a cobbled together Heath Robinson affair. Years of underinvestment because of organisational change short-term cost-cutting, no understanding or interest from senior management. As you say, the techies and analysts will be hung out to dry despite having bust a gut to keep things working over the past 6 months.
Utterly shameful.
Comments are closed.