Skin cancer apps ‘cannot be relied upon for accurate results’, study finds
Smartphone apps designed to detect the risk of skin cancer are poorly regulated and “frequently cannot be relied upon to produce accurate results”, according to a new analysis.
Researchers based in the University of Birmingham’s Institute of Applied Health Research, in collaboration with the Centre of Evidence-Based Dermatology at the University of Nottingham, have analysed a series of studies produced to evaluate the accuracy of six different apps.
Their results, published in The BMJ, reveal a mixed picture, with only a small number of studies showing variable and unreliable test accuracy among the apps evaluated.
They found the apps may cause harm from failure to identify potentially deadly skin cancers, or from over-investigation of false positive results such as removing a harmless mole unnecessarily.
Lead researcher, Dr Jac Dinnes, from the Institute of Applied Health Research, said: “This is a fast-moving field and it’s really disappointing that there is not better quality evidence available to judge the efficacy of these apps.
“It is vital that healthcare professionals are aware of the current limitations both in the technologies and in their evaluations.”
In an editorial on the findings for the British Medical Journal, Jessica Morley, University of Oxford DataLab policy lead, Dr Ben Goldacre, DataLab director, and Luciano Floridi,, professor of philosophy and ethics of information at the University of Oxford, said trust in artificial intelligence (AI) relies on the “myth” of an objective algorithm to inform the technology.
But the systems for generating and implementing evidence have not yet met the specific challenges, they added.
The study looked at nine different applications and found evidence for accuracy was “lacking”.
Two of those apps are currently available in the UK: SkinScan and SkinVision. Both are approved and regulated as Class 1 medical devices and both have CE marks.
The study found SkinScan was evaluated in a single study of 15 images with five melanomas. The app did not identify any of the melanomas.
SkinVision was evaluated in two studies. One study of 108 moles, of which 35 were cancerous or precancerous, achieved a sensitivity of 88% and a specificity of 79%.
That means that 12% of patients with cancerous or precancerous moles would be missed, while 21% of those non-problematic moles would be wrongly identified as potentially being cancerous, the BMJ study found.
Yet both apps are marketed with claims the can “detect skin cancer at an early stage” or “track moles over time with the aim of catching melanoma at an earlier stage of the disease”, researchers added.
Other studies have previously found SkinVision performs “well above” a GP in detecting skin cancer.
Steps to improvement
Dr Dinnes said: “As technologies continue to develop, these types of apps are likely to attract increasing attention for skin cancer diagnosis, so it’s really important that they are properly evaluated and regulated.”
The researchers made a number of recommendations for future studies of smartphone apps:
- Studies must be based on clinically-relevant population of smartphone users who may have concerns about their risk of skin cancer
- All skin lesions identified by smartphone users must be included – not just those identified as potentially problematic
- Clinical follow-up of benign lesions must be included in the study to provide more reliable and generalizable results
To improve the reliability of AI apps regulatory, governance and cultural changes are needed, Morley, Goldacre and Floridi wrote.
Regulators accustomed to managing medicines will require new skills to evaluate digital technologies and where it is not being evaluated it should clearly be flagged to patients and policy makers, they explained.
While clinicians, patients and commissioners need to be better informed about how to evaluate an AI app, which will help drive better innovation, they added.