Artificial intelligence (AI) could lead to UK health services that disadvantage women and ethnic minorities, scientists are warning.
They are calling for biases in the systems to be rooted out before their use becomes commonplace in the NHS.
They fear that without that preparation AI could dramatically deepen existing health inequalities in our society.
i can reveal that a new government-backed study has found that artificial intelligence models built to identify people at high risk of liver disease from blood tests are twice as likely to miss disease in women as in men.
The researchers examined the state of the art approach to AI used by hospitals worldwide and found it had a 70 per cent success rate in predicting liver disease from blood tests.
But they uncovered a wide gender gap underneath – with 44 per cent of cases in women missed, compared with 23 per cent of cases among men.
This is the first time bias has been identified in AI blood tests.
“AI algorithms are increasingly used in hospitals to assist doctors diagnosing patients. Our study shows that, unless they are investigated for bias, they may only help a subset of patients, leaving other groups with worse care,” said Isabel Straw, of University College London, who led the study, published in the journal BMJ Health & Care Informatics.
“We need to be really careful that medical AI doesn’t worsen existing inequalities.”
“When we hear of an algorithm that is more than 90 per cent accurate at identifying disease, we need to ask: accurate for who? High accuracy overall may hide poor performance for some groups.”
Other experts, not involved in the study, say it helps shine a light on the threat posed to health equality as AI use, already quite common in the US, starts to take off in the UK.
Brieuc Lehmann, a UCL health data science specialist and co-founder of expert panel on Data for Health Equity, says the use of AI in healthcare in the UK is “very much in its infancy but is likely to grow rapidly in the next five to 10 years”.
“It’s absolutely crucial that people get a handle on AI bias in the next few years. With the ongoing squeeze on NHS budgets, there will be growing pressure to use AI to reduce costs,” he said.
“If we don’t get a hold on biases, there will be a temptation to deploy AI tools before we’ve adequately assessed their impact, which carries with in the risk of worsening health inequalities.”
Lauren Klein, co-author of the book Data Feminism and an academic at Emory University in Atlanta in the US, said the liver disease study showed how important it was it get AI systems right.
“Examples like this demonstrate how a failure to consider the full range of potential sources of bias can have life or death consequences,” she said.
“AI systems are predictive systems. They make predictions about what’s most likely to happen in the future on the basis of what’s most often happened in the past. Because we live in a biased world, those biases are reflected in the data that records past events.
“And when that biased data is used to predict future outcomes, it predicts outcomes with those same biases.”
She gave the example of a major tech firm that developed a CV screening system as part of its recruitment process.
But because the examples of “good” CVs came from existing employees, who were predominantly men, the system developed a preference for the CVs of male applicants, disadvantaging women and perpetuating the gender imbalance.
“AI systems, like everything else in the world, are made by humans. When we fail to recognise that fact, we leave ourselves open to the false belief that these systems are somehow more neutral or objective than we are,” Dr Klein added.
It is not the AI in itself which is biased – as it only learns from the data it is given, experts stress – but rather the information it is given to work with.
David Leslie, director of ethics and responsible innovation research at the Alan Turing Institute, is concerned that AI may make things worse for minority groups.
In an article for the British Medical Journal last year, he warned that: “The use of AI threatens to exacerbate the disparate effect of Covid-19 on marginalised, under-represented, and vulnerable groups, particularly Black, Asian, and other minoritised ethnic people, older populations, and those of lower socioeconomic status.”
“AI systems can introduce or reflect bias and discrimination in three ways: in patterns of health discrimination that become entrenched in datasets, in data representativeness [with small sample sizes in many groups often very small], and in human choices made during the design, development, and deployment of these systems,” he said.
Honghan Wu, associate professor in health informatics at University College London, who also worked on the study about blood test inequalities, agrees that AI models can not only replicate existing biases but also make them worse.
“Current AI research and developments would certainly bake in existing biases – from the data they learnt from – and, even worse, potentially induce more biases from the way they were designed,” he said.
“These biases could potentially accumulate within the system, which lead to more biased data that is later used for training new AI models. This is a scary circle.”
He has just completed a study looking at four AI models based on more than 70,000 ICU admissions to hospitals in Switzerland and the US, due to be presented at the European Conference on Artificial Intelligence in Austria this month.
This found that women and non-white people with kidney problems had to be considerably more ill than men and white people to be admitted to an ICU ward or recommended for an operation, respectively.
And it found “the AI models exacerbated ‘data embedded’ inequalities significantly in three out of eight assessments, one of which was more than nine times worse”.
More from Science
“AI models learn their predictions from the data,” Dr Wu said. “We say a model exacerbates inequality when inequalities induced by it were higher than those embedded in the data where it learned from.”
But some experts say there are also reasons for optimism, because AI can also be used to actively combat bias within a health system.
Ziad Obermeyer, of the University of California at Berkeley, who worked on a landmark study that helped to explain how AI could introduce racial bias (see box below), said he had also shown in separate research that an algorithm can “find causes of pain in Black patients that human radiologists miss”.
“There’s increasing attention from both regulators who oversee algorithms and – just as importantly – from the teams building algorithms,” he told i.
“So I am optimistic that we are at least moving in the right direction.”
Dr Wu, at UCL, is working on ways to solve AI bias but cautions “this area of research is still in its infancy”.
“AI could lead to a poorer performing NHS for women and ethnic minorities,” he warns.
“But the good news is, AI models haven’t been used widely in the NHS for clinical decision-making, meaning we still have the opportunity to make them right before ‘the poorer performing NHS’ happens.”
How inequalities can be built into AI at the design stage
Using the wrong proxy, or variable, to predict risk is probably the most common way in which AI models can magnify inequalities, experts say.
This is demonstrated in a landmark study, published in the journal Science, which found “that a category of algorithms that influences health care decisions for over a hundred million Americans shows significant racial bias”.
In this case, the algorithms used by the US healthcare system for determining who gets into care management programmes were based on how much the patients had cost the healthcare system in the past and using that to determine how at-risk they were from their current illness.
But because Black people typically use healthcare less in America, in part because they are more likely to distrust doctors, the algorithm design meant they had to be considerably more ill than a white person to be eligible for the same level of care.
However, by tweaking the US healthcare algorithm to use other variables – or proxies – to predict patient risk the researchers were able to correct much of the bias that was initially built into the AI model, reducing it by 84 per cent.
And by correcting for the health disparities between Black and white people, the researchers found that the percentage of Black people in the ‘automatic enrollee’ group jumped from 18 per cent to 47 per cent.
What the NHS is doing to tackle the problem of AI bias:
The NHS is aware of the problem and is taking a number of steps. These include:
- NHS AI Lab has partnered with the Health Foundation to fund £1.4m in research to address algorithmic bias, with a particular focus on countering racial and ethnic health inequalities that could arise from the ways in which AI is developed and deployed. This includes funding for a project which will ensure that diabetic screening technologies work effectively for different patient populations. It also includes funding for an international consensus-based approach to developing standards related to the inclusivity and generalisability of datasets used to train and test AI.
- NHS AI Lab has also worked with the Ada Lovelace Institute to develop a model for an algorithmic impact assessment, which is a tool that can be used to assess possible societal impacts of an AI system before it is used. This includes identifying risks of algorithmic bias at an early stage when there is greater flexibility to make adjustments.
- The NHS believes it’s important that training data be reflective of the whole population to avoid building biased AI systems (if the training data contains any errors or biases, these will also be present in the AI system).
- The NHS says AI systems should also be validated to test whether the system can perform effectively for different patient groups. This means that the system must be tested using examples that it has never seen before (i.e., testing on different data than it was trained on). Validation should happen as part of the development process, but AI systems should also be tested once development has been completed. Ongoing monitoring is recommended.
- There is a move towards including patients and the public in addressing ethical concerns, such as algorithmic bias. For example, as part of our algorithmic impact assessment, there is a participatory element, which entails involving members of the public in exploring the legal, social, and ethical implications of an AI system. These members of the public would inform the decision-making process for granting access to data used to train and test AI systems.
- NHS AI Lab partnership with the Health Foundation includes funding for projects that could use AI to help close gaps in health outcomes. For example, we’re funding a project that will use an AI-driven chatbot which provides advice about sexually transmitted infections (STIs) to raise the uptake of STI/HIV screening among minority ethnic communities. The research will also inform the development and implementation of chatbots designed for minority ethnic populations within the NHS and more widely in public health
- Another project NHS AI Lab is funding will develop an AI system that can help investigate factors that contribute to adverse maternity incidents among Black women, who are four times more likely to die in pregnancy or childbirth than white women, but the reasons for this are not well understood. This research will provide a way of understanding how a range of causal factors could lead to maternal harm. The aim is to inform the design of more effective, targeted interventions that could improve maternal health outcomes for Black women.