The Dark Side of the Current State of Medical AI
Thank you for reading NewHealthcare Platforms' newsletter. With a massive value-based transformation of the healthcare industry underway, this newsletter will focus on its impact on the medical device industry reflected in the rise of value-based medical technologies, and platform business models that are significantly transforming payer and provider healthcare organizations. I will occasionally share updates on our company's unique services to accelerate and de-risk the transition!
DISCLAIMER: This newsletter contains opinions and speculations and is based solely on public information. It should not be considered medical, business or investment advice. The banner and other images included in this newsletter are AI-generated and created for illustrative purposes only unless other source is provided. All brand names, logos, and trademarks are the property of their respective owners. At the time of publication of this newsletter, the author has no business relationships, affiliations, or conflicts of interest with any of the companies mentioned except as noted. ** OPINIONS ARE PERSONAL AND NOT THOSE OF ANY AFFILIATED ORGANIZATIONS!
Hello again friends and colleagues,
While my writing usually highlights the promise and potential of AI in medicine and healthcare, I also recognize that some of the actual and proposed uses go beyond the technology’s current capabilities. From enhancing diagnostic accuracy to automating administrative tasks, AI has the potential to revolutionize healthcare delivery and patient outcomes. In today’s newsletter, however, we will discuss some of these shortcomings, and show that the issues are not merely technical or theoretical but have real-world consequences for patients and providers alike.
Overreach and Mistrust
The hype around medical AI is palpable. With some current implementations often over-representing their capabilities and sidelining critical concerns.
Misleading Diagnostic Accuracy
Cutting-edge large language models have been praised for their advanced diagnostic reasoning capabilities. However, their performance in real-world scenarios exposes troubling limitations. Despite claims of improved accuracy and reduced hallucinations, these models have sometimes shown a propensity to confidently rationalize incorrect diagnoses. This inconsistency is not only misleading but potentially dangerous in clinical settings where accuracy and reliability are paramount.
These issues have partly been blamed on flaws in benchmarks like MedQA, used to evaluate the performance of these tools, which fail to reflect the complexities of real-world healthcare scenarios. Some benchmarks prioritize theoretical accuracy over practical applicability, leading to inflated perceptions of AI performance while ignoring critical gaps in functionality.
Erroneous Denial of Care
UnitedHealthcare and other insurers’ use of an AI model for claims adjudication highlights a different, but equally concerning, misuse of AI. Designed to predict the length of post-acute care stays, the model was reportedly used to deny insurance claims, often overriding physicians’ recommendations. Lawsuits allege that the model had a 90% error rate, yet its recommendations were routinely followed because only a small fraction of patients appealed denied claims. This has led to instances where patients were deprived of medically necessary care, demonstrating the ethical and practical dangers of deploying unvalidated AI systems in high-stakes environments. Some have speculated that these practices and the resulting harm may have contributed to the recent tragic events involving UnitedHealthcare’s CEO.
Inaccurate Medical Transcriptions
AI transcription tools highlight another critical issue—the phenomenon of hallucinations. These hallucinations, where AI systems fabricate information, have been widely documented in chatbots and transcription models. Cutting-edge transcription tools, trained to convert audio input into text, extend this issue to medical transcription. A recent study found that 1.4% of these transcriptions included fabricated sentences, and many of these hallucinations contained harmful or offensive content.
The problem stems from the diversity of speech patterns globally and the limitations of available training data. Silences, filler words, and irregular speech patterns are often misinterpreted by the language model, leading to the creation of fictional sentences. For medical professionals using these tools to transcribe patient notes, such hallucinations could introduce dangerous inaccuracies into medical records, potentially altering the course of care.
Researchers have called for more rigorous validation and better training datasets to mitigate these issues. Recent updates to transcription models, aimed at addressing these hallucinations by skipping silences and retranscribing audio when errors are detected, represent a step forward. However, until these systems achieve greater reliability, the need for manual oversight remains critical.
Creating Better Standards for AI in Medicine
To address these challenges, the healthcare industry must prioritize rigorous validation and oversight of AI systems. Enhanced performance benchmarks are essential to bridge the gap between theoretical and practical accuracy. These benchmarks should reflect real-world complexities, such as diverse patient populations and incomplete datasets, and they should involve domain experts, including clinicians, in their design and evaluation.
Transparency is another critical requirement. AI models should provide probabilistic outputs rather than deterministic answers, allowing clinicians to weigh recommendations appropriately. Developers must disclose known limitations and avoid overhyping AI capabilities.
Human oversight remains a cornerstone of safe AI deployment in healthcare. AI systems should augment human decision-making, not replace it. Clinicians must remain the final arbiters in patient care, and organizations must establish accountability mechanisms to ensure that errors are identified and rectified promptly.
Provider education and up-skilling are also crucial to ensure that clinicians and healthcare workers can effectively engage with AI systems. Training programs should focus on understanding the limitations of AI, interpreting its outputs, and identifying potential errors.
Finally, ethical deployment practices are non-negotiable. Companies should refrain from deploying unsupervised AI systems in critical areas like insurance claims or medical diagnostics until the technology has been thoroughly vetted and proven reliable. Regulators and policymakers must enforce stricter guidelines to protect patients from harm and uphold the integrity of the healthcare system.
A Balanced Path Forward
AI in medicine is undeniably transformative, but the current trajectory of its implementation is fraught with risks. Misleading diagnostic claims, unethical applications, and concerning hallucinations, such as those seen in transcription tools, underscore the urgent need for improved validation, transparency, and oversight. By addressing these issues and fostering education and up-skilling for healthcare providers, we can harness the potential of AI to enhance healthcare while safeguarding against its misuse. The future of medical AI depends on striking the right balance between innovation and responsibility.
If you enjoyed today's newsletter, please Like, Comment, and Share.
See you next week,
Sam
Author, Healthcare Compliance Consultant, Attorney ***NEW RELEASE!: Angels of Deception, Medical Thriller!
3dAnother great article, thank you! With respect to the AI denials, I have recently been helping a client fight back on that, so I can attest to the prevalence of that issue (and it wasn't UHC). This practice causes a great deal of harm to small providers as well as patients, when AI denies pre-auths and the patient either doesn't get care timely, or the provider has to wait many months to get paid, and usually only after countless rounds of appeals. I see change coming, because consumers, providers, and even government is looking at this now. AI has much good, but in this instance it is being used to promulgate bad business practices.
Passionate about Design for Health, Design Research and Project Delivery
6d"Human oversight remains a cornerstone of safe AI deployment in healthcare" I would suggest that Human Oversight / QA checking / Validation is a basic requirement of any cumputer/AI info. We expect human validation of manually recorded meeting notes, people even interpret what is said differently. One of the most common 20th century human failings is believing something just because it is the product of a compter / AI. PS my GP uses AI in consultations and so far it is a benefit to me and him.
Physician Leader | AI in Healthcare | Neonatal Critical Care | Quality Improvement | Patient Safety | Co-Founder NeoMIND-AI and Clinical Leaders Group
1wThanks for the post and insight. Could you provide the reference for the 1.4% rate of fabrication by LLMs in transcription? Thanks I have been unable find that one
Thought leader in Identity Management, Data Security, and Access Control | Healthcare, Life Science, EPIC, Cybersecurity, Cloud
1wGreat insight!