Improving the detection of behavioral health conditions through positive and unlabeled learning: opioid use disorder #OHDSISocialShowcase #JoinTheJourney
Lead: Praveen Kumar
Co-Author: Christophe Lambert
Background: Opioid use disorder (OUD) is a chronic behavioral health condition marked by prolonged opioid use that leads to significant distress or impairment of brain structure and function. The opioid crisis continues to be a significant public health problem worldwide. Globally, opioid use disorders afflict over 16 million people, including more than 2.1 million individuals in the US alone. Additionally, opioids contribute to more than 120,000 deaths annually worldwide. In 2020, 91,799 drug overdose deaths occurred in the US, with opioids contributing to 74.8% of all those deaths.
Accurately detecting and estimating behavioral health conditions, such as OUD, is crucial for identifying at-risk individuals, determining treatment needs, tracking prevention and intervention efforts, and finding treatment-naive individuals for clinical trials. With increased data availability and improved machine learning (ML) frameworks, researchers have recently started applying ML models to healthcare data to analyze various aspects of the opioid crisis. Nevertheless, underdiagnosis and undercoding of these conditions in electronic health records (EHRs) and claims data are common, with this missing data potentially compromising the reliability of analytics and inferences drawn from EHRs.
Our study employs a novel Positive and Unlabeled (PU) machine learning method to estimate the probability of an individual patient having OUD and the overall prevalence of OUD among individuals who have been exposed to at least one opioid in their lifetime. Furthermore, we examine differences in OUD diagnosis versus our imputed estimates across US states using administrative claims data. Since the Selected Completely At Random (SCAR) assumption is often not valid in healthcare data due to the fact that coded cases may not be true representatives of undetected cases (e.g., severe cases may more likely to generate a healthcare encounter), we applied our novel PU learning algorithm, “Positive Unlabeled Learning Selected Not At Random (PULSNAR),” to estimate the proportion of OUD among undetected individuals. PULSNAR can also generate a calibrated estimate of the probability that each patient has a given condition, assuming other patient healthcare data (i.e., conditions, procedures, drugs) are correlated with the condition of interest. The full details of our PU learning algorithm are available in a preprint.
https://lnkd.in/esWG4S9g
Head of operations @DOCSY | IIT DELHI - Executive program in healthcare | Dynamic Bams doctor | Expert in patient management and business strategy | Driving innovation and quality in healthcare
2moCommenting for my network