File(s) stored somewhere else

Please note: Linked content is NOT stored on University of Illinois at Chicago and we can't guarantee its availability, quality, security or accept any liability.

External validation of a machine learning classifier to identify unhealthy alcohol use in hospitalized patients

journal contribution
posted on 11.11.2022, 22:32 authored by Yiqi Lin, Brihat Sharma, Hale M Thompson, Randy Boley, Kathryn Perticone, Neeraj Chhabra, Majid Afshar, Niranjan S Karnik
BACKGROUND AND AIMS: Unhealthy alcohol use (UAU) is one of the leading causes of global morbidity. A machine learning approach to alcohol screening could accelerate best practices when integrated into electronic health record (EHR) systems. This study aimed to validate externally a natural language processing (NLP) classifier developed at an independent medical center. DESIGN: Retrospective cohort study. SETTING: The site for validation was a midwestern United States tertiary-care, urban medical center that has an inpatient structured universal screening model for unhealthy substance use and an active addiction consult service. PARTICIPANTS/CASES: Unplanned admissions of adult patients between October 23, 2017 and December 31, 2019, with EHR documentation of manual alcohol screening were included in the cohort (n = 57 605). MEASUREMENTS: The Alcohol Use Disorders Identification Test (AUDIT) served as the reference standard. AUDIT scores ≥5 for females and ≥8 for males served as cases for UAU. To examine error in manual screening or under-reporting, a post hoc error analysis was conducted, reviewing discordance between the NLP classifier and AUDIT-derived reference. All clinical notes excluding the manual screening and AUDIT documentation from the EHR were included in the NLP analysis. FINDINGS: Using clinical notes from the first 24 hours of each encounter, the NLP classifier demonstrated an area under the receiver operating characteristic curve (AUCROC) and precision-recall area under the curve (PRAUC) of 0.91 (95% CI = 0.89-0.92) and 0.56 (95% CI = 0.53-0.60), respectively. At the optimal cut point of 0.5, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were 0.66 (95% CI = 0.62-0.69), 0.98 (95% CI = 0.98-0.98), 0.35 (95% CI = 0.33-0.38), and 1.0 (95% CI = 1.0-1.0), respectively. CONCLUSIONS: External validation of a publicly available alcohol misuse classifier demonstrates adequate sensitivity and specificity for routine clinical use as an automated screening tool for identifying at-risk patients.


Great Lakes Node of the Drug Abuse Clinical Trials Network | Funder: National Institutes of Health (National Institute on Drug Abuse) | Grant ID: UG1DA049467

Employing eSBI in a Community-based HIV Testing Environment for At-risk Youth | Funder: National Institute on Drug Abuse | Grant ID: R01DA041071



Lin, Y., Sharma, B., Thompson, H. M., Boley, R., Perticone, K., Chhabra, N., Afshar, M.Karnik, N. S. (2021). External validation of a machine learning classifier to identify unhealthy alcohol use in hospitalized patients. Addiction, 117(4), 925-933.