1313. A data-driven approach to predict daily risk of Clostridium difficile infection at two large academic health centers
Session: Poster Abstract Session: HAI: C. difficile Risk Assessment and Prevention
Friday, October 6, 2017
Room: Poster Hall CD

Background: An estimated 293,300 healthcare-associated cases of Clostridium difficile infection (CDI) occur annually in the United States. Prior research on risk-prediction models for CDI have focused on a small number of risk factors with the goal of developing a model that works well across hospitals. We hypothesize that risk factors are, in part, hospital-specific. We applied a generalizable machine learning approach to discovering, or “learning”, hospital-specific risk-stratification models using electronic health record (EHR) data collected during the course of patient care from the Massachusetts General Hospital (MGH) and the University of Michigan Health System (UM).

Methods: We utilized EHR data from 115,958 adult inpatient admissions from 2012-2014 (MGH) and 258,050 adult inpatient admissions from 2010-2016 (UM) (Fig 1). We extracted patient demographics, admission details, patient history, and daily hospitalization details, resulting in 2,964 and 4,739 features in the MGH and UM models, respectively. We used L2 regularized logistic regression to learn the models and measured the discriminative performance of the models on a year of held-out data from each hospital.

Results: The MGH and UM models achieved AUROCs of 0.74 (CI: 0.73-0.75) and 0.77 (CI: 0.75-0.80), respectively. The relative importance of risk factors varied significantly across hospitals. In particular, in-hospital locations appeared in the set of top risk factors at one hospital and in the set of protective factors at the other. On average, both models were able to predict CDI five days in advance of clinical diagnosis (Fig 2).

Conclusion: We used EHR data to generate a daily estimate of the risk of CDI for each inpatient hospitalization. We applied a generalizable data-driven approach to existing data from two large institutions with different patient populations and different data formats and content. In contrast to approaches that focus on learning models that apply generally across hospitals, our proposed approach yields risk stratification models tailored to an institution’s EHR system and patient population. In turn, these hospital-specific models could allow for earlier and more accurate identification of high-risk patients.

Maggie Makar, MS1,2, Jeeheh Oh, MS3, Christopher Fusco, BS4, Joseph Marchesani, BS4, Robert McCaffrey, BS4, Krishna Rao, MD, MS5, Erin E. Ryan, MPH, CCRP6,7, Laraine Washer, MD5, Lauren R. West, MPH6,7, Vincent B. Young, MD, PhD5, John Guttag, PhD1, David C. Hooper, MD6,7,8, Erica S. Shenoy, MD, PhD7,8,9,10 and Jenna Wiens, PhD3, (1)Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, (2)Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, (3)Computer Science and Engineering, University of Michigan, Ann Arbor, MI, (4)Information Systems, Partners HealthCare, Boston, MA, (5)Department of Internal Medicine, Division of Infectious Diseases, University of Michigan Medical School, Ann Arbor, MI, (6)Division of Infectious Diseases, Massachusetts General Hospital, Boston, MA, (7)Infection Control Unit, Massachusetts General Hospital, Boston, MA, (8)Harvard Medical School, Boston, MA, (9)Department of Medicine, Division of Infectious Diseases, Massachusetts General Hospital, Boston, MA, (10)Medical Practice Evaluation Center, Department of Medicine, Massachusetts General Hospital, Boston, MA


M. Makar, None

J. Oh, None

C. Fusco, None

J. Marchesani, None

R. McCaffrey, None

K. Rao, None

E. E. Ryan, None

L. Washer, None

L. R. West, None

V. B. Young, None

J. Guttag, None

D. C. Hooper, None

E. S. Shenoy, None

J. Wiens, None

Findings in the abstracts are embargoed until 12:01 a.m. PDT, Wednesday Oct. 4th with the exception of research findings presented at the IDWeek press conferences.