Improving risk prediction of Clostridium Difficile Infection using temporal event-pairs

Monsalve, Mauricio; Pemmaraju, Sriram; Johnson, Sarah; Polgreen, Philip M.; Balakrishnan, P; Srivatsava, J; Fu, WT; Harabagiu, S; Wang, F

Abstract

Clostridium Difficile Infection (CDI) is a contagious healthcare-associated infection that imposes a significant burden on the healthcare system. In 2011 alone, half a million patients suffered from CDI in the United States, 29,000 dying within 30 days of diagnosis. Determining which hospital patients are at risk for developing CDI is critical to helping healthcare workers take timely measures to prevent or detect and treat this infection. We improve the state of the art of CDI risk prediction by designing an ensemble logistic regression classifier that given partial patient visit histories, outputs the risk of patients acquiring CDI during their current hospital visit. The novelty of our approach lies in the representation of each patient visit as a collection of co-occurring and chronologically ordered pairs of events. This choice is motivated by our hypothesis that CDI risk is influenced not just by individual events (e.g., being prescribed a first generation cephalosporin antibiotic), but by the temporal ordering of individual events (e.g., antibiotic prescription followed by transfer to a certain hospital unit). While this choice explodes the number of features, we use a randomized greedy feature selection algorithm followed by BIC minimization to reduce the dimensionality of the feature space, while retaining the most relevant features. We apply our approach to a rich dataset from the University of Iowa Hospitals and Clinics (UIHC), curated from diverse sources, consisting of 200,000 visits (30,000 per year, 2006-2011) involving 125,000 unique patients, 2 million diagnoses, 8 million prescriptions, 400,000 room transfers spanning a hospital with 700 patient rooms and 200 units. Our approach to classification produces better risk predictions (AUC) than existing risk estimators for CDI, even when trained just on data available at patient admission. It also identifies novel risk factors for CDI that are combinations of co-occurring and chronologically ordered events.

Más información

Título según WOS: ID WOS:000380399000018 Not found in local WOS DB
Título de la Revista: 2015 IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2015)
Editorial: IEEE
Fecha de publicación: 2015
Página de inicio: 140
Página final: 149
DOI:

10.1109/ICHI.2015.24

Notas: ISI