Methods: Inpatient hospitalizations in 2016 with ICD-10 codes for UTI at a children’s hospital were identified. Records of inpatients with positive urine cultures for 2016 were reviewed to identify missed cases. Notes for inpatient hospitalizations for 2016 were processed using a NLP pipeline. The NLP pipeline receives real-time data, accounts for institution-specific document structure, performs named-entity recognition on clinical problems/symptoms and matches these terms to concept unique identifiers (CUI) in the unified medical language system (UMLS). We used the UMLS CUI for urinary tract infections (C0042029) to identify notes of interest. To minimize false positives, we selected as the threshold for case positivity-the mean UTI CUI mentions per patient during 2016.
Results: Among 10,681 hospitalized patients, there were 181 unique patients that were identified with UTI using ICD-10 codes. An additional 85 UTI cases were identified using chart review of positive urine cultures (n=409). A total of 289,344 notes were screened by the NLP pipeline to identify UTI patients. Using the predefined threshold (n=6) all cases of UTI identified by ICD-10 screening were detected by the NLP based method. Of the additional cases missed by ICD-10 codes- 84/85(98.9%) were positive by the NLP based method. To identify these 84 true cases, an additional 275 charts without UTI, flagged as positive by the NLP method would have to be reviewed (~ratio of 1:3).
Conclusion: We demonstrate the use of a NLP based pipeline to enhance IDS surveillance. Using NLP based surveillance with other methods could facilitate case detection and outbreak control for IDS that lack microbiologic data or have novel presentations. Further work will improve the specificity of NLP based case finding methods and apply this to other IDS.
K. Natarajan, None
See more of: Oral Abstract Session