Background: Medical research publications on sepsis have increased at an exponential rate, whereas our capacity to absorb and understand them has remained limited. We used topic modeling, a method that allows machines to distill large amounts of information into its elemental themes, to help us infer the discourse that led us to the present model/understanding of sepsis. Using this model to augment our understanding of sepsis, an evolving, networked and complex disease, we aimed to recognize connections that could be further explored and aid in knowledge discovery.
Methods: We extracted all abstracts from PubMed containing the terms “sepsis”, “septic shock”, and “septicemia” between 1890-2017 and retained the most informative words. Using topic modeling approaches based on Latent Dirichlet Allocation, we trained dynamic models to 5 topics from the corpus. We conducted a thematic analysis of topics across publication periods by examining the 30 most frequent words in each topic for each decade. We then fit a static topic model to the last 5 years. We compared the respective themes and their relatedness, and compared the frequency of each topic over the first and second halves of the century.
Results: Five themes emerged overall: surgery, physiology, microbiology, neonatal/maternal health, and cellular and endothelial responses to infection. When limited to the last 5 years, topics were: acute organ failure and ICU management, early sepsis management and cost, cellular and endothelial response, biomarkers and viruses, and neonatal infection. For the first half of the 20th century, the bulk of research focused on microbiology while in the latter half of the century there was increased attention on the host response.
Conclusion: When visualizing the frequency of each topic over the last 100 years we found that the focus has shifted from the pathogen to the host response both from a cellular and physiologic perspective. In the last 5 years, biomarkers, early recognition and system management emerged as new themes. Reasons for this may include: evolution of scientific tools, treatments and statistical abilities, an increasing focus on healthcare cost, and ultimately an incorporation of the individual host response into the disease model.
A. D. Bostwick,
B. Jones, None
R. Paine, None
M. Samore, None
M. Jones, None