2474. Early Feedback From A Pilot Of A Cognitive Computing System To Analyze Immunization Data
Session: Poster Abstract Session: Vaccine Policy and Hesitancy
Saturday, October 6, 2018
Room: S Poster Hall

Immunization programs maintain and improve vaccination coverage to prevent diseases. Immunization program text data provide contextual information necessary to better understand vaccine coverage. However, text data analysis can be labor intensive. Cognitive computing systems address this challenge by systematically processing large volumes of text data.


Publicly available data were used. Formal data were gathered using scrapers and parsers to extract information from immunization-related websites, journals, and legislation. Informal data were collected via a social media search platform, Sysomos, from Twitter feeds. All data were preprocessed to remove irrelevant text. Existing algorithms analyzed data and retrieved the most closely related words or paragraphs and produced similarity scores for queries. Additionally, Word2vec and Glove algorithms were used to assess similarity and frequency of occurrence between queried and retrieved information.


The system searches by query, date, and jurisdiction. A query can range from a single word to a whole document. The system understands similarities between words, sentences, paragraphs, and documents and retrieves text based on similarities to the query. Results are supplemented by similarity scores, dates, jurisdictions, web-links, and usernames (where applicable). Similarity scores allow for quantitative analysis on text data.


The pilot cognitive computing system used algorithms to quickly search formal and informal immunization text data, creating a well-rounded system. The formal data can help identify program activities associated with changes in vaccination coverage. The informal data can help assess information being shared through social media during an outbreak or other emergency. The system will stay relevant as long as new data are continuously incorporated to update the algorithms.

Sarah Ball, MPH, ScD1, Marija Stanojevic, ME, BE2, Cindi Knighton, BS3, William Campbell, MPH1, Alison Thaung, MBA1, Alison Fisher, MPH3, Alexandra Bhatti, JD, MPH3, Yoonjae Kang, MPH3, Pam Srivastava, MS3, Fang Zhou, PhD2, Zoran Obradovic, PhD2 and Stacie Greby, DVM, MPH3, (1)Abt Associates, Cambridge, MA, (2)Temple University, Philadelphia, PA, (3)Centers for Disease Control and Prevention, Atlanta, GA


S. Ball, None

M. Stanojevic, None

C. Knighton, None

W. Campbell, None

A. Thaung, None

A. Fisher, None

A. Bhatti, None

Y. Kang, None

P. Srivastava, None

F. Zhou, None

Z. Obradovic, None

S. Greby, None

Findings in the abstracts are embargoed until 12:01 a.m. PDT, Wednesday Oct. 3rd with the exception of research findings presented at the IDWeek press conferences.