Methods: We obtained a random 10-15% sample in real time of the 4.4 billion tweets posted on the micro-blogging service Twitter between June 13 and December 6, 2011. They were anonymized and a dataset was developed from the English language conversations employing lists of vaccine- and vaccine safety-related terms. Conversations unrelated to vaccines were removed. The documents were clustered employing a proprietary program (http://in-spire.pnnl.gov ) and their relatedness compared using a Pearson Chi-square algorithm. Key summary terms were highly associated with a set of documents which, if significantly increasing (p<0.05) in a time period, were called surprising terms. Trending themes were topics that increased throughout a time line. The inter-relatedness of themes was compared by the co-occurrences of other terms within and across themes’ summary term centroids. Visualization of top themes permitted consideration of the inter-relations and frequencies of topics as well as the rapidity of emerging issues.
Results: There were 189,539 English language conversations about vaccines and vaccine safety. Themes and surprising terms came and went rapidly. A particularly useful data subset emerged consisting of 52,378 tweets and re-tweets about vaccine safety that also mentioned a person and a question. These consisted of observations, strong opinions about topical issues, conspiracies and new emerging issues. These topics rapidly expanded into trending themes with new surprising terms following press releases and statements by prominent people.
Conclusion: Emerging vaccine safety issues can be quickly identified, quantified and tracked in real time among conversations in social media such as Twitter. Using these types of tools should permit the assessment of new strategies to counter evolving concerns about vaccines.
S. Rose, None
M. Myers, None
See more of: Poster Abstract Session