Methods: We developed a modular open-source software suite called PathogenDB that implements major functionalities needed for genomic clinical microbiology and pathogen surveillance. A central laboratory information management system runs on a standard open-source Linux/Apache/MySQL/PHP stack. A modular genomics workflow, PathogenDB-pipeline, was publicly released in 2014. It automates de novo assembly of reads with HGAP, circularizes contigs with Circlator, annotates genes with Prokka, and predicts epigenetic motifs. The pipeline also post-processes assemblies to evaluate quality and provide visualizations using a custom genome browser (ChromoZoom). A comparative genomics module, PathogenDB-comparison, performs semi-automated phylogenetic analysis with Mugsy and RAxML.
Results: PathogenDB-pipeline has been used to assemble and annotate 232 genomes from 7 species, and runs in <12 hours end-to-end. At an urban tertiary-care hospital, PathogenDB-comparison has genomically characterized one MRSA outbreak, two transmissions via solid organ transplant, and pseudo-outbreaks of S. maltophilia and B. cepacia. Both software packages are freely available on GitHub.
Conclusion: We have created modular, open-source software that automates significant portions of a genomic clinical microbiology workflow and can characterize transmissions within an outbreak. Further work could add visualizations based on epidemiological trend data and geospatial analysis, allowing rapid, unprecedented insight into transmission events and potential outbreaks occurring within a NGS-equipped hospital.
O. Attie, None
E. Webster, None
A. Kasarskis, None
H. Van Bakel, None
A. Bashir, None