Publications

Publications / SAND Report

Synthetic data generators for the evaluation of biosurveillance outbreak detection algorithms

Levin, Drew L.; Finley, Patrick D.

The research and development of new algorithmic and statistical methods of outbreak detection is an ongoing research priority in the field of biosurveillance. The early detection of emergent disease outbreaks is crucial for effective treatment and mitigation. New detection methods must be compared to established approaches for proper evaluation. This comparison requires biosurveillance test data that accurately reflects the complexity of the real-world data it will be applied to. While the test and evaluation of new detection methods is best performed on real data, it is often impractical to obtain such data as it is either proprietary or limited in scope. Thus, scientists must turn to synthetic data generation to provide enough data to properly eval- uate new detection methodologies. This paper evaluates three such synthetic data sources: The WSARE dataset, the Noufilay equation-based approach, and the Project Mimic data generator.