Daniel B. Neill

Detection of Spatial and Spatio-Temporal Clusters Degree Type: Ph.D. in Computer Science
Advisor(s): Andrew Moore
Graduated: August 2006

Abstract:

This thesis develops a general and powerful statistical framework for the automatic detection of spatial and space-time clusters. Our "generalized spatial scan" framework is a flexible, model-based framework for accurate and computationally efficient cluster detection in diverse application domains. Through the development of the "fast spatial scan" algorithm and new Bayesian cluster detection methods, we can now detect clusters hundreds or thousands of times faster than previous approaches. More timely detection of emerging clusters (with high detection power and low false positive rates) was made possible by development of "expectation-based" scan statistics, which learn baseline models from past data then detect regions that are anomalous given these expectations. These cluster detection methods were applied to two real-world problem domains: the early detection of emerging disease epidemics, and the detection of clusters of activity in fMRI brain imaging data. One major contribution of this work is the development of the SSS system for nationwide disease surveillance, currently used in daily practice by several state and local health departments. This system receives data (including emergency department records and medication sales) from over 20,000 stores and hospitals nationwide, automatically detects emerging clusters of disease, and reports these results to public health officials. Through retrospective case studies and semi-synthetic testing, we have shown that our system can detect outbreaks significantly faster than previous disease surveillance methods.

Thesis Committee:
Andrew Moore (Chair)
Tom Mitchell
Jeff Schneider
Gregory Cooper (University of Pittsburgh)
Andrew Lawson (University of South Carolina)

Jeannette Wing, Head, Computer Science Department
Randy Bryant, Dean, School of Computer Science

Keywords:
Cluster detection, data mining, algorithms, biosurveillance, fMRI

CMU-CS-06-142.pdf (2.62 MB) ( 158 pages)
Copyright Notice