Ronald Taylor Locke
Anomaly Detection with Applications in Environmental and Cyber Security
Committee Members: Advisor: Yannis Paschalidis, SE/ECE; Christos Cassandras, SE/ECE; David Castañón, SE/ECE; Bobak Nazer, SE/ECE; Appointed Chair: Sean Andersson, SE/ME
Abstract: Two approaches to detecting anomalous behavior within a sequence of random observations are presented. One approach is stochastic in nature, using large deviations techniques to form a Hoeffding decision test. Scenarios in which sequential observations can be considered independent and identically distributed (iid) or adhere to a first-order Markov chain are both considered. The Markovian case is explored further and asymptotic performance results are developed for using the generalized likelihood ratio test (GLRT) to identify a Markov source. After a presentation of binary and multi-class Support Vector Machines (SVM), a deterministic anomaly detection method based on the so-called one-class SVM is also presented. The presented methodologies are then applied to detection and localization of Chemical, Biological, Radiological, or Nuclear (CBRN) events in an urban area using a network of sensors. In contrast to earlier work, these approaches do not solve an inverse dispersion problem but rely on data obtained from a simulation of the CBRN dispersion to obtain descriptors of sensor measurements under a variety of CBRN release scenarios. To assess the problem of environmental monitoring, CBRN event-free conditions are assumed to be iid and a corresponding stochastic anomaly detector is relied on to detect a CBRN event. Conditional on such an event, subsequent sensor observations are assumed to follow a Markov process. Accordingly, the presented Markov source identification methodology is used to map sensor observations to a source location chosen out of a discrete set of possible locations. A multiclass SVM approach to CBRN localization is also developed, and the two techniques are compared using three-dimensional CBRN release simulations. Also addressed is the problem of optimally placing sensors to minimize the localization probability of error. The anomaly detection approaches are then applied to detection of data exfiltration-style attempts on a network server. Two one-class SVM approaches are presented. In both, data packet transmissions are captured and compiled into network flows. In a flow-by-flow network anomaly detector, features are extracted from individual flows and their novelty is tested. If a flow’s features differ too greatly from nominal flow features, as determined by the SVM, that flow is declared an anomaly. In a network-wide anomaly detector, the novelty of a time sequence of flows is tested. The stochastic anomaly detectors are applied to sequences of flows as well, under the contexts of subsequent network flows either being iid or following a Markov process.These techniques are evaluated on simulated network traffic.