The impressive performance that deep neural networks demonstrate on a range of seismic monitoring tasks depends largely on the availability of event catalogs that have been manually curated over many years or decades. However, the quality, duration, and availability of seismic event catalogs vary significantly across the range of monitoring operations, regions, and objectives. Semisupervised learning (SSL) enables learning from both labeled and unlabeled data and provides a framework to leverage the abundance of unreviewed seismic data for training deep neural networks on a variety of target tasks. We apply two SSL algorithms (mean-teacher and virtual adversarial training) as well as a novel hybrid technique (exponential average adversarial training) to seismic event classification to examine how unlabeled data with SSL can enhance model performance. In general, we find that SSL can perform as well as supervised learning with fewer labels. We also observe in some scenarios that almost half of the benefits of SSL are the result of the meaningful regularization enforced through SSL techniques and may not be attributable to unlabeled data directly. Lastly, the benefits from unlabeled data scale with the difficulty of the predictive task when we evaluate the use of unlabeled data to characterize sources in new geographic regions. In geographic areas where supervised model performance is low, SSL significantly increases the accuracy of source-type classification using unlabeled data.
Long-term seismic monitoring networks are well positioned to leverage advances in machine learning because of the abundance of labeled training data that curated event catalogs provide. We explore the use of convolutional and recurrent neural networks to accomplish discrimination of explosive and tectonic sources for local distances. Using a 5-year event catalog generated by the University of Utah Seismograph Stations, we train models to produce automated event labels using 90-s event spectrograms from three-component and single-channel sensors. Both network architectures are able to replicate analyst labels above 98%. Most commonly, model error is the result of label error (70% of cases). Accounting for mislabeled events (~1% of the catalog) model accuracy for both models increases to above 99%. Classification accuracy remains above 98% for shallow tectonic events, indicating that spectral characteristics controlled by event depth do not play a dominant role in event discrimination.
The quality of automatic signal detections from sensor networks depends on individual detector trigger levels (TLs) from each sensor. The largely manual process of identifying effective TLs is painstaking and does not guarantee optimal configuration settings, yet achieving superior automatic detection of signals and ultimately, events, is closely related to these parameters. We present a Dynamic Detector Tuning (DDT) system that automatically adjusts effective TL settings for signal detectors to the current state of the environment by leveraging cooperation within a local neighborhood of network sensors. After a stabilization period, the DDT algorithm can adapt in near-real time to changing conditions and automatically tune a signal detector to identify (detect) signals from only events of interest. Our current work focuses on reducing false signal detections early in the seismic signal processing pipeline, which leads to fewer false events and has a significant impact on reducing analyst time and effort. This system provides an important new method to automatically tune detector TLs for a network of sensors and is applicable to both existing sensor performance boosting and new sensor deployment. With ground truth on detections from a local neighborhood of seismic sensors within a network monitoring the Mount Erebus volcano in Antarctica, we show that DDT reduces the number of false detections by 18% and the number of missed detections by 11% when compared with optimal fixed TLs for all sensors.
Rigorous characterization of the performance and generalization ability of cyber defense systems is extremely difficult, making it hard to gauge uncertainty, and thus, confidence. This difficulty largely stems from a lack of labeled attack data that fully explores the potential adversarial space. Currently, performance of cyber defense systems is typically evaluated in a qualitative manner by manually inspecting the results of the system on live data and adjusting as needed. Additionally, machine learning has shown promise in deriving models that automatically learn indicators of compromise that are more robust than analyst-derived detectors. However, to generate these models, most algorithms require large amounts of labeled data (i.e., examples of attacks). Algorithms that do not require annotated data to derive models are similarly at a disadvantage, because labeled data is still necessary when evaluating performance. In this work, we explore the use of temporal generative models to learn cyber attack graph representations and automatically generate data for experimentation and evaluation. Training and evaluating cyber systems and machine learning models requires significant, annotated data, which is typically collected and labeled by hand for one-off experiments. Automatically generating such data helps derive/evaluate detection models and ensures reproducibility of results. Experimentally, we demonstrate the efficacy of generative sequence analysis techniques on learning the structure of attack graphs, based on a realistic example. These derived models can then be used to generate more data. Additionally, we provide a roadmap for future research efforts in this area.
Neural machine learning methods, such as deep neural networks (DNN), have achieved remarkable success in a number of complex data processing tasks. These methods have arguably had their strongest impact on tasks such as image and audio processing - data processing domains in which humans have long held clear advantages over conventional algorithms. In contrast to biological neural systems, which are capable of learning continuously, deep artificial networks have a limited ability for incorporating new information in an already trained network. As a result, methods for continuous learning are potentially highly impactful in enabling the application of deep networks to dynamic data sets. Here, inspired by the process of adult neurogenesis in the hippocampus, we explore the potential for adding new neurons to deep layers of artificial neural networks in order to facilitate their acquisition of novel information while preserving previously trained data representations. Our results on the MNIST handwritten digit dataset and the NIST SD 19 dataset, which includes lower and upper case letters and digits, demonstrate that neurogenesis is well suited for addressing the stability-plasticity dilemma that has long challenged adaptive machine learning algorithms.
Biological neural networks continue to inspire new developments in algorithms and microelectronic hardware to solve challenging data processing and classification problems. Here, we survey the history of neural-inspired and neuromorphic computing in order to examine the complex and intertwined trajectories of the mathematical theory and hardware developed in this field. Early research focused on adapting existing hardware to emulate the pattern recognition capabilities of living organisms. Contributions from psychologists, mathematicians, engineers, neuroscientists, and other professions were crucial to maturing the field from narrowly-tailored demonstrations to more generalizable systems capable of addressing difficult problem classes such as object detection and speech recognition. Algorithms that leverage fundamental principles found in neuroscience such as hierarchical structure, temporal integration, and robustness to error have been developed, and some of these approaches are achieving world-leading performance on particular data classification tasks. In addition, novel microelectronic hardware is being developed to perform logic and to serve as memory in neuromorphic computing systems with optimized system integration and improved energy efficiency. Key to such advancements was the incorporation of new discoveries in neuroscience research, the transition away from strict structural replication and towards the functional replication of neural systems, and the use of mathematical theory frameworks to guide algorithm and hardware developments.
Given a set of observations within a specified time window, a fitness value is calculated at each grid node by summing station-specific conditional fitness values. Assuming each observation was generated by a refracted P wave, these values are proportional to the conditional probabilities that each observation was generated by a seismic event at the grid node. The node with highest fitness value is accepted as a hypothetical event location, subject to some minimal fitness value, and all arrivals within a longer time window consistent with that event are associated with it. During the association step, a variety of different phases are considered. Once associated with an event, an arrival is removed from further consideration. While unassociated arrivals remain, the search for other events is repeated until none are identified. Results are presented in comparison with analyst-reviewed bulletins for three datasets: a two-week ground-truth period, the Tohoku aftershock sequence, and the entire year of 2010. The probabilistic event detection, association, and location algorithm missed fewer events and generated fewer false events on all datasets compared to the associator used at the International Data Center (51% fewer missed and 52% fewer false events on the ground-truth dataset when using the same predictions).