• Login
    View Item 
    •   Home
    • Theses & Dissertations
    • 2017 - Mines Theses & Dissertations
    • View Item
    •   Home
    • Theses & Dissertations
    • 2017 - Mines Theses & Dissertations
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Browse

    All of Mines RepositoryCommunitiesPublication DateAuthorsTitlesSubjectsThis CollectionPublication DateAuthorsTitlesSubjects

    My Account

    Login

    Mines Links

    Arthur Lakes LibraryColorado School of Mines

    Statistics

    Display Statistics

    Machine learning for the automatic detection of anomalous events

    • CSV
    • RefMan
    • EndNote
    • BibTex
    • RefWorks
    Thumbnail
    Name:
    Fisher_mines_0052E_11222.pdf
    Size:
    4.775Mb
    Format:
    PDF
    Download
    Author
    Fisher, Wendy D.
    Advisor
    Camp, Tracy
    Date issued
    2017
    Keywords
    BSR detection
    machine learning
    earth dam and levee
    anomaly detection
    
    Metadata
    Show full item record
    URI
    https://hdl.handle.net/11124/170968
    Abstract
    In this dissertation, we describe our research contributions for a novel approach to the application of machine learning for the automatic detection of anomalous events. We work in two different domains to ensure a robust data-driven workflow that could be generalized for monitoring other systems. Specifically, in our first domain, we begin with the identification of internal erosion events in earth dams and levees (EDLs) using geophysical data collected from sensors located on the surface of the levee. As EDLs across the globe reach the end of their design lives, effectively monitoring their structural integrity is of critical importance. The second domain of interest is related to mobile telecommunications, where we investigate a system for automatically detecting non-commercial base station routers (BSRs) operating in protected frequency space. The presence of non-commercial BSRs can disrupt the connectivity of end users, cause service issues for the commercial providers, and introduce significant security concerns. We provide our motivation, experimentation, and results from investigating a generalized novel data-driven workflow using several machine learning techniques. In Chapter 2, we present results from our performance study that uses popular unsupervised clustering algorithms to gain insights to our real-world problems, and evaluate our results using internal and external validation techniques. Using EDL passive seismic data from an experimental laboratory earth embankment, results consistently show a clear separation of events from non-events in four of the five clustering algorithms applied. The results from experimenting with our BSR data, using various system information (SI) and system information blocks (SIBs), show we can make a clear distinction between commercial and non-commercial scans in both Universal Mobile Telephone System (UMTS) and Long Term Evolution (LTE); more work is needed to understand whether non-commercial BSRs can be discovered in the Global System for Mobile Communications (GSM) analysis. We also investigate and provide results on using ASN.1 encoded LTE data as input to our machine learning algorithms; we use encoded data to eliminate the need for extensive feature selection and manual analysis that could potentially introduce bias. Chapter 3 uses a multivariate Gaussian machine learning model to identify anomalies in our experimental data sets. For the EDL work, we used experimental data from two different laboratory earth embankments. Additionally, we explore five wavelet transform methods for signal denoising. The best performance is achieved with the Haar wavelets. We achieve up to 97.3% overall accuracy and less than 1.4% false negatives in anomaly detection. Using the BSR scans, we continue to see that the GSM broadcast messages are not suitable for our anomaly detection system. However, the multivariate Gaussian approach with the UMTS, LTE, and ANS.1 encoded LTE scans were successful in separating commercial from non-commercial BSRs with 100% overall accuracy. In Chapter 4, we research using two-class and one-class support vector machines (SVMs) for an effective anomaly detection system. We again use the two different EDL data sets from experimental laboratory earth embankments (each having approximately 80% normal and 20% anomalies) to ensure our workflow is robust enough to work with multiple data sets and different types of anomalous events (e.g., cracks and piping). We apply Haar wavelet-denoising techniques and extract nine spectral features from decomposed segments of the time series data. The two-class SVM with 10-fold cross validation achieved over 94% overall accuracy and 96% F1-score. The F1-score is a measure of the algorithms predictive performance and the harmonic mean of precision and recall. Experiments with the one-class SVM (no labeled data for anomalies) using the top features selected by our automatic feature selection algorithm increase our overall results from 83% accuracy and 89% F1-score to over 91% accuracy and 95% F1-score. The two-class SVM experiments with our BSR detection workflow, using the top two features for each data set, highlight the ability to make a distinction between commercial and non-commercial BSRs with 83.3% overall accuracy and 89.8% F1-score for GSM and an impressive 100% overall accuracy for UMTS, LTE, and the ASN.1 encoded LTE data. As expected, using labels for only normal data, the one-class SVM resulted in a lower overall performance. The overall accuracy for GSM, UMTS, and LTE dropped to 73.3%, 74.5%, and 91.1%, with F1-scores of 81.0%, 82.2%, and 95.0%, respectively. Our approach provides a means for automatically identifying anomalous events using various machine learning techniques. Detecting internal erosion events in aging EDLs, earlier than is currently possible, can allow more time to prevent or mitigate catastrophic failures. Results show that we can successfully separate normal from anomalous data observations in passive seismic data, and provide a step towards techniques for continuous real-time monitoring of EDL health. Our lightweight non-commercial BSR detection system also has promise in separating commercial from non-commercial BSR scans without the need for prior geographic location information, extensive time-lapse surveys, or a database of known commercial carriers.
    Rights
    Copyright of the original work is retained by the author.
    Collections
    2017 - Mines Theses & Dissertations

    entitlement

     
    DSpace software (copyright © 2002 - 2023)  DuraSpace
    Quick Guide | Contact Us
    Open Repository is a service operated by 
    Atmire NV
     

    Export search results

    The export option will allow you to export the current search results of the entered query to a file. Different formats are available for download. To export the items, click on the button corresponding with the preferred download format.

    By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.

    To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export. The amount of items that can be exported at once is similarly restricted as the full export.

    After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.