University of Tasmania
Browse

Machine learning for geological mapping : algorithms and applications

thesis
posted on 2023-06-14, 12:27 authored by Matthew CracknellMatthew Cracknell
Machine learning algorithms are designed to identify efficiently and to predict accurately patterns within multivariate data. They provide analysts computational tools to aid predictive modelling and the interpretation of interactions between data and the phenomena under investigation. The analysis of large volumes of disparate multivariate geospatial data using machine learning algorithms therefore offers great promise to industry and research in the geosciences. Geoscience data are frequently characterised by a restriction in the number and distribution of direct observations, irreducible noise in these data and a high degree of intraclass variability and interclass similarity. The choice of machine learning algorithm, or algorithms and the details of how algorithms are applied must therefore be appropriate to the context of geoscience data. With this knowledge, I aim to employ machine learning as a means of understanding the spatial distribution of complex geological phenomena. I conduct a rigorous and comprehensive comparison of machine learning algorithms, representing the five general machine learning strategies, for supervised lithology classification applications. I also develop and test a novel method for obtaining robust estimates of the uncertainty associated with machine learning algorithm categorical predictions. The insights gained from these experiments leads to the further development and comparison of new methods for the incorporation of spatial-contextual information into machine learning supervised classifiers. In using machine learning algorithms for geoscience applications, I have developed bestpractice methodologies that address the challenges facing geoscientists for geospatial supervised classification. Guidelines are established that detail the preparation and integration of disparate spatial data, the optimisation of trained classifiers for a given application and the robust statistical and spatial evaluation of outputs. I demonstrate, through a case study in a region that is prospective for economic mineralisation, the combination of supervised and unsupervised machine learning algorithms for the critical appraisal of pre-existing geological maps and formulation of meaningful interpretations of geological phenomena. The experiments conducted as part of my research confirm the efficacy of machine learning algorithms to generate accurate geological maps representing a variety of terranes. I identify and explore key aspects of the spatial and statistical istributions of geoscience data that affect machine learning algorithm performance. My research clearly identifies Random Forests‚Äöv묢 as a good first-choice algorithm for the prediction of classes representing lithologies using commonly available multivariate geological and geophysical data. Furthermore, Random Forests prediction uncertainty is shown to be closely related to ambiguous and/or erroneous classifications and, thus provides a practical means of indicating variable levels of confidence. Spatial-contextual information is best incorporated into machine learning supervised classifiers via the pre-processing of input variables and/or the post-regularisation of classifications. My findings indicate that a trade-off between optimal predictive models and interpretable explanatory models exists, whereby, intuitively interpretable models are not necessarily the most accurate. The practical application of machine learning algorithms requires the implementation of three key stages: (1) data pre-processing; (2) algorithm training; and (3) prediction evaluation. This methodology provides the foundation for generating accurate and geologically meaningful predictions with minimal user intervention and assists in the formulation of robust interpretations of complex geological phenomena. For example, classifications obtained by Random Forests are useful for critically appraising interpreted geological maps. Clusters produced by Self-Organising Maps indicate the presence of discrete, spatially contiguous and geologically significant sub-classes within individual lithological units, which represent regions of contrasting primary composition and alteration styles. My results may be widely applied to a broad range of practical geoscience challenges such as ore deposit targeting, geo-hazard risk assessment, engineering and construction projects, hydrological and environmental modelling and ecological studies. The applications of machine learning algorithms detailed in this thesis align well with state-of-the-art Big Data online infrastructure and virtual laboratories currently emerging in Australia.

History

Publication status

  • Unpublished

Rights statement

Copyright 2014 the author Chapter 4 appears to be the equivalent of a post-print version of an article published as: Cracknell, M. J., Reading, A. M., 2014. Geological mapping using remote sensing data: a comparison of five machine learning algorithms, their response to variations in the spatial distribution of training data and the use of explicit spatial information, Computers and geosciences, 63, 22-33. The article is published under a Creative Commons Attribution 3.0 Unported (CC BY 3.0) License https://creativecommons.org/licenses/by/3.0/ Chapter 5 appears to be the equivalent of a post-print version of an article published as: Cracknell, M. J., Reading, A. M., 2013. The upside of uncertainty: identification of lithology contact zones from airborne geophysics and satellite data using random forests and support vector machines, Geophysics, 78(3), WB113-WB126. Copyright 2013 Society of Exploration Geophysicist. Reuse is subject to SEG terms of use and conditions. Chapter 6 appears to be the equivalent of an Accepted Manuscript of an article published by Taylor & Francis in Australian journal of Earth sciences on 26 November 2013, available online: http://www.tandfonline.com/10.1080/08120099.2014.858081

Repository Status

  • Open

Usage metrics

    Thesis collection

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC