Loading...
Thumbnail Image
Publication

Imputing missing soil properties using machine learning

Lee, Zachary J.
Citations
Altmetric:
Editor
Date
Date Issued
2023
Date Submitted
Research Projects
Organizational Units
Journal Issue
Embargo Expires
2025-06-24
Abstract
Imputing values of missing features with highly missing data is a challenging and important problem for both understanding data and reducing bias in models that make use of the data. In this work, we propose three novel imputation methods designed to work well with highly missing data and address the shortcomings of more general state-of-the-art imputation methods. Our novel models make use of concepts such as diffusion processes, adversarial methods, and adaptive training. For an application dataset, we focus on imputing missing soil property values (pH, organic carbon, nitrogen, etc.) in a specified area, based on the observed properties. We provide an analysis of baseline imputation methods evaluated on example datasets as well as our application dataset to illustrate their shortcomings and compare them to our novel methods. We find that our novel methods match or outperform the baseline methods on various metrics, and we provide a discussion of the results and shortcomings of our methods.
Associated Publications
Rights
Copyright of the original work is retained by the author.
Embedded videos