Environmental Transformations and Interactions
Machine Learning Analyses Enhance the Prediction of Soil Respiration by Microbes
Advanced analysis of a large and structured soil organic matter dataset collected through the Molecular Observation Network (MONet) enhances continental-scale understanding of soil microbial respiration.
Machine learning extracts key molecular classes (small circles in the left figure) from large, structured, and molecule-rich high-resolution soil organic matter data to explain potential soil respiration better than traditional biogeochemical parameters. (Image courtesy of Nathan Johnson | Pacific Northwest National Laboratory)
The science
Soil organic matter (SOM) contains thousands of individual organic molecules that, in aggregate, record the metabolic history of the soil microbiome while also providing the fuel that sustains future soil health. Although scientists have suggested that studying the entire suite of organic molecules in soil could improve the ability to predict microbial respiration, initial efforts to do so have produced mixed results because it not only requires advanced data analysis approaches, but also large quantities of standardized and structured data. In this study, a multi-institutional team of researchers led by scientists at the Environmental Molecular Sciences Laboratory (EMSL), a Department of Energy Office of Science user facility located at Pacific Northwest National Laboratory, used machine learning to analyze detailed SOM data from across the United States. Using structured data from the Molecular Observation Network (MONet), an open science network developed by EMSL, the team found that using machine learning to interrogate SOM composition data could improve predictions of soil respiration.
The impact
This study informs how soil sampling and the resulting feature-rich data from entire regions can be analyzed with AI to reveal soil microbial signals that previously could not be detected. These signals can be used in multiple ways, including enhancing the extraction of critical minerals from soils or improving bioenergy crop yields. Additionally, these signals can be used in land use models to improve the reliability and security of energy infrastructure.
Summary
Knowing how microbes break down SOM is important for understanding not only nutrient cycling in soils, but also soil health generally. Most current models that predict soil respiration use non-biological soil and weather property data, but the estimates produced by these models have a high degree of uncertainty because of the complexity of soil microbial metabolism. The research team hypothesized that looking at the vast array of molecules in the soil in ensemble might help improve these models. As part of this multi-institutional study, data from MONet were used to analyze the molecular composition of SOM from 66 soil samples from across the United States. The significant advancement in this research was the use of a machine learning model to classify the full breadth of molecules into subsets, greatly simplifying the analyses of the complex SOM found in each soil sample. The analysis clearly shows that understanding the molecular composition of SOM is important for predicting soil respiration. Incorporation of this approach in in modeling of nutrient cycling has the potential to enhance the management of regional resources.
Contact
Emily Graham
Environmental Molecular Sciences Laboratory | Pacific Northwest National Laboratory
Funding
Soil data were provided by the Molecular Observation Network at the Environmental Molecular Sciences Laboratory, a Department of Energy Office of Science user facility sponsored by the Biological and Environmental Research program. Work was also conducted with capabilities provided by the Joint Genome Institute, another Department of Energy Office of Science user facility located at Lawrence Berkeley National Laboratory. Soil samples collected for the project were obtained through the National Ecological Observatory Network, a program sponsored by the National Science Foundation and operated under a cooperative agreement by Battelle.
Publication
S. Cheng, et al. “Scaling High-Resolution Soil Organic Matter Composition to Improve Predictions of Potential Soil Respiration Across the Continental United States,” Geophysical Research Letters 52, e2024GL113091 (2025). [DOI: 10.1029/2024GL113091]