Skip to main content
Computing, Analytics, and Modeling

Systems Modeling

The Environmental Molecular Sciences Laboratory (EMSL) has led scientific computing, software creation, and model development for more than 25 years, solidifying our leadership in computational science. One of our primary science missions is to advance the prediction and control of biological and environmental systems, using first-principles models based on laws of physics as well as the latest advances in artificial intelligence/machine learning (AI/ML). Our computational scientists develop approaches for modeling and simulating a wide range of biological and environmental processes, ranging from the molecular scale to watersheds. We build approaches to connect models to EMSL’s experiments through the model–experiment (ModEx) paradigm. 

The science 

EMSL’s Systems Modeling Integrated Research Platform uses computational models of protein structure and function, metabolic modeling, and ML approaches to associate genotype with phenotype, understand the biological processes that control nutrient flux, and enable predictive approaches to biodesign and biofuel/bioproduct production. We also use modeling and simulation to trace the flow of materials, like carbon, nutrients, and contaminants, in the environment to see how biological and hydrobiogeological processes change ecosystem function. 

How we do the science 

EMSL’s Systems Modeling expertise uses experimental data with multiscale models to integrate these processes across temporal and spatial scales. This integrated research platform 

  • Delivers computational expertise for developing a predictive understanding of biological and environmental systems. 
  • Integrates experimental and modeling approaches and develops molecular- to system-scale models of biological and environmental processes. 
  • Maximizes AI models to enhance spatial and temporal pattern recognition that contribute to scale-aware models of complex systems. 
  • Deploys and develops software to analyze and visualize the extensive, complex data generated by EMSL’s analytical instruments, and models the systems being studied, enabling comparison of theory and experiment. Current software expertise includes but is not limited to bioinformatics (BLAST, Diamond, HMMer/MaxRebo), computational chemistry (NWChem, LAMMPS, NAMD, GROMACS, ADF, MolPro, VASP, and WRF-Chem), and the environmental/hydrological code PFLOTRAN. 
  • Supports the development of sophisticated models of biological and environmental processes by leveraging EMSL’s data repository. These models include but are not limited to electronic structure methods (quantum chemistry), classical molecular dynamics, continuum models, systems biology and metabolic models, and bioinformatics. 
  • Includes our 0.93 petaflop scientific computing resource system (Tahoma) and EMSL’s data repository—large-scale, long-term storage (housed in an 80+ PB storage archive, Aurora) and can create DOIs for datasets when they are made public. 

Research in action

Molecular simulations 

An essential plant stress response mechanism called cellular metal chelation allows plants to thrive in metal-polluted soils. A multi-institutional research team led an EMSL user project investigating the copper ion binding mechanism in the metal-chelating BURP domain protein, AhyBURP, using advanced computational and experimental techniques. Computational tools like AlphaFold and molecular dynamics simulations reveal the structure and dynamic behavior of AhyBURP’s Cu-binding sites. Their research findings could contribute to understanding metal stress in crops and its potential for crop engineering and phytoremediation. 

Molecular modeling 

A researcher from the University of Kansas Medical Center worked with EMSL to develop a computational prototype to synthesize and visualize the structural models of higher-order protein complexes derived from existing empirical data and the relative protein abundance of these complex assemblies from many organisms. The purpose of the prototype is to visualize the spatial arrangement and topological hierarchy of the protein assemblies, which are essential for a comprehensive understanding of cell biology. The prototype will advance technical knowledge required to interpret the fundamentals of protein complexes from targeted omics data.