Skip to main content
Computing, Analytics, and Modeling

Systems Modeling

The foundation for advancing scientific discovery is data. Improved means of producing, storing, and analyzing data is the key to expanding our knowledge of the world. EMSL has led mid-range scientific computing and visualization, software creation, and modeling development for more than 25 years. One of our primary science missions is to advance the prediction and control of biological and environmental systems. Our computing experts develop approaches for advanced data analysis, data integration, multiscale modeling, and simulation of processes across scales. 

The science

The Environmental Molecular Sciences Laboratory (EMSL)'s Systems Modeling Integrated Research Platform has two focuses. EMSL uses computational models of protein structure and function, metabolic modeling, and machine learning approaches to associate genotype with phenotype, understand the biological processes that control nutrient flux, and enable predictive approaches to biodesign and biofuel/bioproduct production. We also trace the flow of materials, like carbon, nutrients, and contaminants, in the environment to see how biological and hydrobiogeological processes change ecosystem function.

How we do the science

EMSL’s Systems Modeling expertise uses experimental data with multiscale models to transcend these processes across temporal and spatial scales. This integrated research platform:

  • Delivers computational and data analytics expertise for developing a predictive understanding of biological and environmental systems.
  • Integrates experimental and computational approaches and develops molecular- to system-scale models of biological and environmental processes.
  • Maximizes methods to analyze and visualize data in spatial and temporal context to enhance pattern recognition and statistical validation, and contributes to scale-aware models of complex systems.
  • Deploys and develops software to analyze and visualize the extensive, complex data generated by EMSL’s analytical instruments, and models the systems being studied, enabling comparison of theory and experiment. Current software expertise includes but is not limited to bioinformatics (BLAST, Diamond, HMMer/MaxRebo), computational chemistry (NWChem, LAMMPS, NAMD, GROMACS, ADF, MolPro, VASP, and WRF-Chem), and the environmental/hydrological code PFLOTRAN.
  • Supports the development of sophisticated models of biological and environmental processes by leveraging EMSL’s data repository. These models include but are not limited to electronic structure methods (quantum chemistry), classical molecular dynamics, continuum models, systems biology and metabolic models, and bioinformatics.
  • Offers a suite of web services for visualization and analysis of ‘omics and environmental data (FREDA and P-Mart), cloud orchestration capabilities, and our 0.93 petaflop high-performance parallel computing system (Tahoma). EMSL’s data repository offers large-scale, long-term storage (housed in a 30+ PB storage archive, Aurora) and can create DOIs for data sets when they are made public.

Research in action

Screening compound libraries for coronavirus therapeutics

COVID

When SARS-CoV-2 first emerged, there was an urgent need to speed up the development of new antiviral drugs to combat the virus. Researchers from the University of Washington School of Medicine, Pacific Northwest National Laboratory, and EMSL screened more than 13,000 compounds from existing drug libraries for the ability to inhibit a SARS-CoV-2 nonstructural protein called nsp15. This protein is commonly found in coronaviruses and has no counterpart in host cells, making it an interesting target for drug development. Through their screen, the scientists identified three hits against the nsp15 protein and confirmed that one bound to the protein using EMSL’s mass spectrometry capabilities. Though the hit—a compound called Exebryl-1—did not have sufficient antiviral activities in cell-based assays, the compound can be optimized using artificial intelligence and in silico molecular docking calculations.

Improving proteomics data analysis 

histones

Histone proteins can be modified to enhance or suppress gene expression. A histone can be simultaneously modified with multiple chemical groups. These modifications can be recognized by other proteins in the cell. Interpreting this “histone code” can be a challenge. Researchers at EMSL developed two different tools to help. IsoForma is a robust and automated software that helps analyze modified proteins such as histones. PSpecteR is a proteomics-focused visualization application that helps scientists understand protein fragmentation patterns.