Omics Data Integration via Interpretable Machine Learning Models
EMSL Project ID
60116
Abstract
Often, the ultimate motivation for the generation of omics data for an experiment is a better understanding of the biological system; the generation of multiple 'omics datasets for the same study is motivated by the hope for a more holistic understanding of the biological system. However, biochemical relationships between variables are often complex and in the presence of other confounding variables, such as censored or missing data, rendering traditional statistical metrics (e.g. correlation) insufficient for discovery of complicated biological relationships in the context of large datasets. This project will develop novel models and metrics of association (relationship) between biomolecules observed in multi-omics datasets, by leveraging random forest statistical learning models and untapped model structure information.
Project Details
Start Date
2021-10-01
End Date
2023-10-01
Status
Closed
Released Data Link
Team
Principal Investigator
Team Members