Development of an Environmental-Focused DIA Pipeline for HTP Proteomics
EMSL Project ID
60905
Abstract
Currently, metaproteomics workflows are hindered by sample complexity which necessitates fractionation of the sample meaning one sample can require 12 to 24 MS analyses. Data dependent workflows (DDA) focus on the MS/MS fragmentation of one peptide at a time, thus resulting in high levels of missing data and a lack of reproducibility. Data independent methods (DIA) overcome these limitations by the simultaneous fragmentation of all the peptides eluting from the LC, thus collecting significantly more data in the same amount of analysis time. However, DIA proteomics approaches rely on a well annotated genome which allow for the prediction of the peptide fragmentation patterns a priori for peptide identification. These predictions are hindered by the presence of isoforms in the genome (as is present in poly ploidy plant species) or the redundancy of species delineated protein and peptide sequences (as are present in metagenomes). Additionally, the lack of depth in sequencing coverage of many metagenomes also create ambiguity in the peptide sequence predictions.We will develop a DIA proteomics pipeline for environmental proteomics research, specifically focusing on developing workflows that enable whole proteome coverage for plants and microbial communities. We will test the current methods that focus on human and bacterial species to determine the extent of hindrance caused by the genome ambiguity and research workflows that will overcome these limitations. Several DIA benchmark studies have reported significant variations in identified peptides and protein groups using different DIA analysis workflows and software from the same DIA data, with significant impact on the on the overlap of protein groups identified by all the software packages used in the study. This observation necessitates a comprehensive look at all the different components of the DIA pipeline by a prospective DIA user to be able to develop a workflow that delivers proteomics data at the highest confidence level. The objective of this proposed comprehensive benchmark study is to perform a direct comparison of DDA proteomics with the major schemes available for DIA analysis using a sample set of varying biological complexity and quantifying the possible gains that DIA can deliver for environmental and botanical proteomics research.
Project Details
Start Date
2023-10-01
End Date
N/A
Status
Active
Released Data Link
Team
Principal Investigator
Co-Investigator(s)
Team Members