Fragmentation and Characteristic Study of Peptides Missed by SEQUEST
EMSL Project ID
15899
Abstract
Peptide sequencing/protein identification has become an important application of mass spectrometry to biological research [1]. Proteins go through 1D or 2Dgel separation and are digested in gel with enzymes, producing a mixture of peptides in each gel band. Liquid chromatography paired with tandem mass spectrometry then produces thousands of fragmentation spectra of peptide ions for a complex mixture of peptides [2]. Peptides, and thus proteins, are then identified by using various algorithms available, such as SEQUEST, MASCOT and X! Tandem. Although quite a few peptides can be identified correctly based on the tandem mass spectra of the peptides, many are not identified with enough confidence, and in SEQUEST, a protein will be included in the searching results with only one identified peptide passing some searching criteria [2]. Even if a particular protein is identified, there are still many peptides within that protein which are not identified by the software. It’s interesting to probe the reason behind this: is it because they are unidentifiable by the algorithm applied or simply because their spectra don’t even exist in the set of spectra collected? If this question can be understood, the ability/accuracy of peptide/protein identification of searching algorithms could possibly be improved.
To investigate this question, the first step is to obtain the tandem spectra for those peptides that are not identified or identified with insufficient confidence by the software in the identified proteins. To do that, the digested peptide mixtures of a bacterium previously studied (e.g., shewanella oneidensis) will be split into two parts: one to HPLC and QIT quadrupole liner trap mass spectrometer for acquiring MS/MS spectra in a data-dependent mode as peptides elute from the HPLC column; another to HPLC and a Fourier transform ion cyclotron resonance (FT-ICR) mass spectrometer for obtaining the accurate masses of the peptides fragmented in the QIT. SEQUEST, a commonly used searching engine, then will be used for peptide sequencing based on the tandem spectra of the peptides. For each peptide identified by SEQUEST, if the retention times from the two HPCLs are the same and the accurate mass from the ICR MS matches that of the identified sequence, the peptide is assumed to the true one. Proteins in the sample are then identified by SEQUEST. Theoretical digestions by the same enzyme used in previous experiments, typically trypsin, will be performed on the identified proteins to identify the peptide sequences which are not identified in the previous searching. With the list of the expected but missing peptides, by using the retention time predictor, the QIT mass spectrometer will be set to intentionally fragment those peptides that were previously missed by Sequest by selecting the appropriate theoretical m/z values. After the data are obtained, studies will be performed on the spectra to determine whether their fragmentation differs from a large set of spectra previously characterized.
Project Details
Project type
Exploratory Research
Start Date
2005-07-28
End Date
2007-06-28
Status
Closed
Released Data Link
Team
Principal Investigator