Skip to main content

Structural Proteomics: annotating the genome using 3D structure


EMSL Project ID
2427

Abstract

Most genomes are annotated based on sequence homology but there are many proteins encoded in the genome where the sequence do not have homology to other proteins with known functions and these are presently annotated as 'hypothetical proteins'. A protein's biochemical function is often dictated by its 3 dimensional shape. A major goal of structural proteomics is to determine the structure of all the proteins in a genome. We hope that the structures of these hypothetical proteins would yield valuable clues as to the function of the proteins. The wealth of structural informations that would come from the structural proteomics project could also enrich our understanding of protein folding and the evolutionary relationship between genomes. Current experimental methods in determining structure are x-ray crystallography and NMR spectroscopy. Inspite of NMR spectroscopy's protein size limitation, our initial study had shown that NMR spectroscopy can play a significant role in structural proteomics (Christendat et al.). Over the past two years, we have used NMR spectroscopy as part of our structural proteomics efforts. We want to determine the structures of hypothetical proteins and proteins with known functions but no sequence homology to known structures. Project Update: Our collaboration with Dr. Kennedy had resulted in the determination of 7 structures so far and 3 more structures are currently in progress. Of the structures solved as part of this collaboration, we were able to deduce the function of 3 hypothetical proteins based on structural homology, the functions inferred by the structures still need to be confirmed by biochemical means. Another 3 hypothetical protein structures failed to yield functional clues because they have structural homology to a very common fold and we were not able to find a known active site. Yet another protein studied has a known function but has no sequence homology to known structures, the structure determined suggested a possible mechanism for how this protein functions. In continuation of our structural proteomics project, we have screened proteins from different organisms that appear to be well folded and amenable for structure determination using NMR spectroscopy. Attached are the 15N-HSQC spectra of these proteins. TM1809 is a hypothetical protein from Thermatoga maritima. It is conserved among lower microorganisms only, including Mycobacterium tuberculosis. This protein may be a potential drug target against M. tuberculosis and the 3D structures would facilitate the design of the drugs. TM1489 is a 50s ribosomal protein L24 from Thermatoga maritima and it has sequence homologue in higher organisms such as mouse and A. thaliana. The structure of this T. maritima homologue could be use to model the eukaryotic homologues that are difficult to get the structure of. This research is part of the Northeast Structural Genomics Consortium (NESG), an NIH-funded Center for structural genomics, and the Ontario program in Structural Proteomics. Publications which made use of EMSL spectrometers: Yee et al., NMR approach to structural proteomics. PNAS (2002), in press. Pineda-Lucena et al., Solution structure of the hypothetical protein Mth0637 from Methanobacterium thermoautotrophicum. J. Biomol. NMR (2002), submitted. Pineda-Lucena et al., NMR structure of the hypothetical protein encoded by the YjbJ gene from Escherichia coli. Proteins: Structure, Function and Genetics (2002), accepted. Cort et al.,J. Mol. Biol. 302, 189-203 (2000) Christendat et al., Nature Struct. Biol., 7, 903-909 (2000). Cort et al., Journal of structural genomics (2000) 1, 15-25.

Project Details

Project type
Capability Research
Start Date
2002-04-01
End Date
2002-11-18
Status
Closed

Team

Principal Investigator

Cheryl Arrowsmith
Institution
University of Toronto