Skip to main content

High performance sequence alignment: OMB software effectiveness metric studies for FY06


EMSL Project ID
18896

Abstract

ScalaBLAST is a new software extension to the NCBI BLAST distribution enabling very high throughput for biological sequence alignments. ScalaBLAST has been selected by OMB as a test code expected to demonstrate at least 50% improvement vs. state of the art sequence alignment technology which was available at the beginning of FY06: MPI BLAST. This study will focus on two specific test problems requiring significant time (~12 hours) on at least 1/2 of MPP2:
1) Exhaustive sequence alignment of the environmental sequence database, and
2) Identification of highest scoring parent proteins from the nonredundant protein databse for the entire mass tag database (AMT) for Salmonella (for peptides of 10 residues or more).

These two problems are representative grand-challenge sequence alignment problems enabling downstream analysis critical for understanding molecular ancestry of organisms (problem 1) and protein-protein interactions for host-pathogen systems (problem 2).

Expected usage is 48 hours on 1000 processors (2 baseline 12 hour runs, 2 FY06 Q4 12 hour runs; each on 1000 processors) + setup, small benchmark time.

Project Details

Project type
Exploratory Research
Start Date
2006-04-06
End Date
2007-01-11
Status
Closed

Team

Principal Investigator

Kenneth Roche
Institution
Oak Ridge National Laboratory

Team Members

Christopher Oehmen
Institution
Pacific Northwest National Laboratory