|
Virtual Seminar on Genomics and Bioinformatics
www.virtualgenomics.org
Thursday, March 04, 2004, Noon - 1PM US Eastern time
Yang Zhang and Jeffrey Skolnick
Center of Excellence in Bioinformatics, University at Buffalo
Protein Structure Prediction on a Genomic Scale
Despite considerable effort, the prediction of the native structure of a protein from its amino acid sequence remains an outstanding unsolved problem. In this postgenomic era, because protein structure can assist in functional annotation, the need for progress is even more crucial. In this talk, we present the recent structure prediction results by TASSER based on a large-scale benchmark test. TASSER is a new hierarchical approach to protein structure prediction that consists of template identification by threading, followed by the assembly of tertiary structures via rearranging continuous template fragments under the guide of an optimized C-alpha and side chain based potential. The benchmark set includes 1489 medium size proteins that cover the whole Protein Data Bank (PDB) library at the level of 35% sequence identity. Starting from the structure templates identified by our threading algorithm PROSPECTOR_3 where homology proteins are excluded, we can fold the proteins in two thirds (990/1489) of cases, which have at least one model among the top five with a root-mean-square-deviation to native below 6.5 Angstrom. When using the best possible templates identified by structurally aligning the native structure through the PDB, TASSER can fold almost all the proteins (except for 2/1489) with accuracy comparable to low-resolution experimental structures. With both templates from the threading and the structure alignment, TASSER models show improvement with respect to the initial templates. With the comparison of the quality of the final models in these two simulations, these data highlight the urgent need for progress in fold recognition alignment algorithms, which will lead to the eventual solution of the protein structure prediction problem.
|