Protein structure prediction
From Bioinformatics.Org Wiki
Protein structure prediction methods attempt to determine the native, in vivo structure of a given amino acid sequence. To do so, knowledge of protein structure determinants are critical: the hydrophobicity and hydrophilicity of residues, electrostatic interactions, hydrogen and covalent bonds, van der Waals interactions, bond angle stresses, and enthalpy and entropy.
There are two important facts about the determinants mentioned: First, information about them, and thus a protein's structure, is contained entirely within the sequence (in addition to knowledge of the solvent). Second, they are all measurements of physical properties (energies, actually). Assuming a protein can take its native conformation in solvent, without the aid of protein chaperones, we have enough information to predict protein structures ab initio (from basic principles). However, many of the determinants are not known precisely enough, or may be too compute-intensive (computationally non-tractable).
In the absence of feasible ab initio methods, protein structure prediction has turned to knowledge-based methods: homology modeling and protein fold recognition methods being the two major and complementary approaches taken.
The low amount of accuracy (usually ranging 50-70%) is a disadvantage for both methods. Another disadvantage for both is that known structures have to be available. The methods would fail when predicting the structure of a novel protein. Additionally, they tend to fail in predicting structures which are particularly sensitive to sequence differences, such as with random coils.
For homology modeling, the amino acid sequence of a protein with unknown structure is aligned against sequences of proteins with known structures. High degrees of homology (very similar sequences across and between the proteins) can be used to determine the global structure of the protein with unknown structure and place it into a certain fold category. Lower degrees of homology may still be used to determine local structures, an example being the Chou-Fasman method for predicting secondary structure. An advantage for homology modeling methods is the lack of dependence on the knowledge of physical determinants.
Fold recognition methods take a complementary approach. With these, structures, not sequences, are aligned. With the method called "threading," the sequence of a protein with unknown structure is forced to take the conformation of the backbone (protein sans side chains) of a protein with known structure. The better the physical determinants measure for each attempt, the better the score for the alignment. These methods tend to be more compute-intensive than homology modeling methods, but they give more confidence in the physical viability of the results.
- LIBELLULA - a neural network based web server to evaluate fold recognition results