[ssml] Howto generate position specific score matrix (PSSM) in the way of psi-blast ?

Dan Bolser dmb at mrc-dunn.cam.ac.uk
Sat Jan 8 11:50:54 EST 2005

On Sat, 8 Jan 2005, Jarod wrote:

>I try to write a short program to generate position specific score matrix 
>(PSSM) from a multi-alignment of sequences in Perl language, I know there are 
>methods to do this, but I want to do this like PSI-blast. Unfortunately, the 
>original article about PSI-blast does not make me clear, and the NCBI source 
>code is too difficult to read. 
>Anyone who can tell me the its principle and how psi-blast works?

If you are finding the original papers confusing (I know I did), the best
thing to do next is probably to try reading a text book on sequence
analysis, and/or you can probably find lots of online tutorials and
descriptions of PSSM's. If you find any good ones please post them up on
the list :)

There is a short section specifically about PSSM in the book 'Biological
sequence analysis' by Durbin, Eddy, Krogh and Mitchison (section 5.1),
which gives a very clear description of a simple (PSSM) scoring system.

While the focus of that book is profile HMMs, it might help your
understanding to know that a PSSM is exactly like a profile HMM except
without insert or delete *states*. It is just a string of match states.

The two main issues (as I understand it) are;

1) calculating the probabilities of finding a particular amino acid at a
particular position, and

2) translating the score a particular PSSM gives to a particular sequence
into a measure of significance.

The suggested book has lots of background reading on different methods for
doing both of the above.

I really like the idea of a Perl implementation of psi-blast (for
educational purposes). If you are really ambitious perhaps your code could
be the first contribution to a new open source project at

All the best,

>         Jarod
>ssml-general mailing list
>ssml-general at bioinformatics.org

More information about the ssml-general mailing list