[Biococoa-dev] starting BCAlignment
Alexander Griekspoor
a.griekspoor at nki.nl
Thu Mar 10 16:20:47 EST 2005
On 10-mrt-05, at 22:05, Philipp Seibel wrote:
>> Now one thing more about matrices to explain John a bit more:
>> You can imagine that in the DNA world a (very simple) scoring scheme
>> can be:
>> a match positive, e.g. +1
>> a mismatch negative, e.g. -1
>> A simple char comparison is all it takes to get the score.
>> But in the protein world there's more info as the change from
>> aminoacid X to Y can be less or more important based on if they
>> belong to the same chemical class or not. Based on analysis of
>> mutations in many sequences, people have created substitution
>> matrices with this point in mind (examples are PAM and BLOSUM). As
>> for each score these matrices have to be accessed, for performance
>> reasons they are usually of type int** (or char** but that's the
>> same).
>>
> I think we should use a int* instead of int** because its faster. Take
> a look at my BCScoringMatrix.
You're the expert! ;-)
I came along this example code which I though was quite elegant:
Generation of a (DNA)scoring matrix:
match = 1;
mismh = -1;
/* set match and mismatch weights */
for ( i = 0; i < 128 ; i++ )
for ( j = 0; j < 128 ; j++ )
if (i == j ) v[i][j] = match;
else v[i][j] = mismh;
v['N']['N'] = mismh;
v['n']['n'] = mismh;
v['A']['a'] = v['a']['A'] = match;
v['C']['c'] = v['c']['C'] = match;
v['G']['g'] = v['g']['G'] = match;
v['T']['t'] = v['t']['T'] = match;
So, you simply build a 128x128 char matrix using the fact that chars
are ints
Next to calculate the score:
char *a = A[++i]; // character i in sequence A
char *b = B[++j]; // character j in sequence B
char *c++ = (*a == *b || isdna && v[*a][*b] == MATCHSC ) ? '|' : ' ';
// code to insert a | in the case of a match and
// a space in the case of a mismatch
Again, my experience is pretty limited, so I believe you immediately
that using a simple int array is faster than a matrix, and certainly
much simpler!!
Cheers,
Alex
*********************************************************
** Alexander Griekspoor **
*********************************************************
The Netherlands Cancer Institute
Department of Tumorbiology (H4)
Plesmanlaan 121, 1066 CX, Amsterdam
Tel: + 31 20 - 512 2023
Fax: + 31 20 - 512 2029
E-mail: a.griekspoor at nki.nl
AIM: mekentosj at mac.com
Web: http://www.mekentosj.com
EnzymeX - To cut or not to cut
http://www.mekentosj.com/enzymex
*********************************************************
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 4116 bytes
Desc: not available
URL: <http://www.bioinformatics.org/pipermail/biococoa-dev/attachments/20050310/1c3483e5/attachment.bin>
More information about the Biococoa-dev
mailing list