[Biococoa-dev] BCPairwiseAlignment & BCScoreMatrix

Alexander Griekspoor a.griekspoor at nki.nl
Fri Mar 11 16:31:24 EST 2005


On 11-mrt-05, at 17:47, Philipp Seibel wrote:

> Hi everybody,
>
> i just made some modifications to the Alignment stuff. I followed 
> Alex' advice and made the Scoring Matrix char based. every symbol is 
> casted to a char and used as a number key for the matrix. With this 
> approach we have some memory overhead, but we're much faster, because 
> we need not to ask the NSArray for the Symbol index everytime.
>
> I also copied some of alex' code ( sorry for that alex ;-) ) to 
> provide a short overview over the global alignment.
Absolutely no problem!
Just to make things clear for everyone, with alignments we're talking 
about two kinds of matrices. The one with the scores one which are also 
known as substitution matrices,  although you can implement them as 
arrays as well like phil demonstrated before.
These are different  from the matrices used during the actual 
alignments (with the 3 phases as you might remember). For the first we 
create the scoring matrix objects, the second are probably only used 
internally in the algorithm implementation.

So Koen, in this light your remark:
> I am thinking how this will be used. The end user probably wants to 
> try out one type of alignment, see the result, then try another one, 
> compare the results, etc. So if we make a BCNeedlemanWunsch, and then 
> a BCSmithWaterman where is the actual matrix that is used to 
> calculate. I think it is a good idea if we have just one matrix, that 
> is used as a basis for each different calculation. It would be a waste 
> if for every calculation the starting matrix has to be re-calculated. 
> Or maybe that's where BCMatrix comes in place?
The actual matrix used for calculation is the second one. But keeping 
the matrix only saves you the memory allocation, but different 
alignments fill the matrix differently so there's no use in keeping it 
around as it has to be refilled again with scores based on algorithm, 
penalty scores, gap costs etc. As most time goes into filling the 
matrix and tracing it back after the fill, you can't reuse it. Also, 
most algorithms that are subquadratic for memory requirements, chop up 
the matrix and use a divide-and-conquer approach because it's the 
storage of a complete-sized matrix that forms the memory problem.
Does this make any sense?
Alex

>
> @Charles: Perhaps we could discuss your symbol to int mapping in more 
> detail, i didn't get the idea. ;-)
>
> Phil
>
> _______________________________________________
> Biococoa-dev mailing list
> Biococoa-dev at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/biococoa-dev
>
>
*********************************************************
                     ** Alexander Griekspoor **
*********************************************************
               The Netherlands Cancer Institute
               Department of Tumorbiology (H4)
          Plesmanlaan 121, 1066 CX, Amsterdam
                    Tel:  + 31 20 - 512 2023
                   Fax:  + 31 20 - 512 2029
                   AIM: mekentosj at mac.com
                  E-mail: a.griekspoor at nki.nl
               Web: http://www.mekentosj.com

               4Peaks - For Peaks, Four Peaks.
        2004 Winner of the Apple Design Awards
                Best Mac OS X Student Product
              http://www.mekentosj.com/4peaks

*********************************************************




More information about the Biococoa-dev mailing list