[Biococoa-dev] BCPairwiseAlignment & BCScoreMatrix
Alexander Griekspoor
a.griekspoor at nki.nl
Fri Mar 11 16:31:24 EST 2005
On 11-mrt-05, at 17:47, Philipp Seibel wrote:
> Hi everybody,
>
> i just made some modifications to the Alignment stuff. I followed
> Alex' advice and made the Scoring Matrix char based. every symbol is
> casted to a char and used as a number key for the matrix. With this
> approach we have some memory overhead, but we're much faster, because
> we need not to ask the NSArray for the Symbol index everytime.
>
> I also copied some of alex' code ( sorry for that alex ;-) ) to
> provide a short overview over the global alignment.
Absolutely no problem!
Just to make things clear for everyone, with alignments we're talking
about two kinds of matrices. The one with the scores one which are also
known as substitution matrices, although you can implement them as
arrays as well like phil demonstrated before.
These are different from the matrices used during the actual
alignments (with the 3 phases as you might remember). For the first we
create the scoring matrix objects, the second are probably only used
internally in the algorithm implementation.
So Koen, in this light your remark:
> I am thinking how this will be used. The end user probably wants to
> try out one type of alignment, see the result, then try another one,
> compare the results, etc. So if we make a BCNeedlemanWunsch, and then
> a BCSmithWaterman where is the actual matrix that is used to
> calculate. I think it is a good idea if we have just one matrix, that
> is used as a basis for each different calculation. It would be a waste
> if for every calculation the starting matrix has to be re-calculated.
> Or maybe that's where BCMatrix comes in place?
The actual matrix used for calculation is the second one. But keeping
the matrix only saves you the memory allocation, but different
alignments fill the matrix differently so there's no use in keeping it
around as it has to be refilled again with scores based on algorithm,
penalty scores, gap costs etc. As most time goes into filling the
matrix and tracing it back after the fill, you can't reuse it. Also,
most algorithms that are subquadratic for memory requirements, chop up
the matrix and use a divide-and-conquer approach because it's the
storage of a complete-sized matrix that forms the memory problem.
Does this make any sense?
Alex
>
> @Charles: Perhaps we could discuss your symbol to int mapping in more
> detail, i didn't get the idea. ;-)
>
> Phil
>
> _______________________________________________
> Biococoa-dev mailing list
> Biococoa-dev at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/biococoa-dev
>
>
*********************************************************
** Alexander Griekspoor **
*********************************************************
The Netherlands Cancer Institute
Department of Tumorbiology (H4)
Plesmanlaan 121, 1066 CX, Amsterdam
Tel: + 31 20 - 512 2023
Fax: + 31 20 - 512 2029
AIM: mekentosj at mac.com
E-mail: a.griekspoor at nki.nl
Web: http://www.mekentosj.com
4Peaks - For Peaks, Four Peaks.
2004 Winner of the Apple Design Awards
Best Mac OS X Student Product
http://www.mekentosj.com/4peaks
*********************************************************
More information about the Biococoa-dev
mailing list