Hello, Theodore! > Is there ever a case where two inserts, could be less costly than a > substituion? In fact it's not even just "two inserts", it's an insert > followed by a delete. I don't know any such case but it would be possible. The program blast allows the user to specify the gap penalty on the command line (though I have never ever seen anyone do that)... > This would require that an insert followed by a deletion, is cheaper > than a replacement. > > Can this be true? > > I've inspected the BLOSUM62 matrix (-4 is the worst penalty for > replacement), and the default gap penalties (-11,-1), and it seems > like using BLOSUM62, it's impossible to get "insert+delete" instead > of one replacement. ...so a user could specify a gap penalty which is >-4. It is not wise to restrict your algorithm to one specific matrix since users will claim that they will want to use other matrices. (Even if they wont.) The blast algorithm would not even find this sort of thing. > So this would be a good case for me, as it would simplify my > algorithm. I'm happier if insert+delete is always worse than > replacement. > > But does this pattern hold across all biological uses? Can there be a > case where an insert followed by a delete, is better than a > replacement? If so, would it be something so contrived and unlikely > and not really useful, that I can just ignore it and tell my users > that my software has a design restriction that insert+delete must > always score worse than a substitution? If you tell your users that, they will not understand the special case the restriction applies to. So they will only remember "has some sort of restriction" and not use it out of fear of loosing relevant alignments. To handle your special case, it would suffice to scan the matrix (which is small) and check if there is any substitution value with a greater penalty than the insert+delete penalty. If this is the case your program terminates and blames the user or the matrix for it. > Also, if an insert is followed by a delete, does it count as -11-11 > (-22) or -11-1 (-12)? It is <cost for opening a gap> + <cost for deletion> The -1 is the penalty for extending a gap. That is not the case here. > Thanks to all who can help! My algorithm is really close to > finishing. I can see it almost in my hands. I will be very proud to > finish it. Whether or not the algorithm fast enough to be is of use, > is another matter, but it will function correctly at least! Announce here, when you are done! Greetings, Michael. -- ----------------------------------------------------------- Dipl.-Inform. Michael Nuhn Bioinformatik Zentrum für Nanostrukturtechnologie und Molekularbiologische Technologie +49 (0)631 - 205 4334 nuhn at rhrk.uni-kl.de http://nbc3.biologie.uni-kl.de/ -----------------------------------------------------------