[Biophp-dev] Fasta filetype parser updated

Serge Gregorio biophp-dev@bioinformatics.org
Wed, 07 May 2003 11:44:56 +0800


Hello all!

>I thought that clustalx is simply the X-windows interface to >clustalw. Fileformats should be identical.

Pretty impressive work... btw, if I don't appear to be writing
too much, it's because of the difference in our time zomes...
By the time I read your postings, questions, etc. you've most
likely resolved or answered the issues already...  =)

However, just so you know, I've made a mental note of Sean's suggestion of having "aliases" for some of the attributes or
properties of classes/objects (e.g. id = label = name).  I 
realize the dilemma of making the user memorize all of them.
However, I'm holding my comments till I've formed an opinion
on it.  

My WinCVS still isn't set up for bioinfo.org... in the mean-
time, I notice my code is still without the "banner" Andres
was mentioning...

   /* **********************************************
   seqdb.inc.php 

   Description - etc.
   Written by: Serge Gregorio
   Date: March/April 2003
   ********************************************** */

Could someone (Sean?) do something about this?  I don't know
what needs to be done, if it means recreating an existing
directory, etc. but it seems only proper to do so.  I can
send the *.inc.php files with the "banners" tomorrow.  
 
>I added seqlength to both the clustal and fasta parsers.  Also, it is (a
>little bit) better to quote with single quotes if you do not need
>variable interpolation, that way php does not have to invoke it's parser.
>The clustal parser output has a whole lot of dashes in the sequences.
>Can you have a look at those? (reg expressions are not my strong point as 
>you might have noticed)

Dashes are "gaps"... they are meant to say that that particular
position in Sequence A does not match letters in the corresponding
position in Sequence B, in any meaningful way.  When measuring 
sequence similarity, usually inserting the first gap has a "high
cost", the 2nd and succeeding gaps a slightly lower cost, non-gaps
have the lowest (but differing) costs...  

Cheers!

Serge


Need a new email address that people can remember
Check out the new EudoraMail at
http://www.eudoramail.com