[Bioclusters] BLAST - Collecting output after formatdb

Joseph Landman bioclusters@bioinformatics.org
09 Nov 2002 09:55:33 -0500


On Sat, 2002-11-09 at 09:46, Mario Belluardo wrote:
> Hi Blasters,
> I'm working to improve blast performances by splitting databases.
> It works fine using the formatdb (with -v option), but I obtain a single, big
> usefull, but not easy to read, file.

Which file are you talking about?  The results of the BLAST at the end
of the run?  Some other file?

> I would like to kwon how people solve this problem, if you wrote down scripts
> ore used known software.

For BLAST parsing, BioPerl is hard to beat (http://bioperl.org).  It can
help you with "primative" analytics, such as iterating over the HSP
matches, or similar.

> I would also like your policy for most significant results: do you keep
> everything in a single html-result page, even if you are blusting more than one
> sequence at time?

I think returning a structured document (XML), or some sort of database
load would be preferable.  Think of it as if you have a pipeline, and
how are you going to get the data out to the next stage.  First you will
have to decide what data to send based in part upon how it will be used,
and then the format that the recipient will want (human eyes, computer
program, etc).

HTML is great for visual inspection of the pipeline output, but it is
hard for other programs to parse to extract data from.

> 
> Any suggestion will be welcome!
> 
> Thanks

-- 
Joseph Landman, Ph.D
Scalable Informatics LLC
email: landman@scalableinformatics.com
  web: http://scalableinformatics.com
phone: +1 734 612 4615