[Bioclusters] Details on a local blast cluster question

Sergio Ahumada N bioclusters@bioinformatics.org
Mon, 27 Jan 2003 14:42:06 -0300


> ... and also for the record on our RedHat 7.2 based system (kernel
> 2.4.2-2smp?), files greater than 2GB have to be piped into formatdb, rather
> than supplied as an argument

I wrote and send to this list a Perl script for cutting a large size database 
into pieces .. I test with a "nt" (~ 6GB) and I can supplied it as an 
argument.  It's not a great code, but it's works :)

Furthermore, Tim Harsch says that splitting a database reduce the costs of 
phisical memory ... I think it is not good at all .. because the results are 
not right ... we are test a local blast cluster (9 DELL Power Edge) and the 
best performance (reducing costs of time and disk usage) is obtined when 
split the input files (fasta format obtained via phredPhrap) in equal size of 
the physical memory available and send separate jobs to each node in the 
cluster  ... I hope you find this useful (for beginners i guess)

Greetings

> Andy

ps: I am so sorry for my bad english :/
-- 
Sergio Antonio Ahumada Navea                mailto:san@inf.utfsm.cl
Centro de Bioinformatica - UTFSM
http://www.biotec.utfsm.cl/