[Bioclusters] BioPerl 1.2.3 and memory handling

Michael Cariaso cariaso at yahoo.com
Sun Nov 28 12:49:57 EST 2004

Michael Maibaum wrote:
> On 10 Nov 2004, at 18:25, Al Tucker wrote:
>> Hi everybody.
>> We're new to the Inquiry Xserve scientific cluster and trying to iron 
>> out a few things.
>> One thing is we seem to be coming up against is an out of memory error 
>> when getting large sequence analysis results (5,000 seq - at least- 
>> and above) back from BTblastall. The problem seems to be with BioPerl.
>> Might anyone here know if BioPerl is knows enough not to try and 
>> access more than 4gb of RAM in a single process (an OS X limit)? I'm 
>> told Blastall and BTblastall are and will chunk problems accordingly, 
>> but we're not certain if BioPerl is when called to merge large Blast 
>> results back together. It's the default version 1.2.3 that's supplied 
>> btw, and OS X 10.3.5 with all current updates just short of the latest 
>> 10.3.6 update.
> BioPerl tries to slurp up the entire results set from a BLAST query, and 
> build objects for each little bit of the result set and uses lots of 
> memory. It doesn't have anything smart at all about breaking up the job 
> within the result set, afaik.
>  I ended up stripping out results that hit a certain threshold size to 
> run on a different, large memory opteron/linux box and I'm experimenting 
> with replacing BioPerl with BioPython etc.
> Michael

You may find hthat the BPLite parser works better when dealing with 
large blast result files. Its not as clean or maintained, but it does 
the job nicely for my current needs, which overloaded the usual parser.

More information about the Bioclusters mailing list