[Bioclusters] blast on opteron

Joe Landman bioclusters@bioinformatics.org
27 Jun 2003 13:46:18 -0400


Hi Hunter:

  Due to the mmap used in BLAST, and the way it is implemented in Linux,
in low memory or memory constrained situations, BLAST is indeed IO
bound.

  You basically have a number of options:

1) segment the database (use the -v switch on formatdb)
2) run on a big memory machine entirely in ram
3) run on a big memory machine with a ramdisk

Option 1 works today, it simply requires some process adjustment.  If
this is possible you should be in good shape.  Option 2 may be possible,
but I had some issues with quality of the opteron OSes I played with. 
The compiler and library issues on these machines are annoying right
now, but they should be resolved with AMD's help.  There are lots of
distro's that "run" on the opteron, though I am not sure if they really
support large memory directly, or if they use the PAE hack.  If it is
the latter (the same way to support large memory on Xeon's), it is worth
avoiding.  

Option 3 is possible, though you will be paying for memory to memory
transfers, which are not free.  They cost, and they use up bandwidth. 
This option is related to option 2, in that if you have a linux with
support for real extended addressing (not PAE), then you should not have
to worry about option 3.

Joe

On Fri, 2003-06-27 at 13:33, Hunter Matthews wrote:
> A researcher here is looking for where to host their blast searches. One
> question I asked was "how big is this database" and the reply was 4-6GB
> total.
> 
> I'm no blast expert, but my understanding is that blast is heavily IO
> bound. What about buying an opteron with 8GB of ram and simply creating
> a 6GB ram disk? That would leave 2GB for the OS and blast itself, and
> then the entire IO would be in memory.
> 
> Is that insane? The machine I specifically had in mind was a dual CPU
> opteron with 8GB of ram and a single disk (for the OS and the
> "permanent" copy of the database). Several vendors quoted about $5500
> for this machine, which was the researchers target budget anyway.
-- 
Joseph Landman, Ph.D
Scalable Informatics LLC,
email: landman@scalableinformatics.com
web  : http://scalableinformatics.com
phone: +1 734 612 4615