[Bioclusters] mpi-blast performance on a 256-core cluster
George Magklaras
georgios at biotek.uio.no
Fri Jan 4 14:59:24 EST 2008
Li Liu wrote:
> Dear biocluster members,
>
> We've been struggling with the performance of mpi-blast 1.4 on our
> 256-core cluster for almost a month. What we tried to do is to run
> blastx search against NCBI NR database for 600K 454 reads. It ran for
> a few days and stopped without giving any error message. I'm
> wondering if any of you have any suggestions on
Any hints as to what point did it stop? What was the state of your
result files? If possible, during the execution, did your sysadmin
observe anything from I/O monitoring tools (preferrably vmstat/iostat)
that was weird?
>
> 1. Any alternative parallel blast program?
> 2. Did anybody observe frequent I/O operations with mpi-blast? Isn't
> it supposed to load the database into memory and access the memory
> from then on, rather than keep asking disk for database fragment?
>
If you mean the buffer fs cache yes, that's normally what it should do,
provided that what you are attempting to access both on the fragment AND
the input sequence size does not exceed your RAM per node (I assume you
are not running something else on the cluster nodes when the job
executes). We do not run such large jobs here, but I have seen in our
small setup (10 nodes- 40 cores) buffer cache invalidation that forced
some heavy I/O only due to the fact that we were running other jobs per
node that were writing large amounts of files (more than 700 Megs) and
hence blowing any buffer cache zone we had on 4 Gig RAM nodes. To answer
your question I need to know what exactly you class as frequent I/O, the
amount of RAM you have on each node/core, whether you are running
something else per node and how exactly do your nodes access the blast
db fragment data (on local disk, FC, other network FS? we just copy on
local node SATA-II disks). Preferrably, if you can show some rows of
vmstat on some of the nodes, so we can see what sort of disk activity
and cache state they are in while they run the job that would help.
--
--
George Magklaras
Senior Computer Systems Engineer/UNIX Systems Administrator
EMBnet Technical Management Board
The Biotechnology Centre of Oslo,
University of Oslo
http://www.biotek.uio.no/
EMBnet Norway: http://www.no.embnet.org/
More information about the Bioclusters
mailing list