[Bioclusters] mpiblast and its database

Thu, 15 May 2003 08:31:04 -0700 (PDT)

Hi Jeremy,

I have two questions for you. 
(1)After you did mpiformatdb -N 44, how many fragments
did you get? Did you get 45 fragments or 44?
(2)When you run mpiBlast, how many cpus or nodes you
specify? (I mean, what is your number for xx in mpirun
-np xx).

Thanks,

-Yong Liu

--- Jeremy Mann <jeremy@bioc09.v19.uthscsa.edu> wrote:
> 
> It has come unto this list, questions about mpiblast
> and its copying of
> database segments to individual nodes. Here, we have
> a 20 node dual CPU
> beowulf. Previously, I was using mpiformatdb -N 20.
> Then it hit me, even
> though I start lamboot with cpus=2 for each node, in
> reality, blast only
> runs one instance on each node. Now I get one
> segment per node. But what
> happens when I start another job? Maybe mpiblast
> doesn't use the same CPU
> for this next job, so it copies another segment, and
> so forth...
> 
> So this morning, I did mpiformatdb -N 44 (42 cpus +
> 1 dual master as Jason
> Gans suggested) for protein nr. Then I ran the tests
> I used previously,
> its a simple protein nr sequence which takes about 3
> mins on one node.
> Started top on various nodes, and ls in the local
> storage. The very first
> time I ran it, of course, the segments (this time 2
> per node) copied to
> local storage. Speed wasn't great (nfsd was the
> problem), but obviously
> faster that one node, total time for the first run
> was 1 min, 23 secs.
> 
> With the same top and ls windows open, I ran it
> again. This time, no
> copying and each node still had its segment. Total
> time for the second run
> was 5.9 seconds!
> 
> Then I thought, "Ok, maybe it caches the sequence so
> it didn't need to
> copy segments anymore." So I took a different nr
> protein sequence and ran
> this in mpiblast. To my surprise, there was no
> copying, and each node
> still had its segments. Total time for the new
> sequence was 5.9 seconds.
> 
> I can only assume after this little experiment that
> formatting a db to the
> exact number of nodes/cpus is the key.
> 
> Now, on to my next problem.... Keeping the segments
> in local storage so
> another user doesn't have to go thru the copy
> process.
> 
> I'll keep you all posted.
> 
> 
> 
> 
> 
> -- 
> Jeremy Mann
> jeremy@biochem.uthscsa.edu
> 
> University of Texas Health Science Center
> Bioinformatics Core Facility
> http://www.bioinformatics.uthscsa.edu
> Phone: (210) 567-2672
> 
> 
> _______________________________________________
> Bioclusters maillist  - 
> Bioclusters@bioinformatics.org
>
https://bioinformatics.org/mailman/listinfo/bioclusters

__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com