> How many nodes are you running mpiblast on? Notice that splitting up a > database > via mpiformatdb -N X ... produces X + 1 database fragments. If you are > running > on fewer than X + 2 nodes (X + 1 workers and 1 master) than at least one > node > will have to process more than one database fragments. The cluster is 20 nodes. The primary database is protein nr, so I have nr.00 thru nr.19 in shared storage. > I was under the impression that mpiblast first checks to see if the > correct > database fragment is present in the local storage directory. If you run > mpiblast > with the --debug flag it should tell what fragments are being staged out > to the > worker nodes. I thought so too, but if you run another sequence, that node's segments gets copied as well. An example is our node2, I have nr.01, nr.03, nr.08 and nr.11. To me, I thought only that nodes segment would be present, nr.01, but its not. > Restrict each node to run only one blast job at a time (via PBS, condor, > etc). Multiple > blast jobs running on the same node will compete for memory and cpu > resources and > remove the super linear scaling of mpiblast. We are not at this point yet, mainly in the testing stage. So I am the testing user being provided samples by another researcher. The problem is, if I run sequences, I own those copied segments. -- Jeremy Mann jeremy@biochem.uthscsa.edu University of Texas Health Science Center Bioinformatics Core Facility http://www.bioinformatics.uthscsa.edu Phone: (210) 567-2672