Since my question seems to have sparked considerable, and very useful responses both on and off the list, I'm going to try and summarize the feedback I've gotten. a ram cache of the db will be a big help but a linux process can only use 2 or 3GB [1] So the job may need to be spread across several smaller machines which is what mpiBLAST is intended for mpiBLAST uses NCBI BLAST and therefore the cpu effects should be proportional between them. Determining the optimal size of the database per node, will be important, but trial and error I'll probably need more nodes, each with less memory, than I had originally anticipated which will increase the total price :-( a raid0 should help minimize diskIO, which is suspected as the next bottleneck [1] I've heard 2 & 3 from different responders.No definitive answer yet. I'm playing email tag with ncbi in hopes of learning more about the 2/3GB memory limit And what benefits a 64bit cpu might provide This cluster is intended exclusively for blast, and will not support on-demand queries. At present I'm leaning toward a cluster of rackmounts each with 4GB and dual 2.4Ghz Xeons. Several people have contacted me to suggest alternative suppliers. And I'm eager to hear more such responses. I'm pleased to say all of those responses were made privately, not to the general list. I'll start with perhaps 4 machines, and profile performance against truncated versions of th nr database. Keeping an eye out for a serious performance hit as the db size grows. Then establish how many additional machines might be necessary for the full nr collection, and anticipated growth. I'm still not sure if there should be a master node, or a cluster of equals. Since there will be a certain amount of reliance on profiling and benchmarking shared experiences with tools and techniques would be helpful.