Let me provide a disclaimer before answering that I'm not an expert on caching in the various NFS implementations. That said, if you were to set the local storage path in the mpiBLAST configuration file to a directory on shared storage, mpiBLAST should be able to work without any local storage. In such a situation, database fragments would be cached both locally on each node and on the server. The current implementation of mpiBLAST assigns database fragments to workers based on which fragments the worker has in its 'local storage' directory (in this case the shared NFS dir). Because all workers will appear to have all fragments available 'locally', the master will have no preference for assigning the same fragment to the same node during consecutive executions of mpiBLAST. If your worker nodes are dedicated and have enough RAM to cache the entire database there won't be a problem. Otherwise the master may assign database fragments the workers don't have cached locally, and even if your NFS server has them cached there will be some performance impact due to the latency of accessing data over the network. Of course, this situation could be remedied by slightly modifying the mpiBLAST scheduler algorithm to store some persistent state information about which nodes have most recently searched each fragment to exploit the worker's buffer-cache effectively. In designing mpiBLAST we opted to copy fragments to local storage because in practice it significantly reduces the cost of a buffer-cache miss. Reading a block from local storage is much faster (latency and bandwidth) than from the average NFS server, and it eliminates the potential server contention that arises when several nodes simultaneously make requests to the NFS server. -Aaron On Mon, 2 Feb 2004, Joydeep Sen Sarma wrote: > Hi folks, > > I work on file systems and am doing some research into > > NFS issues when running Blast. I have read a number of > > posts on the bioclusters mailing list regarding usage > of local disks being better. > > However, after reading the mpiBlast white paper, I > got the impression that mpiBlast would avoid nfs read > io after the server caches are warmed up. (Of course > as long as the data fits in the server memory pool). > > So I guess i am a little curious as to whether people > still feel nfs is not suited for mpiBlast and if so, > why ? Do you have multiple databases against which > searches are performed (so that the cache is purged > periodically ?). Or does the database not fit into the > combined memory of a typical cluster ? > > thanks in advance for your response, > > Joydeep > > > > __________________________________ > Do you Yahoo!? > Yahoo! SiteBuilder - Free web site building tool. Try it! > http://webhosting.yahoo.com/ps/sb/ > _______________________________________________ > Bioclusters maillist - Bioclusters@bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bioclusters >