Michael Gutteridge <mgutteri <at> fhcrc.org> writes: > I'm getting ready to roll out another cluster that will be running a > PVM based application. I've run into what I think is a bit of an issue > with regards to the PVM daemon startup, NFS mounted home directories, > and scalability with large node-sets. > > We've got 62 nodes that will each host 6-10 pvmd's. The problem I'm > foreseeing is that when the PVM is started, that'll initiate a login to > each of those 62 nodes, a login process which will mount the user's > home directory from an NFS server. So that could mean quite a few > mounts (62 * 10+ unique users), which I suspect would have a > detrimental effect on the NFS server. Either way, I feel this is an > approach that won't scale. > > For this application, each of the slave pvmd's doesn't need access to > anything, really, as all of the necessary data is passed to the slave > via a PVM message. Doesn't even really need to run as any particular > user, just needs to be able to spawn a slave PVM process that can > connect to the master. > > I don't believe this problem to be specific to PVM, but could be an > issue with any parallel machine using large node sets. I'm curious as > to strategies anyone else has used to mitigate the problem I've > described, especially for circumstances such as this, where the slave > nodes are merely compute donors. Just statically mount /home rather than doing automounting of individual homes, and you are fine. Also you could run the pvmd's as a user that does not require or have an nfs-mounted home but uses local scratch instead. Lastly, you can port to mpich on a bproc system like Scyld, and get rid of pvmd's altogether. Michael