Glen, Thanks for the info. As it happens I've just discovered what the problem is with ssh. I configured the cluster while it was installed inside the company network. Subsequently we moved into a DMZ area with highly restrictive iptables rules. The cluster has no access to DNS, and I did configure nameservers for frontend/compute nodes. If I ssh to nodes directly by ip address the ssh completes quickly (also if I do "ssh -4 c0-0"). So it seems I must either relax the DMZ firewall rules or turn off DNS lookup. Also I did discover that mpirun spawns the tasks by ssh, but intertask communication doesn't appear to use ssh (maybe I'm not correct here - one of my Linpack jobs for small size did only two ssh communications during its run, one to spawn the job, and another at the end.) Anyway it appears I've made an unwarranted conclusion that ssh slowness is directly related to my HPL weirdness. They MAY be related, but they may not be... Thanks, Bill On Fri, 2003-08-29 at 11:31, Glen Otero wrote: > Bill- > > You've come across two common slowdown on the cluster, 1) the use of > ssh and 2) mpirun using ssh to start jobs. This is a real pain, and > something I will change in the future, but it takes a whole > reorganization of mpi. One thing I can recommend trying is to add rsh > functionality to the cluster so jobs will launch faster. Instructions > for doing so are in the Rocks User's Guide. > > WRT to HPL, I'm not surprised with the weirdness your seeing, I hear > about it all of the time, but I'm not an expert at tuning HPL, so I > can't offer any hints. If I come across anything, I'll let you know. > > Glen -- Bill Barnard <bill at barnard-engineering.com>