Hi Dan, The root user home directory is not shared cluster wide on an inquiry cluster. If you are running your mpi example out of /private/var/root/ this is probably why it is failing. Getting passwordless SSH to work for non-root users is generally a very simple operation. Drop a line to support at bioteam.net and you'll get a faster response than the inquiry users mailing lists. -Chris Dan Swan wrote: > Hi all, > > I originally posted this to the iNquiry list but I'm not sure people > are using it that heavily.. I'm new to Xserve/iNquiry, my background is > Linux/Condor. > > I wish to run the MPI version of MrBayes. I followed the MPICH > install instructions from: > > http://bioteam.net/faq/index.php?sid=&lang=en&action=artikel&cat=4&id=33&artlang > > > which are very straightforward. The problem is that when I execute the > cpi test program with mpirun and -np > 1 mpirun hangs for about 5 minutes > before spitting back: > > -su-2.05b# /usr/local/mpich-1.2.6/ch_p4/bin/mpirun -np 2 > /usr/local/mpich-1.2.6/ch_p4/examples/cpi > p0_3247: p4_error: Timeout in making connection to remote process on > node01.cluster.private: 0 > Killed by signal 2. > > my machines.* file looks like this: > > node01.cluster.private > node02.cluster.private > node03.cluster.private > node04.cluster.private > node05.cluster.private > node06.cluster.private > node07.cluster.private > node08.cluster.private > node09.cluster.private > node10.cluster.private > node11.cluster.private:2 > node12.cluster.private:2 > node13.cluster.private:2 > node14.cluster.private:2 > node15.cluster.private > node16.cluster.private:2 > > If I start mpirun and log into node01 a ps -ax shows: > > 1162 ?? Ss 0:00.01 /usr/local/mpich-1.2.6/ch_p4/examples/cpi > biocluster > > so the job appears to arrive (at least on one node). I'm a bit stumped. > > Googling for that error people start suggesting it might be to do with ssh > requiring passwordless authentication (although I am currently doing this > as root and am not sure this is the issue yet). But I started looking > into it and threw up another problem :/ > > Passwordless ssh works fine for the root user from the head node to the > cluster nodes, but not for any other user on the machine. And no matter > how I try, I cannot get passwordless authentication working for a non-root > user on the cluster. > > Is there some MacOS X voodoo I am not aware of? I cannot ssh to the > cluster nodes as a non-root user, so I assume there is some "deeper" > authentication issue here? (passwordless ssh is normally trivial to set > up). User home directories are being exported from the head node to the > nodes, although a root user on the node seems unable to ls -l any users > ~/.ssh/. Any hints gratefully appreciated. > > Cheers! > > Dan > -- Chris Dagdigian, <dag at sonsorol.org> BioTeam - Independent life science IT & informatics consulting Office: 617-665-6088, Mobile: 617-877-5498, Fax: 425-699-0193 PGP KeyID: 83D4310E iChat/AIM: bioteamdag Web: http://bioteam.net