I'm working with a cluster which has unexplained high load values (hovering between 1 and 2 with the system sitting idle) on the portal. It's a 32 node, 64 cpu opteron cluster, running SUSE, with the 2.6 kernel. When I turn off GANGLIA's gmon daemon, the load drops down to ordinary rest states (0.1-ish). After some debugging to isolate the behavior, there's clearly a causal link between gmond on the portal and these high loads. Gmond does not appear to be taking very much cpu time, doesn't hang out in "top", and otherwise doesn't seem to be the real problem. The cluster is relatively small (32 nodes). If I turn off all of the cluster gmond processes, the load drops some, but not all the way to a rest state. The system is sluggish when the load reports high, but not as sluggish as I might expect. Has anyone seen this before? It's more annoying than anything else. I'm tempted to blame "something in the kernel" and "multicast," but I would love to have a more robust explanation. -Chris Dwan The BioTeam