Hi Bonnie: W/o the heat sink compound (I use Arctic Silver III) my 2000+ runs in the 50->60C range (outside of specs). With the compound and same fan, it is in the low 40C region. You need good thermal contact for the cooling. Also, look into heat spreaders (about 25->50 cents) for athlons. They are small metal templates which help to spread the heat out (and help with the cooling). Machines dropping could be due to heat, or bad power (not power supplies, but overloaded electrical outlets or poor power conditioning). Try putting the ganglia monitor on with the appropriate heat sensor running, as well as the voltage sensors. Also, look into running Memtest86. See http://www.memtest86.org. One of the better memory testers. I have had more problems with substandard memory on Athlons than I care to admit. Athlons require high quality memory, the cheap stuff might or might not work. Memtest86 running w/o errors for a day is a good sign. Joe On Sun, 2002-12-29 at 21:49, BHurwitz@twt.com wrote: > Hi William, > > Thank you so much. Our power supply in each node is Lemacs (model > P1M6400P). I will look into the power supply and see if that might be the > culprit. > > PS. The fans seem to be making good contact with the CPU's. But, we > haven't tried a heat sink compound yet. Maybe that will be our next step? > > -Bonnie > > > > |---------+------------------------------------> > | | William Park | > | | <opengeometry@yahoo.ca> | > | | Sent by: | > | | bioclusters-admin@bioinfo| > | | rmatics.org | > | | | > | | | > | | 12/29/2002 08:26 PM | > | | Please respond to | > | | bioclusters | > | | | > |---------+------------------------------------> > >-------------------------------------------------------------------------------------------------------------------------------| > | | > | To: bioclusters@bioinformatics.org | > | cc: | > | Subject: Re: [Bioclusters] problems with AMD Athlon 2000+ MP | > >-------------------------------------------------------------------------------------------------------------------------------| > > > > > On Sun, Dec 29, 2002 at 08:01:02PM -0600, BHurwitz@twt.com wrote: > > Hello, > > > > We have a 20 CPU cluster which has AMD Athlon 2000+ MP processors. The > > cluster is relatively new and almost every week we loose one or two nodes > > due to the processors failing. I think the issue may be heat related, so > > we tried separating the nodes, increasing the airflow with a fan, and > > dropping the temp in the server room. None of these attempts have > helped. > > We have also tried to replace the fans. Has anyone ever had a similar > > problem? How did you resolve it? Are pentium III more stable? > > > > Thank you in advance! > > Bonnie > > - Check that CPU fan are making good contact with CPU. Use heatsink > compound, if you have to. > > - I think your culprit may be power supply. Good case and power > supply cannot be over-emphasized. I recommend Antec or Enlight. > > - Intel runs a lot cooler. P3 is more stable, but don't have the > crunching power of AMD 2000+. If you have the money, go for > quad-Xeon. > > -- > William Park, Open Geometry Consulting, <opengeometry@yahoo.ca> > Linux solution for data management and processing. > _______________________________________________ > Bioclusters maillist - Bioclusters@bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bioclusters > > > > > > _______________________________________________ > Bioclusters maillist - Bioclusters@bioinformatics.org > https://bioinformatics.org/mailman/listinfo/bioclusters -- Joseph Landman, Ph.D Scalable Informatics LLC email: landman@scalableinformatics.com web: http://scalableinformatics.com phone: +1 734 612 4615