This is all great feedback. Thanks. As you have stated, data access/network bandwidth issues are definitely a difficult problem to solve with no silver bullet in sight (fiber or at least gigabit would be a good start though). I certainly understand the idea of just building a dedicated cluster and calling it a day if that gives the results back in the time needed. It certainly minimizes the amount of management housekeeping that needs to be done. As stated, some problems may not be well-suited to being distributed to 100 administrative assistant's desktops. However, for the problems that do work well with the cycle stealing solutions that y'all are using, is the 1) mgmt such a royal pain in the rear that ultimately you say screw it? 2) are the political issues of harvesting from a couple hundred or thousand machines a nightmare? If so, where does the security issues rank? Thanks again. Message: 2 From: Chris Dwan <cdwan@mail.ahc.umn.edu> Subject: Re: [Bioclusters] RE: non linear scale-up issues? Date: Tue, 11 May 2004 18:56:46 -0500 To: bioclusters@bioinformatics.org Reply-To: bioclusters@bioinformatics.org > Slinging a terabyte or two of traffic over the same worm-rotten, > ocasionally-managed corporate network that handles things like > payroll, HR, business apps etc. just to get some CPU cycles from a > bunch of cheap $900 desktop CPUs can be, um... problematic. I agree with this completely. I try to treat data motion as an "out of band" problem which is completely decoupled from the CPU scheduling and access problem. I have found that we can get good use out of those $900 desktops provided that I'm allowed to reserve 20GB (or so) for my target set and that I can populate that 20GB with my target data via cron / rsync / whatever on an automatic basis. All the scheduler really needs to know is whether or not the data is already on a particular node. This comes back to a very old saw indeed: Not all problems are suited to parallel computing. -Chris Dwan --__--__-- _______________________________________________ Bioclusters maillist - Bioclusters@bioinformatics.org https://bioinformatics.org/mailman/listinfo/bioclusters End of Bioclusters Digest