[Bioclusters] Anyone interested in clustering transmeta
cpus?
chris dagdigian
dag@sonsorol.org
Sat, 25 Aug 2001 11:46:31 -0400
At 08:56 PM 8/24/01 -0400, Ivo Grosse wrote:
>... and even 1 GB per node might often be too small to run useful BLAST
>jobs.
Blackstone and others have tacked this problem by breaking up the databases
into pieces that are small enough to be dynamically shipped peer-to-peer
style around the network and cached in local RAM. This was a huge priority
project at our company back when DRAM prices were very very high :) (Memory
used to account for 50% of the total cost some of the servers we bought).
Now that memory is pretty darn cheap it is not as beneficial except in
cases where you are forced to deal with low-memory hardware like the RLX
blades. The process to do this is not that difficult assuming you can get
the statistics correct when you merge your spit result sets back together.
This approach is not suitable for scientists doing one-off searches
agasinst many databases...it works best when you know that you have to do
many queries against a large db.
It does work though- we did some blazing fast searching on nodes with 256mb
RAM using this approach.
> > o No possibility of a PCI slot; this rules out Myrinet and other high
> > speed interconnect technologies
>
>... is a fast communication between the nodes really important in
>bioinformatics applications, which are typically embarassingly parallel?
No for bioinformatics; yes for other life science areas.
There is no need for high speed interconnect for bioinformatics and
sequence analysis. As you said most of those apps are embarrassingly
parallel and most in fact are rate limited by things like RAM and disk I/O.
Once you start having researchers who want to do computational chemistry,
molecular modeling, QSAR and virtual screening then you start to see more
and more emphasis on parallel code. Some PVM stuff but more and more
commercial applications are coming out as MPI-aware. It also seems that
many of the scientific software developers in these fields are deciding to
start with MPI and parallelism. Vertex Pharmaceuticals is an example of
this case; they just replaced their 128-node SGI system with a 112CPU
Myrinet-enabled linux cluster. The system is expected to go to 300+ CPUs
within a year and they are pretty much going to use it entirely for
proprietary parallel code that their researchers have cooked up inhouse.
-Chris