[Bioclusters] cluster hardware question

Joe Landman bioclusters@bioinformatics.org
16 Jan 2003 11:15:19 -0500


On Thu, 2003-01-16 at 11:00, Jeff Layton wrote:

>    Any comments on the size of "typical" databases? (pick whatever

You can go to bio-mirror.net and see some of them.  The Gene-Expression
databases can be huge or small, depending upon what data is stored (raw
vs processed, image vs histograms, etc.)  The now redundant nr protein
database is rapidly approaching 1M sequences and 1 GB in size.  The nt
database is approaching 10 GB (at ~5-6).  Others vary in there.  Single
chromosome assembly databases are in the mid 10s of MB.  These are
usually fully assembled, so you need to look at any sort of
pre-segmentation if you are going to use it in a BLAST like manner.

> you want for "typical"). This shows my ignorance of Bio codes.
>    However, I've been looking at using the extra memory on the latest
> Xeon mptherboards as  RAM-disk. For instance some of the Supermicro
> boards can handle up to 16 Gig for a dual CPU (32 Gig for a Quad).

Neat idea.

> If you assume that you are running on one instance of your app per
> CPU and that you can only address 3.5 Gig of memory per CPU, then
> that leaves you with around 8 Gigs to play with (giving a generous
> 1 Gig for the OS). While RAM is expensive compared to disk, this
> idea is also much faster than disk. Would 8 Gigs be enough for some,
> many, lots of people?

Some.  I think the PAE stuff lets you address up to 64 GB, but I dont
know how well it works.  I have heard it slows things down a bit.  Using
it as a ram disk for swap space (ala the old Crays with the SSD) could
be interesting.

> 
> Thanks!
> 
> Jeff
> 
> 
> 
> --
> 
> Jeff Layton
> Senior Engineer
> Lockheed-Martin Aeronautical Company - Marietta
> Aerodynamics & CFD
> 
> "Is it possible to overclock a cattle prod?" - Irv Mullins
> 
> This email may contain confidential information. If you have received this
> email in error, please delete it immediately, and inform me of the mistake by
> return email. Any form of reproduction, or further dissemination of this
> email is strictly prohibited. Also, please note that opinions expressed in
> this email are those of the author, and are not necessarily those of the
> Lockheed-Martin Corporation.
> 
> 
> 
> _______________________________________________
> Bioclusters maillist  -  Bioclusters@bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters
-- 
Joseph Landman, Ph.D
Scalable Informatics LLC,
email: landman@scalableinformatics.com
web  : http://scalableinformatics.com
phone: +1 734 612 4615