[Bioclusters] Xserve G5 memory

Tue, 05 Oct 2004 15:12:32 -0500

[I just finished composing this as Joe's post showed up]
Under most modern operating systems, BLAST databases automatically get 
cached in memory by the OS as they are searched.  The operating system 
typically uses all available memory (unused by currently active 
applications) to cache frequently used data from files on disk.  The 
memory cache for data stored on disk is referred to as a "buffer 
cache".  Different operating systems have different policies regarding 
which data gets stored in the buffer cache, but the common theme is that 
they all attempt to cache the most frequently used (MFU) data in the 
buffer cache.

To the best of my knowledge, NCBI's blastall code uses memory-mapped I/O 
whenever possible.  Although I am not aware of any, there may be some 
Darwin-specific limitation that prevents memory-mapped files from 
getting added to the buffer cache.  This may be an avenue for further 
investigation.

A quick hack to ensure the DB stays in memory would be creating a 
RAMdisk and copy the DB onto it.  I think OS X supports RAMdisks.(?)  
Unfortunately, the memory consumed by the RAMdisk can't be used for 
anything else when you aren't running BLAST.

-Aaron

Victor M.Ruotti wrote:

> Hi Juan,
> How exactly do you hold your databases in memory. Do you it through 
> programming? It may help to describe how exactly this is done. I am 
> also curious to know how you do it.
>
> Victor
> On Oct 5, 2004, at 12:09 PM, Juan Carlos Perin wrote:
>
>> I have been running benchmarks with blastall on several different 
>> machines.
>> We've come to realize that one of the biggest differences affecting 
>> search
>> times is how the machines actually maintain the search databases in 
>> memory.
>>
>> Eg..  On our IBM 8-way machine, the databases are held in the memory, 
>> which
>> seems to be an effect of the architecture of the machine, and search 
>> times
>> become incredibly fast after an initial run, which stores the 
>> database in
>> memory.  The same effect seems to take place on our Dual Xeon Dell (PE
>> 1650), which also outpaces the Xserves significantly after an initial 
>> run to
>> populate the db in  memory.
>>
>> It would appear the the Xserves dump the db from memory after each 
>> search,
>> even when submitting batch jobs with multiple sequences in a file.  Is
>> anyone aware of how this functions, and how this effect might be 
>> changed to
>> allow the db to stay in memory longer?  Thanks
>>
>> Juan Perin
>> Child. Hospital of Philadelphia
>>
>> _______________________________________________
>> Bioclusters maillist  -  Bioclusters@bioinformatics.org
>> https://bioinformatics.org/mailman/listinfo/bioclusters
>
>
> _______________________________________________
> Bioclusters maillist  -  Bioclusters@bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/bioclusters