[BiO BB] My Protein Sequence analysis tool is taking a lot of time to complete a single database similarity search

Fri Mar 12 04:57:32 EST 2004

This 10ms is that the true execution time, or just the
measured time between call and return?

~2G Hz means that you are executing instructions in the
nanoseconds interval with the CPU. So any single operation
that consumes 1ms of CPU time have done quite a lot things -
or did just butterflied for a while.

It might be that your machine  is swapping and is spending
time in doing paging (disk I/O /IS/ slow). The standard
solution for this is to put in more RAM in the machine.

You mentioning that 2 threads runs faster than 4, this might
just be a coincident (dependent on current load, weather you
uses light weight thread, available system resources, etc, etc)
but it might also suggest that you are low on memory, and the
operating system is forced to swap memory pages when doing
context switches. (This can be true for light weight threads
if they uses a lot of private memory for data.)

But you say you are using Java. And for me that explains it all. :)

In my experience (been watching java process executing) Java is
really bad in handling memory garbage collection when it gets
stressed. This is due to the fact that the decision on when
to do garbage collection is left to a machine - and as such
it might not be at the most optimal time memory is being 
released - and that might very well be your case - you can
figure this out by watching the memory allocation/deallocation
statistics from your application.

In any case, you may be able to handle this is two ways: write
your application in C++_ and take control of the memory handling,
or just by sufficient enough memory so the Java app never runs
out of memory. 

On Thu, 2004-03-11 at 10:23, prathibha bharathi wrote:
> Hai all,
>  
>          My protein sequence analysis tool is taking a lot of time to
> complete a single request for database similarity search.My database
> is a relational database for MySQL which contains 16 tables and
> 2,83,366 sequence entries.
>  
> My Sequence analysis tool is currently running on a Local intranet
> server with 1.9GHz processor and 256MB RAM.
>  
> For a single pairwise alignment it is taking around 10msecs depending
> on the length of query sequence and was  taking more than 24 hours to
> complete single request with 4 threads working on 4 partitions .By
> making only 2 threads to be alive at a time working on 2 partitions(I
> partitioned my Database in to 8 based on sequence chesk sum) ,now it
> is taking around 9 hours to complete a single request for database
> similarity search.
>  
> Is it really possible to reduce the time further with hardware
> configuration of 1.9Ghz and 256MB RAM.
> Or have I to go for more more powerful hardware configuration.
> Now i'm using MySQL database server and Apache HTTP server with JRun
> application server.Have i to go for more powerful application server
> than JRun .
> My implementation platform is Java and algorithm being used is"
> SMITH-WATERMAN LOCAL ALIGNMENT" algorithm.
>                   Thanking You,
>                                                           Prathibha.
> 
> 
> 
> Yahoo! India Insurance Special: Be informed on the best policies,
> services, tools and more.