[Bioclusters] BLAST job time estimates

Lucas Carey bioclusters@bioinformatics.org
Tue, 8 Jun 2004 16:05:36 -0400


If you're randomly generating your queries you will indeed find that runtime isn't highly affected by the query. Do you also find that you don't get very many long matches? The most time consuming part of the blast algorithm is the hit extension. For example, if you run blastp with two equal length queries, one being the E coli RNA Polymerase Sigma factor (few long matches), and the other being an equal length segment of the E coli HSP70 protein (many long matches) against nr, I'm willing to bet that the HSP70 query will take longer to complete.
-Lucas

On Tuesday, June 08, 2004 at 09:57 +0100, Micha Bayer wrote:
> Since I wrote the original message I have run a few test runs myself and
> I have actually found that the size of the target database is a much
> stronger predictor of the wall time the job takes than the query size.
> Times seem pretty consistent across different length queries run against
> the same target (I randomly generate my test queries now).
> 
> cheers
> Micha