Chris, You might be interested in what we are working on -- http://www.lifeformulae.com Pam >From: Chris Dwan <cdwan at bioteam.net> >Reply-To: "Clustering, compute farming & distributed computing in life >science informatics" <bioclusters at bioinformatics.org> >To: "Clustering, compute farming & distributed computing in life science >informatics" <bioclusters at bioinformatics.org> >Subject: Re: [Bioclusters] sensitivity & blast >Date: Wed, 6 Apr 2005 16:58:36 -0400 > > >BLAST is not a black box, and its function need not be determined by >experiment: > >- An excellent reference on the algorithm: >http://www.ncbi.nlm.nih.gov/BLAST/tutorial/Altschul-1.html >- The source code: ftp://ftp.ncbi.nih.gov/toolbox/ncbi_tools/ncbi.tar.Z >- O'Reilly published an entire book on BLAST, whose author is active on >this list. > >Yes, the search space defaults to the product of the query length (m) and >the target set length (n). The -Y option overrides that search space. > >Alignment Score depends only on the alignments and the substitution matrix. >Bit score normalizes for values specific to the substitution matrix. >Expect value normalizes out query and target set size. > >Keep in mind as well: BLAST is an heuristic algorithm with no knowledge of >any structure beyond primary sequence. If increased sensitivity is the >goal, you will get much greater milage by using an algorithm which takes >structure into account, or one which utilizes more than pairwise >alignments. > >However, taken very literally, your answer is correct. If the goal is to >remove query length as a factor in E value, the "-Y" option is the way to >go. > >-Chris Dwan > The BioTeam > >On Apr 6, 2005, at 4:39 PM, Pamela Culpepper wrote: > >>orks as follows. >>In the absense of -Y, the "effective search space" is the product of the >>query sequence length >>and the total database length. It affects the calculation of the >>expection value but not the score. >>It will thus vary with the query sequence length. >>Using "-Y 12345" sets the above "effective search space" to 12345, >>constant for each query >>sequence. To make the > >_______________________________________________ >Bioclusters maillist - Bioclusters at bioinformatics.org >https://bioinformatics.org/mailman/listinfo/bioclusters