[ssml] error from BLASTCLUST

Dan Bolser dmb at mrc-dunn.cam.ac.uk
Thu Sep 2 19:03:42 EDT 2004


This is a simple problem that is often encountered when beginning to use

The problem comes when blastclust uses low complexity sequence filtering
by default, and the sequence it creates on the fly (which you may never
actually see) is totally 'screened' by the filter. i.e. the whole sequence
is replaced by X's.

This prevents anything useful being done with the sequence and leads to
the error seen (I am not sure of the exact process that leads to the
error in these cases).

The short answer is you can safely ignore these problems, or you can
switch off low complexity filtering at the risk of a few seemingly
significant matches (matches over low complexity regions are probably not
as unlikely as the random sequence approximation makes them seem).=20

Sorry that isn't a very clear description...=20

The best thing to do is understand low complexity sequences (very simple
sequence repeats) and why / how those are filtered.=20

The standard program is repetitive (for low complexity), and DUST removes
coiled-coil sequences (which can be highly repetitive).

A random sequence is maximally complex. A continious repeat of one
character is minimally complex.


On Thu, 2 Sep 2004, Manoj Tyagi wrote:

>Thanks for the reply to my BLAST query.=20
>This time I have another query about BLASTCLUST which I am trying to use t=
>clsuter my dataset. In the documentation it says by default it uses BLOSUM=
>with gap penalities etc.=20
>Now I want to use default options so I just simply give my dataset as inpu=
>file & give output file names.=20
>Problem is it throws warning & error saying=20
>"[NULL_Caption] WARNING: SetUpBlastSearch failed.
>[NULL_Caption] ERROR: BLASTSetUpSearch: Unable to calculate Karlin-Altschu=
l para
>ms, check query sequence"
>it means it didn't find lamda & K values in precomputed tables so giving=
>warning & errors. normally it should be there anyway I can provide that va=
>the question is HOW? in BLASTCLUST there is no option for providing these=
>Could you help me out, what to do in this case. & why it is giving error?=
>Quoting Kevin Karplus <karplus at soe.ucsc.edu>:
>> The matrix is not the whole set of parameterization for BLAST.
>> There are also the gap costs and the lambda and K values used for
>> computing E-values.
>> Changing the matrix without correcting the other parameters leads to
>> uninterpretable results.
>> Kevin Karplus =09karplus at soe.ucsc.edu=09http://www.soe.ucsc.edu/~karplus
>> Senior member, IEEE=09Board of Directors, ISCB (starting Jan 2005)
>> Professor of Biomolecular Engineering, University of California, Santa C=
>> Undergraduate and Graduate Director, Bioinformatics
>> Affiliations for identification only.
> Manoj TYAGI=20
> Laboratoire de Biochimie et G=E9n=E9tique Mol=E9culaire
> Universit=E9 de La R=E9union
> BP 7151, 15 avenue Ren=E9 Cassin
> 97715 Saint Denis Messag Cedex 09
> La R=E9union
> Tel : +262 262 938641
> Fax : +262 262 938237
>This mail sent through IMP: http://horde.org/imp/
>ssml-general mailing list
>ssml-general at bioinformatics.org

More information about the ssml-general mailing list