[ssml] Accurate searches in nucleotide databases

Kevin Karplus karplus at soe.ucsc.edu
Tue Oct 25 13:24:08 EDT 2005


I have forwarded your request to the genome browser people at UCSC,
since it seems like their sort of problem.

If I were faced with the problem, I would probably do a two-step search:
	1) prefilter with blast using a very high E-value (like 1000)
	   (probably taking the union of searches with each of the
	    known examples)
	2) run the output of the prefilter through SAM's hmmscore,
	   with an HMM built from the known examples, and with the
	   dbsize set to the number of sequences in the original database.

The genome browser folks may very well have a better answer for you.

Kevin Karplus

More information about the ssml-general mailing list