If searching for an exact match to a 5-mer, the approximate-match tools are a poor choice. You're much better off just reading the flat file and scanning the sequence with a standard string-match algorithm. You can probably even use the built-in regular-expression search in perl. That should be reasonably fast for a single search. The Bioperl wrappers for reading the files should make this a pretty trivial program to write, though they might make things a little too slow for heavy-duty use. If you need more speed, you could write a c or c++ program to do the i/o and use the gnu regular-expression package to do the searching. If you have many different 5-mers to search for, you could build an index, listing for each 5-mer all the sequences that contain that 5-mer. Building the index would take only one pass over the data and would allow very fast lookup. Again, one could build a prototype quickly in perl, and reimplement in a faster language if it turns out to be necessary. Kevin Karplus karplus at soe.ucsc.edu http://www.soe.ucsc.edu/~karplus life member (LAB, Adventure Cycling, American Youth Hostels) Effective Cycling Instructor #218-ck (lapsed) Professor of Biomolecular Engineering, University of California, Santa Cruz Undergraduate and Graduate Director, Bioinformatics Affiliations for identification only.