How about if you turn the filter off using the -F F option? DOing this should give u exact matches. On 4/6/07, Phil Princely <phil.princely at gmail.com> wrote: > > Hi all, > > I'm working on a script to compare all genes in a genome against a > full sequence in a blast database. both have around 2000 genes. my > script takes the test genome, extracts one amino acid sequence and > runs it through blast. it then filters the output to grab only the > name of the gene with the best match and the similarity (in percent). > For example, from these lines: > > >Contig 165-147: 171558..172979 (reverse), 474 amino acids > Identities = 471/473 (99%), Positives = 471/473 (99%) > > it grabs the text Contig 165-147 and the percent 99%. > > My problem comes when sequences have a lower similarity, and blast > uses only a section of the input gene. For example > > >Contig 158-62: 61482..62750 (direct), 423 amino acids > Identities = 15/46 (32%), Positives = 27/46 (58%), Gaps = 2/46 (4%) > > Here, it's only used 46 of the amino acids, where the full gene > sequence has 347. > > Is there a way I can force blast to use the full 347 amino acids for > comparison. The researchers in my lab are most interested in places > with low similarities, since they are trying to find the portions > which make this organism virulent. > > Thanks again > > Phil P. > _______________________________________________ > Biodevelopers mailing list > Biodevelopers at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biodevelopers >