Hi Phil, It sounds like you want a Global search, not a local one like BLAST. Try 'Needle' from the EMBOSS package- I think that may work for you. Marty On 4/9/07, Phil Princely <phil.princely at gmail.com> wrote: > Hi all, > > I'm working on a script to compare all genes in a genome against a > full sequence in a blast database. both have around 2000 genes. my > script takes the test genome, extracts one amino acid sequence and > runs it through blast. it then filters the output to grab only the > name of the gene with the best match and the similarity (in percent). > For example, from these lines: > > >Contig 165-147: 171558..172979 (reverse), 474 amino acids > Identities = 471/473 (99%), Positives = 471/473 (99%) > > it grabs the text Contig 165-147 and the percent 99%. > > My problem comes when sequences have a lower similarity, and blast > uses only a section of the input gene. For example > > >Contig 158-62: 61482..62750 (direct), 423 amino acids > Identities = 15/46 (32%), Positives = 27/46 (58%), Gaps = 2/46 (4%) > > Here, it's only used 46 of the amino acids, where the full gene > sequence has 347. > > Is there a way I can force blast to use the full 347 amino acids for > comparison. The researchers in my lab are most interested in places > with low similarities, since they are trying to find the portions > which make this organism virulent. > > Thanks again > > Phil P. > _______________________________________________ > Biodevelopers mailing list > Biodevelopers at bioinformatics.org > https://bioinformatics.org/mailman/listinfo/biodevelopers > -- -- Martin Gollery Associate Director Center For Bioinformatics University of Nevada at Reno Dept. of Biochemistry / MS334 775-784-7042 -----------