[ssml] Finding Matches using N-term & C-term sequences

Tristan Fiedler tfiedler at rsmas.miami.edu
Tue Dec 9 17:16:30 EST 2003

I am interested in finding any homologs to a protein I am working on,
however, I have only an N-terminal sequence of about 15 amino acids, and 3
internal peptides from tryptic digests.

I have used the default scoring matrices, gap existence & extension
penalties, and word sizes for the NCBI blastp web interface as well as for
the 'search short nearly exact matches' using the blastcl3 client-server
interface :

 ../blastcl3 -p blastp -e 10 -d swissprot -F T -T T -M BLOSUM62 -G 11 -E 1
-W 3
../blastcl3 -p blastp -e 10 -d nr -F T -T T -M BLOSUM62 -G 11 -E 1 -W 3

../blastcl3 -p blastp -e 20000 -d swissprot -F F -T T -M PAM30 -G 9 -E 1 -W 2
../blastcl3 -p blastp -e 20000 -d nr -F F -T T -M PAM30 -G 9 -E 1 -W 2

Although many 'hits' were returned, none had e-values less than 0.1.

What is the threshold for 'significance' with such short peptides?  Is
there a preferred method to find homologs when dealing with these short


Tristan J. Fiedler, Ph.D.
Postdoctoral Research Fellow - Walsh Laboratory
NIEHS Marine & Freshwater Biomedical Sciences Center
Rosenstiel School of Marine & Atmospheric Sciences
University of Miami

tfiedler at rsmas.miami.edu
t.fiedler at umiami.edu (alias)

More information about the ssml-general mailing list