[ssml] Finding Matches using N-term & C-term sequences

Dan Bolser dmb at mrc-dunn.cam.ac.uk
Wed Dec 10 04:35:18 EST 2003

One thing I would check first is the status of your organism in the database. This
way you know if you should be finding the exact hit or not. If not (your organism
has no sequence data) you could use the distance in the taxonomic tree from your
species to the species you are hitting to try and guestimate how close you would
expect the sequences to be. Low evalue on a short region from a very different
organism is probably false (unless you find the same hit to the same protein in lots
of different organisms). If you know anything about your protein, it is probably
best to see if the hits you get 'look right' in a functional sense. You may not get
any significant hits, as the N / C terminal regions can vary widely. Just like
betting, there is no way to be sure that the bets you make will come off in your
favour - the odds are just a guideline.


++ Tristan Fiedler--
> I am interested in finding any homologs to a protein I am working on, however, I
> have only an N-terminal sequence of about 15 amino acids, and 3 internal peptides
> from tryptic digests.
> I have used the default scoring matrices, gap existence & extension penalties, and
> word sizes for the NCBI blastp web interface as well as for the 'search short
> nearly exact matches' using the blastcl3 client-server interface :
>  ../blastcl3 -p blastp -e 10 -d swissprot -F T -T T -M BLOSUM62 -G 11 -E 1
> -W 3
> ../blastcl3 -p blastp -e 10 -d nr -F T -T T -M BLOSUM62 -G 11 -E 1 -W 3
> ../blastcl3 -p blastp -e 20000 -d swissprot -F F -T T -M PAM30 -G 9 -E 1 -W 2
> ../blastcl3 -p blastp -e 20000 -d nr -F F -T T -M PAM30 -G 9 -E 1 -W 2
> Although many 'hits' were returned, none had e-values less than 0.1.
> What is the threshold for 'significance' with such short peptides?  Is there a
> preferred method to find homologs when dealing with these short fragments?
> Cheers,
> Tristan
> --
> Tristan J. Fiedler, Ph.D.
> Postdoctoral Research Fellow - Walsh Laboratory
> NIEHS Marine & Freshwater Biomedical Sciences Center
> Rosenstiel School of Marine & Atmospheric Sciences
> University of Miami
> tfiedler at rsmas.miami.edu
> t.fiedler at umiami.edu (alias)
> 305-361-4626
> _______________________________________________
> ssml-general mailing list
> ssml-general at bioinformatics.org
> https://bioinformatics.org/mailman/listinfo/ssml-general

More information about the ssml-general mailing list