[ssml] Unusual amino-acid composition ?
Dan Bolser
dmb at mrc-dunn.cam.ac.uk
Thu Jun 16 11:56:15 EDT 2005
On Thu, 16 Jun 2005, Gerard DVD Kleywegt wrote:
>
>hi all,
>
>we are writing up the structure determination of a dimeric human enzyme. while
>going through the model (~750 residues per monomer), i noticed that the
>protein contains rather few lysines (1.8%) and isoleucines (2.7%), and rather
>many prolines (7.5%) and phenylalanines (6.5%). (if i remember correctly,
>there are no low-complexity regions in the sequence.)
>
>i would be grateful for any clues or literature references that might tell us
>if this is statistically to be expected or unusual and -if the latter- what
>could explain it, and whether or not it might have any significance. also, a
>pointer to a table of the average amino-acid composition of soluble human
>proteins (or enzymes) would be useful.
>
>thanks in advance for any input !
I don't know of any tables or literature off hand (I am sure there are
plenty), but you can quite easily generate the statistics from a
non-redundant set of sequences (for example UniParc).
Use this 'background' set to generate your 'expected' frequency for
each amino acid, then compare this to the 'observed' frequency from
your protein.
The stats are simply a case of comparing the observed and expected
frequencies to get some measure of 'unusual' (along with a significance).
Often people quote log(likelyhood), coming from the log odds ratio.
It gets rapidly more complecated (technically) when you try to consider
different 'populations' of amino acids, for example suface amino acids
(which are known to have a different distribution from core amino
acids). However, the basic idea is the same.
Dan.
>--gerard
>
>******************************************************************
> Gerard J. Kleywegt
> [Research Fellow of the Royal Swedish Academy of Sciences]
>Dept. of Cell & Molecular Biology University of Uppsala
> Biomedical Centre Box 596
> SE-751 24 Uppsala SWEDEN
>
> http://xray.bmc.uu.se/gerard/ mailto:gerard at xray.bmc.uu.se
>******************************************************************
> The opinions in this message are fictional. Any similarity
> to actual opinions, living or dead, is purely coincidental.
>******************************************************************
>
>_______________________________________________
>ssml-general mailing list
>ssml-general at bioinformatics.org
>https://bioinformatics.org/mailman/listinfo/ssml-general
>
More information about the ssml-general
mailing list