[BioCocoa-dev] Peptides...

Anastassis Perrakis a.perrakis at nki.nl
Sun Mar 20 06:55:48 EST 2005


>> Some comments on the mass calculation. In mass spectrometry a 
>> molecule can only be detected if it has a charge. With most modern MS 
>> equipment the charge of peptides and proteins is obtained by addition 
>> of one or more protons from the solvent. Usually this is denoted as 
>> [M+H]+ or [M+2H]2+. Especially for peptides, the charge in general is 
>> 2+; a peptide of MW 2000, will therefore be observed as [2000 + 2* 
>> protonmass]/2 = 1001. This if referred to as the mass-over-charge 
>> ratio (m/z). So when a mass spectrum shows a peptide of 1001, the 
>> peptide can actually have an uncharged mass of 1000 or 2000. Looking 
>> at the spectrum will reveal if a species is 'singly-charged' or 
>> 'doubly-charged'. So for our search program, we need to take into 
>> account what the charge of the peptide is that we are looking at. The 
>> mass of a proton is defined in BCFoundationDefines.h: H_mono and 
>> H_ave. This code probably needs to be added to BCMassCalculator tool 
>> - I will look into that and will start with just protons, but we need 
>> to keep in mind that there are more possibilities, eg sodium.
> I already hoped you could give me some more insight, thanks! Again, 
> you have the knowledge to further improve the app a lot, feel free to 
> do so if you're interested... A simple popupbutton of expected 
> modifications, or at least a matrix for singly or double charged would 
> be easy to add and compensated for...

Although all the above 100% correct the application I suggested to Alex 
was to find a already corrected mass in the sequence. Thus the charge 
correction is not needed, the mass is coming from 8-14 peaks typically 
and then is corrected.

What of course would be cool would be to read the scan, find the peaks, 
get the MW and correct.
They sell such software for 12.000 Euro - believe it or not !
I got an Excel sheet and a little c application using GSL - they took 
30 mins each to make, but for both
you need to type in the peaks.

To identify peaks is trivial - I just need to get my 3-D code to 1-D ;-)
I can prototype that in c or f77 and then let Alex slow it down by a 
factor of 50 or so ...
Can be a fun project if I get some time.

Do you guys have any MS tools for proteins already though ?

>
>> As discussed a while ago, for our BCSequenceView, it might eventually 
>> be better to roll our own view, including a NSLayoutManager, etc. The 
>> code that is currently used is sufficient for displaying a simple 
>> sequence in an NSTextView.
> Absolutely true, I think there are two possibilities here however. 
> First we can further improve the current BCSequenceView quite a bit 
> with a few relatively simple additions to take the spacing into 
> account. The second is indeed a far more difficult one, to create a 
> "native" BCSequence view. Ideally one that can be further extended to 
> display alignments as well.
>
> Finally, here are the initial comments from the Tassos, the guy who 
> challenged me to create the project. It's interesting to see that he 
> indeed managed to get the thing working at 50x the speed our framework 
> manages to get (at least he claims to ;-).

f77 available on request ;-)
... hmm, but i was so lazy I did hardcode the mass you look for .. 
another 10 mins of programming to fix it.

	A.

> Shark tells me that most time (65%) is already lost to object 
> messaging in the symbol counter, so perhaps that could deserve some 
> optimization ;-) The kudos for the added water molecule go to you! And 
> the comment about the framework to us all!
>
>
... actually, its not a water molecule:
a residue has N/CA/C/O atoms, plus the side chain.
When its cleaved it will get an addtional OH group at the C+ thats 
created.
And, you need to count one extra H at the Nterm ;-)
... well, here is your 'water' ;-)
(Alex has been doing to much FRET lately and he needs his chemistry 
reminded(

Ciao guys !

	A.


> nice ... very nice ;-)
>
>  did not look at the code yet, but for the same sequence size I can do
>  it at less than 1 second instead of 30 secs, really curious to see 
> what
>  you screw up in the coding ;-)
>  my speed test was looking for the same lengths as you did, which - due
>  to bad advice from me-
>  was excessive by far. checking +/-15 is far enough.
>
>  (do I sound like Victor ?)
>
>  but, very nice, you added 18 to all fragments ;-) obvious error to 
> make
>  in first implementation avoided ...! Only one little bug, you display
>  the wrong aa range (one aa to the left) but you highlight the correct
>  one ;-)
>
>  scientifically now, its clear that the ms accuracy you need can not be
>  achieved ....
>  so, i can have fun coding a feature to input 1-3 aa from N-term
>  sequencing, while
>  I try and speed up the code ... and maybe fix the bugs ;-)
>
>  really thanx, its a really nice framework to play with !
>
>          A.
>
> *********************************************************
>                       ** Alexander Griekspoor **
> *********************************************************
>                 The Netherlands Cancer Institute
>                 Department of Tumorbiology (H4)
>           Plesmanlaan 121, 1066 CX, Amsterdam
>                     Tel:  + 31 20 - 512 2023
>                     Fax:  + 31 20 - 512 2029
>                    AIM: mekentosj at mac.com
>                     E-mail: a.griekspoor at nki.nl
>                 Web: http://www.mekentosj.com
>
>           LabAssistant - Get your life organized!
>           http://www.mekentosj.com/labassistant
>
> *********************************************************
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 5755 bytes
Desc: not available
URL: <http://www.bioinformatics.org/pipermail/biococoa-dev/attachments/20050320/6f1fcaea/attachment.bin>


More information about the Biococoa-dev mailing list