[Proteopedia] Atom serial numbers vs. Jmol/Proteopedia

Eric Martz emartz at microbio.umass.edu
Sun Mar 29 22:21:35 EDT 2009

It appears to me that one source of major problems with existing 
scenes in Proteopedia, vs. the March 17 remediation of the wwPDB (PDB 
format 3.20), concerns atom serial numbers that changed. Jmol's state 
scripts (used to save scenes in Proteopedia) use atom serial numbers 
extensively in select commands. The re-ordering of serial numbers in 
the 3.20 remediation causes the wrong atoms to be selected in 
Proteopedia scenes.

An example is some scenes of the nucleosome,

Here, five of six green links now produce seriously damaged scenes.

The order of chains was changed in 3.20. In a 2005 version of 
1aoi.pdb that I have (which also antedates the August 1, 2007 
remediation), protein chains are first, followed by DNA: atom serial 
1 is in protein chain A. In the current version, DNA is first: atom 
serial 1 is in DNA chain I. Thus, scripts that depend on serial 
numbers will produce scrambled results. If this problem is limited to 
PDB files containing nucleic acids, this would explain our impression 
that mostly scenes with DNA are damaged. I have not checked multiple 
chain protein-only files.
Old 1aoi chain order: ABCDEFGH IJ.
New 1aoi chain order: IJ ABCDEFGH.

I examined a few other cases where I happen to have old versions of 
PDB files, since the snapshots ftp server appears to be down today. 
The results are organized more systematically in a separate message I 
am sending to the pdb-l.

  - 1lbg (used in the damaged scene in the Lac Repressor page) I do 
not have an old copy of this file.

  - 1osl (NMR multi model file). Order of chains is protein, then DNA 
-- both before Mar 17, and currently. So here, unlike with 1aoi, the 
DNA was not put before the protein.

  - 1d66. In my 2006 copy, the first chain is DNA chain D, followed 
by another DNA chain E, and 2 protein chains A and B. This order is 
not changed in the current version. Both files end protein with
ATOM   1710  CD2 LEU B  64

  - 1fzp (My copy is from 2001.) Old version has protein chains D, B 
followed by DNA chains W, K. Current version has chain order reversed 
(WKDB), so different atom serial numbers.

  - 1hcr (My copy is from 2001.) Old version has protein chain A 
followed by DNA chains B, C. Current version has chain order reversed 
(B, C, A), so different atom serial numbers.

  - 1qln (My copy is from 2002.) Both old and current versions have 
chain order protein, nucleic. However, the order of chains is 
changed. Old: chains A, T, N, R. A is protein, T and N are DNA, and R 
is RNA. The last atom in the old file is "ATOM   7508  C4    G 
R   3". The order in the current file is A, N, R, T, which differs 
from the order given in the COMPND records. The last atom in the 
current file is "ATOM   7508  C4   DA T  22". Thus, the serial 
numbers are changed.

  - 1flo (My copy is from 2004.) Old file has 4 protein chains 
followed by 8 DNA chains, ABCD, EGIK, FGHL. In the new file, the 
order is EFGHIJKL, ABCD. The DNA chain order for ATOM records differs 
from the order in the COMPND records. Thus the serial numbers are changed.

  - 1e3m (My copy is from 2002.) Old file has protein chains A, B, 
then HETATM chain C (a single residue ADP "chain"), then DNA chains 
E, F. The new file has the same order of true chains A, B, E, F, 
followed by ADP deemed to be part of chain A. Thus, some serial 
numbers are changed. The last atom in chain F is (old file) 
"ATOM  12915  C6    T F  30", and (new file) "ATOM  12888  C7   DT F  30".

At http://www.wwpdb.org in the 3.20 documentation I did not find any 
specification for the order of chains, but I may well have missed it.

It may be worth suggesting a change in Jmol to avoid using atom 
serial numbers in state scripts, in case a future remediation again 
scrambles the serial numbers.

In my separate message to the PDB, I am asking whether the changes in 
chain order were intentional, what the specification for chain order 
is (it is not obvious to me), and whether these changes will be 
permanent (requiring repairs in Proteopedia).


