[BiO BB] what the difference between remediated pdb and pdb?
dan.bolser at gmail.com
Sat Jun 14 07:38:44 EDT 2008
On 09/06/2008, Roberto Mosca <roberto.mosca at embl-hamburg.de> wrote:
> From the document:
> "3.5 Resolution macromolecular sequence conflicts
> Differences in entity sequence assignment between RCSB PDB and MSD-EBI
> been resolved. Any remaining differences between the chemical sequence
> and the
> macromolecular sequence have also been resolved."
> I think that this refers only to the DBREF and SEQRES records in the
> file. Of
> course an alanine cannot become a valine or a tyrosine when going from
> the original
> to the remediated PDB.
I think it can.
Lots of 'strange' format inconstancies have been cleaned up, including
the unusual practice of labelling amino acids such as tyrosine as
alanine if the majority of the side chain is 'unobserved' in the
structure. In such cases the tyrosine may look like an alanine
(chemically), and was sometimes labelled alanine in the ATOMSEQ.
Tons of minor changes were made throughout the PDB files when going
from PDB to PDB-REMEDIATED, however, these various changes were not
documented... Amazing isn't it? Stupefying? I would be very happy if
someone could prove me wrong on this point.
It would be possible to get a database of differences by dumping the
mmCIF files into XML and then doing an XML diff on all the pairs of
files. The PDB's view seems to be 'just start using the remediated
files', which you should do if possible.
It is also surprising to learn that data in the PDB files such as
links to sequence databases are not being kept up to date. For up to
date sequence data you need to additionally use the 'sifts' (Structure
integration with function, taxonomy and sequence) database:
Hrm... I should add SIFTS to MetaBase ;-)
> I'm not sure what kind of errors have been fixed but I think there
> could have been
> inconsistencies between the sequence reported in the SEQRES records
> and the
> sequence corresponding to the ATOM records in the PDb file itself.
> Dr. Roberto Mosca
> EMBL c/o DESY
> Notkestr. 85
> 22607 Hamburg
> Email: roberto.mosca at embl-hamburg.de
> Tel: +49 (0)40 89902 131
> Web: http://www.embl-hamburg.de/~rmosca
> On Jun 8, 2008, at 04:03, Xue Li wrote:
>> Hello all,
>>> From remediated pdb web site http://remediation.wwpdb.org/
>>> index.html, I read
>> "Updated references to databases and taxonomies and Resolved
>> between chemical and macromolecular sequences".
>> Does it mean that for some proteins they will have difference
>> sequences in
>> remediated pdb and ordinary pdb?
>> Xue, Li
>> Bioinformatics and Computational Biology program @ ISU
>> Ames, IA 50010
>> BBB mailing list
>> BBB at bioinformatics.org
> BBB mailing list
> BBB at bioinformatics.org
More information about the BBB