[Owl-devel] HETATMs support

Thu May 19 10:44:09 EDT 2011

So more major changes to owl. This time we get a long awaited new feature:
support for HETATOMs and nucleotides.

Basically the whole content of PDB entries will be read: standard amino
acids of protein chains (as before), non-standard amino acids within protein
chains (HETATMs in protein chains), nucleotide chains and HETATMs that are
not part of any chains (ligands, cofactors, buffer molecules or any other
molecule). I've excluded the reading of waters for now as for most uses
waters are not needed so we can save some memory there. If needed we can
change that easily, it's just a flag that ignores waters.

The main change has been in the Residue class. Now we have a Residue
interface and 3 implementations of it: AaResidue for standard amino acids
(what used to be called simply Residue), NucResidue for nucleotide residues
and HetResidue for all other molecules.

In the process of doing this I've learned that the PDB as usual decided to
do strange things with the format. Basically there are two kind of chains in
PDB entries: polymer (poly-peptides or nucleic acid chains) and non-polymer
(ligands and so on). Classically the non-polymer chains were not treated
separately but rather they were assigned chain codes that put them together
with the polymer chains to which they were related: ligands with the chain
they bind to and so on. Now in the CIF format they decided to then assign
separate chain codes (CIF chain codes) to the non-polymeric chains, which in
my opinion makes a lot of sense. But for some reason they decided to keep
the old chain codes in PDB files.

What I've tried to do with PDB files is to treat the non-polymer chains
separately assigning them chain codes trying to keep this assignment as
close as possible to the CIF assignment. To distinguish both kind of chains
(polymeric/non-polymeric) use the new method PdbChain.isNonPolyChain().

I've been trying to track all calls to methods that could be affected and
have tried to keep compatibility. But of course this is a major change that
can introduce some bugs, so we will need some time before things are more
stable again. I've created a tag to mark the code before this change
(owl-1.9.4).

Anyway hopefully the change is for the better as now we will be able to do a
lot more with the PDB data.

Enjoy

Jose
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.bioinformatics.org/pipermail/owl-devel/attachments/20110519/4c13f168/attachment.html>