[Pdbwiki-devel] PDB version 4

Henning Stehr stehr at molgen.mpg.de
Thu Jul 28 08:59:50 EDT 2011

Hi Jose,

amazing work!

> Can those be taken from rsync somehow?

Look at the scripts 'update.sh' and Dan's modified version  of
'rsyncPDB... ' in directory 'pdbrsync' which I just uploaded to the
repository. The information we need will be in the rsync log file. If
you can make pdbase update only the files which changed, then we have
solved almost all our problems. We only have to make sure to update
pdbase after every rsync to keep it consistent. If we miss one update
round, we'll have to recreate pdbase from scratch. But even that
should not be a real problem. Then the pdbwiki update should work
without any modifications, shouldn't it? Or does that rely on the
unzipped files in any way? In that case we would have to modify it to
unzip on-the-fly and delete afterwards.

Thanks for the great work

On Thu, Jul 28, 2011 at 12:56 PM, Jose M. Duarte
<jose.m.duarte at gmail.com> wrote:
>> IIRC, the 'loader' also supports a delete command, where you pass it a
>> list of PDB entries to delete. We just need to work out how to
>> generate the list of files to delete (and then update) without
>> recourse to the file time stamps (which is what we use currently
>> IIRC).
> Indeed there was a "delete entry" option! It seems that Henning implemented
> it in the batch runner. The weird thing is that in the delete method from
> the original code no records were being deleted from the db because the
> execute statement call was missing! So I fixed that one and now it seems to
> be working.
> The command looks like:
> java -cp OpenMMSbatch.jar org.rcsb.openmms.apps.rdb.PDBase LenientParse \
> data=/nfs/data/dbs/pdb/data/structures/all/mmCIF \
> manifest=file:///nfs/data/dbs/pdb/ls-lR \
> log=PDBASE.LOG \
> exclude=ExcludeStructureIDs.list \
> entries=102M,102D,103D,105D \
> pdblist=allpdb_ex1.list \
> dbUrl=jdbc:mysql://localhost/pdbase dbDrv=com.mysql.jdbc.Driver dbUsr=user
> dbPwd=pwd \
> action=DeleteSingleEntry
>> We discussed tracking the data in the OBSOLETE file, but rejected that
>> idea because it changes weekly, so we need to make sure not to skip a
>> week... However, in comparison to building the whole db from scratch
>> every week, it doesn't seem like too much of a compromise.
> I agree there, in the worst case we would need a full reload from time to
> time
>> Once we have a reliable way to generate the list of deletes, updates,
>> and additions, we can then pass those to the loader appropriately.
> Can those be taken from rsync somehow?
> Jose
> _______________________________________________
> Pdbwiki-devel mailing list
> Pdbwiki-devel at bioinformatics.org
> http://www.bioinformatics.org/mailman/listinfo/pdbwiki-devel
> http://www.pdbwiki.org

More information about the Pdbwiki-devel mailing list