[ViewVC] Log of: owl/branches/aglappe-jung/proteinstructure/CiffilePdb.java

Links to HEAD:	(view) (annotate)
Sticky Revision:
Sort logs by:

Revision 448 - (view) (annotate) - [select for diffs]
Modified Tue Dec 4 15:19:15 2007 UTC (16 years, 9 months ago) by lpetzo
File length: 29585 byte(s)
Diff to previous 441

class *Pdb:
- implemented new exception handling scheme for function Getter.get() -> getChains() and getModels() throw all possible exceptions encapsulated in GetterError exceptions

Revision 441 - (view) (annotate) - [select for diffs]
Modified Thu Nov 29 15:19:37 2007 UTC (16 years, 9 months ago) by duarte
File length: 29410 byte(s)
Diff to previous 419

Now Pdb constructors don't load data but rather intialise the pdbCode, loading occurs upon call of load(pdbChainCode,modelSerial)
New methods getChains() and getModels() in all Pdb classes
New Exception PdbLoadError
New tester for getChains and getModels: testGetChains
Changed all calls to Pdb construction accordingly (includes changing excpetions)

Revision 419 - (view) (annotate) - [select for diffs]
Modified Thu Nov 22 14:09:18 2007 UTC (16 years, 10 months ago) by duarte
File length: 29335 byte(s)
Diff to previous 402

Re-branching for JUNG2 development

Revision 402 - (view) (annotate) - [select for diffs]
Modified Thu Nov 15 16:15:32 2007 UTC (16 years, 10 months ago) by duarte
Original Path: trunk/proteinstructure/CiffilePdb.java
File length: 29335 byte(s)
Diff to previous 400

Forgot CiffilePdb!: not forcing upper casing of chainPdbCode. pdb chain codes are case sensitive!!!

Revision 400 - (view) (annotate) - [select for diffs]
Modified Tue Nov 13 15:55:30 2007 UTC (16 years, 10 months ago) by filippis
Original Path: trunk/proteinstructure/CiffilePdb.java
File length: 29361 byte(s)
Diff to previous 397

-pdb_seq_num is used instead of auth_seq_num in get_ressers_mappings. Now resser2pdbresser contains all residues even the unobserved ones. This has been changed in PdbasePdb and CiffilePdb classes.

-pdb_strand_id not needed in the query in read_seq in PdbasePdb class. Asym_id and pdb_strand_id have 1-1 mapping in case sensitive mode.

-comments have been added about the entity_id-asym_id-pdb_strand_id 1-1-1 mapping and some filters in queries that are no longer necessary.

Revision 397 - (view) (annotate) - [select for diffs]
Modified Thu Nov 8 16:04:21 2007 UTC (16 years, 10 months ago) by filippis
Original Path: trunk/proteinstructure/CiffilePdb.java
File length: 29490 byte(s)
Diff to previous 357

Pdb code forced to lowercase in CiffilePdb

Revision 357 - (view) (annotate) - [select for diffs]
Modified Wed Oct 17 08:01:13 2007 UTC (16 years, 11 months ago) by duarte
Original Path: trunk/proteinstructure/CiffilePdb.java
File length: 29476 byte(s)
Diff to previous 356

Fixed bug: now does correctly quoting for any of the 3 cases: "", '' or ;;. Does also correctly multiline parsing of ;; quoted fields. Passed testing on cullpdb 90 (more than 9000 pdb entries)

Revision 356 - (view) (annotate) - [select for diffs]
Modified Sat Oct 13 15:55:41 2007 UTC (16 years, 11 months ago) by duarte
Original Path: trunk/proteinstructure/CiffilePdb.java
File length: 28335 byte(s)
Diff to previous 355

Fixed bug: wasn't tokenising well when first field in line was quoted

Revision 355 - (view) (annotate) - [select for diffs]
Modified Fri Oct 12 18:42:37 2007 UTC (16 years, 11 months ago) by duarte
Original Path: trunk/proteinstructure/CiffilePdb.java
File length: 28315 byte(s)
Diff to previous 336

FIXED BUG: now doesn't fail with records that are delimited with \n; ;\n
Method tokeniseFields is now completely rewritten: is what does all the magic of parsing all the oddities of the mmcif format
Using RandomAccessFile to open the file only once and then seek to the positions we need to scan at each point. Might be slower due to the RandomAccessFile that does no buffering. Also maybe because the new tokenisation is not very optimal
Now parseCifFile does the whole parsing calling also the submethods instead of calling them in the constructor

Revision 336 - (view) (annotate) - [select for diffs]
Modified Tue Oct 2 16:14:20 2007 UTC (16 years, 11 months ago) by stehr
Original Path: trunk/proteinstructure/CiffilePdb.java
File length: 27667 byte(s)
Diff to previous 326

extracted constant NULL_CHAIN_CODE from ...Pdb classes, added copy() methods to NodeSet and EdgeSet, added some functionality to NodesAndEdges, new class SimilarityGraph

Revision 326 - (view) (annotate) - [select for diffs]
Modified Thu Sep 20 14:49:55 2007 UTC (17 years ago) by duarte
Original Path: trunk/proteinstructure/CiffilePdb.java
File length: 27654 byte(s)
Diff to previous 319

Removed class AA and replace it by AAinfo, which reads contact types from separate file contactTypes.dat
New class ContactType which contains atoms for each contact type and residue. A static object for each contact type is loaded into AAinfo upon reading the contactTypes.dat file
Changed all references accordingly

Revision 319 - (view) (annotate) - [select for diffs]
Modified Mon Sep 17 16:10:32 2007 UTC (17 years ago) by stehr
Original Path: trunk/proteinstructure/CiffilePdb.java
File length: 27885 byte(s)
Diff to previous 317

added constructors for loading from online pdb

Revision 317 - (view) (annotate) - [select for diffs]
Modified Thu Sep 13 16:09:10 2007 UTC (17 years ago) by duarte
Original Path: trunk/proteinstructure/CiffilePdb.java
File length: 22872 byte(s)
Diff to previous 315

Fixed some comments

Revision 315 - (view) (annotate) - [select for diffs]
Modified Thu Sep 13 08:13:40 2007 UTC (17 years ago) by duarte
Original Path: trunk/proteinstructure/CiffilePdb.java
File length: 22939 byte(s)
Diff to previous 314

Now parsing each element in different methods (re-opening the file). Parsing first pdbx_poly_seq_scheme so we get the chainCode that we can use for reading the rest
Now taking care of cases where struct_sheet_range is not a loop element
In tokeniseFields now also unquoting double-quoted strings
Tested on a set of 12000 entries

Revision 314 - (view) (annotate) - [select for diffs]
Modified Wed Sep 12 14:50:48 2007 UTC (17 years ago) by duarte
Original Path: trunk/proteinstructure/CiffilePdb.java
File length: 21070 byte(s)
Diff to previous 311

Checking number of fields per line in loop elements and throwing exception if count is not correect
Doing tokenisation of lines through new function that takes care of possible quoted string with spaces
New exception CiffileFormatError
Checking 1st line of cif file has correct format: data_1xxx, if not throwing exception

Revision 311 - (view) (annotate) - [select for diffs]
Modified Thu Aug 30 17:31:38 2007 UTC (17 years ago) by duarte
Original Path: trunk/proteinstructure/CiffilePdb.java
File length: 18680 byte(s)
Diff to previous 310

Fixed buf: sometimes struct_conf can be non-loop elements, now also taking care of that particular case

Revision 310 - (view) (annotate) - [select for diffs]
Modified Thu Aug 30 16:00:08 2007 UTC (17 years ago) by duarte
Original Path: trunk/proteinstructure/CiffilePdb.java
File length: 17003 byte(s)
Diff to previous 309

Bug with '?' in auth_seq_num was not really fixed. Now should be fine: behaviour is the same as PdbasePdb

Revision 309 - (view) (annotate) - [select for diffs]
Modified Thu Aug 30 15:55:53 2007 UTC (17 years ago) by duarte
Original Path: trunk/proteinstructure/CiffilePdb.java
File length: 16999 byte(s)
Diff to previous 308

Fixed bug: needed to read alt locs in advance in another scan of the file because the order of the elements in the cif file is not guaranteed. As read of atom_site needs of alt locs, we need to do first the parsing of atom_sites_alt

Revision 308 - (view) (annotate) - [select for diffs]
Modified Thu Aug 30 14:54:38 2007 UTC (17 years ago) by duarte
Original Path: trunk/proteinstructure/CiffilePdb.java
File length: 16090 byte(s)
Diff to previous 307

Fixed bugs:
- was reading HETATM lines as well as ATOM in atom_site
- auth_seq_num with '?' not taken now when populating the pdbresser2resser map (same behaviour as in PdbasePdb)
- now using chainCodeStr and auth_asym_id to identify chains in pdbx_poly_seq_scheme, struct_conf and struct_sheet_range. atom_site is not guaranteed to appear in file before all the others so we can't rely on having read a chainCode (asym_id) when parsing the other elements

Revision 307 - (view) (annotate) - [select for diffs]
Modified Thu Aug 30 10:41:55 2007 UTC (17 years ago) by duarte
Original Path: trunk/proteinstructure/CiffilePdb.java
File length: 15526 byte(s)
Diff to previous 306

Now taking indices for fields from parsed field names. Still only minimal testing

Revision 306 - (view) (annotate) - [select for diffs]
Added Thu Aug 30 09:09:24 2007 UTC (17 years ago) by duarte
Original Path: trunk/proteinstructure/CiffilePdb.java
File length: 12814 byte(s)

First implementation of mmCIF file parser. Tested minimally.

This form allows you to request diffs between any two revisions of this file. For each of the two "sides" of the diff, enter a numeric revision.