On Wednesday, January 15, 2003, at 09:51 PM, Joe Landman wrote: > The other problem for structured documents of this nature is that the > size of them almost precludes real parsing efforts. A parser is going > to build up data structures which represent the content of the > document, > and these structures should be of comparable size to the document in > various cases. > > We probably need to start looking at things differently in the file > systems, and handling the output somewhat differently (and more > succinctly). > Part of my interest is that I've been working on event-parsing schemes for XML that should be of good use in this area. There are lots of useful things you can do in an event-oriented environment where you only look at small subtrees at any point in time. This would then allow you to traverse a large document (i.e. genome data), doing whatever you do, without have to try to "load" it into some data structure first. I've just found BSML [1] so I'm going to take a look at that to see if it is any better. [1] http://www.bsml.org Bioinfomatic Sequence Markup Language Alex Milowski FAX: (707) 598-7649 alex at milowski.com "The excellence of grammar as a guide is proportional to the paucity of the inflexions, i.e. to the degree of analysis effected by the language considered." Bertrand Russell in a footnote of Principles of Mathematics