[Pipet Devel] Getting rid of modification of xml definition files
Brad Chapman
chapmanb at arches.uga.edu
Sat Jul 29 18:36:10 EDT 2000
Hello all;
Jean-Marc and I have been having a lot of debate back and forth
over the way the Piper should represent the desription of the workflow
diagram. The way that Piper has worked up until this point is that it
stores the description of each locus/node added to a network as a
separate XML file in the filesystem (a *.def file). These files are
stored in a heirarchial system, with each composite locus (network)
having a directory with all of the xml files describing the children
loci/nodes inside of it. Every time something is modified in the user
interface, the xml file for representing the modified object gets
changed to reflect this. Similary, when we need to retrieve
informationa bout an object, we get this from the XML file.
Recently in addition to this "permanent" storage layer, I added an
"in-memory" storage layer on top of this. This was due to the fact
that all of the file accesses were really slowing things down, and it
was a *huge* speed improvement to add this "in-memory" layer on top.
Even more recently (well, I'm still working on it right now :-) I've
been merging this in-memory layer with the Overflow UI* library, and
using this library as the in-memory layer.
So anyways (you knew I would have to get to a point somewhere in
here, didn't you :-), Jean-Marc has been suggesting that we get rid of
the "permanent" storage layer and manipulation of the XML files, and
instead use the Overflow '*.n' file format as the permanent storage
format for Piper. When we hashed this back and forth we came up with a
number of reasons to do this:
1. Speed improvement by not having to access the filesystem so much.
2. The "individual XML files for each locus" format will not scale
well to large networks (Jean-Marc has some with *tons* of nodes).
3. The *.n file format is more compact and thus easier to transport
between compouters.
4. The naming of the *.def files is ugly and hard to handle,
especially when there are tons of nodes.
5. Storing these *.def files could get to be a beast, especially if
you are working with tons of different networks at once.
On the other side, the best reason to retain the '*.def' file
manipulation strategy is that if we start having tons and tons of
options that could be set in *.def files (ie. having to do with XML
descriptions of user interface components) this could make the *.n
files get huge, bloated and hard to comprehend.
After a lot of discussion, I guess I'm on the side of Jean-Marc
(which is why I'm writing this mail, after all) and think it might be
best to do away with the copying and manipulation of *.def files.
These *.def files will still be used to define a node (as they are
being used for right now in Overflow and Piper), but will just be like
C header files -- descriptions. Internally, the dl will manipulate
the information about the work flow diagram in the "in-memory" layer
and use the *.n format for permanent storage. Since the dl runs as a
separate process from the UI, crash recovery will still be possible
since we can just save a *.n file as a crash file, and then re-load
from that.
So at any rate, this is a proposal to do away with the *.def
file manipulation in the dl. What do people think about this? Are
there any proponents of leaving it the old way? Other things that
should be considered? Comments are very welcome!
Brad
More information about the Pipet-Devel
mailing list