[Pipet Devel] Resend: workflow diagram data model and databases

J.W. Bizzaro bizzaro at geoserve.net
Sun Jan 9 14:31:41 EST 2000

Brad Chapman wrote:
> Hello all! I actually sent this a couple of days ago but I just realized
> that I don't think it ever made it to the list (at least, it never made it
> back to me), although it somehow managed to get in the archives. Weirdo!
> Anyways, my apologies for the re-send if I'm the only one who failed to get
> it.

Sendmail went down a while ago.  That might have been the problem.

> So basically, the overall plan is that every workspace gets a unique
> directory created containing baselocus.xml, an xml file with links to each
> of the loci in the workspace, and xml files for each loci in the workspace.

As Gary mentioned, we will want to use a real database for all this.  What
you've done is a step in the right direction though.

>         This is much like the ugly loci-file window did before, except that
> now things are done in DOM trees. Unfortunately, dealing with DOM trees
> also has led to a big slow-down in the time it takes to walk through a
> directory tree and write it as xml. To sort-of counteract this, the
> directory structure will only be parsed to a certain depth (currently it is
> set to something like 3). I'll try to think up speed-ups, but dealing with
> DOM trees slows things down. Sorry!

I would favor parsing only the highest level (depth = 1) directory.  Would
that speed things up?  It seems to me that parsing depth = 3 or more would
give you much more to parse, and it would be done everytime a new directory is

> Right, excellent point! When a workspace/composite loci is created within
> another workspace, the newly created workspace directory should be inside
> the previous workspace directory. I'll try to make my xml model do this.

A WFD represents any contiguously connected group of loci.  It can be as small
as a single, disconnected locus, and it has no upper size limit.  The
boundaries of a WFD are the unconnected connectors.

A WFD wholey enclosed by its boundaries (the WFD is unconnected) should be
represented by its own portable data structure (a database).

A WFD may represented on a higher level as a composite locus.  The result is
nested structures: WFD's inside of WFD's and Workspaces (the windowlet view)
inside of Workspaces.  But by connecting a composite locus to other loci, its
WFD becomes connected.

A WFD not wholey enclosed by its boundaries (connected via higher-level
composite locus) may still be its own portable data structure, but this would
require connectivity between these structures (is this possible?).  Otherwise,
the connection of WFD via composites will require integration of data

> I think making it xml makes it intrinsically portable. Once you create an
> xml workspace, you can zip it up (or use an xml compression tool) and send
> it around to your hearts content.

Gary and I figured that if we wanted portibility, a single XML file would be
best.  But if we had one giant XML file, it would be difficult to
hash/search.  This is why we settled on an XML-capable database.

> I think the directory structure that I currently have could be shoved into
> a  database in the following way:
> directories             -> main databases
> xml files               -> sub-databases within the main database
> info in xml files       -> the column/row info within the sub-database

Looks good.  But we wouldn't use the fuilesystem as an intermediate, right?

> Okay. I've connected loci using the xml:link linking language. How does
> this sound? Once we get the ability to disconnect links working, I think it
> shouldn't be too hard to disconnect the xml:links.

That might work.

> >The DBMS should operate as client/server processes in order to
> >accommodate distributed processing requirements.
> Do we want to have a DBMS as a client/server process separate from the Loci
> client/server stuff, or as a part of it?

Heh.  Actually, Gary and I spent a lot of time discussing how we can split
Loci up into functional layers, running as separate processes.  It turns out
that nearly all of the use of XML will belong to the 'middleware' process. 
The Workspace/Desktop will be a 'thin-client, frontend' process that
communicates with the middleware via a high-level API.  (So, about half of
your work will eventually run from separate processes, Brad.)

For the sake of allowing other frontends to be used with Loci (for example, a
Web interface), the API will be a 'dialog' between the processes, where the
frontend requests that the middleware do something, and the middleware
responds to that request.  For example:

    frontend: connect <locus=id#152415624 connector=id#1325634>
<locus=id#8256356 connector=id#2832637>
    middleware: connect <locus=id#152415624 connector=id#1325634>
<locus=id#8256356 connector=id#2832637>
    frontend: composite <locus id#152415624> <locus id#8256356>
    middleware: composite <locus id#152415624> <locus id#8256356>
    middleware: xml description of the composite gui follows

And the workspace (frontend) won't actually make connections, etc. until it
gets the command echoed back, indicating it is okay to proceed.  I'm thinking
that the frontend will be pretty dumb, doing only what the user and middleware
say to do.

So, you can see how various frontends can be made for Loci, including the
mysterious 'NLI'.  Maybe Gary will explain this some more ;-)

> >The DBMS should be able to quickly provide an XML description of
> >information stored inside the database.
> Okay, so we need xml to database and database to xml converters, right?



                      |           J.W. Bizzaro           |
                      |                                  |
                      | http://bioinformatics.org/~jeff/ |
                      |                                  |
                      |           THE OPEN LAB           |
                      |    Open Source Bioinformatics    |
                      |                                  |
                      |    http://bioinformatics.org/    |

More information about the Pipet-Devel mailing list