[Pipet Devel] BioML vs BSML

Tue Jan 26 02:21:33 EST 1999

> I don't think my opinion is so relevant; my field of work is rather
> different from the Loci project. I work on structures, and BioML
> does not seem to have any provision for structures at all. Which is
> fine, of course, not everything has to be designed for my needs ;-)
> My complaint with CML is that it claims to handle biomolecular
> structures and does it badly.

Does BSML not fulfill all of the requirements Loci needs?
I'm guessing so, since CML was also planned.
If so, what's missing?

A visualization program is going to have to know the format of the data it
gets back from the analysis program (obviously), so the XML translation
wrappers will have to be consistent. Now, we could use two different
languages, but a viewer may want data from two different tools, each with
a different ML (markup language).

Also, we'll be wanting to chain several tools together, which is going to
require tools taking input data from a ML, right?

But we also want control information tagging along with the object? And
that would also be XML data? 

Furthermore, I'd like it if this thing could query/update databases, too 
(ie, a glyph for submitting my new protein structure to Brookhaven, or get
the sequence for some gene out of the GDB, etc.)

Now let me see if I understand the system so far.
Paos is the network transport layer. But which end does the server run on?
Jeff made a comment earlier implying the Paos server runs on the user's
machine. One client is the GCL/viewer/monitor and one is on the actual
machine running the analysis tool. But how would a connection be made to
between the server and the analysis client? Doesn't the Paos server have
to be on the analysis end?

Also, a workflow/batch control system is in charge of directing the
movements of the object (via Paos). In case of failure, the Paos object is
updated with some exception, and the workflow system is notified and deals
with it appropriately.

Throughout this process, the workflow system is also updating the Paos
object with current status and the anaylisis programs update the object
(or create new ones?), which the monitor client is displaying for the
user. When complete, the visualization/viewer program is notified, takes
the Paos object and renders it for the user.

Am I close?
If so, it makes sense to use the Paos object to store control, exception,
and status info. Data for anaylsis and analyzed data are stored in
separate attributes. The gatekeeper takes the data from the appropriate
attribute (as told by relevant control information), modifies it as
necessary for the analysis tool, and runs that tool.
Output is then committed to the Paos object (after conversion to the
appropriate XML dialect by the gatekeeper), and the workflow system
decides what to do next (depending on control info), until eventually, it
is handed back to the user's client.

In this model, the workflow system is a Paos server/client combo. It
would get the original object from the user, hand that to an analysis
server, but keep a local copy updated, which the user (status monitor)
would access for updates. When one analysis step is done (and it had
resynced it's copy of the remote object) it would delete the object on the
analysis server (remote object), and then repeat the whole process (ie.
give the object to the next analysis server, ...)

All the user client stuff access the workflow system directly, which deals
with the individual analysis servers. This runs as a separate process, so
you might have a server running this. The client starts up his Loci
GCL program on a networked computer anywhere, builds the analysis batch,
starts it, gets an ID number, and can close the program and walk away.
Then from any other computer with Loci (or via the web when that interface
is done), enters the batch ID, and can see everything that has happened to
it so far along with it's current status. When it's done, the user can
save the object locally for future reference (or maybe it's moved to a
networked Loci archive system [just a Paos server]).

Of course, the workflow process could be run locally as well, along with
all of the analysis tools. Also, the workflow system could implement more
than just Paos network connection to the analysis programs, such as CORBA,
COM, IRC (biobots!), etc. all of which would be transparent to the client
tools.

So is that what everyone what already thinking?

Also, whenever I said "analyis tool/server", that could be replaced with
"database query/update". 

Now what does the query language look like, and how do we embed info from
analysis and db access early in the batch into later queries. Especially
if we have multiple XML dialects that the tools speak in. Ugh. Well I have
2.5 hours of day-dreaming/class tomorrow to come up with something.

Sorry for the ramblingness...

Justin Bradford
justin at ukans.edu