[Pipet Devel] Idl

J.W. Bizzaro bizzaro at geoserve.net
Tue Mar 28 00:52:56 EST 2000

Brad Chapman wrote:
> For the SUBNET interface:
>     These are for accessing whole subnets or programs (so GMS type
> nodes) after they have been sent a workflow/xml_structure to process
> (ie. a whole bunch of connected nodes), right?
>     My question is, do we really need this level of control so that
> someone can
> manipulate a process so much while it is running? The way I was
> thinking of the GUI and processing engine working together are that
> the user spends some time getting a nice process of connected nodes
> set up, and then hits run and waits for the output to come out at the
> specific nodes/subnets they are interested in. If this is the case,
> then why and when would they need to access a process to remove,
> activate, destruct, add, etc. nodes and subnets?

I for one am very much in favor of this level of control.  As computational
biologists, we may be working on analyses involving terabytes of data and days
of calculations.  Should the network just sit there unusable during these
analyses?  Should entire jobs be scrapped if it turns out one node needs to be

I think this level of control is akin to the use of threading in an
application.  If threading is not used for compute-intensive tasks, the GUI
freezes up and the application is unusable.  Sure, it's still running, and you
can always stop it.  But the loss in usability is both perceived by the user
and quite real.

>     Am I being naive here in thinking that we don't need all of this
> control of a running process? Perhaps what we do need though it a
> method for cancelling and run and getting as much information as you
> can from it...

Well, it may be simpler to not worry about it at this stage.

> For the following functions--do we need the processor to know about
> the description of the program? Can all of this type of information by
> stored in xml in the "middle scripting thingy"?

This is mostly useful to the user and should be kept as XML meta-data.  I
don't know what use the core has for it.

> For the following functions: I don't really understand the set status
> (how does the scripting engine set the status of a process?). I tried
> to include GETstatus type stuff in my query_process function. But
> maybe we should split this querying to be specific for a subnet or
> node?

The back-end or applications should set the status, which is reported back to
the user through the datastream.

> For the following, I'm not really positive why the middle would need
> to know this stuff? Do you want the GUI to restrict only allowed
> things to be connected? This seems *really* hard to maintain--maybe it
> would be better to have the processing engine generate errors and then
> propogate them back to the GUI so the user can fix them? The idl
> probably does need a better way to return back error messages.

Hmmm.  I have been in favor of the GUI allowing only compatible connections
(same data type).  If we assign types (e.g., mime) to data, we can determine
what will and won't work.

> For the following, do we still need the processing engine to store a
> log/history of events, or should this be handled by the middle
> scripting part, which can maintain logs for all its communication.

We can make a 'front' that records the datastream to a file.

>     I think the same kind of comments I made above apply here: how
> about if we just send all of this information to the middle as XML. I
> think that the current Overflow XML will be a good point to start with
> and we can work from there.

Okay by me.

> What I need to get working is a way to
> turn the directory/heirarchy based file storage system that Loci
> currently has into the type of flat XML that Overflow uses, so I can
> pass this. While the heirarchial structure is best for dealing with
> the GUI (since I have to make lots of changes to it as the GUI
> changes, it is easier to break it up into smaller chunks for
> manageability and speed), the flat file format is what we need so that
> we can pass a single file to the processing engine to process (and to
> stay compatible with what Overflow does since they have a good thing
> going).

You pretty much summed up Gary's and my own thoughts on this network structure
defined with XML.  A flat file is easy to store and pass but difficult to
query.  Breaking it up makes it easy to query but difficult to store and
pass.  The solution seemed to be the use of a real database.

> We also will not want to pass all of the info that is stored
> in XML in the middle (like descriptions of programs, etc.) to the
> processing engine, so some kind of transformation will be needed. I
> need to start trying to think about how to do this, but I hope it
> shouldn't be too bad...

descriptions of GUIs, libraries of widgets, etc. are not put into this
datastream but given a link so that the fronts and backs can get the stuff
themselves.  This way, the system remains neutral about protocol, and things
can be spread throughout the Internet.

                      |           J.W. Bizzaro           |
                      |                                  |
                      | http://bioinformatics.org/~jeff/ |
                      |                                  |
                      |        BIOINFORMATICS.ORG        |
                      |           The Open Lab           |
                      |                                  |
                      |    http://bioinformatics.org/    |

More information about the Pipet-Devel mailing list