[Pipet Devel] and still more infrastructure things

J.W. Bizzaro bizzaro at bc.edu
Sun Feb 28 06:44:03 EST 1999


We're both late night or early morning people, huh? :-)

Justin Bradford wrote:

> I think I need a clarification on the meaning of a locus. My understanding
> was a locus is term covering an instance of Porta/Gatekeeper/analysis
> tool(s) on a computer somewhere. It's just a place where analysis is done,
> and that's it. The wfs system worries about direction of the whole object.

"Locus" just means any program or object, and I mean _any_.  The name "Loci"
then emphasizes that this is a distributed system to the extreme.  But usually I
mean a client or server process.

> I agree. I had intended to make a generic C/BS/BioML2 format first. Then
> this would be what's under the data sections, so LociML would just
> encapsulate that portion of it.

Fine with me.

> As for the algorithm and statistics stuff, I was thinking of that as
> something potentially useful to keep in with sequence/structure/relation
> data. For instance, it could be useful to know a structure was derived
> using some particular X-ray crystallography technique. That stuff is
> related to Loci.

Hmmm.  It almost lies between biological and workflow data.  I suppose it could
go either place, but the workflow stuff is just temporary really.  When the data
is to be archived, we don't need to keep old status and query data around.

> > > Status is information concerning the data returned at each analysis step.
> > ...what was collected along the way
> 
> More specifically, how the collection went. Actual data would get stuck
> back in a block under <data>.

Okay.

> > Nice.  But how will Paos handle this?  Are we looking at some major changes to
> > Paos itself?
> 
> I don't think so. My intention was to have the wfs only send what that
> specific analysis needed. Input, output, and status each have an attribute
> on the object. The wfs sends input once, reads output once (and merges the
> new data with the full object), and gets constant updates on the status
> attribute. So whenever the analysis tool changes status, the wfs knows,
> and the benchtop can be updates (assuming any are paying attention at the
> moment).

This brings up a question I wrote at the end of this e-mail.

> > Right.  That'd save time, but be difficult to manage.  Now we're talking about
> > concurrency.
> >
> > Hmmm.  Now are we dealing with the whole forking/sewing issue here?
> > Once an XML
> > object is split up, will it have to be put back together again?
> 
> Concerning the dependency scheduling, it wouldn't be difficult to manage
> this from a central server, as I was envisioning the wfs. If an object
> roamed independently, it would be difficult to manage, unless we had it
> all of the threads regroup when data needed to be rejoined.

Of course we can deal with this after we are comfortable with the basic wfs.

> Yes, it needs to be restructured. Many of the ID numbers would be assigned
> by the GCL to XML query translator.

Okay.

> > > The wfs identifies queries it can currently run
> >
> > How?  By the database of available loci/clients?
> 
> However GCL defines it. I imagine explicitly naming a server as one
> option, or just specifying a type of analysis, where the wfs will use a
> list of some kind to find one available.
> But before it contacts the server, it has to make sure it has all of the
> data available for its query (check dependencies).

Yes.  We define dependencies as data, servers, and clients (loci).

> > > giving it only the portions
> > > of the xml file necessary for it to run (query and relevant data
> > > sections).
> >
> > Yeah, this is where I see Porta Internet or Gatekeeper filtering out
> > stuff the
> > server-side algorithms/databases don't need.
> 
> I had imagined the wfs server doing that, but I imagine are difference is
> in semantics. Basically, the analysis tool just gets what it needs.

Right, we agree on the end but not the mean...We'll sort that out.

> At the very least, I can pass blocks of XML through attributes on the paos
> object. It would be interesting to see if the Paos object could be a
> mirror of the XML, however.
> So:
> <status>
>  <message>Ok</message>
> </status>
> Becomes:
> paos_object.status.message = 'Ok'
> 
> But I can work without that.

That brings up a big question I had, and where I've been getting confused...

Is there really any such thing as an "XML object"?  I mean, XML is a way to save
structured data as a _file_.  Python objects, on the other hand, are data
structures in memory.  We would just be going back and forth between file and
object using XML.

So, where do we really need XML?  Could the data just be a Python object?  If we
need to save the object, I think it can just be "pickled"?

Konrad?


Guten Morgen!
-- 
J.W. Bizzaro                  Phone: 617-552-3905
Boston College                mailto:bizzaro at bc.edu
Department of Chemistry       http://www.uml.edu/Dept/Chem/Bizzaro/
--



More information about the Pipet-Devel mailing list