[Pipet Devel] from Justin

Humberto Ortiz Zuazaga hortiz at neurobio.upr.clu.edu
Tue Jun 15 15:12:32 EDT 1999

> > I think Lincoln Stein's boulder file format is a good example here.
> Do you have a reference or URL for this?  I do understand what you're
> suggesting.

David already posted the link, I hope everyone has a chance to look at it. 
Boulderio is a much simpler solution to the same problem we're looking at, how 
to pipeline biological data through a set of command line programs. It was 
designed for cgi based apps, and is written in perl.

> There is a difference, however, in how we see the locus functioning.  I think
> you're saying the locus waits until _runtime_ to check for availability, when
> the next component in the workpath is needed.  It then decides on its own what
> to do next.
> But I'm saying that this should be decided at _design_time_, and ultimately by
> the locus developer.

Yes, I was thinking of runtime checking for needed loci, when a user wishes to 
use a locus he's never used before.

>  There are a several advantages to checking availability at
> design time:
>     (1) The workpath and all components can be displayed in the WFD
>     (2) The workflow won't be interrupted constantly at runtime
>     (3) The developer has more control
>     (4) The locus will require less "AI" to make decisions.

But you can't put together a locus that depends on the availability of a 
network resource (the net may not be up).

I think we're talking about two different steps

packaging (developing) a composite locus:
you do want "compile time" checking for all needed subcomponents to transform 
your stated input to the correct output

using a locus (composite or native):
you want run time checking for any matching inputs and outputs

this way, youre not limited by the ways the developer thought his locus could 
be used, as long as you can find a shim from your outputs to it's inputs.

This is one of the biggest problems with current bioinformatics programs, why 
can't I just take the output program X and use it as input to program Y?

> So the chain of command would be more like this:
>     (1) The locus finds all immediately available solutions
>     (2) The locid finds all remotely available solutions
>     (3) The developer is given a list and decides what to do next

Replace developer with user, and do this at run time when you run a locus and 
you've got my idea down too.

> What I'd like to see is the locus respond ("I need v3.2 foo objects as input")
> to the Workspace at design time to let the developer know what is needed.  Sure,
> it's simpler not to get the user involved in this, but we are talking about a
> developer constructing a command-line, something that may not be done by
> everyone.

But that's the point, everyone is going to be constructing a command line, 
every time they run an analysis. In many cases, it will be the trivial command 
line input -> canned locus -> output, but often it won't be so simple. The 
people who really need help running the tools aren't the developers, it's the 
bench scientitsts.

Humberto Ortiz Zuazaga
Bioinformatics Specialist
Institute of Neurobiology
hortiz at neurobio.upr.clu.edu

More information about the Pipet-Devel mailing list