[Pipet Users] making more CLI programs Piper-compliant

Roland Walker walker at ncbi.nlm.nih.gov
Mon May 14 20:49:40 EDT 2001

[jeff writes]
> > If there is some format you need to export
> > command-line options, I'd be glad to support it.
> When CLI programs are piped together, in Piper or elsewhere, they're something
> like linked libraries: whoever or whatever does the linking needs to know
> something about the I/O parameters.  The problem is, CLI programs don't
> provide that information prior to execution.  They *DO* however have
> human-readable help output (by typing the --help or -h flag).  In Piper,
> programs can be wrapped/ported by hand.  But we'd like to automate the
> process, and probably the best way to do this would be to parse the help
> output.  There seems to be some semblence of a standard, when you look at the
> output, but I don't believe that there is any.  Anyway, if we find the least
> common denominator, we may try promoting it for CLI programs.
> Unless someone can think of a better way.  We've also considered parsing man
> files and source code.

There is absolutely a better way.  Don't parse the help output.  This
is harder than doing things the nominally hard way.

   1 First, you must wrap some CLI elements by hand.  This is needed
     to get started, yes? and you are doing it already, yes?  Wrapping
     basically means standardizing the handling of STDIN, STDOUT,
     and STDERR, trying to tame buffering, and providing some
     formalization of the options.  This is not too hard; it's my own
     stock in trade.

   2 Provide an _easy_ way for CLI authors to export options, making
     makes their programs drop-in elements for you.  This means not
     writing yet another standard, which authors will just ignore, but
     giving authors the needed code.  This is not as hard as it seems.

     One can write a perl module that reads the simple argument spec
     given in Getopt::Long, and translates it to Piper-spec.  4 out of
     10 interesting programs are written in perl already.  8 out of 10
     coders are amateur perl coders.  8 out of 10 amateur perl coders
     have used Getopt::Long.  So conquer Getopt::Long.  You will get
     many elements.  Then conquer the PPT (Perl Power Tools) project,
     which duplicates most basic Unix commands.

     You won't have to conquer SEALS, as I'll just give it to you.

I assume you have written an XML spec to define the options?  That's
what I would have done if XML was around when I started my thing.
Give me your spec and a give me a couple of hours, and I can give you
a Getopt::Piper that is drop-in compatible with Getopt::Long.

I'm not claiming this is perfect, just useful as heck.  Cram new
programs in to your system as hard as you can.  If you reach a certain 
critical mass, all kinds of stuff will start to come your way.

There are dozens of scripts here at NCBI that use the same library for
argument processing.  For some reason argument processing and CLI
usability is an obsessive interest of mine.  Anyway, I build a data
structure that holds more than anyone could want to know about
processing the options for a CLI script.  Then I like to do more with
that than merely process the options.  For instance, each script can
generate programmable completions for tcsh based on its expected

   [thorin] {/home/walker/downloads:1751} gi2fasta -tcsh_completions
   complete gi2fasta {n,c}/{-,--,}{index_method,master_db,http_server}{=,}/x:"<value>"/ {n,c}/{-,--,}{hint,only_hint}{=,}/x:"<list>"/ {n,c}/{-,--,}{dollop,timeout,tries}{=,}/x:"<integer>"/ \
   c/-{-,}/"(index_method= master_db= http_server= hint= only_hint= dollop= timeout= tries= save append delete feedback defline_as_hint swallow_failures secure insist show_spacers no_save no_append no_delete no_feedback no_defline_as_hint no_swallow_failures no_secure no_insist no_show_spacers argsets help version tcsh_completions)"/ 'c/@/`argsets --terse $:0`/'

This lets you do

   [thorin] {/home/walker/downloads:1751} gi2fasta -htt<tab>
   [thorin] {/home/walker/downloads:1751} gi2fasta -http_server= 

and generally makes -long_option_names more tolerable.

I'd generate bash and zsh completions too if I had the time; I'd
make the time if someone had the need.

Once, someone at NCBI wanted to drive SEALS scripts via CGI off of
a separate GUI.  We said, no problem, just show us the what kind of
spec you want to see for the options.  The SEALS side must have 
taken about a day.

So to you Piperians, I say, show me your spec, I'll throw it into
the main SEALS library so that all scripts can be queried, say,
like this

   gi2fasta -piper_options

to return your spec, giving you a sizable wad of new CLI elements
for your system.  There is certainly some duplication, but duplication
can even be good.  I am not into the idea of a single perfect 
system, or a single best way, but rather in opening up new ways
which are new paths for users to wander.

Furthermore, you may have our module if you find it useful.  I'd say
the module is wonderful, though idiosyncratic.  Fixing it for general
use means mostly indenting and deleting.

In addition, the research side at the NCBI is a powerfully active
testbed.  If you could get folks using Piper here, you would get
a _lot_ of feedback.

Think critical mass.

So I think that the only hitch between us and a world of excitement
is the lamentable lack of a packaging system that can tame the SEALS
monster.  We have multiple revisions we need to reconcile, and
tangles of code that we rather urgently need to clean up and push
out the door in some usable form.

Please feel free to write to Janos Murvai at

   murvai at ncbi.nlm.nih.gov

if you are interested in the status of the SEALS packaging project.


More information about the Pipet-Users mailing list