[Pipet Devel] Re: hi, it's Tim

Mon Dec 28 06:34:02 EST 1998

> First, since someone interested in this project has likely worked on visual
> representations of algorithms, a nice idea for the "to-do" list would be "find
> code that looks like what we're trying to do".

I have been accumulating links to other bioinformatics programs.  I sent some of
these to the team before you joined, so I'll have to send it again soon.  I have
been particularly interested in examples of any sort that use Python and GTK,
for obvious reasons, but they are pretty rare for now.

> 
> Second, I'm sending my half-assed version of Codon to you for posting as soon
> as I get back to Burlington.  It sucks, but will take a file of base pairs and
> turn them into strings of AA's (broken into pieces at the stop codons).  If I
> can find out how the ABI sequencers feed their data to the colorful little
> "genomics" programs out there, Codon can be an interface widget.  More
> importantly, if it works okay in that context and we have stuff to build up
> around it, then we can spit out some useful code and get users (and thus
> feedback).  So look in your mailbox around January 3rd or 4th, I will tarball
> up all the stuff I have been using to work on Codon (not much!) and hopefully
> you can post it on the website. 

Great.  If you got the gist of the new model by reading my recent e-mail about
collaborating with the EMBOSS project, we have a client-server model, and
complex analyses will go on the server-side.  What you may call trivial could go
on the client-side, with the visualization tools.

>  (It would be neat if we set up a web code browser.)

Yes.  I've been thinking the CGI model will allow us to have a rather static
interface to some of the server-side tools.  But this would be in *addition* to
the dynamic client-side interfaces, not a substitute (see my list of tools
below).

> 
> Opinion:  There needs to be less design talk and more componentry for the
> project or it will never mature into anything.  A lot of this stuff can be
> retrofitted (i.e. if we realize "oh shit, this is a lousy data representation"
> then the major version number gets incremented and some parser code gets
> rewritten -- big deal).  So from now on I'm just going to send in pieces of
> code that do useful things, even trivial things, and hope that a big enough
> pile of useful widgets accumulates.

In other words, "less talk, more action!" :-)  Our talk so far hasn't focused at
all on data visualization; it's been just about all on tool communication.  I
think I have a much clearer picture as to what we need to do regarding this. 
The next step is to build a list, not of analysis tools, but of dynamic
client-side interfaces.  EMBOSS will take care of the big analysis tools for
us.  For now, here is a brief list of some dynamic "loci" I've been thinking
of.  Please feel free to add to this:

  (1)  Benchtop/workspace.  GUI representation of all data objects
       (files, documents, graphs) and possibly various loci.  Also
       may be used for automation of analyses (recall GCL?).

  (2)  File translation interface:  to read in various DNA/protein
       document formats and convert them to XML.  Also may be used
       to query databases and sort/compile documents.

  (3)  Sequence visualization/editing tool:  to manipulate DNA/protein
       sequences

  (4)  Sequence comparison tool:  to show multiple sequences aligned
       or translated.  May also perform some functions of (6)

  (5)  3D visualization tool:  to display molecules as 3D structures, with
       emphasis on a schematic/cartoon representation.

  (6)  Graphing tool:  to display plots against sequences and to make simple
       graphs.  Some may argue this isn't needed, but I need it for my
       programs, so others may too ;-)

  (7)  HTML browser implementation:  separate from the other tools, this
       would be a way for anyone with a browser to access analysis loci.

The best approach may be for each of us to pick a single tool to concentrate
on.  And remember, most of these tools will be XML browsers of a sort.

We should also make a list of "trivial" analysis/conversion tools for the
client-side.

> -- device driver for Linux, FreeBSD, etc for ABI sequencers (not sure
> about this)

Device driver?  Hmmm.  This could be a third-party add-on; it wouldn't be used
by enough people to put it in the core.

> -- "fuzzy" CLIPS-style inference engine for "expert" per-base
> decision-assistance
> - assistance in recognizing which type of NA a base is even when
> it looks like the peaks in the chromatogram could go either way 

Hmmm.  I haven't put much thought into chromatograph analyses, but it would be
useful to many experimentalists...and we want to target experimentalists in this
project too.

> -- a tutorial

Yes.

> I still don't entirely understand how GTK is designed -- if the GTK
> gurus on this project were willing to offer some assistance, I might
> progress beyond simple "Sign your timecard today!" boxes that I run from
> cron twice a month. ;-)  Part of this is that I'm a crappy C programmer
> and haven't applied myself to the tutorial, but when I follow along with

Only Jay is a GTK guru, and we haven't heard from him in a while.  We are
otherwise all pretty new to it.  I would, though stress learning Python/GTK over
C/GTK.  I want the client-side loci to be 90% Python and 10% C (0% if
possible).  Yes, it can be done.  Here is the Web site for Python/GTK:

  http://www.daa.com.au/~james/pygtk/

Also, look into the GLADE/gIDE projects.  There seems to be some work toward a
Python/GTK code generator for these...We'll see.  For now, it is C/GTK.

  GLADE: http://glade.pn.org/
  gIDE:  http://gide.pn.org/

> I also realized that (and this is probably irrelevant to TULIP, but
> still) I need someone academic and respected to goon for me if I am
> going to hold a real job yet stay involved in protein folding / nucleic
> acid folding for real.  If anyone on this list knows such a person at
> UVM, Middlebury, Dartmouth, or UNH, please let me know; this is really
> important to me personally.

I don't know anyone at those locations.  Anybody else here that can help Tim?

> My mailbox is full of TULIP messages -- I will now go read them.
> ps.  What does that stand for?

"The Loci Project" -> T.L.P. -> TULIP.  It was "The Lowell Package" when I
started, but I want it to be a bit less "local," no pun intended ;-)

Jeff
-- 
J.W. Bizzaro                  Phone: 617-552-3905
Boston College                mailto:bizzaro at bc.edu
Department of Chemistry       http://www.uml.edu/Dept/Chem/Bizzaro/
--