[Pipet Devel] Another XML proposal.

Humberto Ortiz Zuazaga hortiz at neurobio.upr.clu.edu
Mon Mar 29 23:17:35 EST 1999

I've been reading the mail archive for the list, and have seen
BSML-vs-bioML-vs-cml threads in the biowidgets and bioperl lists.

I'd like to propose that Loci use many small XML DTDs instead of
trying for a kitchen sink DTD.

This is in keeping with the philosophy of having many small loci for
analysis and display.  The requirement is that any of our DTDs must be
able to contain objects in any of the others.


we could use a simplified BSML for sequence information (and just sequences).

a different XML dialect could be used for structure information
(including structural annotations in sequences).

separate XML DTDs could be defined for references, options to be passed to a
program, and work paths.

So, here's one way this proposal could work.

I sit down at my computer and start up my Workspace.

I retrieve a nucleotide sequence from Genbank, in genbank format, it
is parsed into several XML objects: a nucleotide sequence object, several
bibliographic reference objects, a protein sequence object for the
"/translation=" feature found in the original genbank file.

Each xml object is displayed on the benchtop by the apropriate locus.

Now I click on the button to perform a restriction map of my sequence.

The workspace contacts the restriction map locus, which returns an XML
object describing the parameters and options this restriction map
locus requires or supports.  An option handling locus can then prompt
me for the enzymes I want to cut with, the output format I prefer,
etc.  The sequence object and the options are then passed back to the
restriction map locus for the analysis.

The restriction map locus can now return the results as several xml
objects:  a bibliographic reference object describing the algorithm
used to perform the analysis; a result object containing the requested
results; a locus object containing the gnome-python source code for
a gui-locus that can display the results.

The workspace can check if it already has a gui-locus that can display
the results, and pases the results to it, or downloads the code and
generates the gui-locus.

As loci are loaded into the workspace, they can register the ability
to handle a particular DTD or set of DTDs.

We also need not pass around the entire XML object each time, for
example only a url for a reference need be included in the results
from an analysis, not the entire paper.

Humberto Ortiz Zuazaga
Bioinformatics Specialist
Institute of Neurobiology
hortiz at neurobio.upr.clu.edu

More information about the Pipet-Devel mailing list