[Pipet Devel] Another XML proposal - Part 3.

Humberto Ortiz Zuazaga hortiz at neurobio.upr.clu.edu
Wed Apr 7 14:05:23 EDT 1999


> Rahul Jain wrote:
> 
> > >   LocalAppBroker:   I don't have that.  Send it embedded in the LocusML.
> > 
> > Hmmm... the way I see it, the last step is not necessary.
> 
> That's a good point.
> 
> Netscape does deal with the same problem and uses MIME.

Gnome has another html browser called Express, designed to be expanded with 
plugins, it looks a little like what we're talking about. 
http://www.cse.unsw.edu.au/~conradp/express/

When I first proposed my idea, I was in fact thinking of some kind of MIME 
like typing of the data stream.

The problem I see with MIME is how to handle multiparts, especially of 
different types.  I've been looking at the XML specifications, and it does 
look like namespaces can be used for what we want.  In order to encode all the 
information we want, we may have to do some tricks.

namespaces allow you to associate a label like

bicml:

to a URI like

http://bioinformatics.org/loci/

and have entity names be prepended with the URI.  You can also set up a 
default namespace for an xml file.

Specifically, I propose that we set up our xml files so that the labels and 
uri encode each of the little languages we want to use:

<?xml version="1.0"?>
 <!-- both namespace prefixes are available throughout -->
 <bicml1:stuff xmlns:bicml1='http://bioinformatics.org/loci/bicml/v1.0/'
          xmlns:nucseq1='http://bioinformatics.org/loci/bicml/nucleotide-sequen
ce-ml/v1.0/'>
     <bicml1:name>Some file of stuff</bicml1:name>
     <nucseq1:name>My new sequence</nucseq1:name>
     <nucseq1:sequence>ACGT</nucseq1:sequence>
 </bicml1:stuff>

so nucseq1:name is different from bicml1:name, and nucseq1 is an abbreviation 
for http://bioinformatics.org/loci/bicml/nucleotide-sequence-ml/v1.0/, note 
that the definition of namespaces does not say anything about any content at 
the page http://bioinformatics.org/loci/bicml/nucleotide-sequence-ml/v1.0/, 
only that it uses the actual string used in the uri part in order to decide if 
two names are the same.

I think we should at least put a description of the file format we use at the 
references URL, but in fact it isn't necessary: see the file format used in 
gnumeric for an example (the file is compressed, use zless or rename and 
uncompress it first).

We could use the uri part of the namespace as a kind of MIME type, that 
included the sublanguage and version of a file format.

The BICML language per se could specify how to package things together in sets 
or lists (multiple sequences, sequence + structure; then each sublanguage 
could deal with a smaller domain, like a single sequcence.

The human readable description at the uri could also contain information about 
where to get a browser, but the actual uri could be passed to the appbroker to 
query about display loci.

> But what about binaries?  When Netscape asks if you want to get a viewer, it
> gets a binary program.  For Loci we were talking about a Python script.  That'd
> sure help with portability issues.  But if we are now talking about a
> once-in-a-while situation where a new viewer is needed, could we just have
> binary viewers downloadable and skip the script?

When I was writing the proposal, I thought it wold be neat to have a file type 
like "multifile python module" that contained the source to a viewer, we could 
also have a type like "i386/libc6/rpm" for a viewer.  The app broker can 
return a list like

python - ftp://bar.baz/pub/viewer.py
compressed-multifile-python-tarfile - ftp://bar.baz/pub/fancy-viewer.tar.gz
i386/libc6/rpm - ftp://rpmfind.net/libc6/i386/viewer1.0-2.rpm

and the user can select which one he wants to retreive.
-- 
Humberto Ortiz Zuazaga
Bioinformatics Specialist
Institute of Neurobiology
hortiz at neurobio.upr.clu.edu




More information about the Pipet-Devel mailing list