From bizzaro at bc.edu Thu Mar 25 19:36:34 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:06 2006 Subject: [Pipet Devel] interesting XML article Message-ID: <36FAD692.D4AD2758@bc.edu> Some good points are made in this article about XML and how it is more than HTML++: http://www.linuxworld.com/linuxworld/lw-1999-03/lw-03-xml.html?03-25 Jeff bizzaro@bc.edu From hortiz at neurobio.upr.clu.edu Mon Mar 29 23:17:35 1999 From: hortiz at neurobio.upr.clu.edu (Humberto Ortiz Zuazaga) Date: Fri Feb 10 19:18:06 2006 Subject: [Pipet Devel] Another XML proposal. Message-ID: <199903300417.AAA03711@chimbo.neurobio.upr.clu.edu> I've been reading the mail archive for the list, and have seen BSML-vs-bioML-vs-cml threads in the biowidgets and bioperl lists. I'd like to propose that Loci use many small XML DTDs instead of trying for a kitchen sink DTD. This is in keeping with the philosophy of having many small loci for analysis and display. The requirement is that any of our DTDs must be able to contain objects in any of the others. Specifically: we could use a simplified BSML for sequence information (and just sequences). a different XML dialect could be used for structure information (including structural annotations in sequences). separate XML DTDs could be defined for references, options to be passed to a program, and work paths. So, here's one way this proposal could work. I sit down at my computer and start up my Workspace. I retrieve a nucleotide sequence from Genbank, in genbank format, it is parsed into several XML objects: a nucleotide sequence object, several bibliographic reference objects, a protein sequence object for the "/translation=" feature found in the original genbank file. Each xml object is displayed on the benchtop by the apropriate locus. Now I click on the button to perform a restriction map of my sequence. The workspace contacts the restriction map locus, which returns an XML object describing the parameters and options this restriction map locus requires or supports. An option handling locus can then prompt me for the enzymes I want to cut with, the output format I prefer, etc. The sequence object and the options are then passed back to the restriction map locus for the analysis. The restriction map locus can now return the results as several xml objects: a bibliographic reference object describing the algorithm used to perform the analysis; a result object containing the requested results; a locus object containing the gnome-python source code for a gui-locus that can display the results. The workspace can check if it already has a gui-locus that can display the results, and pases the results to it, or downloads the code and generates the gui-locus. As loci are loaded into the workspace, they can register the ability to handle a particular DTD or set of DTDs. We also need not pass around the entire XML object each time, for example only a url for a reference need be included in the results from an analysis, not the entire paper. -- Humberto Ortiz Zuazaga Bioinformatics Specialist Institute of Neurobiology hortiz@neurobio.upr.clu.edu From justin at ukans.edu Tue Mar 30 03:19:20 1999 From: justin at ukans.edu (Justin Bradford) Date: Fri Feb 10 19:18:06 2006 Subject: [Pipet Devel] Another XML proposal. In-Reply-To: <199903300417.AAA03711@chimbo.neurobio.upr.clu.edu> Message-ID: > I'd like to propose that Loci use many small XML DTDs instead of > trying for a kitchen sink DTD. I agree, and that is basically the way I had been thinking. Specific descriptions of data should be as small and modular as possible (sequence, structure, phylogeny, etc). LocusML should also be able to describe relationships between those pieces of data, if necessary, however. We might need specific DTDs for relationships (ie. a restriction map, which contains a number of short sequence components), as a lot of relationships will be very hard to express generically. > we could use a simplified BSML for sequence information (and just > sequences). I don't like how BSML is structured, but I do like the detail it allows. I prefer how the inner sections of BioML for its "flow". I had planned to merge the two, looking more like BioML but with the versatility of BSML. Also, BSML doesn't cover amino acid sequences (if I remember correctly), while BioML does. The two different structures probably merit unique DTDs anyway, though. > a different XML dialect could be used for structure information > (including structural annotations in sequences). Yes. I'm not sure where to begin on structure. Someone here had ideas on this, but I'm not sure who or what became of them. Also, it sounds like we'll need a DTD for phylogeny, too. There are probably others as well, but the concept remains the same. Describe just the relevant data, and use a unique ID to find reference and data relations elsewhere. > separate XML DTDs could be defined for references, options to be passed > to a program, and work paths. A generic reference DTD is fairly simple. Describing relationships between data will take a little more thought. Loci specific information will probably be filled in as we go further into development. Although, just so no one is confused, the XML format is really only for transfer (data) and storage (both data and Loci info). Actual Loci info will be kept in the Paos object as attributes rather than an XML stream that has to be parsed all the time. It will be written out to XML for non-Paos storage. Generic data will always be handled by the Loci framework as XML (since it's basically meaningless to it), and the data specific tools will handle it internally in whatever way is appropriate (hash table, binary tree, etc). But a DTD is a good way to describe the data Loci uses. > I retrieve a nucleotide sequence from Genbank, in genbank format, it > is parsed into several XML objects: a nucleotide sequence object, several > bibliographic reference objects, a protein sequence object for the > "/translation=" feature found in the original genbank file. Exactly. I believe the translation component is the gatekeeper, in Loci terminology. > Each xml object is displayed on the benchtop by the apropriate locus. > Now I click on the button to perform a restriction map of my sequence. I haven't thought about the UI much yet. > The workspace contacts the restriction map locus, which returns an XML > object describing the parameters and options this restriction map > locus requires or supports. _That_ is an interesting idea. I had just been assuming a generic interface for types of loci (for example, a restriction map locus has three arguments and it doesn't vary), but rather than having a bunch of hardcoded loci types, we can query the locus for it's interface (of course we'll want to cache interfaces). > An option handling locus can then prompt > me for the enzymes I want to cut with, the output format I prefer, > etc. Going back to Jeff's idea about embedding python in XML, a locus could return an interface description with UI code to handle the query configuration (probably optional for exotic cases; most of the time it would be generic fields with default UI handlers). > The restriction map locus can now return the results as several xml > objects: a bibliographic reference object describing the algorithm > used to perform the analysis; a result object containing the requested > results; a locus object containing the gnome-python source code for > a gui-locus that can display the results. Before we go overboard with passing interface code around though, I'd like to strongly encourage the presence of powerful, high-level widgets in the workspace app. We don't want to be passing around a generic sequence viewer all the time. > The workspace can check if it already has a gui-locus that can display > the results, and pases the results to it, or downloads the code and > generates the gui-locus. Like I said just above, I'd like to see a nice API (from the loci perspective) for the UI stuff. Ranging from low-level building block widgets to higher-level generic viewers, as well as the ability to plug-in additional generic viewers. That way if you're always using some non-standard locus gui, you can just load the script locally (and even replace it with faster compiled code). > As loci are loaded into the workspace, they can register the ability > to handle a particular DTD or set of DTDs. Possibly even more than that -- for instance, a loci to handle a specific relationship between sets of DTDs (I don't have a good example, though). > We also need not pass around the entire XML object each time, for > example only a url for a reference need be included in the results > from an analysis, not the entire paper. Yes. It was my intention for the workflow system to just give a locus what it needs (probably by creating a second Paos object). It should present it with the necessary data and control information, rather than sending the whole object with potentially extraneous data and control info. The locus updates the object with status information (recorded to the master Paos object, which the gui can get info from). And then transmits the generated data back via Paos. That's consolidated into the master object and fed to the gui client. Also, for the Paos representation of the Loci XML info, I was imagining a DOM-like interface. The XML is represented in a tree. So, this Loci info: restriction map becomes this Paos object: query.id = "aaaa" query.action = "restriction map" query.option{distinguish enzyme cuts} = "yes" query.data{template} = "#sequence_id" query.data{restriction enzyme} = "EcoR1" query.data{restriction enzyme} = "BamH1" where #sequence_id means the XML can be extracted from the Paos data attribute under the key "sequence_id" Or something along these lines. This example is missing a lot of things. I'm not sure how python handles hashes either. This is actually perl/c-ish here. (note: Ok, now I'm going to ramble some...) Although, perhaps we don't even need to bother trying to express the internal Loci data stuff as XML. Will we ever need to write it out to XML? Possibly only the actual biological data needs XML expression, just to facilitate interaction between Loci derived data and non-Loci tools. Theoretically, we don't need XML for anything, since structures in Paos could hold all of the biological data, too. It just seems like a good way to describe things for stuff that isn't entirely internal to Loci. But on similar grounds, we will need to define the internal Loci info interface adequately for tools to make use of it, and perhaps an XML representation of that would make it more clear. I'd really like to rig up a working demo. Does anyone have a pretty simple analysis tool we could use for an example? In particular, the view of the resulting data should be simple (that's probably where the most programming is). Actually a restriction map wouldn't be too bad... Justin Bradford justin@ukans.edu From hortiz at neurobio.upr.clu.edu Tue Mar 30 09:22:35 1999 From: hortiz at neurobio.upr.clu.edu (Humberto Ortiz Zuazaga) Date: Fri Feb 10 19:18:06 2006 Subject: [Pipet Devel] Another XML proposal. In-Reply-To: Your message of "Tue, 30 Mar 1999 02:19:20 CST." Message-ID: <199903301422.KAA05393@chimbo.neurobio.upr.clu.edu> > > I'd like to propose that Loci use many small XML DTDs instead of > > trying for a kitchen sink DTD. > LocusML should also be able to > describe relationships between those pieces of data, if necessary, > however. We might need specific DTDs for relationships (ie. a restriction > map, which contains a number of short sequence components), as a lot of > relationships will be very hard to express generically. Yes, I propose a DTD for each kind of relationship, where for example a structural alignment could have a structural alignment DTD, and that DTD allowed for embedding a multi-sequence alignemnt entity that in turn contained several protein sequence entities, structure entities, each protein sequence could contain a set of reference entities. Each entity could be in a different DTD. We also need a DTD for page or canvas composition of multiple display loci, for embeding a figure in a figure, for example. > I don't like how BSML is structured, but I do like the detail it allows. I didn't mean BSML specifically, just that a sequence DTD should stick to describing only sequence information. > > a different XML dialect could be used for structure information > > (including structural annotations in sequences). > > Yes. I'm not sure where to begin on structure. Someone here had ideas > on this, but I'm not sure who or what became of them. With my proposal, we can defer on defining a structure DTD until we actually have more clue. > > The workspace contacts the restriction map locus, which returns an XML > > object describing the parameters and options this restriction map > > locus requires or supports. > > _That_ is an interesting idea. I had just been assuming a generic > interface for types of loci (for example, a restriction map locus has > three arguments and it doesn't vary), but rather than having a bunch of > hardcoded loci types, we can query the locus for it's interface (of course > we'll want to cache interfaces). The gatekeeper can also handle finding appropriate loci: workspace says I have a BICML nucleotide sequence v4.1 object, I want to perform a restriction map with these enzymes and see the sizes of the digested fragments. A tacg locus on server.example.com can reply saying, I can do the analysis, please send me the sequence, and the enzymes you want off of this list, to view the output you need a locus that can display v3.5 digest files, here is a url for a gnome-python locus for a compatible viewer. > > An option handling locus can then prompt > > me for the enzymes I want to cut with, the output format I prefer, > > etc. > > Going back to Jeff's idea about embedding python in XML, a locus could > return an interface description with UI code to handle the query > configuration (probably optional for exotic cases; most of the time it > would be generic fields with default UI handlers). Again, we dont have to pass back the UI code, just a URL to it, the workspace may well already have a copy locally. I think it's a bad idea to embed the python code in the xml. It violates the principle that the DTDs should stick to the point, and it really gets ugly when you consider the security implications. Locus will ship with loci for displaying many kinds of DTDs, and a site manager may well not allow the workspace to download untrusted code. With my proposal, the worspace just has to locate any locus that can display the result DTD, you may well have several sequence viewers on your machine already. > > The restriction map locus can now return the results as several xml > > objects: a bibliographic reference object describing the algorithm > > used to perform the analysis; a result object containing the requested > > results; a locus object containing the gnome-python source code for > > a gui-locus that can display the results. > > Before we go overboard with passing interface code around though, I'd like > to strongly encourage the presence of powerful, high-level widgets in the > workspace app. We don't want to be passing around a generic sequence > viewer all the time. That's what I mean. An analysis locus can just say my output is in BICML v3.2 format, here is the url for a viewer if you don't have one. The workspace then chooses whether or not to retreive the UI code. > Although, perhaps we don't even need to bother trying to express the > internal Loci data stuff as XML. Will we ever need to write it out to XML? > Possibly only the actual biological data needs XML expression, just to > facilitate interaction between Loci derived data and non-Loci tools. I argue that all our data structures should be representable as XML. This would let people write loci in any language, export individual components for other tools, and facilitate exchange of data. Storing data in python specific or binary formats restricts your options. Hopefully, we'll soon be able to embed Loci figures in our gnome word processor papers! -- Humberto Ortiz Zuazaga Bioinformatics Specialist Institute of Neurobiology hortiz@neurobio.upr.clu.edu From hinsen at cnrs-orleans.fr Tue Mar 30 11:52:03 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:06 2006 Subject: [Pipet Devel] Another XML proposal. In-Reply-To: <199903300417.AAA03711@chimbo.neurobio.upr.clu.edu> (message from Humberto Ortiz Zuazaga on Tue, 30 Mar 1999 00:17:35 -0400) References: <199903300417.AAA03711@chimbo.neurobio.upr.clu.edu> Message-ID: <199903301652.SAA20878@dirac.cnrs-orleans.fr> > I'd like to propose that Loci use many small XML DTDs instead of > trying for a kitchen sink DTD. I agree. The only reason for having one big DTD is that any combination of information could easily be integrated into one file. This is useful for hand-typed material, but hardly matters for computer-generated data. Let's rather stay flexible. > This is in keeping with the philosophy of having many small loci for > analysis and display. The requirement is that any of our DTDs must be > able to contain objects in any of the others. That should always be possible by using namespaces, if I understood them correctly (which may not be the case!) Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From hinsen at cnrs-orleans.fr Tue Mar 30 11:56:36 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:07 2006 Subject: [Pipet Devel] Another XML proposal. In-Reply-To: (message from Justin Bradford on Tue, 30 Mar 1999 02:19:20 -0600 (CST)) References: Message-ID: <199903301656.SAA25516@dirac.cnrs-orleans.fr> > > a different XML dialect could be used for structure information > > (including structural annotations in sequences). > > Yes. I'm not sure where to begin on structure. Someone here had ideas > on this, but I'm not sure who or what became of them. I am still in contact with people at EMBL who are working on an DTD that is equivalent to mmCIF plus a converter in both directions (written in Python). There have been some delays (caused by real work ;-) but they are optimistic to have it ready soon. This seems the most promising structure DTD to me. > Although, perhaps we don't even need to bother trying to express the > internal Loci data stuff as XML. Will we ever need to write it out to XML? I'd say everything that is saved to a file for archiving or non-immediate reuse should be XML, if only to give users a chance to understand what's inside without any special program! Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From bizzaro at bc.edu Wed Mar 31 03:24:26 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:07 2006 Subject: [Pipet Devel] Another XML proposal - Part 1. References: Message-ID: <3701DBBA.8CF11553@bc.edu> Justin Bradford wrote: > Although, just so no one is confused, the XML format is really only for > transfer (data) and storage (both data and Loci info). > Actual Loci info will be kept in the Paos object as attributes rather than > an XML stream that has to be parsed all the time. It will be written out > to XML for non-Paos storage. > Generic data will always be handled by the Loci framework as XML (since > it's basically meaningless to it), and the data specific tools will handle > it internally in whatever way is appropriate (hash table, binary tree, > etc). We have gone back and forth on this point. I just want to leave open the possibility that "workflow data", which is the "Loci info" you are referring to, may be _actively_ transferred as XML. If the workflow data is only for archiving, I guess there is some point along the pathway where the decision is made to start reading/writing workflow as XML. Is this something then that we want turned on and off along the path? If it is kept "on", it will make for a more robust system, in case of a Loci crash or OS crash. Just a thought. > > I retrieve a nucleotide sequence from Genbank, in genbank format, it > > is parsed into several XML objects: a nucleotide sequence object, several > > bibliographic reference objects, a protein sequence object for the > > "/translation=" feature found in the original genbank file. > > Exactly. I believe the translation component is the gatekeeper, in Loci > terminology. The "Gatekeeper" is the "Internet Application Broker" or "Locus IAB". You're talking about the "Document Translator" or "Locus DT". (Don't you love these names? ;-) I'd actually like to take on the GenBank translation part of this, since I made a GenBank parser once. > > > Each xml object is displayed on the benchtop by the apropriate locus. > > Now I click on the button to perform a restriction map of my sequence. > > I haven't thought about the UI much yet. Each seperable XML object (biodata) will be displayed on the benchtop...as a box or button. And yes, you can click on it (maybe even a right-mousebutton click) to bring up a list of loci/tools that can perform work on that type of data (this is where we need a local database of available loci and what they can do). > > > The workspace contacts the restriction map locus, which returns an XML > > object describing the parameters and options this restriction map > > locus requires or supports. > > _That_ is an interesting idea. I had just been assuming a generic > interface for types of loci (for example, a restriction map locus has > three arguments and it doesn't vary), but rather than having a bunch of > hardcoded loci types, we can query the locus for it's interface (of course > we'll want to cache interfaces). Ohhh yes! This is the database I was just talking about. Maybe it's a part of the benchtop, but it keeps track of all loci available to the user and what they can do. But instead of the locus being queried when it is about to be used, it is queried when it becomes accessible to the workspace. This is important for hot-plugging loci. The user can add loci while Loci is running. Maybe at a certain time interval, the workspace (database part) queries all accessible loci, and the loci return values informing the workspace what they can do. > > > An option handling locus can then prompt > > me for the enzymes I want to cut with, the output format I prefer, > > etc. > > Going back to Jeff's idea about embedding python in XML, a locus could > return an interface description with UI code to handle the query > configuration (probably optional for exotic cases; most of the time it > would be generic fields with default UI handlers). Yep. > > > The restriction map locus can now return the results as several xml > > objects: a bibliographic reference object describing the algorithm > > used to perform the analysis; a result object containing the requested > > results; a locus object containing the gnome-python source code for > > a gui-locus that can display the results. > > Before we go overboard with passing interface code around though, I'd like > to strongly encourage the presence of powerful, high-level widgets in the > workspace app. We don't want to be passing around a generic sequence > viewer all the time. Absolutely correct! I did not want to be passing 100k Python modules to recreate common code. I think we should have high-level widgets that may really be mostly C-GTK binaries wrapped in Python. Python-GTK bindings (by James Henstridge) can be used where the user may not have a particular C-GTK megawidget in their Loci library. > > > The workspace can check if it already has a gui-locus that can display > > the results, and pases the results to it, or downloads the code and > > generates the gui-locus. > > Like I said just above, I'd like to see a nice API (from the loci > perspective) for the UI stuff. Ranging from low-level building block > widgets ...PyGTK... > to higher-level generic viewers, ...C-GTK... > as well as the ability to plug-in > additional generic viewers. That way if you're always using some > non-standard locus gui, you can just load the script locally (and even > replace it with faster compiled code). I'm not exactly sure what you mean here. TO BE CONTINUED.... Cheers, Jeff -- J.W. Bizzaro mailto:bizzaro@bc.edu Boston College Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ I have always appreciated your ability to ________, whenever there has been a blank to fill. -- From bizzaro at bc.edu Wed Mar 31 03:40:58 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:07 2006 Subject: [Pipet Devel] Another XML proposal - Part deux. References: Message-ID: <3701DF9A.F0D8000D@bc.edu> Justin Bradford wrote: > > > As loci are loaded into the workspace, they can register the ability > > to handle a particular DTD or set of DTDs. > > Possibly even more than that -- for instance, a loci to handle a specific > relationship between sets of DTDs (I don't have a good example, though). Okay. But again, the Workspace continually updates a database with this information. > > > We also need not pass around the entire XML object each time, for > > example only a url for a reference need be included in the results > > from an analysis, not the entire paper. > > Yes. It was my intention for the workflow system to just give a locus what > it needs (probably by creating a second Paos object). It should present it > with the necessary data and control information, rather than sending the > whole object with potentially extraneous data and control info. > The locus updates the object with status information (recorded to the > master Paos object, which the gui can get info from). And then transmits > the generated data back via Paos. That's consolidated into the master > object and fed to the gui client. This reminds me of the Notebook, which we have talked very little about. It will give the user a written log, in HTML, of the analyses, but where large amounts of data have been developed, the HTML gives only a link to the file (archived XML). At a later point, the user can view the log using the Notebook, click on a link, and the archived XML will be brought back to life and sent to the appropriate view. Damn this project is complex, but fun! ;-) > Or something along these lines. This example is missing a lot of things. > I'm not sure how python handles hashes either. This is actually perl/c-ish > here. Python doesn't use symbols to type variables. > > (note: Ok, now I'm going to ramble some...) > > Although, perhaps we don't even need to bother trying to express the > internal Loci data stuff as XML. Will we ever need to write it out to XML? > Possibly only the actual biological data needs XML expression, just to > facilitate interaction between Loci derived data and non-Loci tools. > Theoretically, we don't need XML for anything, since structures in Paos > could hold all of the biological data, too. It just seems like a good way > to describe things for stuff that isn't entirely internal to Loci. But on > similar grounds, we will need to define the internal Loci info interface > adequately for tools to make use of it, and perhaps an XML representation > of that would make it more clear. Ahh. The old PAOS vs. XML argument. It's a good argument. I think we could go 100% PAOS or 100% XML, and both ways would work. But I think the combination of the two can give us some advantages. The way I see it, and I guess you do too: PAOS -> for active communication about the internals of Loci XML -> for archiving and translating bio data Actually, I'll say it again: I want PAOS to have built-in XML parsing to make this more uniform. Cheers, Jeff -- J.W. Bizzaro mailto:bizzaro@bc.edu Boston College Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ I have always appreciated your ability to ________, whenever there has been a blank to fill. -- From bizzaro at bc.edu Wed Mar 31 04:15:30 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:07 2006 Subject: [Pipet Devel] Another XML proposal - Part 3. References: <199903301422.KAA05393@chimbo.neurobio.upr.clu.edu> Message-ID: <3701E7B2.4BF2F65A@bc.edu> Humberto Ortiz Zuazaga wrote: > > Yes, I propose a DTD for each kind of relationship, where for example a > structural alignment could have a structural alignment DTD, and that DTD > allowed for embedding a multi-sequence alignemnt entity that in turn contained > several protein sequence entities, structure entities, each protein sequence > could contain a set of reference entities. Each entity could be in a > different DTD. I am assuming that multiple DTD's doesn't mean multiple files, one per DTD. > > We also need a DTD for page or canvas composition of multiple display loci, > for embeding a figure in a figure, for example. Okay. > > The gatekeeper can also handle finding appropriate loci: Okay, now we _are_ talking about Locus IAB ;-) > > workspace says I have a BICML nucleotide sequence v4.1 object, I want to > perform a restriction map with these enzymes and see the sizes of the digested > fragments. > > A tacg locus on server.example.com can reply saying, I can do the analysis, > please send me the sequence, and the enzymes you want off of this list, to > view the output you need a locus that can display v3.5 digest files, here is a > url for a gnome-python locus for a compatible viewer. That's an excellent point. I have been assuming that the Workspace will only use loci (including Locus IAB) that have the appropriate capabilities. On the remote end, as with Locus IAB, the programs may be more up to date than those on the user's machine. I guess Locus IAB can tell the Workspace (locus database) that is has native v4.1 capabilities, but can provide a script to handle the older v3.5. If v3.5 is the only option for the user's version of Loci, it will have to use it (but it would use v4.1 if at all available). > Again, we dont have to pass back the UI code, just a URL to it, the workspace > may well already have a copy locally. Hmmmm. Either way, we're still talking about transferring a script. > I think it's a bad idea to embed the > python code in the xml. It violates the principle that the DTDs should stick > to the point, and it really gets ugly when you consider the security > implications. I wasn't talking about anything that would require a DTD, just a marker to say . > Locus will ship with loci for displaying many kinds of DTDs, > and a site manager may well not allow the workspace to download untrusted > code. Security is an issue for executing _any_ code from a remote source, whether or not it happens to be in the same file as the XML. > With my proposal, the worspace just has to locate any locus that can > display the result DTD, you may well have several sequence viewers on your > machine already. It's true that there may be more than one locus (viewer) to handle the same job. It will be an intresting challenge to get the Workspace to figure this out without the user's help. But I don't see how pointing to a URL for the GUI script will be any more secure. > That's what I mean. An analysis locus can just say my output is in BICML v3.2 > format, here is the url for a viewer if you don't have one. The workspace > then chooses whether or not to retreive the UI code. Oh. But how about the UI code is in the same file, as originally planned, but Loci/Workspace will have a setting to not execute the UI code if it comes from the Internet or from an unknown URL. Won't this work too? I just want to keep everything in the same data stream. > I argue that all our data structures should be representable as XML. This > would let people write loci in any language, export individual components for > other tools, and facilitate exchange of data. Add that to my list: PAOS -> for active communication about the internals of Loci XML -> for archiving and translating bio data AND LOCI INTERNALS > > Storing data in python specific or binary formats restricts your options. Yes. > > Hopefully, we'll soon be able to embed Loci figures in our gnome word > processor papers! Hmmmm. I suppose, if someone writes a translator :-) BTW, I have been including the GNOME API to take advantage of GNOME features like ORBit and even the GNUmeric spreadsheet, which can be accessed via ORBit/CORBA. Jeff -- J.W. Bizzaro mailto:bizzaro@bc.edu Boston College Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ I have always appreciated your ability to ________, whenever there has been a blank to fill. -- From rahul at photino.sid.rice.edu Wed Mar 31 15:57:45 1999 From: rahul at photino.sid.rice.edu (Rahul Jain) Date: Fri Feb 10 19:18:07 2006 Subject: [Pipet Devel] Another XML proposal - Part deux. In-Reply-To: <3701DF9A.F0D8000D@bc.edu> Message-ID: On Wed, 31 Mar 1999, J.W. Bizzaro wrote: > > Or something along these lines. This example is missing a lot of things. > > I'm not sure how python handles hashes either. This is actually perl/c-ish > > here. > > Python doesn't use symbols to type variables. Glib implements hashes. We should be able to use it since we're using gtk anyway. However, we might not want to require the remote loci to have glib installed. -- -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- |--|--------|--------------|----|-------------|------|---------|-----|-| Version 11.423.999.210000101.23.50110101.042 (c)1996-1999, All rights reserved. Disclaimer available upon request. From hortiz at neurobio.upr.clu.edu Wed Mar 31 13:23:55 1999 From: hortiz at neurobio.upr.clu.edu (Humberto Ortiz Zuazaga) Date: Fri Feb 10 19:18:07 2006 Subject: [Pipet Devel] Another XML proposal - Part 3. In-Reply-To: Your message of "Wed, 31 Mar 1999 09:15:30 GMT." <3701E7B2.4BF2F65A@bc.edu> Message-ID: <199903311823.OAA24194@chimbo.neurobio.upr.clu.edu> > > Yes, I propose a DTD for each kind of relationship > > I am assuming that multiple DTD's doesn't mean multiple files, one per DTD. I hope not, but don't know enough about XML yet. > > Again, we dont have to pass back the UI code, just a URL to it, the workspace > > may well already have a copy locally. > > Hmmmm. Either way, we're still talking about transferring a script. Perhaps not. Say I'm doing an analisis on a locus that returns multiple alingments. I get my results back with the url for a multiple sequence viewer, but I already have one installed from a trusted source. I won't need to get a new one. > > I think it's a bad idea to embed the > > python code in the xml. It violates the principle that the DTDs should stick > > to the point, and it really gets ugly when you consider the security > > implications. > > I wasn't talking about anything that would require a DTD, just a marker to say > . But that's what I mean. What does have to do with describing nucleotide sequences? > > Locus will ship with loci for displaying many kinds of DTDs, > > and a site manager may well not allow the workspace to download untrusted > > code. > > Security is an issue for executing _any_ code from a remote source, whether or > not it happens to be in the same file as the XML. Presumably, any loci shipped with the Locus distribution are safe. > > With my proposal, the worspace just has to locate any locus that can > > display the result DTD, you may well have several sequence viewers on your > > machine already. > > It's true that there may be more than one locus (viewer) to handle the same > job. It will be an intresting challenge to get the Workspace to figure this out > without the user's help. > > But I don't see how pointing to a URL for the GUI script will be any more > secure. No, say I want to view a multiple alignment. If I have the any appropriate display locus already installed from a trusted source, then I won't need to execute _any_ remote source. It's like with HTML. If I write valid HTML 4.0, then it doesn't matter what browser a user views it with, or where he got it from, it's sufficient that the user has a compliant browser. I say all loci should return results that say "This is valid bicml nucleotide sequence v4.2" I don't want a locus to say "Best viewed with Internet Genome Explorer 10.3". > Oh. But how about the UI code is in the same file, as originally planned, but > Loci/Workspace will have a setting to not execute the UI code if it comes from > the Internet or from an unknown URL. > > Won't this work too? I just want to keep everything in the same data stream. No, I specifically think putting the UI code into the data stream is bad. After the first time I get the UI, why should I keep downloading it? Imagine having to download netscape everytime you wanted to view a html file. As a matter of fact, an analysis locus doesn't even have to publish the url for a display locus in every result, we can use the AppBroker (can we call her the Matchmaker instead? she facilitates the exchange of loci) to keep track of where to get a display locus for the results. So if I write a new analysis locus and want to make it available to the community I can publish the location of the analysis locus (source or service) and the output formats. Display loci publish the input formats they support, and the AppBroker can make sure your workspace has the appropriate display locus, or help you locate one, or find a translation locus that can do the conversion for you. -- Humberto Ortiz Zuazaga Bioinformatics Specialist Institute of Neurobiology hortiz@neurobio.upr.clu.edu From hortiz at neurobio.upr.clu.edu Wed Mar 31 11:28:40 1999 From: hortiz at neurobio.upr.clu.edu (Humberto Ortiz Zuazaga) Date: Fri Feb 10 19:18:07 2006 Subject: [Pipet Devel] Another XML proposal - Part 1. In-Reply-To: Your message of "Wed, 31 Mar 1999 08:24:26 GMT." <3701DBBA.8CF11553@bc.edu> Message-ID: <199903311628.MAA24038@chimbo.neurobio.upr.clu.edu> > > but rather than having a bunch of > > hardcoded loci types, we can query the locus for it's interface (of course > > we'll want to cache interfaces). > > Ohhh yes! This is the database I was just talking about. Maybe it's a part of > the benchtop, but it keeps track of all loci available to the user and what they > can do. But instead of the locus being queried when it is about to be used, it > is queried when it becomes accessible to the workspace. > > This is important for hot-plugging loci. The user can add loci while Loci is > running. Maybe at a certain time interval, the workspace (database part) > queries all accessible loci, and the loci return values informing the workspace > what they can do. Yes, this is good. Installing a display locus (from wherever, more on this later) should register the locus as able to display a certain set of DTDs, or an analysis locus can register the type of analysis performed and the input and output formats it handles. Once registered, the workspace can find these locally and dispatch them immediately. The app broker locus can also be queried at run time to find display loci or analysis loci meeting certain requirements. Perhaps the workspace could ask the app broker to find source for a widget to display frobnicated sequences. This source could then be downloaded and registered locally. I could set up my app broker to only accept source loci from "trusted" sources, or to only send my data to trusted analysis loci. If I were really paranoid, I could turn off the app broker, and only run locally registered loci. Is the locus database different from the app broker, or can they be merged? The trick then is how the app broker finds out about loci at remote sites. -- Humberto Ortiz Zuazaga Bioinformatics Specialist Institute of Neurobiology hortiz@neurobio.upr.clu.edu From bizzaro at bc.edu Mon Mar 22 18:45:38 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:09 2006 Subject: [Pipet Devel] SGML book Message-ID: <36F6D622.2331D42F@bc.edu> Locians, I want to mention a book I bought the other day. Most references to XML deal with "Web applications" for the language. Of course, Loci at its core has nothing to do with the Web. But I came across this book that addresses the use of SGML (XML's parent) for non-Web software development: McGrath, Sean PARSEME.1ST Prentice Hall PTR, 1998 ISBN 0-13-488967-3 For those who are interested in the use of XML in this project, you may want to check your library for it. There are many useful examples (I think), and some are even in Python. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Mon Mar 1 03:58:45 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:19 2006 Subject: [Pipet Devel] and still more infrastructure things References: Message-ID: <36DA56C5.C708C804@bc.edu> Carlos, Thank you for the e-mail and the GIF! Carlos Maltzahn wrote: > > I totally agree that Paos shouldn't shuffle around real data. I see the > role of Paos as a coordination tool but not as a database management > system. Yeah. I'm ironing out how I think Paos and XML fit into this project. It does appear that they are both neaded, being complementary. XML I think is best as manager for the biological data, while Paos is best as "guide" for the path taken. > I attached a GIF picture to this mail. This picture contains Gnome > clients, Paos server, and Tool Manager (excuse me if I introduce yet > another set of terms). Gnome clients and Tool Manager are Paos clients. A > Gnome client consists of a GCL editor and progress monitor, among other > things. A Tool Manager > > - parses XML data and forwards it to the actual tool, What should be the ratio of tool managers to tools. I didn't see the actual tool represented in the GIF, so I'm assuming it is 1:1. If so, could each tool manager be _embedded_ in the code of a tool?...at least using the "include" command. > - turn the result of a tool into XML data and send it to another tool > manager Hmmm. You see again here that the tool manager does what I'd expect the tool to do. Maybe we're thinking the same way about this. > - sends status information to a Paos server (e.g. processing started or > completed, or processing ran out of memory), > - receives notifications from a Paos server (e.g. "suspend", "abort", > or status query), > - queries a Paos server about where to send results to, Okay. > The thin lines are communicating Python objects, the thick > lines communicate XML structures. Note that the destination of Tool > Manager can also be a Gnome client which is used to visualize results. ...the XML is sent back to the user at the end? > Another question in the discussion was whether to use Python objects for > communication or XML. XML is safer because it is an accepted and > extensible standard. However, transfering serialized objects was the > performance bottleneck in the Chautauqua workflow system (which uses > Paos) and I introduced a bit of trickery to reduce this overhead. What really worried me was the thought of a _single_ Paos process managing _everything_, including every read and write for every single XML, just to get workflow information. Agreed it would be best to leave active workflow information to RAM. > So I > would recommend sticking with Python objects for Paos communications but > use XML for everything else. > So Justin, do you agree then that our new XML should not include workflow information? The top of the XML, however, may still need an ID# to better track it through the system. ...and good luck on your exam this morning! I'm on spring break this week, which I'll use to get a bunch of things taken care of, including work on this project. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From Thomas.Sicheritz at molbio.uu.se Mon Mar 1 04:20:07 1999 From: Thomas.Sicheritz at molbio.uu.se (Thomas.Sicheritz@molbio.uu.se) Date: Fri Feb 10 19:18:20 2006 Subject: [Pipet Devel] more nice interfaces In-Reply-To: <199902261451.PAA15154@dirac.cnrs-orleans.fr> References: <36D34556.673C4A2B@bc.edu> <14038.43467.520971.536971@beagle.bmc.uu.se> <199902261451.PAA15154@dirac.cnrs-orleans.fr> Message-ID: <14042.19355.294867.863525@beagle.bmc.uu.se> > > Questions: > > * how can I combine a python module with a python class definition > > I want to add python code to the c-module ... > > Sorry, I don't understand what you are trying to do. Something > with Python and C and modules... Could you give a more detailed > description? I have a c-module with functions bb_sequence.reverse, bb_sequence.complement etc. and I want to create a class in python called bb_sequence ... the original question was about how to add the c-module into the class ... but I think I am going to rename the c-functions and wrap them in the python code. > > * how can I implement this tcl code in python ? > > foreach i "reverse coplement antiparallel" { > > puts [eval bb_sequence.$i $seq] > > } > > I'd have to know what the Tcl code means! I suppose it's a loop > over three strings, which in Python is > > for i in ["reverse" "coplement" "antiparallel"]: > .... > > But I don't understand the stuff with "puts" etc. I'd like to evaluate variables with function names, like: for i in ["reverse" "coplement" "antiparallel"]: print i,": ",bb_sequence.i where bb_sequence.bb_sequence.i should be substituted/bound to reverse coplement and antiparallel puts = print and [eval bb_sequence.$i $seq] tells the tcl interpreter to substitute all variables before evaluating the expression e.g [eval bb_sequence.$i $seq] -> bb_sequence.reverse "actgactagctagcatcgatcgat" Do we have access to a mailing list archieve ? - I have been away to long from this list to keep an overview ... thx -thomas -- Sicheritz Ponten Thomas E. Department of Molecular Biology blippblopp@linux.nu BMC, Uppsala University BMC: +46 18 4714214 BOX 590 S-751 24 UPPSALA Sweden Fax +46 18 557723 http://evolution.bmc.uu.se/~thomas Molecular Tcl: http://evolution.bmc.uu.se/~thomas/tcl Molecular Linux: http://evolution.bmc.uu.se/~thomas/mol_linux De Chelonian Mobile ... The Turtle Moves ... From hinsen at cnrs-orleans.fr Mon Mar 1 04:43:09 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:20 2006 Subject: [Pipet Devel] Loci markup language and infrastructure things In-Reply-To: (message from Justin Bradford on Sat, 27 Feb 1999 17:24:27 -0600 (CST)) References: Message-ID: <199903010943.KAA15272@dirac.cnrs-orleans.fr> > Also, for structure, there don't appear to be any MLs even attempting to > do this, with the exception of CML. So, my idea is to take the PDB file > format and XMLize it. If any of you know any glaring holes in PDB let > me know, and we can work around those. During a visit at EMBL last week, I talked to some people who are working on an mmCIF to XML converter, using an XML version of the mmCIF dictionary (no DTD yet, but it is planned). The goal is to save everybody else the work of writing a parser for the STAR format that CIF and mmCIF are based on. XML parsers are much more widely available. I can't judge how eager the crystallography community is to move to mmCIF, but from what I heard at EMBL, it seems that mmCIF is getting more and more attention - the PDB will finally accept submitted mmCIF files which contain information that cannot easily be converted to PDB format. So I think it's worth waiting for the mmCIF/XML stuff (which is supposed to be ready soon) instead of doing our own format based on the less flexible PDB format. As for glaring holes in PDB, I think there are many, although mostly related to the rather loose interpretation of the format description that most programs have applied. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From hinsen at cnrs-orleans.fr Mon Mar 1 04:55:37 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:20 2006 Subject: [Pipet Devel] and still more infrastructure things In-Reply-To: <36D92C03.1AF788EF@bc.edu> (bizzaro@bc.edu) References: <36D92C03.1AF788EF@bc.edu> Message-ID: <199903010955.KAA27218@dirac.cnrs-orleans.fr> > That brings up a big question I had, and where I've been getting confused... > > Is there really any such thing as an "XML object"? I mean, XML is a way to save The confusion seems to be widespread. XML is of course a file format (or rather metaformat), but it is particularly useful to store plain-text file representations of objects. That's the philosophy behind DOM (which defines a standard OO interface to XML documents), and also XML-RPC. In the end it's just a difference of point of view; files store data and objects store data! > structured data as a _file_. Python objects, on the other hand, are data > structures in memory. We would just be going back and forth between file and > object using XML. Right. > So, where do we really need XML? Could the data just be a Python > object? If we need to save the object, I think it can just be > "pickled"? Right, in principle. But there are advantages to using XML instead of Python's pickling format, and these are the same advantages that XML has compared to any other format: readability (plain ASCII) and standard syntax. A Python pickle file looks like garbage in an editor, and processing it without using Python requires significant effort. There have been discussions of implementing a pickle-compatible Python module that uses XML files. I don't know how far implementation has progressed, but I definitely expect this to happen soon. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From Thomas.Sicheritz at molbio.uu.se Mon Mar 1 05:03:38 1999 From: Thomas.Sicheritz at molbio.uu.se (Thomas.Sicheritz@molbio.uu.se) Date: Fri Feb 10 19:18:20 2006 Subject: [Pipet Devel] more nice interfaces In-Reply-To: <36DA6529.9A4EE77@bc.edu> References: <36D34556.673C4A2B@bc.edu> <14038.43467.520971.536971@beagle.bmc.uu.se> <199902261451.PAA15154@dirac.cnrs-orleans.fr> <14042.19355.294867.863525@beagle.bmc.uu.se> <36DA6529.9A4EE77@bc.edu> Message-ID: <14042.25294.567808.856677@beagle.bmc.uu.se> Hej Jeff, > One major part of the Loci Project is to create a "library" of Python modules > (and C wrapped in Python) that handle common sequence and structure > manipulations. The library for structure is something Konrad will hopefully > contribute, along the line of MMTK. > > If you are writing or rewriting code to manipulate sequences (like reversing or > complementing), the code should become part of this library. > > Tim, a guy we haven't heard from in a while, was going to write code to > convert codons into amino acids. Tim, this also needs to be in the > library, in Python or Python/C. I see ... that would make life easier. How are we going to do this practically ? Are we using different namespaces in the library ? How shold I rewrite the rewrite of my lib ? > If you want, I can forward you e-mails from any time span, since I save Loci > e-mail on my computer. No thanks - I have already an overfull INBOX ... :-) What are the latest news concerning markup languages ? -thomas -- Sicheritz Ponten Thomas E. Department of Molecular Biology blippblopp@linux.nu BMC, Uppsala University BMC: +46 18 4714214 BOX 590 S-751 24 UPPSALA Sweden Fax +46 18 557723 http://evolution.bmc.uu.se/~thomas Molecular Tcl: http://evolution.bmc.uu.se/~thomas/tcl Molecular Linux: http://evolution.bmc.uu.se/~thomas/mol_linux De Chelonian Mobile ... The Turtle Moves ... From hinsen at cnrs-orleans.fr Mon Mar 1 05:16:02 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:20 2006 Subject: [Fwd: [Pipet Devel] more nice interfaces] In-Reply-To: <36DA6B68.F8F71BFF@bc.edu> (bizzaro@bc.edu) References: <36DA6B68.F8F71BFF@bc.edu> Message-ID: <199903011016.LAA23134@dirac.cnrs-orleans.fr> > I have a c-module with functions bb_sequence.reverse, > bb_sequence.complement etc. and I want to create a class in python called > bb_sequence ... the original question was about how to add the c-module > into the class ... but I think I am going to rename the c-functions and > wrap them in the python code. That's the best solution. You *could* write some C-level type and inherit from it in a Python class by using the ExtensionClass package, but there's no point unless you really want/need to provide a generally useful C type. > I'd like to evaluate variables with function names, like: > for i in ["reverse" "coplement" "antiparallel"]: > print i,": ",bb_sequence.i That just needs a slight rewrite: for i in ["reverse" "coplement" "antiparallel"]: print i,": ", getattr(bb_sequence, i) getattr() works for method names as well, it simply returns a method object, which can be called like a function. From my current undestanding of your original Tcl example, the Python equivalent would be: for i in ["reverse" "coplement" "antiparallel"]: print i,": ", getattr(bb_sequence, i)(seq) Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From bizzaro at bc.edu Mon Mar 1 05:26:48 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:20 2006 Subject: [Fwd: [Pipet Devel] more nice interfaces] Message-ID: <36DA6B68.F8F71BFF@bc.edu> >From Thomas... -------------- next part -------------- An embedded message was scrubbed... From: Thomas.Sicheritz@molbio.uu.se Subject: Re: [Pipet Devel] more nice interfaces Date: Mon, 1 Mar 1999 10:20:07 +0100 (MET) Size: 3494 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990301/3ee52c0e/attachment.mht From bizzaro at bc.edu Mon Mar 1 05:28:07 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:20 2006 Subject: [Fwd: [Pipet Devel] more nice interfaces] Message-ID: <36DA6BB7.AD59D5EA@bc.edu> My reply to Thomas. For some reason we left the mailing list... -------------- next part -------------- An embedded message was scrubbed... From: "J.W. Bizzaro" Subject: Re: [Pipet Devel] more nice interfaces Date: Mon, 01 Mar 1999 10:00:09 +0000 Size: 2242 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990301/7b12f1ba/attachment.mht From bizzaro at bc.edu Mon Mar 1 05:28:40 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:20 2006 Subject: [Fwd: [Pipet Devel] more nice interfaces] Message-ID: <36DA6BD8.A0733627@bc.edu> Again from Thomas... -------------- next part -------------- An embedded message was scrubbed... From: Thomas.Sicheritz@molbio.uu.se Subject: Re: [Pipet Devel] more nice interfaces Date: Mon, 1 Mar 1999 11:03:38 +0100 (MET) Size: 2944 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990301/6c6ab03a/attachment.mht From hinsen at cnrs-orleans.fr Mon Mar 1 05:48:16 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:20 2006 Subject: [Pipet Devel] libraries - was more nice interfaces In-Reply-To: <36DA7242.76312BC7@bc.edu> (bizzaro@bc.edu) References: <36DA6BD8.A0733627@bc.edu> <36DA7242.76312BC7@bc.edu> Message-ID: <199903011048.LAA12448@dirac.cnrs-orleans.fr> > I think ExtensionClass is great in that the OO paradigm of Python is brought to > the C module. I would like to see this become standard Python in a future > release. But is it worth the effort to use a new Python package for the > library? If we need it, yes ;-) It's too early to decide, in my opinion. > Here is the URL? > > http://www.digicool.com/releases/ExtensionClass/ > > Konrad, do you know the license? I can't find it. Here's the file COPYRIGHT.txt from the source distribution: Copyright (C) 1996-1998, Digital Creations, Fredericksburg, VA, USA. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: o Redistributions of source code must retain the above copyright notice, this list of conditions, and the disclaimer that follows. o Redistributions in binary form must reproduce the above copyright notice, this list of conditions, and the following disclaimer in the documentation and/or other materials provided with the distribution. o Neither the name of Digital Creations nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY DIGITAL CREATIONS AND CONTRIBUTORS *AS IS* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL DIGITAL CREATIONS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Looks OK to me. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From Thomas.Sicheritz at molbio.uu.se Mon Mar 1 05:55:46 1999 From: Thomas.Sicheritz at molbio.uu.se (Thomas.Sicheritz@molbio.uu.se) Date: Fri Feb 10 19:18:20 2006 Subject: [Pipet Devel] libraries - was more nice interfaces In-Reply-To: <36DA7242.76312BC7@bc.edu> References: <36DA6BD8.A0733627@bc.edu> <36DA7242.76312BC7@bc.edu> Message-ID: <14042.28588.223446.421175@beagle.bmc.uu.se> Hej all, > Well, I havent really thought about naming conventions for the libraries. Since > you are the first to do this, you get the honor of inventing the namespace. You > can start everything with "locus_", if that is along the line of your question. ? Que ? locus_ ? ... have I missed something ? Why locus ? > I think we have to consider though just how much we will be using C to speed > things up in Loci. Much of what we intend to do will be very > compute-intensive. So maybe. Think about it. > > Here is the URL? > > http://www.digicool.com/releases/ExtensionClass/ Seems as a slight overkill to me ... I'll stick to simple wrapping for the sequence stuff. > > Konrad, do you know the license? I can't find it. http://www.digicool.com/releases/ExtensionClass/COPYRIGHT.html And don't expect any great news from my part ... I have to put most of my time on writting my thesis and keep our genome projects going. Right now brewing more coffee feels like a luxury waste of time ... ;-) > Have you been getting all of the e-mails from the mailing list? > > I count ~20 messages from this weekend and today. Yes ... I have just scanned them. I also have saved all other mails from the list, but I am very short in time right now, so I save the fun part until later. c ya -thomas -- Sicheritz Ponten Thomas E. Department of Molecular Biology blippblopp@linux.nu BMC, Uppsala University BMC: +46 18 4714214 BOX 590 S-751 24 UPPSALA Sweden Fax +46 18 557723 http://evolution.bmc.uu.se/~thomas Molecular Tcl: http://evolution.bmc.uu.se/~thomas/tcl Molecular Linux: http://evolution.bmc.uu.se/~thomas/mol_linux De Chelonian Mobile ... The Turtle Moves ... From bizzaro at bc.edu Mon Mar 1 05:56:02 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:20 2006 Subject: [Pipet Devel] libraries - was more nice interfaces References: <36DA6BD8.A0733627@bc.edu> Message-ID: <36DA7242.76312BC7@bc.edu> Thomas wrote: > Hej Jeff, > > > One major part of the Loci Project is to create a "library" of Python modules > > (and C wrapped in Python) that handle common sequence and structure > > manipulations. The library for structure is something Konrad will hopefully > > contribute, along the line of MMTK. > > > > If you are writing or rewriting code to manipulate sequences (like reversing or > > complementing), the code should become part of this library. > > > > Tim, a guy we haven't heard from in a while, was going to write code to > > convert codons into amino acids. Tim, this also needs to be in the > > library, in Python or Python/C. > > I see ... that would make life easier. How are we going to do this > practically ? Are we using different namespaces in the library ? Well, I havent really thought about naming conventions for the libraries. Since you are the first to do this, you get the honor of inventing the namespace. You can start everything with "locus_", if that is along the line of your question. > How shold I rewrite the rewrite of my lib ? Quoting Konrad: You *could* write some C-level type and inherit from it in a Python class by using the ExtensionClass package, but there's no point unless you really want/need to provide a generally useful C type. I think ExtensionClass is great in that the OO paradigm of Python is brought to the C module. I would like to see this become standard Python in a future release. But is it worth the effort to use a new Python package for the library? I think we have to consider though just how much we will be using C to speed things up in Loci. Much of what we intend to do will be very compute-intensive. So maybe. Think about it. Here is the URL? http://www.digicool.com/releases/ExtensionClass/ Konrad, do you know the license? I can't find it. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From carlosm at moet.cs.colorado.edu Mon Mar 1 15:07:17 1999 From: carlosm at moet.cs.colorado.edu (Carlos Maltzahn) Date: Fri Feb 10 19:18:20 2006 Subject: [Pipet Devel] and still more infrastructure things In-Reply-To: <36DA56C5.C708C804@bc.edu> Message-ID: [Carlos Maltzahn] > I attached a GIF picture to this mail. This picture contains Gnome > clients, Paos server, and Tool Manager (excuse me if I introduce yet > another set of terms). Gnome clients and Tool Manager are Paos clients. A > Gnome client consists of a GCL editor and progress monitor, among other > things. A Tool Manager > > - parses XML data and forwards it to the actual tool, [J.W. Bizzaro] What should be the ratio of tool managers to tools. I didn't see the actual tool represented in the GIF, so I'm assuming it is 1:1. If so, could each tool manager be _embedded_ in the code of a tool?...at least using the "include" command. I used "Tool Managers" because I know very little about the nature of the tools you are planning to use. In a 1:1 scenario, the tools allow you to import a Python module (e.g., the tools are Python programs or run an embedded Python interpreter). In this case the Tool Manager is a Python module that uses the Paos Client module. But if the tools are supposed to be more independent, the Tool Manager could be something like a remote Unix shell which communicates with Paos and that can control a variety of tools and has access to system information such as memory or CPU usage or process status information. It might make sense to support both solutions. > - turn the result of a tool into XML data and send it to another tool > manager Hmmm. You see again here that the tool manager does what I'd expect the tool to do. Maybe we're thinking the same way about this. See above. > The thin lines are communicating Python objects, the thick > lines communicate XML structures. Note that the destination of Tool > Manager can also be a Gnome client which is used to visualize results. ...the XML is sent back to the user at the end? Not necessarily. At some point the user wants to see the results, of course -- but this could either happen "on-line" (i.e. while the processing is going on), or "off-line" (i.e. after the results are archived). For long-running processing it might be useful to see the result as it emerges (e.g. histograms, scatter plots, etc). This might be interesting not only at the end of a processing pipe but also at intermediate steps. In any rate, I think XML is the way to go in all cases where you want to communicate domain-specific data. > Another question in the discussion was whether to use Python objects for > communication or XML. XML is safer because it is an accepted and > extensible standard. However, transfering serialized objects was the > performance bottleneck in the Chautauqua workflow system (which uses > Paos) and I introduced a bit of trickery to reduce this overhead. What really worried me was the thought of a _single_ Paos process managing _everything_, including every read and write for every single XML, just to get workflow information. Agreed it would be best to leave active workflow information to RAM. What is RAM? Carlos From justin at ukans.edu Mon Mar 1 15:51:30 1999 From: justin at ukans.edu (Justin Bradford) Date: Fri Feb 10 19:18:20 2006 Subject: [Fwd: [Pipet Devel] more nice interfaces] In-Reply-To: <36DA6BB7.AD59D5EA@bc.edu> Message-ID: >> Do we have access to a mailing list archieve ? - I have been away to >> long from this list to keep an > to use overview ... > > Justin is managing the majordomo account at UKansas. I know there are > some utilities to convert e-mails to HTML. I've set up a mhonarc to archive our email to the web. It's at http://toaster.sped.ukans.edu/tulip-list/ > If you want, I can forward you e-mails from any time span, since I save > Loci e-mail on my computer. I have all of the mail since I joined the project up on it, but if you can send me the mail prior to Janurary 4th, I'll but that up, too. Justin Bradford justin@ukans.edu From bizzaro at bc.edu Mon Mar 1 17:29:24 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:20 2006 Subject: [Pipet Devel] mialing list - was more nice interfaces References: Message-ID: <36DB14C4.18FBD7CD@bc.edu> Justin Bradford wrote: > > I've set up a mhonarc to archive our email to the web. > It's at http://toaster.sped.ukans.edu/tulip-list/ That's great! Thank you! Question: Will the headers always appear on a single page, or can they be split up by month? > > > If you want, I can forward you e-mails from any time span, since I save > > Loci e-mail on my computer. > > I have all of the mail since I joined the project up on it, but if you can > send me the mail prior to Janurary 4th, I'll but that up, too. > What format? I have everything under Netscape Mail...so it's one text file. I can send it that way, unless you need it some other way. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Mon Mar 1 17:41:01 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:20 2006 Subject: [Pipet Devel] and still more infrastructure things References: Message-ID: <36DB177D.6F25852B@bc.edu> Carlos Maltzahn wrote: > I used "Tool Managers" because I know very little about the nature of the > tools you are planning to use. In a 1:1 scenario, the tools allow you to > import a Python module (e.g., the tools are Python programs or run an > embedded Python interpreter). In this case the Tool Manager is a Python > module that uses the Paos Client module. Okay. That's what I was thinking. > But if the tools are supposed to > be more independent, the Tool Manager could be something like a remote > Unix shell which communicates with Paos and that can control a variety of > tools and has access to system information such as memory or CPU usage or > process status information. Okey dokey. I guess in either case, though, the "tool" would lock up during communication...if communication were 2-way and it waited for a reply. > It might make sense to support both solutions. I think so. We'll have more flexibility that way. > Not necessarily. At some point the user wants to see the results, of > course -- but this could either happen "on-line" (i.e. while the > processing is going on), or "off-line" (i.e. after the results are > archived). For long-running processing it might be useful to see the > result as it emerges (e.g. histograms, scatter plots, etc). This might be > interesting not only at the end of a processing pipe but also at > intermediate steps. In any rate, I think XML is the way to go in all cases > where you want to communicate domain-specific data. ..."domain-specific data" meaning the scientific data. > What really worried me was the thought of a _single_ Paos process > managing _everything_, including every read and write for every > single XML, just to get workflow information. Agreed it would be > best to leave active workflow information to RAM. > > What is RAM? :-) Random Access Memory, as in 64 MB RAM, system memory, not on disk. How's your thesis coming along? Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From justin at ukans.edu Mon Mar 1 17:44:03 1999 From: justin at ukans.edu (Justin Bradford) Date: Fri Feb 10 19:18:20 2006 Subject: [Pipet Devel] mailing list and some confusion In-Reply-To: <36DB14C4.18FBD7CD@bc.edu> Message-ID: > Question: Will the headers always appear on a single page, or can they > be split up by month? I'm sure I can split them by month; it might just take a small script. I'll get it working eventually. > What format? I have everything under Netscape Mail...so it's one text > file. I can send it that way, unless you need it some other way. I'm not postive mhonarc will read netscape mail files. I can always try it, and if it doesn't work we can go from there. Also, would it be possible for you to explain your vision for how the various components of Loci interact again. I'll somewhat fuzzy on how some things interconnect at the network level. For instance, network connections occur between what points? What decides the path? What receives status updates? What happens when problems arise? Does something always have to be running on the user's side, and if so, what does it do? I had a vision for the structure, which I don't think is what you had, and all of the terminology we've been using has gotten muddled in my mind. Justin Bradford justin@ukans.edu From carlosm at moet.cs.colorado.edu Mon Mar 1 18:15:12 1999 From: carlosm at moet.cs.colorado.edu (Carlos Maltzahn) Date: Fri Feb 10 19:18:20 2006 Subject: [Pipet Devel] and still more infrastructure things In-Reply-To: <36DB177D.6F25852B@bc.edu> Message-ID: > But if the tools are supposed to > be more independent, the Tool Manager could be something like a remote > Unix shell which communicates with Paos and that can control a variety of > tools and has access to system information such as memory or CPU usage or > process status information. Okey dokey. I guess in either case, though, the "tool" would lock up during communication...if communication were 2-way and it waited for a reply. Yes -- but 2-way communication can also be non-blocking (and should be) -- this is the beauty of notification requests: "tools" can register notification requests that are designed in such a way that a Paos server can effectively query tools. This requires that tools maintain some sort of event loop that periodically checks for events either from Paos or from the actual processing. The Paos Client module supports multiple ways of implementing this: (1) the Client module forks a separate process that listens to the Paos server; upon receiving a notification it interrupts the main process and forwards the notification to the main process (the necessary signal handlers and pipes are all installed by the Client module), (2) Client uses a pre-defined pipe to receive notifications; this is useful if the application does its own event management, (3) same as (1) but it assumes that the application has installed its own signal handler (this is useful if the actual event processing is done in a language other than Python). > It might make sense to support both solutions. I think so. We'll have more flexibility that way. A shell approach has the additional advantage of being more universal. > Not necessarily. At some point the user wants to see the results, of > course -- but this could either happen "on-line" (i.e. while the > processing is going on), or "off-line" (i.e. after the results are > archived). For long-running processing it might be useful to see the > result as it emerges (e.g. histograms, scatter plots, etc). This might be > interesting not only at the end of a processing pipe but also at > intermediate steps. In any rate, I think XML is the way to go in all cases > where you want to communicate domain-specific data. ..."domain-specific data" meaning the scientific data. Yes. > What really worried me was the thought of a _single_ Paos process > managing _everything_, including every read and write for every > single XML, just to get workflow information. Agreed it would be > best to leave active workflow information to RAM. > > What is RAM? :-) Random Access Memory, as in 64 MB RAM, system memory, not on disk. Huh? What has physical memory to do with this? How's your thesis coming along? Thesis writing sucks! I wish I had more time to finish the Paos tutorial (it's half done). Carlos From carlosm at moet.cs.colorado.edu Mon Mar 1 18:20:33 1999 From: carlosm at moet.cs.colorado.edu (Carlos Maltzahn) Date: Fri Feb 10 19:18:20 2006 Subject: [Pipet Devel] mailing list and some confusion In-Reply-To: Message-ID: [Justin Bradford] Also, would it be possible for you to explain your vision for how the various components of Loci interact again. I'll somewhat fuzzy on how some things interconnect at the network level. For instance, network connections occur between what points? What decides the path? What receives status updates? What happens when problems arise? Does something always have to be running on the user's side, and if so, what does it do? I had a vision for the structure, which I don't think is what you had, and all of the terminology we've been using has gotten muddled in my mind. I agree. A new design diagram or a glossary would be very useful at this point. Carlos From bizzaro at bc.edu Mon Mar 1 18:55:37 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:20 2006 Subject: [Pipet Devel] some confusion References: Message-ID: <36DB28F9.CD61F34C@bc.edu> Justin Bradford wrote: > Also, would it be possible for you to explain your vision for how the > various components of Loci interact again. I'll somewhat fuzzy on how some > things interconnect at the network level. > For instance, network connections occur between what points? > What decides the path? What receives status updates? What happens when > problems arise? Does something always have to be running on the user's > side, and if so, what does it do? > > I had a vision for the structure, which I don't think is what you had, and > all of the terminology we've been using has gotten muddled in my mind. > I guess your looking for an updated "diagram or gloassary", as Carlos just mentioned. I'll get that out shortly. As far as the details of network communication is concerned, Paos will be more heavily involved in this than I originally planned and should handle all the networking for us. I expect networked communication between tools to be almost identical to communication between tools on the same computer. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Mon Mar 1 19:01:45 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:20 2006 Subject: [Pipet Devel] and still more infrastructure things References: Message-ID: <36DB2A69.F097EEA@bc.edu> Carlos Maltzahn wrote: > > > What really worried me was the thought of a _single_ Paos process > > managing _everything_, including every read and write for every > > single XML, just to get workflow information. Agreed it would be > > best to leave active workflow information to RAM. > > > > What is RAM? > > :-) Random Access Memory, as in 64 MB RAM, system memory, not on disk. > > Huh? What has physical memory to do with this? > I mean workflow information is best left to being stored in a data structure, which I'm assuming is kept in physical memory. I guess I'm worng. No big deal. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From carlosm at moet.cs.colorado.edu Mon Mar 1 19:11:19 1999 From: carlosm at moet.cs.colorado.edu (Carlos Maltzahn) Date: Fri Feb 10 19:18:20 2006 Subject: [Pipet Devel] some confusion In-Reply-To: <36DB28F9.CD61F34C@bc.edu> Message-ID: [J.W. Bizzaro] As far as the details of network communication is concerned, Paos will be more heavily involved in this than I originally planned and should handle all the networking for us. I expect networked communication between tools to be almost identical to communication between tools on the same computer. Except the transfer of XML from tool to tool. It might make sense to make a distinction between streamed tool input and input files. If a tool has to have access to the entire result of the previous tool before it can do anything useful and the tool runs on a host that shares the same NFS with the host of the previous tool, it doesn't make sense to transfer any data: all the tool needs is a pointer to a file that contains the result of the previous tool. On the other hand, if the tool is able to process a stream of data, the entire process can be pipelined which saves a lot of time. Carlos From bizzaro at bc.edu Mon Mar 1 20:58:56 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:21 2006 Subject: [Pipet Devel] some confusion References: Message-ID: <36DB45E0.157BD528@bc.edu> Carlos Maltzahn wrote: > > Except the transfer of XML from tool to tool. Yep, we're going to get killed on the terminology in this project. When you say "transfer", you mean reading the XML file from disk and then "streaming" it to the next tool, without writing back to disk? As you wrote below, we shouldn't have to do this if both tools are on the same NFS. But we will need some mechanism for XML transfers between NFS's. > It might make sense to make a distinction between streamed tool input and > input files. Right. So workflow info is streamed and biological data (XML) is read/written from/to a file. > If a tool has to have access to the entire result of the > previous tool before it can do anything useful and the tool runs on a host > that shares the same NFS with the host of the previous tool, it doesn't > make sense to transfer any data: all the tool needs is a pointer to a file > that contains the result of the previous tool. Yes. > On the other hand, if the tool is able to process a stream of data, the > entire process can be pipelined which saves a lot of time. Now by "pipelined", you mean one tool starts getting input before the other tool is even finished with the data...for serialized analyses? I agree we need a glossary. I'll get started on one, but some of the terminology I use may not be "correct" and will need to be changed. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Mon Mar 1 21:55:00 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:21 2006 Subject: [Fwd: [Pipet Devel] libraries - was more nice interfaces] Message-ID: <36DB5304.665B61FC@bc.edu> My reply to Thomas... -------------- next part -------------- An embedded message was scrubbed... From: "J.W. Bizzaro" Subject: Re: [Pipet Devel] libraries - was more nice interfaces Date: Tue, 02 Mar 1999 02:06:15 +0000 Size: 1485 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990302/c464e2f9/attachment.mht From bizzaro at bc.edu Mon Mar 1 23:57:08 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:21 2006 Subject: [Pipet Devel] glossary Message-ID: <36DB6FA4.51E260D9@bc.edu> Attached is a glossary of the terms we've been using to describe Loci. Please let me know if there is any confusion, changes, or additions. You will find some names that you haven't seen before. For example, I'd like to call our XML, "BICML", as in "The BIC Group", instead of LociML. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- -------------- next part -------------- Glossary of Terms for The Loci Project Version 0.1; 01 Mar 1999 analytical tool A small Python module, or program with a Python interface, with the ability to work on biological data of type sequence or structure. This in non-graphical. See also Library. Benchtop Part of the Workspace. A highly graphical client that is the primary user interface for managing data and the Work Flow System BICML (formerly LociML or LocusML) The Biomolecular Informatics and Computation Markup Language. The XML describing the biological data. client Any tool that runs under its own process, usually a graphical tool or the Workspace. collaboratory The system by which multiple users, on separate computers, can collaborate on a research project. command/command-line The characters typed to start a program that would normally run in a console. Gatekeeper The application broker that resides on a remote computer and translates Loci data streams and files into commands (input) for remote algorithms and queries for remote databases. Output from the algorithms and databases is translated back into Loci data streams and files. Graphical Command Language (GCL) (now deprecated) A representation of piped commands and files using pictures. See Work Flow Diagram. graphical/gui tool A type of client and type of tool. What is actually seen and used by the user to work on the biological data. This doesn't include the Workspace, which is not a "tool". hub A remote computer that connects Loci to registered Gatekeepers on the Internet. library Used in the common sense. Loci can be considered a library of tools, both graphical and analytical. Notebook Part of the Workspace. Keeps a running log, written in HTML of all work perfomed using Loci. The notebook can take inserted text from the user but no deletions. This is an electronic version of a laboratory notebook. object Used in the common sense. Objects are data that can be streamed or stored. BICML files are not considered objects. tool A Python module or program wrapped in Python. Can be either graphical or analytical, but is used for work on biological data. local On the user's very own computer or NFS. loci Plural for locus. locus Any real part of Loci: modules, tools, clients, and libraries, including hubs and remote programs. Usually doesn't include data. "Locus" can in fact appear before every other name in this glossary beginning with a capital letter. The loci are represented on the Work Flow Diagram as boxes. Locus AA1DV Amino Acid 1-Dimensional Viewer. A graphical tool. Locus AA1DE Amino Acid 1-Dimensional Editor. A graphical tool. Locus AA2DV Amino Acid 2-Dimensional Viewer. A graphical tool. Locus AA3DV Amino Acid 3-Dimensional Viewer. A graphical tool. Locus NA1DE Nucleic Acid 1-Dimensional Editor. A graphical tool. Locus NA1DV-L Nucleic Acid 1-Dimensional Viewer for linear strands. A graphical tool. Locus NA1DV-Ci Nucleic Acid 1-Dimensional Viewer for cicular strands. A graphical tool. Locus NA1DV-Ch Nucleic Acid 1-Dimensional Viewer for chromosomes. A graphical tool. Locus NA3DV Nucleic Acid 3-Dimensional Viewer. A graphical tool. path/pathway The progress of work performed by the user, manually or automatically, that results in one locus calling on another and another, so that the progress can be traced. The path is represented on the Work Flow Diagram as a line connecting boxes (loci). Paos The "active object server" written by Carlos Maltzahn. Paos acts as the communication backbone for the Loci system and a guide through the work path. porta Any connection between the local Loci system and another system, be it the Gatekeeper, CORBA system, or whatever. Porta can be local or remote. Porta Internet The connection between the local Loci system and the Gatekeeper. Porta CORBA The connection between the local Loci system and a CORBA system. Python The de-facto programming/scripting language of Loci. query A database query sent to the remote database via data stream. remote Not on the user's very own computer or NFS. Across a network or the Internet. remote algorithm A command-line program for complex biological analyses, which resides on a remote computer. remote database A biological database that resides on a remote computer. remote program A remote algorithm or database that resides on a remote computer. server Used in the common sense. Any program serving or contolling a client. stream Active passing of objects from one locus to another, without writing to disk. Usually done via Paos. Translator Client that converts common formats for biological data (such as PDB or GenBank) into BICML, and visa versa. transfer Passing files across a porta. workflow The flow of all work being performed on the Loci system. Work Flow Diagram (WFD) The representation or choreography of work. Part of the Workspace. The WFD is a dynamic flow chart where loci are represented as boxes and paths are represented as lines between boxes. Work Flow System (WFS) Paos control and monitoring of workflow. Workspace or Locus Primus The client(s) that provide user control and monitoring of workflow. This includes the Benchtop and Notebook. The Workspace is not considered a "tool". From Thomas.Sicheritz at molbio.uu.se Tue Mar 2 04:17:45 1999 From: Thomas.Sicheritz at molbio.uu.se (Thomas.Sicheritz@molbio.uu.se) Date: Fri Feb 10 19:18:21 2006 Subject: [Pipet Devel] libraries - was more nice interfaces In-Reply-To: <36DB4796.BE47E7D4@bc.edu> References: <36DA6BD8.A0733627@bc.edu> <36DA7242.76312BC7@bc.edu> <14042.28588.223446.421175@beagle.bmc.uu.se> <36DB4796.BE47E7D4@bc.edu> Message-ID: <14043.39828.639775.924162@beagle.bmc.uu.se> > > Que ? locus_ ? ... have I missed something ? Why locus ? > > You are talking about how modules and objects will be named, right? To give > each a unique name so there is no confusion, I think names could start with > "locus", as in _one_ location, singular for loci. PyGTK classes/objects are > named "gtk_whatever". I c > > What do you mean by "Que"? They don't show "Fawlty Towers" oversea ? - ?Que? is the spanish quote from the a "continental cretin" ... (http://www.metronet.co.uk/cultv/fawlty.htm) > > What do you think should be done regarding "namespace"? This basic library - "loci" - should we build it as a shared library - load on demand, or will it contain all vital structures/functions statically linked to the core ... hmm ... in that case what is the core ? A question concerning addon modules: If I am going to write new phylogenomic tools in e.g. pyhton - what do I need to think about to make the programs loci-compatible ? What should I tell other programmers ? Do we need a "locus style guide" ? -thomas -- Sicheritz Ponten Thomas E. Department of Molecular Biology blippblopp@linux.nu BMC, Uppsala University BMC: +46 18 4714214 BOX 590 S-751 24 UPPSALA Sweden Fax +46 18 557723 http://evolution.bmc.uu.se/~thomas Molecular Tcl: http://evolution.bmc.uu.se/~thomas/tcl Molecular Linux: http://evolution.bmc.uu.se/~thomas/mol_linux De Chelonian Mobile ... The Turtle Moves ... From bizzaro at bc.edu Tue Mar 2 23:18:48 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:21 2006 Subject: [Pipet Devel] libraries - was more nice interfaces References: <36DA6BD8.A0733627@bc.edu> <36DA7242.76312BC7@bc.edu> <14042.28588.223446.421175@beagle.bmc.uu.se> <36DB4796.BE47E7D4@bc.edu> <14043.39828.639775.924162@beagle.bmc.uu.se> Message-ID: <36DCB828.6E8D962C@bc.edu> Thomas.Sicheritz@molbio.uu.se wrote: > > What do you mean by "Que"? > They don't show "Fawlty Towers" oversea ? - ?Que? is the spanish quote from > the a "continental cretin" ... (http://www.metronet.co.uk/cultv/fawlty.htm) I've seen it on Public Television here. What is the character just before Que? I can't read it on Netscape Mail. > > > > What do you think should be done regarding "namespace"? > This basic library - "loci" - should we build it as a shared library - load > on demand, or will it contain all vital structures/functions statically > linked to the core ... hmm ... in that case what is the core ? There is no "core" other than Paos, which is called on demand and is not a single process. The library of analytical tools (see the glossary) will be shared, only loaded if and when needed, using "include". So really, they're Python modules. To get a better idea, take a look at the modules for Konrad's "Molecular Modeling Tollkit". This may be pretty much what Konrad will contribute to Loci: http://starship.python.net/crew/hinsen/mmtk_manual/examples.html > A question concerning addon modules: > If I am going to write new phylogenomic tools in e.g. pyhton - what do I > need to think about to make the programs loci-compatible ? Good point. With the development of the first graphical tools for Loci, such as your sequence editor, we will be constructing a standard for Loci tools. So, there will be a certain amount of Python code in each tool that is the same for all tools, even new ones. And eventually we should be able to provide other programmers with a nonfunctioning core to which they can add their code. What should the standard/core of each graphical tool be able to do? (1) Read workflow data from Paos (2) Get bio data from Paos or directly from file (3) Convert bio data to layout for graphics (4) Have drawing capability (links to gnome-canvas) (5) Send work flow data to Paos (6) Send bio data to Paos or write to file (7) Find available tools to use or launch (from Paos?) Anything else? > What should I tell other programmers ? Do we need a "locus style guide" ? In a sense. We should have clear instructions on how to add to the core of the graphical tool, and what the programmer should know about how Loci functions. This is all very important, because Loci will have to be very easy to expand for it to get expanded by others at all. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Tue Mar 2 23:20:27 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:21 2006 Subject: [Fwd: [Pipet Devel] libraries - was more nice interfaces] Message-ID: <36DCB88B.1A0DB35C@bc.edu> >From Thomas... -------------- next part -------------- An embedded message was scrubbed... From: Thomas.Sicheritz@molbio.uu.se Subject: Re: [Pipet Devel] libraries - was more nice interfaces Date: Tue, 2 Mar 1999 10:17:45 +0100 (MET) Size: 3005 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990303/dd8fa99d/attachment.mht From bizzaro at bc.edu Tue Mar 2 23:26:21 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:21 2006 Subject: [Pipet Devel] libraries - was more nice interfaces References: <36DA6BD8.A0733627@bc.edu> <36DA7242.76312BC7@bc.edu> <14042.28588.223446.421175@beagle.bmc.uu.se> <36DB4796.BE47E7D4@bc.edu> <14043.39828.639775.924162@beagle.bmc.uu.se> <36DCB828.6E8D962C@bc.edu> Message-ID: <36DCB9EC.FDB3AA9F@bc.edu> "J.W. Bizzaro" wrote: > > all tools, even new ones. And eventually we should be able to provide other > programmers with a nonfunctioning core to which they can add their code. > > What should the standard/core of each graphical tool be able to do? To avoid confusion, we should call this a code "skeleton", not a "core". Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Tue Mar 2 23:36:03 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:21 2006 Subject: [Pipet Devel] libraries - was more nice interfaces References: <36DA6BD8.A0733627@bc.edu> <36DA7242.76312BC7@bc.edu> <14042.28588.223446.421175@beagle.bmc.uu.se> <36DB4796.BE47E7D4@bc.edu> <14043.39828.639775.924162@beagle.bmc.uu.se> <36DCB828.6E8D962C@bc.edu> Message-ID: <36DCBC33.7609717F@bc.edu> "J.W. Bizzaro" wrote: > The library of analytical tools (see the glossary) will be shared, only loaded > if and when needed, using "include". So really, they're Python modules. Of course I mean "import". "include" is C, see? :-) Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From rahul at photino.sid.rice.edu Wed Mar 3 00:20:14 1999 From: rahul at photino.sid.rice.edu (Rahul Jain) Date: Fri Feb 10 19:18:21 2006 Subject: [Pipet Devel] and still more infrastructure things In-Reply-To: Message-ID: On Sun, 28 Feb 1999, Justin Bradford wrote: > I miss enclosed blocks, but otherwise I'm doing ok. > { > whitespace usage should > be random . you can just parse around > > it. > } Regarding this problem, I think we could hack up a preprocessor in perl to convert stuff like this to standard python, sorta like cpp but customized for python. There could also be a line we put that the top that defines the format of this file. We could also make deprocessors to convert from one format to the cannonical style of the other. Just some of my many ramblings... -- -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- |--|--------|--------------|----|-------------|------|---------|-----|-| -----BEGIN GEEK CODE BLOCK----- Version: 3.1 GED/S/CS/MD/M/P/O/U/! d->-- s:-(--) a--->? C++(+++)$ UL++++$ P+++$>++++ L+++$>++++ !E--(----)? W++$>+++ N+(--) o>++++$ K? !w---()>? !O? M+ !V--? !PS+? PE+() Y+(++) PGP>+ t !5-->? !X-- R>+ !tv-(+) b+>++ DI+(+++)>++++ D++@>$ G e(*)>++++>+++++>$ h-()>++ r? y? ------END GEEK CODE BLOCK------ See also: http://www.hewgill.com/ogr/ http://www.douglasadams.com Version 11.423.999.210000101.23.50110101.042 (c)1996-1999, All rights reserved. Disclaimer available upon request. From bizzaro at bc.edu Wed Mar 3 00:42:44 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:21 2006 Subject: [Pipet Devel] and still more infrastructure things References: Message-ID: <36DCCBD4.9FF39A95@bc.edu> Rahul Jain wrote: > > On Sun, 28 Feb 1999, Justin Bradford wrote: > > > I miss enclosed blocks, but otherwise I'm doing ok. > > { > > whitespace usage should > > be random . you can just parse around > > > > it. > > } > > Regarding this problem, I think we could hack up a preprocessor in perl to > convert stuff like this to standard python, sorta like cpp but customized > for python. There could also be a line we put that the top that defines > the format of this file. We could also make deprocessors to convert from > one format to the cannonical style of the other. > To help ease someone's transition into Python, it might be an interesting project, although Python programmers won't understand it being written in Perl ;-) You would have to replace not only {} but ; Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From rahul at photino.sid.rice.edu Wed Mar 3 01:26:15 1999 From: rahul at photino.sid.rice.edu (Rahul Jain) Date: Fri Feb 10 19:18:21 2006 Subject: [Pipet Devel] sorta related school project Message-ID: Hi guys, For a programming class (Visualization in Science and Engineering), I need to write a Mathematica module that will essentially be a guided exploration to implement a specifc algorithim in Mathematica. I want to do something bioinformatics-related, so I was wondering what algorithim lends itself to a simple, but non-trivial, exploration. Mathematica can do some pretty intense mathematical stuff (That's what it was made for, duh), but it's also great at visualization. This project is meant to be for students in a variety of science and engineering fields, so shouldn't rely on more than high school biology. It should be a derivation of the algorithim from first principles for a special case and then a generalization. Check out http://www.owlnet.rice.edu/~comp260/ for some examples of the kind of stuff he wants us to write (You need Mathematica to view them). Does any body have any ideas as to what algorithim I might use? I haven't done much in bioinformatics, so I'll need an explanation of the algorithim or a reference to a book or journal article. Thanks for the help, -- -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- |--|--------|--------------|----|-------------|------|---------|-----|-| -----BEGIN GEEK CODE BLOCK----- Version: 3.1 GED/S/CS/MD/M/P/O/U/! d->-- s:-(--) a--->? C++(+++)$ UL++++$ P+++$>++++ L+++$>++++ !E--(----)? W++$>+++ N+(--) o>++++$ K? !w---()>? !O? M+ !V--? !PS+? PE+() Y+(++) PGP>+ t !5-->? !X-- R>+ !tv-(+) b+>++ DI+(+++)>++++ D++@>$ G e(*)>++++>+++++>$ h-()>++ r? y? ------END GEEK CODE BLOCK------ See also: http://www.hewgill.com/ogr/ http://www.douglasadams.com Version 11.423.999.210000101.23.50110101.042 (c)1996-1999, All rights reserved. Disclaimer available upon request. From bizzaro at bc.edu Wed Mar 3 01:58:01 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:21 2006 Subject: [Pipet Devel] sorta related school project References: Message-ID: <36DCDD79.78BF4AD9@bc.edu> Rahul, There is a list of biology computation projects, some with Mathematica code, in this book: Richard E. Crandall Projects in Scientific Computation Springer-Verlag, N.Y., 1994 Pages 79-92 Jeff Rahul Jain wrote: > > Hi guys, > > For a programming class (Visualization in Science and Engineering), I need > to write a Mathematica module that will essentially be a guided > exploration to implement a specifc algorithim in Mathematica. I want to do > something bioinformatics-related, so I was wondering what algorithim lends > itself to a simple, but non-trivial, exploration. Mathematica can do some > pretty intense mathematical stuff (That's what it was made for, duh), but > it's also great at visualization. > > This project is meant to be for students in a variety of science and > engineering fields, so shouldn't rely on more than high school biology. It > should be a derivation of the algorithim from first principles for a > special case and then a generalization. Check out > http://www.owlnet.rice.edu/~comp260/ for some examples of the kind of > stuff he wants us to write (You need Mathematica to view them). > > Does any body have any ideas as to what algorithim I might use? I haven't > done much in bioinformatics, so I'll need an explanation of the algorithim > or a reference to a book or journal article. > > Thanks for the help, > > -- > -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- > -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- > -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- > -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- > |--|--------|--------------|----|-------------|------|---------|-----|-| > -----BEGIN GEEK CODE BLOCK----- > Version: 3.1 > GED/S/CS/MD/M/P/O/U/! d->-- s:-(--) a--->? C++(+++)$ UL++++$ P+++$>++++ > L+++$>++++ !E--(----)? W++$>+++ N+(--) o>++++$ K? !w---()>? !O? M+ !V--? > !PS+? PE+() Y+(++) PGP>+ t !5-->? !X-- R>+ !tv-(+) b+>++ DI+(+++)>++++ > D++@>$ G e(*)>++++>+++++>$ h-()>++ r? y? > ------END GEEK CODE BLOCK------ > See also: http://www.hewgill.com/ogr/ http://www.douglasadams.com > Version 11.423.999.210000101.23.50110101.042 > (c)1996-1999, All rights reserved. Disclaimer available upon request. -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From rahul at photino.sid.rice.edu Wed Mar 3 03:13:57 1999 From: rahul at photino.sid.rice.edu (Rahul Jain) Date: Fri Feb 10 19:18:21 2006 Subject: [Pipet Devel] inter-locus communication In-Reply-To: <36DCB828.6E8D962C@bc.edu> Message-ID: What exactly is the role of Paos in Loci? As I understand it, Paos is simply a way of moving objects back and forth. I think that Justin's original suggestion was good, with some details that need to be worked out. With an XML parser, it should be trivial to convert a LociML file to an object in any language (Python, Perl, C++, etc.). Each of the four sections should be accessible separately, and the part of that section corresponding to a specific step should be equally easy to access. The state is the section that will need to be passed around frequently, the others should be static, except for the data. That can have one part for the original data, and then subsequent parts for the results of the analyses. The main difference from Justin's original model that I'm suggesting regarding the structure is that _all_ sequence and related data be stored in the section and then referred to in the queries. Parts would be appended to that with the output of a specific step, containing chunks of data surrounded in an identifying XML tag, such as , , , etc. with whatever identifiers would seem appropriate for that step. Whatever acts as the master controller for this analysis sequence will be in charge of putting all of these pieces together and sending them back to the Workspace. The status section could be sent over an open socket every time the client requests it (so that the updating occurs as fast as the client and server can handle it, but not faster). The data can be streamed over a separate socket for each step or even for each part of each step. So for a generalized example: ** indicates a "URL" that is accessed " indicates data that is transmitted over the actual socket ** lociwfs://some.wfs.server/my-analysis?create would be used to create a query with the name my-analysis + a unique suffix. This name would be sent back to the client and would be the 'session ID'. The client would send all the relevant information at this time, the and the . The server would return " " " [more steps] " " [more steps] " after figuring out what servers will do which step or indicate that it doesn't yet know. The connection can now be closed or left open. This could be specified in the request or not specified at all (if the client closes the connection, then it's closed). ** lociwfs://some.wfs.server/sessionID?status This would be the status socket, which could be closed and then reconnected at any time. Upon connection, the server would send a control section and then a status section " " " " Analysis finished. " " " Analysis failed. Error: ... " " " Aborted by user " " " " " " Processing... Reading Sequence... " " " Waiting for output from step q4 " " " Searching for available server... " " The client would then send " to get another status section. Every time the control section is updated (a new server is found), a new control section would be sent. This could also be requested explicitly by " if the client sends " at any time, the analysis would be aborted, " should also be possible. The socket would be closed by the server when all of the steps are either aborted, failed, or finished. It could be closed by the client at any time. ** lociwfs://some.wfs.server/sessionID?cancel[.stepid] would do the same as the 'command'. The server would then send a complete and section. ** lociwfs://some.wfs.server/sessionID?data[.stepid[.blockid#offset]] would send all available data (or from a specific set or block of a set). Offsets only make sense for the specific blocks of the step, as they change as the output grows. The data would be streamed to the client as more is amde available if the request was for a specific block. Actually, now that I think about it, the data for a step or all the data shouldn't be made available like this until all of the parts/steps are finished processing. Partial data should only be available for a specific part. ** lociwfs://some.wfs.server/sessionID?reject I think we should include this to prevent the server from being loaded up with unnecessary sessions. However, that brings up an important point: Should we implement a login/password system and the ability to control readability permissions. I guess permissions should be rw for the creator of the session and ro for others. If a wfs wants to keep certain data unreadable to certain people, I don't think we need to implement that, they should either allow general access or access only to those with accounts. This information should also be available, possibly in a separate section that would be sent in the final type of request: ** lociwfs://some.wfs.server/sessionID?info ** lociwfs://some.wfs.server/sessionID?report ** lociwfs://some.wfs.server/sessionID?fullreport would send an entire report of the session, complete with all control information, the final status information (to keep error messages available), and the query. If ...?report is specified, then the server would send output data as well. If ...?fullreport is specified, input data would also be sent. This would only be accessible after the analysis is complete and the session is closed. The structure of the data section would be as follows: [data block] [data block] [more steps] A data block would be structured as follows (I'm open to better ideas here): [data] likewise, you can have and data blocks. Of course we'll have to devise a protocol-level error reporting system to report, for example, that a sessionID is non-existent or that data is not available, or that the offset is larger that the current data. I included a lot of detail here, but I'm open to discussion, I just put the details in so we had a concrete example on the table to work with/argue about. Nothing here is set in stone, especially since I haven't a clue what I'm talking about :). This is only for the communication between the wfs server and the workspace. The communication between the wfs server and the loci can be done in a different way. That can and maybe should involve Paos specifically. We can worry about that later. Also, I think that the definition for transfer in the glossary should also include objects. Whatever, it's 2AM and I feel like nitpicking. There, I feel much better. :)P Wheew, my brain is _tired_. Now to get some sleep... -- -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- |--|--------|--------------|----|-------------|------|---------|-----|-| -----BEGIN GEEK CODE BLOCK----- Version: 3.1 GED/S/CS/MD/M/P/O/U/! d->-- s:-(--) a--->? C++(+++)$ UL++++$ P+++$>++++ L+++$>++++ !E--(----)? W++$>+++ N+(--) o>++++$ K? !w---()>? !O? M+ !V--? !PS+? PE+() Y+(++) PGP>+ t !5-->? !X-- R>+ !tv-(+) b+>++ DI+(+++)>++++ D++@>$ G e(*)>++++>+++++>$ h-()>++ r? y? ------END GEEK CODE BLOCK------ See also: http://www.hewgill.com/ogr/ http://www.douglasadams.com Version 11.423.999.210000101.23.50110101.042 (c)1996-1999, All rights reserved. Disclaimer available upon request. From bizzaro at bc.edu Thu Mar 4 04:20:03 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:21 2006 Subject: [Pipet Devel] Paos article Message-ID: <36DE5043.6B4D7C24@bc.edu> Locians, Attached is the Linux Magazin article on Paos, translated to English. I did this with the help of Babelfish, but it still required/s some cleanup. I made this effort because there is much confusion about just what Paos will do for Loci. I hope it helps. Enjoy! Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- -------------- next part -------------- Persistent Objects and Workflow-Management with Paos by Carlos Maltzahn ---------------------------------------------------------------------------- Paos is a system for the remote-network and consistent administration of Python objects. Carlos leads us in today's issue of Python Tools in providing background to this interesting tool. As the larger application developed besides the Workflow Managment system, Chautauqua thereby is introduced and occupies the possibilities by Python in real applications. ---------------------------------------------------------------------------- The Python module shelve supports a storing of Python objects into a file. After opening shelve files can be entered any objects with a name: >>> import shelve >>> db = shelve.open('database') >>> db['first object'] = [1, 2, 3] >>> db['second object'] = ('hallo', []) >>> db['first object'] [1, 2, 3] >>> db.close() If now this interpreter session is terminated, this way the entered objects are preserved. With the next call of Python these objects can be loaded by means of shelve.open('database ') again, i.e. these objects are persistent. The implementation of shelve can be configured at the Compile time of the Python of interpreter. Different data base c-libraries are available, e.g. dbm, gdbm and bsddb. These libraries implement data structures, which enable fast access to the stored data. In two important points shelve however offers no support: It does not implement a parallel access supervision, i.e. if several processes access persistent objects of the same shelve file, the file can become inconsistent. Additionally no inquiry language makes shelve available. Paos (Python Active Object server) structures on shelve and implements a Client/Server architecture with parallel access supervision and a simple inquiry language. http://www.cs.colorado.edu/~carlosm/paos1_arch.gif In addition Paos makes a notification service available, by which Python can use the server to be able to be informed about certain conditions. These conditions are defined by applications in the form of inquiries and registered with notifications the service in the server. Every time an application stores something with the server, the server applies the registered inquiries again to the stored objects. If a response on a request is not empty, the server transmits the response to the application, which registered the inquiry. An example and somewhat more (or too much?) Detail The following example illustrated how an application with the server constructs a connection, sets an inquiry and accesses to attributes of a loaded, persistent object. We assume a Paos server on the machine runs cheesy.cs.colorado.edu and waits for inquiries on the port 5000. The example produces some objects of the class person, stores it and executes an inquiry. import Client import ExampleSchema # builds connection with the Paos server on conn = Client.Connection('cheesy.cs.colorado.edu', 5000, 'example') # produces objects john = ExampleSchema.Person() john.name = 'John' sue = ExampleSchema.Person() sue.name = 'Sue' john.loves = sue bill = ExampleSchema.Person() bill.name = 'Bill' sue.loves = bill bill.loves = sue # registers objects with the server conn.register_objs([john, sue, bill]) # stores objects off conn.commit([john, sue, bill]) # gets all instances of ' person ', those who Sue falls in love with answer = conn.get('r', 'Person', [('loves', '==', sue)]) # for each object in the response prints out the names of the loved. for obj in answer: if obj.hasattr('sibling'): print obj.name, obj.sibling.name First we import the module Client, in order to be able to structure a connection with the Paos server. Afterwards we import the module ExampleSchema, which the class person defined (see further below). Finally we structure a connection, by instantiating the class of the connections. We indicate the host names and the port, on which the Paos server runs. In the third argument any name for application can be entered. This name emerges then in the appropriate log entries Servers. We produce then three person - for instances and assign them attribute values. Before we can store these objects, they must be registered only with the server. The registration assigns a unique data base number to the new objects. This benefits us in the following inquiry, which follows storing: "give me all objects from the class person, those the object sue holds dear." If sue had not been registered, the server could not compare this object with the stored objects. The first argument 'r' in the inquiry means that the objects are only read in the response by application. If we liked to modify objects, then we must indicate either in the inquiry instead of 'r' the argument 'rw', or acquire the write rights for the objects which can be manipulated subsequently as the method conn.lock. We can acquire the write rights only if no different one possesses the write rights. If we possess the write rights, no different one can modify the corresponding objects. With everyone conn.commit and with program abort we lose all acquired write rights. The inquiry supplies a list with two new objects, which are equivalent to John and bill to us. In the loop following on it we print the name of the respective loving out (both times ' Sue '). This harmless looking loop has it however in itself: The Client module guarantees that sue, john.loves and bill.loves to the same object point. This is enabled by the registration of sue and a resolution process, which is built into the attribute access of john.loves and bill.loves. This resolution process is permitted to be implemented over the inserted Python method __getattr__, those the redefinition of attribute accesses. Additionally these extended attribute access provides for dynamic loading of objects, which do not exist yet in Client application (the implementation of the attribute access is defined in Schema.py in the class DBobject. The method register_objs the class Connection in Client.py installs this attribute access for each Object in the argument list). It is important to understand that this resolution process can provide only for the referential consistency of registered objects among themselves. In the above example the variables John and bill point to objects, which are not contained in the response. It is situated here in the responsibility of the programmer to detect when variables point to outdated objects. With a simple trick, variables can become "refurbished": John = conn.cache[john.db_id ]. In addition it is to be known necessarily that the Connection object administers a Cache for loaded objects and this Cache accesses over the data base numbers of the registered objects. Each registered object possesses the attribute "db_id with unique data base number. The Cache contains the version of all loaded objects, last-loaded in each case. Paos's most interesting characteristic is however the notification service. The following example shows how this service is used: import Client import ExampleSchema import Utilities import os import pickle # defines a Pipe for notifications (read_pipe_fd, write_pipe_fd) = os.pipe() # builds connection with the server on conn = Client.Connection('cheesy.cs.colorado.edu', 5000, 'example', (read_pipe_fd, write_pipe_fd)) # registers inquiry with notifications the service request_id = conn.register('Person', [('name', '==', 'Sue')]) while 1: # control room on a notification and reads it data = Utilities.READ(read_pipe_fd, 10000) # packet notification out (req_id, obj_list, other_client) = pickle.loads(data) # packet identification from other client (other_host, other_pid, other_uid, other_name) = other_client # makes something with Compared with the first example three additional modules must be loaded: Utilities is a module with auxiliary procedures, which are used in all Paos modules. os and pickle are inserted modules of Python, the operating system functions and make available functions for the transformation of objects into a string (serialization). First we define a Pipe, which will serve us later than recipients for notifications. We structure then a connection to the Paos server. The call of the Connection function has the Pipe as the fourth argument, so that the Pipe can be associated with the connecting object. Then we register an inquiry, which ensures that the server sends us all new person objects with the name 'Sue', as soon as these objects are again entered into the data base. Conn.register(...) Call returns the delivery a registration number. We receive the notification over the Pipe. We use for it an auxiliary procedure, which guarantees that the full length of the notification of the Pipe is read. The notification is sent as string over the network and must be converted into a Python object again with the receiver side. This occurs with the assistance of the call pickle.loads(data). A notification consists of a Tripel, which those * Registration number of the inquiry, * the response of the inquiry in form of a nonblank object list and * the identification of application, stored those the objects and with it the notification released includes. This identification again consists of * the computer name * the number of the user process, * the number of the user user * and the third argument of the Client. Connection function call in application. Chautauqua: A larger application with Paos Paos is a " spin off " product of a Workflow research project. One of the results of this project is the experimental Workflow system Chautauqua. Paos makes notifications available with the service the communication infrastructure for the different Chautauqua system components. Chautauqua users interact with web browser and with the system over a graph wordprocessor. The web browser displays dynamically generated "to-cDo" lists for each coworker, and is used for filling out forms. The graph wordprocessor displays the structure of an office process and the status of differently jobs. E.g. if an office coworker stores contents of a form in Paos, Paos notifies the Chautauqua Workflow manager, who delegates the form to the next office coworker. This process can track each user on their graph wordprocessor, since each wordprocessor receives and converts the appropriate notifications immediately into graphic representations. The specialty of Chautauqua is that it enables to the users to change the structure of the office processes during running jobs. In the following snapshot we see the structure of an office process: http://www.cs.colorado.edu/~carlosm/paos2_ICN.gif Coworkers are explained by asterisks, office roles by squares, and activities by sets and triangles. Small points on the right above activities represent "tokens", which represent the status of the work, and which move with the work progress by the graph. With the wordprocessor it is now possible to change any part of the graph. If activities are deleted, tokens can lose their location. Chautauqua offers mechanisms to gather and assign these lost tokens new locations in the changed graph. Information Paos and Chautauqua are completely programmed in Python and freely available at: ftp://ftp.cs.colorado.edu/users/carlosm/paos-1.4.tar.gz ftp://ftp.cs.colorado.edu/users/carlosm/chautauqua-1.4.tar.gz (Chautauqua contains Paos) More detailed documentation for Paos and Chautauqua is at present in preparation and is announced in the new group comp.lang.python. ---------------------------------------------------------------------------- Carlos Maltzahn is at present a computer science student in the Ph.D. program of the University of Colorado in Boulder. His interests in research concentrate at the moment on Internet Caches and distribution indicating. In his spare time he roams either somewhere in the fantastically beautiful Rocky Mountains or spends his time building mobile robots from Fischer technique. To reach him use carlosm@cs.colorado.edu ---------------------------------------------------------------------------- Copyright ? Linux Magazin From bizzaro at bc.edu Thu Mar 4 04:32:33 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] Paos README Message-ID: <36DE5331.A33471D6@bc.edu> For more information on Paos, attached in the README file, in English. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- -------------- next part -------------- # # Copyright 1995 Carlos Maltzahn # # Permission to use, copy, modify, distribute, and sell this software # and its documentation for any purpose is hereby granted without fee, # provided that the above copyright notice appear in all copies and that # both that copyright notice and this permission notice appear in # supporting documentation, and that the name of Carlos Maltzahn or # the University of Colorado not be used in advertising or publicity # pertaining to distribution of the software without specific, written # prior permission. Carlos Maltzahn makes no representations about the # suitability of this software for any purpose. It is provided "as is" # without express or implied warranty. # # CARLOS MALTZAHN AND THE UNIVERSITY OF COLORADO DISCLAIMS ALL WARRANTIES # WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF # MERCHANTABILITY AND FITNESS, IN NO EVENT SHALL THE UNIVERSITY OF COLORADO # BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY # DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER # IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING # OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. # # Author: # Carlos Maltzahn # Dept. of Computer Science # Campus Box 430 # Univ. of Colorado, Boulder # Boulder, CO 80309 # # carlosm@cs.colorado.edu # Paos ==== DISTRIBUTION ------------ Paos (Python active object server) is an active multi-user object server with a simple query language. All software is written in Python. The distribution consists of the following files: Store.py - implements storing and locking of objects, the query language and registration of notifications. Server.py - implements the network interface of Store.py. Server.py imports Store.py and is started by "python Server.py " Client.py - implements the network interface of a client. It is used by importing it into a Python program. Schema.py - defines the class DBobject. All objects that are to be stored in the object server need to be of this class or a class that inherits this class directly or indirectly. Utilities.py - contains a number of functions that are used in above modules. example/ -------- Producer.py - implements a producer that accepts input lines and stores them to the object server. Started by "python Producer.py " Consumer.py - implements a consumer that prints out lines produced by a producer and is started by "python Consumer.py " Talk.py - implements two way communication (accepts input lines and prints out lines received from the server as notifications). Uses select call and the new pipe feature. ExSchema.py - contains the schema necessary for Talk.py, Producer.py and Consumer.py INSTALLATION ------------ Look at http://www.python.org/ for information on how to get and install Python. During installation make sure that you include at least one database module of either dbhash, gdbm, dbm, or macdb. I would recommend dbhash or dbm with the ndbm library because these do not limit length of records (which gdbm and the default library of dbm do; I don't know anything about macdb). Second you need to include the home directory of Paos and all your application directories into the environment variable PYTHONPATH. In this case the applicaton directory would be /example. Sometimes this environment variable is not accessible to the Python application (e.g. in CGI programs for a WWW server). Then your application programs need to import the module "sys" and set the variable "sys.path" appropriately. STARTING THE SERVER ------------------- You start the server by "python Server.py []. The database file name is optional. The default database file name is "database". The server then looks for a file .db. If it it does not exist, the server creates a new file of this name. CONNECTING TO THE SERVER ------------------------ The client can be either a standalone or an embedded Python program. It needs to import Client.py. This module defines a class called "Connection" which is instantiated as follows: import Client conn = Client.Connection(, , [, ]) If host and port are correctly specified this creates a TCP connection to the server. can be an arbitrary string which is only useful for debugging purposes and possible future extensions. is optional. If specified, this function is called if the client receives a notification from the object server (see below on how to register notification requests). NEW in v0.2: Instead of the callback function you can now pass a pipe instead of a callback function (a tuple of a read and write file descriptor returned by os.pipe()). You can use select.select(...) on the read descriptor of the pipe. Use Utilities.READ(...) and pickle.loads(...) to receive the notification (see below for the format of a notification). You also need to apply conn.register_objs(...) on the notification's object list. See the example application. All interactions with the server are defined as methods of the Connection instance. Note also, that you could have multiple connections to same or different servers. However, currently each object server has a seperate object ID name space. Also, each client registrates with a client specific name, not a connection specific name. Therefore, the client programmer has to take care of possible name collisions. A future version will introduce client naming that is unique over all connections and object ID naming that is unique over all Paos object servers. Use conn.close() to close the connection. QUERYING THE OBJECT SERVER -------------------------- In order to query the object server you use answer = conn.get(, , ) answer is a list of objects. can be either 'r' for read-only access or 'rw' for write-locking all objects contained in the answer. If some of the objects contained in answer are already write-locked by another client then the answer is None. Note the difference to an empty list that merely indicates that there is no object in the object server that matches the query. Note that each failure to acquire write-locks results in the loss of all write-locks acquired so far! can be either a list of persistent object references or a class name. A persistent object reference is a tuple as follows: ('__db', ). is an integer issued to each object that is stored in the object server. is a list of properties. A property is a tuple as follows: (, , ). is a string specifying the name of an attribute of objects specified by . can have '==', '!=', 'in', 'not in', 'has', 'has not', 'all in', 'not all in', 'some in', 'none in'. The meaning of '==', ..., 'not in' is the same as in Python. A list 'has' element iff element 'in' a list. A list 'has not' element iff not list 'has' element. List A 'all in' list B iff the elements of A are a subset of elements of B. List A 'not all in' list B iff not list A 'all in' list B List A 'some in' list B iff there exist a non-empty subset C of elements of A which is also a subset of elements of B. List A 'none in' list B iff not list A 'some in' list B Note that 'some in' is not the same as 'not all in'. In the first case the subset C has to be non-empty; in the second case C can be empty. CREATING NEW OBJECTS -------------------- Each new object that is created in a client and that is eventually written to the object server needs to be registered with the server PRIOR TO COMMIT TIME. Objects that are not registered at commit time can cause bad inconsistencies! In general new objects should be registered before your first access to one of their attributes with references to other persistent objects. Each registered object receives a unique persistent object ID under the attribute name "db_id". Use db_id_list = conn.register_objs() db_id_list is a list of db_id integers in the order corresponding to . is a list of objects. It can contain registered and unregistered objects. Registrating registered objects is useful in connection with notifications (see below). All unregistered objects in acquire write-locks. STORING OBJECTS --------------- Objects are stored by using ret = conn.commit() ret is either 'ok' or None if an error at the server occured (the diagnostics printed out by the server will give more information about the error - I'm aware that this is not a good solution; future versions will hopefully offer a better error handling). is a list of objects. contains all the objects that are supposed to be written to the database. However, only objects that were previously locked will be written to the object server; readonly objects are simply ignored. LOCKING OBJECTS --------------- It is possible to write-lock objects once they are loaded. Use answer = conn.lock() answer is a list of objects locked. The order of the list corresponds to . However, answer contains the versions of objects as they were found in the object server at locking time. If the lock failed answer is None and all previously acquired locks are released. is list of persistent objects to be locked. Objects that are not explicitly mentioned in the list (i.e., are only directly or indirectly referenced by objects explicitly mentioned in the list) are ignored. Note: 'lock' is faster than 'get' in the case of failed locking: 'get' retrieves objects before checking their locks while 'lock' checks locks first. Note also that there are three occasions where all previously acquired locks are lost: (1) calling "commit", (2) calling "lock" which fails, and (3) closing the connection or terminating the client ATTRIBUTE ACCESS ---------------- Assuming you load object a and b, and a.attr = b, i.e. a.attr contains a pointer to b. Now you issue a query that loads b and c. However, a.attr and b refer now to different objects because a.attr points to an older version of b. With many objects referring to each other it can become quite difficult to keep track of all the different versions of objects. In Paos each connection instance maintains an object cache that is updated by all connection methods except get_raw_notification() (see below). Attribute access of registered objects always access objects in the cache. Thus, in the above example a.attr always refers to the newest version of b. If a user wants to keep the older version of b she needs to assign it to a variable v before the next query. However, b's references to other persistent objects always point to the newest versions. Another advantage of this policy of attribute access is that the client will load objects from the object server as needed. For example, if you load object a and you assign v = a.attr then the client will automatically load b unless it is already in the cache. This convenience comes with a price: When you define persistent object classes you need to enumerate those attribute names that can have attribute values which contain references to other persistent objects. This information is kept in a special attribute called '__refs'. For example: import Schema class A(schema.DBobject): def __init__(self): schema.DBObject.__init__(self) self.__refs = ['attr'] This assumes that instances of class A have an attribute called 'attr' that can refer to other persistent objects. NOTIFICATIONS ------------- With request_id = conn.register(, ) you can register a notification request. and have the same meaning as in "get". A notification request is a query that is stored at the object server and evaluated in each subsequent "commit" against the set of objects that is written to the object server. If the result of such a query is not empty the client which registered the notification request is notified. The format of the notification is (, , ) corresponds with the returned value of the corresponding "register" call, i.e. identifies the corresponding query. is the list of objects that matches the query. identifies the client that triggered the notification. Note that there no client can register a notification request for another client; each notification request corresponds to exactly one client. Also note that notification request do not survive a client's lifetime: If a client terminates (or crashes) all notification requests owned by that client are deleted. There are multiple ways for a client to process notifications. If the connection to the server was created with a pointer to a callback function in the fourth argument then the client is interrupted at each notification (with the signal SIGUSR1) and the callback function is called. Otherwise the client needs to poll for notifications. In both cases notifications are retrieved by notification = conn.get_notification() Note that a notification is generated for each registered notification request. For example, if a client registered two requests and a subsequent commit contains objects matching both requests then the object server sends two notifications to the client. Also note that multiple notifications triggered by one commit are sent in the order they were registered. Each "get_notification" updates the object cache (see paragraph about attribute access). One can avoid this by using notification = conn.get_raw_notification() Note however, that attribute access in objects within the notification is not resolved correctly since these objects are disconnected from the attribute resolution mechanism discussed above. To connect these objects to the resolution mechanism use "register_objs" (this updates the object cache). If there are no notifications "get_notification" returns None. With conn.unregister() you can retract a notification request. From bizzaro at bc.edu Thu Mar 4 11:49:39 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] [Fwd: An EMBOSS release to play with] Message-ID: <36DEB9A3.6A8318F@bc.edu> >From the EMBOSS mailing list: -------------- next part -------------- An embedded message was scrubbed... From: Peter Rice Subject: An EMBOSS release to play with Date: Thu, 4 Mar 1999 15:08:50 GMT Size: 3654 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990304/83005b51/attachment.mht From david.lapointe at umassmed.edu Thu Mar 4 15:02:54 1999 From: david.lapointe at umassmed.edu (david.lapointe@umassmed.edu) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] New Book Release Message-ID: <93307F07DE63D211B2F30000F808E9E525D6EF@edunivexch02.umassmed.edu> Developing Linux Applications using GTK+ and GDK (Feb 1999). It doesn't seem to be as much of a reference as much as a collection of applications using GTK+, ( a notepad editor, a molecule viewer (PDB), a graphical apache log analyzer, etc). http://www.mcp.com/publishers/new_riders/catalog/new_riders_nr_bud.cfm David Lapointe Manager - Research Computing Services UMass Medical School Worcester, MA 01655 508/856-5141 From carlosm at moet.cs.colorado.edu Thu Mar 4 15:25:37 1999 From: carlosm at moet.cs.colorado.edu (Carlos Maltzahn) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] Paos article In-Reply-To: <36DE5043.6B4D7C24@bc.edu> Message-ID: Thanks a bunch Jeff! I put a proof-read version of your translation on the web: www.cs.colorado.edu/~carlosm/paos-english.html This version also fixes some bugs in the original Linux Magazin version. Carlos On Thu, 4 Mar 1999, J.W. Bizzaro wrote: Locians, Attached is the Linux Magazin article on Paos, translated to English. I did this with the help of Babelfish, but it still required/s some cleanup. I made this effort because there is much confusion about just what Paos will do for Loci. I hope it helps. Enjoy! Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Mon Mar 15 11:05:57 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] inter-locus communication References: Message-ID: <36ED2FE5.85724F@bc.edu> Sorry for the delay in replying to this. Rahul Jain wrote: > > What exactly is the role of Paos in Loci? As I understand it, Paos is > simply a way of moving objects back and forth. I think that Justin's > original suggestion was good, with some details that need to be worked > out. Justin's original suggestion, as I understood it, was to embed workflow information into the XML. I agreed that it is a novel idea, but I have two problems with it: (1) It would greatly increase the amount of parsing and writing involved (2) It would greatly diminish the role of the object server... (thus I asked, "why use PAOS?") We do need PAOS for object serving, but what the XML cannot do on its own, is handle active links between multiple loci. What would XML do in these cases? (1) Workflow information has to be reported back along the path to several loci (2) Several loci need to update a single XML The best solution for this is to have a server manage XML usage. But you see, this is where we need PAOS. And if PAOS can handle the workflow information as an XML, wouldn't it be more efficient to just keep this information as data structure objects? > > With an XML parser, it should be trivial to convert a LociML file to an > object in any language (Python, Perl, C++, etc.). Yes. I do see the use of XML for archiving and transferring objects, even workflow objects. But I think the advantage to having an XML that is biodata-only, is that it might be used outside of Loci. Maybe it will be more accepted than BSML or BioML. But if it contains workflow structures that are inseparable from the biological, it may never be used. Perhaps we can make BICML so that it does not *need* workflow data to be complete, but that it can *handle* it. I think if we BICML strongly labels biodata with ID#'s, workflow data can be appended to the XML, kept in another XML format, or just kept in PAOS as objects but be easier to track. So we have 4 options for the workflow data: (1) Put it in BICML, mixed with the biodata (2) Put it in BICML, separate from the biodata (3) Put it in a separate XML (4) Leave it as pure objects in PAOS In all cases, I would like PAOS to handle the workflow data. Carlos, I'm curious if an XML parser can be integrated with PAOS. I think it would make all of this simpler, even though a parser could be separate. [cut to save space] > > This is only for the communication between the wfs server and the > workspace. The communication between the wfs server and the loci can be > done in a different way. That can and maybe should involve Paos > specifically. We can worry about that later. Thank you for the prototype. It brings us closer to a format definition for our XML. But I think communication should be handled via PAOS rather than inventing a new system that requires each client to access Internet sockets. Can we come up with a system for options 2 and 3 above? (2) Put it in BICML, separate from the biodata (3) Put it in a separate XML > > Also, I think that the definition for transfer in the glossary should also > include objects. Whatever, it's 2AM and I feel like nitpicking. There, I > feel much better. :)P I guess we just wanted a short term to describe parse/write. Of course someone can "transfer an object". Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Mon Mar 15 11:05:57 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] inter-locus communication References: Message-ID: <36ED2FE5.85724F@bc.edu> Sorry for the delay in replying to this. Rahul Jain wrote: > > What exactly is the role of Paos in Loci? As I understand it, Paos is > simply a way of moving objects back and forth. I think that Justin's > original suggestion was good, with some details that need to be worked > out. Justin's original suggestion, as I understood it, was to embed workflow information into the XML. I agreed that it is a novel idea, but I have two problems with it: (1) It would greatly increase the amount of parsing and writing involved (2) It would greatly diminish the role of the object server... (thus I asked, "why use PAOS?") We do need PAOS for object serving, but what the XML cannot do on its own, is handle active links between multiple loci. What would XML do in these cases? (1) Workflow information has to be reported back along the path to several loci (2) Several loci need to update a single XML The best solution for this is to have a server manage XML usage. But you see, this is where we need PAOS. And if PAOS can handle the workflow information as an XML, wouldn't it be more efficient to just keep this information as data structure objects? > > With an XML parser, it should be trivial to convert a LociML file to an > object in any language (Python, Perl, C++, etc.). Yes. I do see the use of XML for archiving and transferring objects, even workflow objects. But I think the advantage to having an XML that is biodata-only, is that it might be used outside of Loci. Maybe it will be more accepted than BSML or BioML. But if it contains workflow structures that are inseparable from the biological, it may never be used. Perhaps we can make BICML so that it does not *need* workflow data to be complete, but that it can *handle* it. I think if we BICML strongly labels biodata with ID#'s, workflow data can be appended to the XML, kept in another XML format, or just kept in PAOS as objects but be easier to track. So we have 4 options for the workflow data: (1) Put it in BICML, mixed with the biodata (2) Put it in BICML, separate from the biodata (3) Put it in a separate XML (4) Leave it as pure objects in PAOS In all cases, I would like PAOS to handle the workflow data. Carlos, I'm curious if an XML parser can be integrated with PAOS. I think it would make all of this simpler, even though a parser could be separate. [cut to save space] > > This is only for the communication between the wfs server and the > workspace. The communication between the wfs server and the loci can be > done in a different way. That can and maybe should involve Paos > specifically. We can worry about that later. Thank you for the prototype. It brings us closer to a format definition for our XML. But I think communication should be handled via PAOS rather than inventing a new system that requires each client to access Internet sockets. Can we come up with a system for options 2 and 3 above? (2) Put it in BICML, separate from the biodata (3) Put it in a separate XML > > Also, I think that the definition for transfer in the glossary should also > include objects. Whatever, it's 2AM and I feel like nitpicking. There, I > feel much better. :)P I guess we just wanted a short term to describe parse/write. Of course someone can "transfer an object". Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Mon Mar 15 12:36:32 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] inter-locus communication References: <36ED2FE5.85724F@bc.edu> Message-ID: <36ED4520.67992C2C@bc.edu> "J.W. Bizzaro" wrote: > > Perhaps we can make BICML so that it does not *need* workflow data to be > complete, but that it can *handle* it. [cut] > So we have 4 options for the workflow data: > > (1) Put it in BICML, mixed with the biodata > (2) Put it in BICML, separate from the biodata > (3) Put it in a separate XML > (4) Leave it as pure objects in PAOS > I want to stress that in all cases we should make an XML that can include workflow data, but is complete without it, having only bio data. So with option 1, where the XML has workflow and bio data mixed, can the workflow data be left out by someone who wants to use it as a purely biological ML? If we can do this, and include an XML parser with PAOS, I'll be happy. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Mon Mar 15 12:36:32 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] inter-locus communication References: <36ED2FE5.85724F@bc.edu> Message-ID: <36ED4520.67992C2C@bc.edu> "J.W. Bizzaro" wrote: > > Perhaps we can make BICML so that it does not *need* workflow data to be > complete, but that it can *handle* it. [cut] > So we have 4 options for the workflow data: > > (1) Put it in BICML, mixed with the biodata > (2) Put it in BICML, separate from the biodata > (3) Put it in a separate XML > (4) Leave it as pure objects in PAOS > I want to stress that in all cases we should make an XML that can include workflow data, but is complete without it, having only bio data. So with option 1, where the XML has workflow and bio data mixed, can the workflow data be left out by someone who wants to use it as a purely biological ML? If we can do this, and include an XML parser with PAOS, I'll be happy. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From rahul at photino.sid.rice.edu Mon Mar 15 17:55:59 1999 From: rahul at photino.sid.rice.edu (Rahul Jain) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] inter-locus communication In-Reply-To: <36ED4520.67992C2C@bc.edu> Message-ID: In the plan I gave you guys, I only meant for the XML to be a way to communicate between a wfs and the Workspace (GUI). It was designed so that intermittently connected clients (or people who need to log out) can check up on the status of their analysis from time to time, esp. on a really long analysis. PAOS would most likely be used as the mode of communication between the wfs and the loci. Regarding the comment on making workflow information independent from the biological stuff, I think the data section covers that separation quite well. Keep the bio-related stuff in between the and tags and the rest is Loci-specific workflow information. This format also makes it easy to store the results of an analysis for archival purposes. -- -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- |--|--------|--------------|----|-------------|------|---------|-----|-| Version 11.423.999.210000101.23.50110101.042 (c)1996-1999, All rights reserved. Disclaimer available upon request. From bizzaro at bc.edu Tue Mar 16 17:11:33 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] inter-locus communication References: Message-ID: <36EED715.9DD9C338@bc.edu> Rahul Jain wrote: > > In the plan I gave you guys, I only meant for the XML to be a way to > communicate between a wfs and the Workspace (GUI). I don't see WFS <---> Workspace (Work Flow Diagram and Notebook) communication being much different from WFS <---> Tool communication. > It was designed so that > intermittently connected clients (or people who need to log out) can check > up on the status of their analysis from time to time, esp. on a really > long analysis. I like this idea. Was it Justin who first suggested it? It's a good argument for keeping a "hard copy" of the workflow data on disk via XML. Imagine that the system goes down for some reason, or even that the user wants to exit Loci and log out. Loci could just pick up later where it left off. > Regarding the comment on making workflow information independent from the > biological stuff, I think the data section covers that separation quite > well. Keep the bio-related stuff in between the and tags > and the rest is Loci-specific workflow information. This format also makes > it easy to store the results of an analysis for archival purposes. This is from your message: [data block] [data block] [more steps] Let's see.. is either workflow or bio and are workflow is workflow [data block] is bio If that is correct, bio data is nested directly in workflow sections in 2 cases. I suppose this is acceptable if the definition of BICML will allow for bio data to go directly under : (amino acid 1-dimensional/sequence) Something like that :-) Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Tue Mar 16 17:11:33 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] inter-locus communication References: Message-ID: <36EED715.9DD9C338@bc.edu> Rahul Jain wrote: > > In the plan I gave you guys, I only meant for the XML to be a way to > communicate between a wfs and the Workspace (GUI). I don't see WFS <---> Workspace (Work Flow Diagram and Notebook) communication being much different from WFS <---> Tool communication. > It was designed so that > intermittently connected clients (or people who need to log out) can check > up on the status of their analysis from time to time, esp. on a really > long analysis. I like this idea. Was it Justin who first suggested it? It's a good argument for keeping a "hard copy" of the workflow data on disk via XML. Imagine that the system goes down for some reason, or even that the user wants to exit Loci and log out. Loci could just pick up later where it left off. > Regarding the comment on making workflow information independent from the > biological stuff, I think the data section covers that separation quite > well. Keep the bio-related stuff in between the and tags > and the rest is Loci-specific workflow information. This format also makes > it easy to store the results of an analysis for archival purposes. This is from your message: [data block] [data block] [more steps] Let's see.. is either workflow or bio and are workflow is workflow [data block] is bio If that is correct, bio data is nested directly in workflow sections in 2 cases. I suppose this is acceptable if the definition of BICML will allow for bio data to go directly under : (amino acid 1-dimensional/sequence) Something like that :-) Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Tue Mar 16 18:23:41 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] new server Message-ID: <36EEE7FD.D2821B12@bc.edu> Locians, Tomorrow I will try to finish setting up our new server. It's not much, but it'll work for now: Pentium I/100 MHz 16 MB RAM (maybe I should get more) 10 GB HDD (brand new!) RedHat Linux 5.2 I will register the domain name bicgroup.org. The new Loci Web site will likely be at www.bicgroup.org/loci. But we'll just have an IP address for a while. The computer is partly owned by UMass Lowell, but we will work something out as we (Ken Marx and I) do not want to associate the Loci Project with the University. (I'm trying to avoid intellectual property problems here. I'm not paid by the University or a student there any longer, and I don't want the school to claim rights just because the server is there.) As time passes and funds pass my way, I will set up servers in my home. I want to give everyone an account. This way, we can upload and download what each of us has done. I have considered issues like CVS and patches, but the nature of Loci, being all smallish scripts with one author per script, allows us to avoid these things rather nicely. Isn't Python wonderful? Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Wed Mar 17 12:54:45 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] Dave Beck joins! Message-ID: <36EFEC65.50CACC93@bc.edu> Here is Dave's latest e-mail: -------------- next part -------------- An embedded message was scrubbed... From: Dave Beck Subject: Re: Loci / TULIP Date: Wed, 17 Mar 1999 12:17:26 -0500 Size: 5212 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990317/911f0a46/attachment.mht From bizzaro at bc.edu Thu Mar 18 00:28:24 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] new server up Message-ID: <36F08EF8.880A6043@bc.edu> Okay. We've got a dedicated server guys! 129.63.144.25 This will be shortly named onsager.uml.edu But use the IP for now. Everyone gets an account. I will send the pwords to you directly. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From carlosm at moet.cs.colorado.edu Thu Mar 18 02:39:02 1999 From: carlosm at moet.cs.colorado.edu (Carlos Maltzahn) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] new server up In-Reply-To: <36F08EF8.880A6043@bc.edu> Message-ID: Thanks Jeff. Be aware that onsager is not behind a firewall and seems to run Red Hat. I wouldn't use onsager for anything that cannot be restored very easily. Jeff, are you planning to give us some tulip-related web space on onsager? Carlos On Thu, 18 Mar 1999, J.W. Bizzaro wrote: Okay. We've got a dedicated server guys! 129.63.144.25 This will be shortly named onsager.uml.edu But use the IP for now. Everyone gets an account. I will send the pwords to you directly. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Thu Mar 18 03:07:25 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] new server up References: Message-ID: <36F0B43D.F5684219@bc.edu> Carlos Maltzahn wrote: > > Thanks Jeff. > > Be aware that onsager is not behind a firewall and seems to run Red Hat. > I wouldn't use onsager for anything that cannot be restored very easily. I know there is no firewall. But what's wrong with Red Hat? > > Jeff, are you planning to give us some tulip-related web space on onsager? Anything you want. What did you have in mind? Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From jabbo at mindless.com Thu Mar 18 05:02:00 1999 From: jabbo at mindless.com (Tim) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] new server up References: <36F0B43D.F5684219@bc.edu> Message-ID: <99Mar18.100530est.131770@gateway.macroint.com> >> I know there is no firewall. But what's wrong with Red Hat? Two words: script kiddies Either use ipchains or a packet filtering router (eg. a POS with a PCI bus ;-)). -- "Lisp has all the visual appeal of oatmeal with fingernail clippings mixed in." --Larry Wall From dave at arginine.umdnj.edu Thu Mar 18 08:17:59 1999 From: dave at arginine.umdnj.edu (Dave Beck) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] new server up In-Reply-To: <36F0B43D.F5684219@bc.edu>; from J.W. Bizzaro on Thu, Mar 18, 1999 at 08:07:25AM +0000 References: <36F0B43D.F5684219@bc.edu> Message-ID: <19990318081759.C18203@arginine.umdnj.edu> If enough people have access to the clients, Jeff, or even if only a few might, you could install ssh (http://www.cs.hut.fi/ssh/). Will there be a CVS repository on that box? Quoting J.W. Bizzaro (bizzaro@bc.edu): > Carlos Maltzahn wrote: > > > > Thanks Jeff. > > > > Be aware that onsager is not behind a firewall and seems to run Red Hat. > > I wouldn't use onsager for anything that cannot be restored very easily. > > I know there is no firewall. But what's wrong with Red Hat? > > > > > Jeff, are you planning to give us some tulip-related web space on onsager? > > Anything you want. What did you have in mind? > > > Jeff > -- > J.W. Bizzaro Phone: 617-552-3905 > Boston College mailto:bizzaro@bc.edu > Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ > -- -- Dave Beck dave@arginine.umdnj.edu Sites of interest (set 1): Computer Science and Biology http://locus.umdnj.edu/nigms/ Drexel University, Philadelphia PA http://www.bio.net/ From jabbo at mindless.com Thu Mar 18 08:18:07 1999 From: jabbo at mindless.com (Tim) Date: Fri Feb 10 19:18:22 2006 Subject: [Pipet Devel] new server up References: Message-ID: <99Mar18.132138est.131763@gateway.macroint.com> That reminds me, you should consider putting up a packet filter and only allowing connections on ports 80 and . Plaintext logins are a Bad Thing... SSH is a good thing. And CVS can run inside of SSH (duh, but worth noting). -- "A goal is a dream with a deadline." -- Harvey Mackay From jabbo at mindless.com Thu Mar 18 11:24:30 1999 From: jabbo at mindless.com (Tim) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] new server up References: Message-ID: <99Mar18.162759est.131743@gateway.macroint.com> That reminds me -- what sort of RAM does the machine take? I can pick some up at auction and beef the onsager.uml.edu server up to a respectable amount if you tell me what type (EDO or FP, ECC or not, how many pins, how many nanoseconds) it takes. 16MB won't cut it for anything exciting. (hell, my workstation has 128MB, but that's so I can cache the OS into memory ;-)) Also, I apologize for being almost dead to the world. I have been under a lot of pressure to pull off a lesser miracle ... as of April 1st that pressure is off. I have been playing with PyGTK and trying to get back into the swing of things, but the codon code I thought was finished isn't around, and I'd like to stick an interface on it anyways. I will have a lot of leverage here after my deadline. One thing that (thanks to work) I've been playing with a whole lot is servlets; I know that a web interface isn't really what we're after, but there are some stupendous projects out there that might allow us to run JPython versions of some of the code on a webserver. That, combined with the ability to do cool stuff with corba, equals a lot of freedom for showing prototypes to the people that would actually use this package. Anyways, I'll write more on this after my deadline. Konrad -- I know French crypto laws are sort of fascist but is there any way to use something similar to ssh? Or alternatively could we set up a mirroring type of thing? Or... hell, this could be interesting. We gotta work around it. I guess it would be way better to risk a corrupted codebase than to have barriers to people like Konrad's contributing easily. Jon Stevens at clearink has a bunch of notes on setting up CVS and managing stuff behind-the-scenes for the Java-Apache project: http://www.working-dogs.com Or alternatively I could help out after April 1st. (there's a theme here ;-)) Seriously though I can think of some other solutions now that I'm writing; we have an interactive system here at Macro that runs under SSL, maybe that would work, if so I can help you set it up (that'd be port 443) and we could work on it from that angle. Being fascist is silly, but so is losing work! -- "When it is not necessary to make a decision, it is necessary not to make a decision." --Lord Falkland From carlosm at mroe.cs.colorado.edu Thu Mar 18 12:59:07 1999 From: carlosm at mroe.cs.colorado.edu (Carlos Maltzahn) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] new server up In-Reply-To: <36F0B43D.F5684219@bc.edu> Message-ID: > > Be aware that onsager is not behind a firewall and seems to run Red Hat. > > I wouldn't use onsager for anything that cannot be restored very easily. > > I know there is no firewall. But what's wrong with Red Hat? Our passwords are going through the Internet in plain text. It's extremely easy to snoop them and then login. Red Hat's user friendly admin tools have the tendency to permit users to acquire root access among other things. RH's distributions are so unsecure that our department doesn't allow us to connect RH computers to the network inside the firewall. The Debian distribution tends to be more secure. I would recommend to put onsager behind a firewall and allow us to login through the firewall using ssh or at least one-time passwords. > > Jeff, are you planning to give us some tulip-related web space on onsager? > > Anything you want. What did you have in mind? I will start working at a company two months from now and eventually lose my CU account. At that point I'd like to have a neutral place for Paos. I was thinking about putting it on onsager -- but it needs to be more secure than it is now. I hate to discover one day that the Paos distribution contains a Trojan horse or something else ugly. More generally, I think onsager is not a save repository for Tulip development right now. Carlos From dave at arginine.umdnj.edu Thu Mar 18 13:55:17 1999 From: dave at arginine.umdnj.edu (Dave Beck) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] new server up In-Reply-To: <99Mar18.132138est.131763@gateway.macroint.com>; from Tim on Thu, Mar 18, 1999 at 08:18:07AM -0500 References: <99Mar18.132138est.131763@gateway.macroint.com> Message-ID: <19990318135517.A21261@arginine.umdnj.edu> Tim has the idea... I don't quite agree with Carlos's assesment of Red Hat's security flaws, but I don't think that matters if /etc/hosts.* files were set up properly and only SSH, port 80, and perhaps anonymous FTP were allowed from "unknown" hosts. As far as Paos being on a server that could be cracked, granted Carlos knows best of the potential dangers of Paos, but it would seem to me that ANY machine is potentialy vulnerable especially with man in the middle attacks possible. If there is potential for trojan horses being sent via Paos then Paos needs to deal with that (by providing some kind of encryption / tamper proofing on its messages) and not the server or operating system. I don't think it is reasonable to expect every locus server that might want to paticipate to ensure that its local network and every network between source and destination be secure and "tamper proof." Its more realistic to put a seatbelt in every car than it is to expect everyone to be a perfect driver. Quoting Tim (jabbo@mindless.com): > That reminds me, you should consider putting up a packet filter and only > allowing connections on ports 80 and right now>. > > Plaintext logins are a Bad Thing... SSH is a good thing. And CVS can > run inside of SSH (duh, but worth noting). > > -- > > "A goal is a dream with a deadline." > > -- Harvey Mackay -- Dave Beck dave@arginine.umdnj.edu Sites of interest (set 1): Computer Science and Biology http://locus.umdnj.edu/nigms/ Drexel University, Philadelphia PA http://www.bio.net/ From rahul at photino.sid.rice.edu Thu Mar 18 14:21:50 1999 From: rahul at photino.sid.rice.edu (Rahul Jain) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] new server up In-Reply-To: Message-ID: On Thu, 18 Mar 1999, Carlos Maltzahn wrote: > Our passwords are going through the Internet in plain text. It's extremely > easy to snoop them and then login. Red Hat's user friendly admin tools > have the tendency to permit users to acquire root access among other > things. RH's distributions are so unsecure that our department > doesn't allow us to connect RH computers to the network inside the > firewall. The Debian distribution tends to be more secure. I agree that Debian is generally more secure, but what's this about getting root with the admin tools? They're not suid root. -- -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- |--|--------|--------------|----|-------------|------|---------|-----|-| Version 11.423.999.210000101.23.50110101.042 (c)1996-1999, All rights reserved. Disclaimer available upon request. From hinsen at cnrs-orleans.fr Thu Mar 18 14:41:58 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] new server up In-Reply-To: <19990318135517.A21261@arginine.umdnj.edu> (message from Dave Beck on Thu, 18 Mar 1999 13:55:17 -0500) References: <99Mar18.132138est.131763@gateway.macroint.com> <19990318135517.A21261@arginine.umdnj.edu> Message-ID: <199903181941.UAA26378@dirac.cnrs-orleans.fr> > Tim has the idea... I don't quite agree with Carlos's assesment of > Red Hat's security flaws, but I don't think that matters if /etc/hosts.* > files were set up properly and only SSH, port 80, and perhaps anonymous > FTP were allowed from "unknown" hosts. As far as Paos being on a server In principle I like the idea of SSH as much as others, but I have a small problem: French cryptography law does not allow me to use SSH. There are plans to change them, but as far as I know nothing has happened yet. On the other hand, I see no problem with restricting telnet and ftp access to specific hosts; I can always go through my home machine if necessary. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From carlosm at moet.cs.colorado.edu Thu Mar 18 15:07:38 1999 From: carlosm at moet.cs.colorado.edu (Carlos Maltzahn) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] new server up In-Reply-To: Message-ID: I don't know the details of these security flaws. But if you look at the RH errata you see a lot of updates regarding users being able to get root access. It might all be fixed by now -- or it might not. I know of multiple groups here who switched to Debian because they had problems with people being able to hack into their RH systems. I'm not a firewall expert either. All I know is that breakins at the CU CS department were very frequent until we introduced a firewall, ssh, and one-time passwords. Carlos On Thu, 18 Mar 1999, Rahul Jain wrote: On Thu, 18 Mar 1999, Carlos Maltzahn wrote: > Our passwords are going through the Internet in plain text. It's extremely > easy to snoop them and then login. Red Hat's user friendly admin tools > have the tendency to permit users to acquire root access among other > things. RH's distributions are so unsecure that our department > doesn't allow us to connect RH computers to the network inside the > firewall. The Debian distribution tends to be more secure. I agree that Debian is generally more secure, but what's this about getting root with the admin tools? They're not suid root. -- -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- |--|--------|--------------|----|-------------|------|---------|-----|-| Version 11.423.999.210000101.23.50110101.042 (c)1996-1999, All rights reserved. Disclaimer available upon request. From rahul at photino.sid.rice.edu Thu Mar 18 18:56:42 1999 From: rahul at photino.sid.rice.edu (Rahul Jain) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] new server up In-Reply-To: Message-ID: On Thu, 18 Mar 1999, Carlos Maltzahn wrote: > > I don't know the details of these security flaws. But if you look at the > RH errata you see a lot of updates regarding users being able to get root > access. It might all be fixed by now -- or it might not. I know of > multiple groups here who switched to Debian because they had > problems with people being able to hack into their RH systems. AFAIK, these were bugs in the original packages, and were present in all distros. RH is probably just more vocal about the bugfixes because they have more corporate customers to worry about and they may not watch lists such as BUGTRAQ. -- -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- |--|--------|--------------|----|-------------|------|---------|-----|-| Version 11.423.999.210000101.23.50110101.042 (c)1996-1999, All rights reserved. Disclaimer available upon request. From bizzaro at bc.edu Thu Mar 18 19:58:43 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] new server up References: <36F0B43D.F5684219@bc.edu> <19990318081759.C18203@arginine.umdnj.edu> Message-ID: <36F1A143.52EC0104@bc.edu> Dave Beck wrote: > > If enough people have access to the clients, Jeff, or even if only a few > might, you could install ssh (http://www.cs.hut.fi/ssh/). Will there be > a CVS repository on that box? > I have the ssh2 package already, but I have never used it. CVS is another area I will need some help with. I am more an ambitious junior scientist than an OSS hacker ;-) Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Thu Mar 18 20:10:08 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] new server up References: <99Mar18.162759est.131743@gateway.macroint.com> Message-ID: <36F1A3F0.88F06833@bc.edu> Tim wrote: > > That reminds me -- what sort of RAM does the machine take? I can pick > some up at auction and beef the onsager.uml.edu server up to a > respectable amount if you tell me what type (EDO or FP, ECC or not, how > many pins, how many nanoseconds) it takes. 16MB won't cut it for > anything exciting. (hell, my workstation has 128MB, but that's so I can > cache the OS into memory ;-)) I friend of mine just gave me a fist full of SIMMS, 72-pin EDO. We don't even know the MB yet. I'll just have to plug them in and try. But the computer will get much more than 16 MB if I can help it. Thanks for the offer! But I'll see how this works out first. You didn't mention the fact that it is a Pentium 100. I know that's pathetic, but it's the best I can do for now. > > Also, I apologize for being almost dead to the world. I have been under > a lot of pressure to pull off a lesser miracle ... as of April 1st that > pressure is off. I have been playing with PyGTK and trying to get back > into the swing of things, but the codon code I thought was finished > isn't around, and I'd like to stick an interface on it anyways. I will > have a lot of leverage here after my deadline. Are you working on a thesis? > > One thing that (thanks to work) I've been playing with a whole lot is > servlets; I know that a web interface isn't really what we're after, but > there are some stupendous projects out there that might allow us to run > JPython versions of some of the code on a webserver. Ahhh. You may want to get together with Rahul about this. Since Sun made Java somewhat open source, I can accept a limited implementation of it for the Web front end. That project is, afterall, not part of the Loci core, so it can be licensed anyway we want. Since Loci is LGPL rather than GPL, the guts of the Web interface are irrelevant to the rest of Loci. > That, combined > with the ability to do cool stuff with corba, equals a lot of freedom > for showing prototypes to the people that would actually use this > package. Anyways, I'll write more on this after my deadline. Hmmm. Looking forward to it. > Jon Stevens at clearink has a bunch of notes on setting up CVS and > managing stuff behind-the-scenes for the Java-Apache project: > > http://www.working-dogs.com I'll check it out. > > Or alternatively I could help out after April 1st. (there's a theme > here ;-)) > Seriously though I can think of some other solutions now that I'm > writing; we have an interactive system here at Macro that runs under > SSL, maybe that would work, if so I can help you set it up (that'd be > port 443) and we could work on it from that angle. Being fascist is > silly, but so is losing work! > Do you mean SSL or SSH? Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Thu Mar 18 20:22:43 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] new server up References: Message-ID: <36F1A6E3.7DEA0ABE@bc.edu> Carlos Maltzahn wrote: > > Our passwords are going through the Internet in plain text. It's extremely > easy to snoop them and then login. Red Hat's user friendly admin tools > have the tendency to permit users to acquire root access among other > things. RH's distributions are so unsecure that our department > doesn't allow us to connect RH computers to the network inside the > firewall. Even _inside_ of a firewall? I know of one case where password snooping led to a security breach on a Solaris system. They used one-time passwords after that...pain. > I would recommend to put onsager behind a firewall and allow us to login > through the firewall using ssh or at least one-time passwords. UMass Lowell just doesn't seem so concerned about firewalls. Actually, I just set up a Web server at Boston College using Red Hat. But BC has this firewall set up for every system on the network that prevents every attempt to make a connection from the outside, which naturally blocks the Web server. I asked to have the firewall removed, and as nutty as they are about security, BC said all I have to do is disable finger and update sendmail. And the system administrator is a real Linux guru. He seemed to have little concern about using Red Hat. > > > Jeff, are you planning to give us some tulip-related web space on onsager? > > > > Anything you want. What did you have in mind? > > I will start working at a company two months from now and eventually lose > my CU account. At that point I'd like to have a neutral place for Paos. I > was thinking about putting it on onsager -- but it needs to be more secure > than it is now. I hate to discover one day that the Paos distribution > contains a Trojan horse or something else ugly. I would be honored to host PAOS. We'll get this security problem settled. > More generally, I think onsager is not a save repository for Tulip > development right now. Where do you think the biggest threat comes from, other developers or the occasional cracker? Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Thu Mar 18 20:33:42 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] new server up References: <99Mar18.132138est.131763@gateway.macroint.com> <19990318135517.A21261@arginine.umdnj.edu> Message-ID: <36F1A976.D014FF34@bc.edu> Dave Beck wrote: > > Tim has the idea... I don't quite agree with Carlos's assesment of > Red Hat's security flaws, but I don't think that matters if /etc/hosts.* > files were set up properly and only SSH, port 80, and perhaps anonymous > FTP were allowed from "unknown" hosts. Okay. We need someone to volunteer to be our anti-cracker. Tim? Carlos? Dave? Rahul? > As far as Paos being on a server > that could be cracked, granted Carlos knows best of the potential dangers > of Paos, but it would seem to me that ANY machine is potentialy vulnerable > especially with man in the middle attacks possible. If there is potential > for trojan horses being sent via Paos then Paos needs to deal with that > (by providing some kind of encryption / tamper proofing on its messages) > and not the server or operating system. I don't think it is reasonable > to expect every locus server that might want to paticipate to ensure that > its local network and every network between source and destination be > secure and "tamper proof." Its more realistic to put a seatbelt in every > car than it is to expect everyone to be a perfect driver. I'm sure Carlos was referring to the PAOS source code tree or whatever being compromised on an insecure server. But the reality of the Loci communication process being "secure" has not escaped me. We cannot guarantee that every Loci client (locus) on the Internet is legitimate, but we can take measures to keep loci communication in sort of a "sandbox", to use a Java term. Another concern is that companies using Loci will want to keep communication private, so that no one steals their million-dollar discovery. Maybe someone into encryption would like to take on that project. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Thu Mar 18 20:35:52 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] new server up References: Message-ID: <36F1A9F8.39EC5CC3@bc.edu> Rahul Jain wrote: > > On Thu, 18 Mar 1999, Carlos Maltzahn wrote: > > > Our passwords are going through the Internet in plain text. It's extremely > > easy to snoop them and then login. Red Hat's user friendly admin tools > > have the tendency to permit users to acquire root access among other > > things. RH's distributions are so unsecure that our department > > doesn't allow us to connect RH computers to the network inside the > > firewall. The Debian distribution tends to be more secure. > > I agree that Debian is generally more secure, but what's this about > getting root with the admin tools? They're not suid root. > Red Hat doesn't allow a direct login to root from a remote host. But you can log into a user account and use "su", if that is what you're referring to. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Thu Mar 18 20:37:02 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] new server up References: <99Mar18.132138est.131763@gateway.macroint.com> <19990318135517.A21261@arginine.umdnj.edu> <199903181941.UAA26378@dirac.cnrs-orleans.fr> Message-ID: <36F1AA3E.DFAFE99A@bc.edu> Konrad Hinsen wrote: > On the other hand, I see no problem with restricting telnet and ftp > access to specific hosts; I can always go through my home machine > if necessary. Or to specific domains. That's a good idea. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Thu Mar 18 22:35:12 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] and a new idea! Message-ID: <36F1C5F0.8D4765D8@bc.edu> Boy, you guys now have a mailbox full of my messages ;-) I did originally want to make the focus of Loci the production of publication-quality figures. This is where some of the comparisons to The GIMP came in. Every graphical locus is really supposed to be preparing an illustration/picture/image. Well, how about this: We have a _central_ "canvas", where the user can grab figures from other loci and drop them into the canvas. The way I see it, someone can take a nucleotide sequence from one locus, drop it onto the canvas, and then take the 3D structure of the DNA or RNA, and drop it right below the sequence for comparison. So, the user is really building a figure for publication. The workflow system comes in to play now because I would like to see each figure on the canvas dynamically updated by the originating locus. E.g., the user wants to go back and edit the sequence. When this is done, the user won't have to drag it back over to the canvas; Loci does it automatically. I think it does sort of tie together the graphical loci, as they weren't so much before. You know, the user really does have a single task in mind. They want to publish their results. Any comments? Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From justin at ukans.edu Fri Mar 19 01:37:48 1999 From: justin at ukans.edu (Justin Bradford) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] new server up In-Reply-To: <36F1A143.52EC0104@bc.edu> Message-ID: On Fri, 19 Mar 1999, J.W. Bizzaro wrote: > I have the ssh2 package already, but I have never used it. > CVS is another area I will need some help with. I can give you a hand with both. If you want, this weekend I'll put both on (but it'll require temporary root access). Justin From Thomas.Sicheritz at molbio.uu.se Fri Mar 19 03:13:50 1999 From: Thomas.Sicheritz at molbio.uu.se (Thomas.Sicheritz@molbio.uu.se) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] new server up In-Reply-To: <36F1A976.D014FF34@bc.edu> References: <99Mar18.132138est.131763@gateway.macroint.com> <19990318135517.A21261@arginine.umdnj.edu> <36F1A976.D014FF34@bc.edu> Message-ID: <14066.631.492397.523413@beagle.bmc.uu.se> > > Tim has the idea... I don't quite agree with Carlos's assesment of > > Red Hat's security flaws, but I don't think that matters if /etc/hosts.* > > files were set up properly and only SSH, port 80, and perhaps anonymous > > FTP were allowed from "unknown" hosts. > > Okay. We need someone to volunteer to be our anti-cracker. > > Tim? Carlos? Dave? Rahul? > I agree in RedHat being the least secure of all distributions - I switched from Debian & RH to Suse on all of the departments and my personal machines. One of our fresh installes RH machines was on the net in 7 minutes before the first successfull crack-in ... :-( My policy here is * restricted secure shell * if ssh is not an alternative: tcp_wrapper protected telnet/ftp and I do NOT close all ports - instead I wrapp/twist/fake them with tcp_wrapper so that we get a chance to notice any cracking attempts; read script kiddies (try to finger me at beagle.bmc.uu.se - I assure you we dont have users named fritz or bertram) * of course ... no rsh.rcp, rhost etc. My suggestion is to (at least) wrap all open ports directly in inetd. I fear that I have to stop looking at python and the sequence editor for a while ... to many meetings and to many unwritten thesises (=1) -thomas -- Sicheritz Ponten Thomas E. Department of Molecular Biology blippblopp@linux.nu BMC, Uppsala University BMC: +46 18 4714214 BOX 590 S-751 24 UPPSALA Sweden Fax +46 18 557723 http://evolution.bmc.uu.se/~thomas Molecular Tcl: http://evolution.bmc.uu.se/~thomas/tcl Molecular Linux: http://evolution.bmc.uu.se/~thomas/mol_linux De Chelonian Mobile ... The Turtle Moves ... From hinsen at cnrs-orleans.fr Fri Mar 19 04:09:38 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] new server up In-Reply-To: <99Mar18.162759est.131743@gateway.macroint.com> (message from Tim on Thu, 18 Mar 1999 11:24:30 -0500) References: <99Mar18.162759est.131743@gateway.macroint.com> Message-ID: <199903190909.KAA14982@dirac.cnrs-orleans.fr> > Konrad -- I know French crypto laws are sort of fascist but is there any > way to use something similar to ssh? Or alternatively could we set up a I don't know for sure, but in principle anything using cryptography is not allowed. I have seen the opinion that using ssh for password protection is OK as long as the following session is not encrypted; I think this was deduced by analogy to e-mail encryption, which is allowed for signatures but not for encrypting content. To make things worse, I work for a French government institution, so our system administrators won't tolerate anything which looks just the slightest bit illegal. On the other hand, if I am the only one in France (and it seems so at the moment), then don't worry. I can always go through my account on Starship Python, and use ssh from there. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From dave at arginine.umdnj.edu Fri Mar 19 09:37:10 1999 From: dave at arginine.umdnj.edu (Dave Beck) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] sequence editor (WAS: new server up) In-Reply-To: <14066.631.492397.523413@beagle.bmc.uu.se>; from Thomas.Sicheritz@molbio.uu.se on Fri, Mar 19, 1999 at 09:13:50AM +0100 References: <99Mar18.132138est.131763@gateway.macroint.com> <19990318135517.A21261@arginine.umdnj.edu> <36F1A976.D014FF34@bc.edu> <14066.631.492397.523413@beagle.bmc.uu.se> Message-ID: <19990319093710.A26189@arginine.umdnj.edu> Quoting Thomas.Sicheritz@molbio.uu.se (Thomas.Sicheritz@molbio.uu.se): > I fear that I have to stop looking at python and the sequence editor for a > while ... to many meetings and to many unwritten thesises (=1) Thomas, would you mind if I started futzing with it? I'd like to start porting my QT/C++ based sequence editor to Python/C/GTK and what you have is a terrific start... -- Dave Beck dave@arginine.umdnj.edu Sites of interest (set 2): Computer Science and Biology http://www.cyc.com/cyc-2-1/toc.html Drexel University, Philadelphia PA http://arginine.umdnj.edu/ From jabbo at mindless.com Fri Mar 19 09:56:40 1999 From: jabbo at mindless.com (Tim) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] new server up References: <99Mar18.162759est.131743@gateway.macroint.com> <36F1A3F0.88F06833@bc.edu> Message-ID: <99Mar19.150002est.131730@gateway.macroint.com> >> Or alternatively I could help out after April 1st. (there's a theme >> here ;-)) >> Seriously though I can think of some other solutions now that I'm >> writing; we have an interactive system here at Macro that runs under >> SSL, maybe that would work, if so I can help you set it up (that'd be >> port 443) and we could work on it from that angle. Being fascist is >> silly, but so is losing work! >Do you mean SSL or SSH? SSL is port 443 (usually, you can change this but I think it's a bad idea). I can't remember what port SSH connects to by default; I looked around a bit but it's been several months since I used ssh through a firewall. We need to set it up here (security at my company is pathetic) so pretty soon it will come back to me. -- "An organization is like a tree full of monkeys, all on different levels, some climbing up. The monkeys on the top look down and see a tree full of smiling faces. The monkeys on the bottom look up and see nothing but assholes." --Tom Schuneman From Thomas.Sicheritz at molbio.uu.se Fri Mar 19 10:41:41 1999 From: Thomas.Sicheritz at molbio.uu.se (Thomas.Sicheritz@molbio.uu.se) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] sequence editor In-Reply-To: <19990319093710.A26189@arginine.umdnj.edu> References: <99Mar18.132138est.131763@gateway.macroint.com> <19990318135517.A21261@arginine.umdnj.edu> <36F1A976.D014FF34@bc.edu> <14066.631.492397.523413@beagle.bmc.uu.se> <19990319093710.A26189@arginine.umdnj.edu> Message-ID: <14066.27762.236938.449442@beagle.bmc.uu.se> Dave Beck writes: > Quoting Thomas.Sicheritz@molbio.uu.se (Thomas.Sicheritz@molbio.uu.se): > > I fear that I have to stop looking at python and the sequence editor for a > > while ... to many meetings and to many unwritten thesises (=1) > > Thomas, would you mind if I started futzing with it? I'd like to start > porting my QT/C++ based sequence editor to Python/C/GTK and what you have > is a terrific start... Sure - I feel that I have no time at all to start with the graphical stuff (I haven't succeeded yet compiling gnomelibs on my Sun). But I'd like to keep on a little on the python based - behind the scenes/nongraphic - sequence classes. Could we corporate with this ? Beside my thesis I have another genome which has to be analysed, parsed and annotated. I feel that I have the basic python sequence class ready to build my usual tools on it (read: I almost got used to python and don't really want to stop messing around with it) Suggestions ? -thomas -- Sicheritz Ponten Thomas E. Department of Molecular Biology blippblopp@linux.nu BMC, Uppsala University BMC: +46 18 4714214 BOX 590 S-751 24 UPPSALA Sweden Fax +46 18 557723 http://evolution.bmc.uu.se/~thomas Molecular Tcl: http://evolution.bmc.uu.se/~thomas/tcl Molecular Linux: http://evolution.bmc.uu.se/~thomas/mol_linux De Chelonian Mobile ... The Turtle Moves ... From dave at arginine.umdnj.edu Fri Mar 19 11:08:12 1999 From: dave at arginine.umdnj.edu (Dave Beck) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] sequence editor In-Reply-To: <14066.27762.236938.449442@beagle.bmc.uu.se>; from Thomas.Sicheritz@molbio.uu.se on Fri, Mar 19, 1999 at 04:41:41PM +0100 References: <99Mar18.132138est.131763@gateway.macroint.com> <19990318135517.A21261@arginine.umdnj.edu> <36F1A976.D014FF34@bc.edu> <14066.631.492397.523413@beagle.bmc.uu.se> <19990319093710.A26189@arginine.umdnj.edu> <14066.27762.236938.449442@beagle.bmc.uu.se> Message-ID: <19990319110812.A27565@arginine.umdnj.edu> We need a CVS repository and strong documentation skills. ;) I don't have any problem working on shared sources.... I have found that it is pretty easy when everyone uses the changelogs and people set up watches on sources they are actively developing.. Quoting Thomas.Sicheritz@molbio.uu.se (Thomas.Sicheritz@molbio.uu.se): > Dave Beck writes: > > Quoting Thomas.Sicheritz@molbio.uu.se (Thomas.Sicheritz@molbio.uu.se): > > > I fear that I have to stop looking at python and the sequence editor for a > > > while ... to many meetings and to many unwritten thesises (=1) > > > > Thomas, would you mind if I started futzing with it? I'd like to start > > porting my QT/C++ based sequence editor to Python/C/GTK and what you have > > is a terrific start... > > Sure - I feel that I have no time at all to start with the graphical stuff > (I haven't succeeded yet compiling gnomelibs on my Sun). But I'd like to > keep on a little on the python based - behind the scenes/nongraphic - sequence > classes. Could we corporate with this ? Beside my thesis I have another > genome which has to be analysed, parsed and annotated. I feel that I have > the basic python sequence class ready to build my usual tools on it > (read: I almost got used to python and don't really want to stop messing > around with it) > > Suggestions ? > -thomas > > -- > Sicheritz Ponten Thomas E. Department of Molecular Biology > blippblopp@linux.nu BMC, Uppsala University > BMC: +46 18 4714214 BOX 590 S-751 24 UPPSALA Sweden > Fax +46 18 557723 http://evolution.bmc.uu.se/~thomas > Molecular Tcl: http://evolution.bmc.uu.se/~thomas/tcl > Molecular Linux: http://evolution.bmc.uu.se/~thomas/mol_linux > > De Chelonian Mobile ... The Turtle Moves ... -- Dave Beck dave@arginine.umdnj.edu Sites of interest (set 2): Computer Science and Biology http://www.cyc.com/cyc-2-1/toc.html Drexel University, Philadelphia PA http://arginine.umdnj.edu/ From bizzaro at bc.edu Fri Mar 19 12:12:42 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] new server up References: Message-ID: <36F2858A.937CFC10@bc.edu> Justin, The ssh2 is on my machine at home, not on "biohacker/onsager". Do you want me to get the packages (in RPM) for you first? I appreciate the help. I'll send you an e-mail with the pword in the body. That is, if you really are Justin :-) Everyone, Justin will set these up. I guess we'll start with SSH, and then we can consider the neat tricks mentioned by Tim and Thomas. Jeff Justin Bradford wrote: > > On Fri, 19 Mar 1999, J.W. Bizzaro wrote: > > > I have the ssh2 package already, but I have never used it. > > CVS is another area I will need some help with. > > I can give you a hand with both. > If you want, this weekend I'll put both on (but it'll require temporary > root access). > > Justin -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Fri Mar 19 12:26:22 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:23 2006 Subject: [Pipet Devel] Greg Waltz joins! Message-ID: <36F288BE.D42A7C6B@bc.edu> Locians, I recruited Greg Waltz, who is an OpenGL guru developing his own modeler with GTK. He will be taking charge of the rendering engines for the 3D loci. The messages to follow will be from our initial conversations. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Fri Mar 19 12:28:29 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] Greg/Jeff Message-ID: <36F2893D.22DBB01D@bc.edu> Forwarded message I sent to Greg... -------------- next part -------------- An embedded message was scrubbed... From: "J.W. Bizzaro" Subject: Re: 3D modeller for structural biology Date: Thu, 18 Mar 1999 05:13:51 +0000 Size: 4836 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990319/33adb5e7/attachment.mht From bizzaro at bc.edu Fri Mar 19 12:29:35 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] Greg/Jeff Message-ID: <36F2897F.ABDAEB1F@bc.edu> Forwarded message Greg sent back to me... -------------- next part -------------- An embedded message was scrubbed... From: greg waltz Subject: Re: 3D modeller for structural biology Date: Thu, 18 Mar 1999 11:55:01 -0500 (EST) Size: 4497 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990319/5aaea797/attachment.mht From dave at arginine.umdnj.edu Fri Mar 19 16:04:23 1999 From: dave at arginine.umdnj.edu (Dave Beck) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] www pages Message-ID: <19990319160423.A30492@arginine.umdnj.edu> I got tired of trying to find all the references to the tools mentioned in the list archives, so I have created a WWW page (in the J. W./ Loci style) which has the relevant homepages and download pages: http://cimr.umdnj.edu/~dave/loci # goto What You Need If I have left anything off, let me know... -- Dave Beck dave@arginine.umdnj.edu Sites of interest (set 2): Computer Science and Biology http://www.cyc.com/cyc-2-1/toc.html Drexel University, Philadelphia PA http://arginine.umdnj.edu/ From rahul at photino.sid.rice.edu Fri Mar 19 18:31:47 1999 From: rahul at photino.sid.rice.edu (Rahul Jain) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] new server up In-Reply-To: <99Mar19.150002est.131730@gateway.macroint.com> Message-ID: On Fri, 19 Mar 1999, Tim wrote: > I can't remember what port SSH connects to by default; I looked around a > bit but it's been several months since I used ssh through a firewall. It's port 22 by default. -- -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- |--|--------|--------------|----|-------------|------|---------|-----|-| Version 11.423.999.210000101.23.50110101.042 (c)1996-1999, All rights reserved. Disclaimer available upon request. From bizzaro at bc.edu Fri Mar 19 22:40:15 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] www pages References: <19990319160423.A30492@arginine.umdnj.edu> Message-ID: <36F3189F.7F2BFA1@bc.edu> Thanks Dave! I am making a new site of course on biohacker/onsager. The way I am organizing the pages, the information on your pages would go under "Developers". BTW, you did see my "PyG Tools" Web site, didn't you? I have many many links there to Python and GTK sites...but not all directly to the download sites. I guess I should have that. What do you mean by "tools we are tentatively going to use"? Do you have something else in mind? :-) Jeff Dave Beck wrote: > > I got tired of trying to find all the references to the tools mentioned > in the list archives, so I have created a WWW page (in the J. W./ Loci > style) which has the relevant homepages and download pages: > http://cimr.umdnj.edu/~dave/loci # goto What You Need > If I have left anything off, let me know... > > -- > Dave Beck > dave@arginine.umdnj.edu Sites of interest (set 2): > Computer Science and Biology http://www.cyc.com/cyc-2-1/toc.html > Drexel University, Philadelphia PA http://arginine.umdnj.edu/ -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From jabbo at mindless.com Sat Mar 20 08:43:19 1999 From: jabbo at mindless.com (Tim) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] interesting... Message-ID: <99Mar20.134637est.131718@gateway.macroint.com> http://www.inxight.com/Inxight_Corporate_Web_Site/Edu_Org_Program/Intro_to_Program.html Take a look at this tool... looks like it could be useful for browsing phylogenetic trees. I'm working a bit on a molecule viewer and the frontend for codon. If I'm lucky it could be done tomorrow night. If not, well, it will wait another week. -- "We don't like their sound, and guitar music is on the way out." --Decca Recording Co. rejecting the Beatles, 1962 From dave at arginine.umdnj.edu Sat Mar 20 09:50:46 1999 From: dave at arginine.umdnj.edu (Dave Beck) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] www pages In-Reply-To: <36F3189F.7F2BFA1@bc.edu>; from J.W. Bizzaro on Sat, Mar 20, 1999 at 03:40:15AM +0000 References: <19990319160423.A30492@arginine.umdnj.edu> <36F3189F.7F2BFA1@bc.edu> Message-ID: <19990320095046.C32426@arginine.umdnj.edu> Quoting J.W. Bizzaro (bizzaro@bc.edu): > I am making new site of course on biohacker/onsager. The way I am organizing > the pages, the information on your pages would go under "Developers". OK... > BTW, you did see my "PyG Tools" Web site, didn't you? I have many many links > there to Python and GTK sites...but not all directly to the download sites. I > guess I should have that. I'm a very to the point kind of man. ;) I was trying to prep 6 different boxes for this and I didn't want to navigate back to the download pages every time. BTW: Python 1.5.1, PyGTK 0.5.9, GTK+ 1.2, GLib 1.2, PAOS, and egcs 1.1.2 (C only), compiles effortlessly on Linux (duh), Solaris (not terribly surprising), IRIX (wow), AIX (double wow), and LinuxPPC (duh). > What do you mean by "tools we are tentatively going to use"? Do you have > something else in mind? :-) No, I just never like to commit until someone has put a gun to my head. That way you can at least LIE and say, "No, it was just a tentative plan to kill the Godfather." > Jeff > Dave Beck wrote: > > > > I got tired of trying to find all the references to the tools mentioned > > in the list archives, so I have created a WWW page (in the J. W./ Loci > > style) which has the relevant homepages and download pages: > > http://cimr.umdnj.edu/~dave/loci # goto What You Need > > If I have left anything off, let me know... > > > > -- > > Dave Beck > > dave@arginine.umdnj.edu Sites of interest (set 2): > > Computer Science and Biology http://www.cyc.com/cyc-2-1/toc.html > > Drexel University, Philadelphia PA http://arginine.umdnj.edu/ > > -- > J.W. Bizzaro Phone: 617-552-3905 > Boston College mailto:bizzaro@bc.edu > Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ > -- -- Dave Beck dave@arginine.umdnj.edu Sites of interest (set 3): Computer Science and Biology http://selene.biochem.uga.edu/tutorial/ Drexel University, Philadelphia PA http://www.cold.org/ From bizzaro at bc.edu Sat Mar 20 13:26:49 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] www pages References: <19990319160423.A30492@arginine.umdnj.edu> <36F3189F.7F2BFA1@bc.edu> <19990320095046.C32426@arginine.umdnj.edu> Message-ID: <36F3E868.837CF4A8@bc.edu> Dave Beck wrote: > I'm a very to the point kind of man. ;) I was trying to prep 6 different > boxes for this and I didn't want to navigate back to the download pages > every time. BTW: Python 1.5.1, PyGTK 0.5.9, GTK+ 1.2, GLib 1.2, PAOS, and > egcs 1.1.2 (C only), compiles effortlessly on Linux (duh), Solaris (not > terribly surprising), IRIX (wow), AIX (double wow), and LinuxPPC (duh). Your effort deserves a "wow!" Thanks. > > > What do you mean by "tools we are tentatively going to use"? Do you have > > something else in mind? :-) > No, I just never like to commit until someone has put a gun to my head. That > way you can at least LIE and say, "No, it was just a tentative plan to kill > the Godfather." Okey dokey. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Sat Mar 20 13:47:50 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] 3D modelers (was Greg/Jeff) References: <36F2897F.ABDAEB1F@bc.edu> Message-ID: <36F3ED56.BF57BB6D@bc.edu> Hi Greg. > gui progrqmming isn't too bad, but i know nothing about python. we'll see > how this goes, what jobs come up and when, and how much time i have when > they do. i certainly would not be adverse to learning a new language. For more information on Python bindings to GTK, you can check out my info page: http://www.uml.edu/Dept/Chem/BICGroup/PyGTools/ > right now it isn't very big. but it may be in the future. i've been > thinking about that, and i have come to the conclusion that mg^2 will have > the functions required to be useful to you in a few releases from now. in > fact, even now it has the features i think you would want (multiple views, > solid and wireframe rendering, translate, rotate, scale, zoom, etc.), but > those features need some work and some of the data structures need > redesigning (i'm working on that this afternoon). so, my idea is that we > can take a version of mg^2 that has the functionality you need while it's > still small and then buld in the specifics to your application. but this > all depends on what you want. i don't mind starting from scratch. It sounds like a plan to me :-) I'll get more detail to you ASAP. > > to solve your light weight constraint, why not make a main app that runs > the other functions as plug-ins? however, from what i saw on your site it > seems that you want alot of it to be command line driven so maybe plug-ins > wouldn't help so much. The command-line programs are implemented as a special feature of Loci. The other tools communicate via Python object server. If we had a small "engine" that could be modified by plug-in to make several special-purpose tools, that would be along the line I was thinking. > > a friend might be interested in the 3d area of this project also. i'll ask > him. Great. Anyone you know who'd like to help is more than welcome! Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From rahul at photino.sid.rice.edu Sun Mar 21 00:35:45 1999 From: rahul at photino.sid.rice.edu (Rahul Jain) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] interesting... In-Reply-To: <99Mar20.134637est.131718@gateway.macroint.com> Message-ID: On Sat, 20 Mar 1999, Tim wrote: > http://www.inxight.com/Inxight_Corporate_Web_Site/Edu_Org_Program/Intro_to_Program.html > > Take a look at this tool... looks like it could be useful for browsing > phylogenetic trees. What I thought it was from the article on slashdot. Unfortunately their site was slashdotted so I couldn't get to it before. It's just the same as a Metainformation format that Apple developed about a year ago. Forget what it was called but it was much cooler and free. Not open source, tho. -- -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- |--|--------|--------------|----|-------------|------|---------|-----|-| Version 11.423.999.210000101.23.50110101.042 (c)1996-1999, All rights reserved. Disclaimer available upon request. From bizzaro at bc.edu Sun Mar 21 00:41:11 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] interesting... References: <99Mar20.134637est.131718@gateway.macroint.com> Message-ID: <36F48677.D7936BB0@bc.edu> Tim wrote: > > http://www.inxight.com/Inxight_Corporate_Web_Site/Edu_Org_Program/Intro_to_Program.html > > Take a look at this tool... looks like it could be useful for browsing > phylogenetic trees. Hmmm. I can't get the Java to run, as usual. > > I'm working a bit on a molecule viewer and the frontend for codon. If > I'm lucky it could be done tomorrow night. If not, well, it will wait > another week. > Could you tell me more about these? How are these written? Are they meant for Loci? Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From Thomas.Sicheritz at molbio.uu.se Mon Mar 22 03:43:51 1999 From: Thomas.Sicheritz at molbio.uu.se (Thomas.Sicheritz@molbio.uu.se) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] interesting... In-Reply-To: <99Mar20.134637est.131718@gateway.macroint.com> References: <99Mar20.134637est.131718@gateway.macroint.com> Message-ID: <14069.64306.977925.47568@beagle.bmc.uu.se> Tim writes: > http://www.inxight.com/Inxight_Corporate_Web_Site/Edu_Org_Program/Intro_to_Program.html > > Take a look at this tool... looks like it could be useful for browsing > phylogenetic trees. Hmm ... :-) ... it looks quite fun ... (look at the spider phylogeny) We could adapt the rotating/zooming idea to the python phylogentic tool (pyphy or phypy .. or physpampy ?) Actually I really like the idea ... phylogenetic reconstructions tend to get large amounts of taxa - which is not easy to see in one window. I have to write a treeviewer module for my (hopefully) last bigger project (phylogenomics) in my thesis. I thought I would hack it in Tcl/Tk but with some help from other loci'ers I could try it in python. There is no good treeviewing program for all platforms (read: nothing for Linux and Solaris which doesn't need 8bpp color mode) I always had some problems to code treeparsing scripts and beeing able to represent them in a "good" way on the screen (trifurcation, distances etc.) I could need some help here ... I started on some smaller versions where the branches or taxa labels (in my case SWISSPROT ID's) are linked to yank (sequence retrieval), SWISSPROT database, blast and clustalw - which should be connected/linked from the whole genome map/sequence ... that seems to fit perfectly into the LOCI way of thinking. My time schedule: * mar,apr,may: finish my current paper * apr: bioinformatics meeting in Lyon(France) (RECOMB99) * apr: bioinformatics meeting in Lund(Sweden) (bioinformatics'99) (am I going to meet some of you in Lyon or Lund ?) * ???: start with the phylogenomic project I am very tempted to leave the whole sequence editor part to Dave and only keep on with the basic_nucleotide_sequence and phylogenetic tools. Suggestions ? -thomas -- Sicheritz Ponten Thomas E. Department of Molecular Biology blippblopp@linux.nu BMC, Uppsala University BMC: +46 18 4714214 BOX 590 S-751 24 UPPSALA Sweden Fax +46 18 557723 http://evolution.bmc.uu.se/~thomas Molecular Tcl: http://evolution.bmc.uu.se/~thomas/tcl Molecular Linux: http://evolution.bmc.uu.se/~thomas/mol_linux De Chelonian Mobile ... The Turtle Moves ... From pmr at sanger.ac.uk Mon Mar 22 04:19:46 1999 From: pmr at sanger.ac.uk (Peter Rice) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] interesting... In-Reply-To: <14069.64306.977925.47568@beagle.bmc.uu.se> (Thomas.Sicheritz@molbio.uu.se) References: <99Mar20.134637est.131718@gateway.macroint.com> <14069.64306.977925.47568@beagle.bmc.uu.se> Message-ID: <199903220919.JAA05972@unst.sanger.ac.uk> Thomas.Sicheritz@molbio.uu.se writes: >There is no good treeviewing program for all platforms (read: nothing for >Linux and Solaris which doesn't need 8bpp color mode) You could look at the European Bioinformatics Institute's hyperbolic viewer for taxonomy. It can generalize to all kinds of tree-based data. http://industry.ebi.ac.uk/~alan/BioWidget/ -- ---------------------------------------------------------------------- Peter Rice | Informatics Division, The Sanger Centre, E-mail: pmr@sanger.ac.uk | Wellcome Trust Genome Campus, Tel: (44) 1223 494967 | Hinxton, Cambridge, CB10 1SA, England Fax: (44) 1223 494919 | URL: http://www.sanger.ac.uk/Users/pmr/ From Thomas.Sicheritz at molbio.uu.se Mon Mar 22 05:05:02 1999 From: Thomas.Sicheritz at molbio.uu.se (Thomas.Sicheritz@molbio.uu.se) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] interesting... In-Reply-To: <199903220919.JAA05972@unst.sanger.ac.uk> References: <99Mar20.134637est.131718@gateway.macroint.com> <14069.64306.977925.47568@beagle.bmc.uu.se> <199903220919.JAA05972@unst.sanger.ac.uk> Message-ID: <14070.3927.402222.785804@beagle.bmc.uu.se> Peter Rice writes: > Thomas.Sicheritz@molbio.uu.se writes: > >There is no good treeviewing program for all platforms (read: nothing for > >Linux and Solaris which doesn't need 8bpp color mode) > > You could look at the European Bioinformatics Institute's hyperbolic > viewer for taxonomy. It can generalize to all kinds of tree-based data. > > http://industry.ebi.ac.uk/~alan/BioWidget/ Thx - looks nice. But what I had in mind was more a viewer and editor. - what about the performance - is this hyperbolic viewer really usable on a normal workstation ? (I really like the fish-eye views ...) -thomas -- Sicheritz Ponten Thomas E. Department of Molecular Biology blippblopp@linux.nu BMC, Uppsala University BMC: +46 18 4714214 BOX 590 S-751 24 UPPSALA Sweden Fax +46 18 557723 http://evolution.bmc.uu.se/~thomas Molecular Tcl: http://evolution.bmc.uu.se/~thomas/tcl Molecular Linux: http://evolution.bmc.uu.se/~thomas/mol_linux De Chelonian Mobile ... The Turtle Moves ... From pmr at sanger.ac.uk Mon Mar 22 05:47:03 1999 From: pmr at sanger.ac.uk (Peter Rice) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] interesting... In-Reply-To: <14070.3927.402222.785804@beagle.bmc.uu.se> (Thomas.Sicheritz@molbio.uu.se) References: <99Mar20.134637est.131718@gateway.macroint.com> <14069.64306.977925.47568@beagle.bmc.uu.se> <199903220919.JAA05972@unst.sanger.ac.uk> <14070.3927.402222.785804@beagle.bmc.uu.se> Message-ID: <199903221047.KAA06102@unst.sanger.ac.uk> Thomas, >Thx - looks nice. But what I had in mind was more a viewer and editor. >- what about the performance - is this hyperbolic viewer really usable on a >normal workstation ? > >(I really like the fish-eye views ...) It should be adaptable to become an editor. Alan Robinson at the EBI is the best contact for it. Peter -- ---------------------------------------------------------------------- Peter Rice | Informatics Division, The Sanger Centre, E-mail: pmr@sanger.ac.uk | Wellcome Trust Genome Campus, Tel: (44) 1223 494967 | Hinxton, Cambridge, CB10 1SA, England Fax: (44) 1223 494919 | URL: http://www.sanger.ac.uk/Users/pmr/ From bizzaro at bc.edu Mon Mar 22 16:10:52 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] interesting... References: <99Mar20.134637est.131718@gateway.macroint.com> <14069.64306.977925.47568@beagle.bmc.uu.se> Message-ID: <36F6B1DC.487EB75E@bc.edu> Thomas.Sicheritz@molbio.uu.se wrote: > We could adapt the rotating/zooming idea to the python phylogentic tool > (pyphy or phypy .. or physpampy ?) > Or how about locus_phy :-) > Actually I really like the idea ... phylogenetic reconstructions tend to get > large amounts of taxa - which is not easy to see in one window. > I have to write a treeviewer module for my (hopefully) last bigger project > (phylogenomics) in my thesis. I thought I would hack it in Tcl/Tk but with > some help from other loci'ers I could try it in python. > I'll help all that I can! > > There is no good treeviewing program for all platforms (read: nothing for > Linux and Solaris which doesn't need 8bpp color mode) > I always had some problems to code treeparsing scripts and beeing able to > represent them in a "good" way on the screen (trifurcation, distances etc.) > I could need some help here ... > I do like the representation Peter showed us: http://industry.ebi.ac.uk/~alan/BioWidget/ There are numerous ways people have chosen to represent this type of data. I think a good look at some of the literature will help us. I'll see if I can find anything else. > > I started on some smaller versions where the branches or taxa labels (in my > case SWISSPROT ID's) are linked to yank (sequence retrieval), SWISSPROT > database, blast and clustalw - which should be connected/linked from the > whole genome map/sequence ... that seems to fit perfectly into the LOCI > way of thinking. > Phylogeny is something we be very concerned with in developing LocusML. Your input to Justin and Rahul would be helpful. > > My time schedule: > * mar,apr,may: finish my current paper > * apr: bioinformatics meeting in Lyon(France) (RECOMB99) > * apr: bioinformatics meeting in Lund(Sweden) (bioinformatics'99) > (am I going to meet some of you in Lyon or Lund ?) > I wish. I haven't heard of Bioinformatics'99. Do you have a URL? Konrad, will you be at RECOMB99? > > I am very tempted to leave the whole sequence editor part to Dave and > only keep on with the basic_nucleotide_sequence and phylogenetic tools. > > If you wish. Be sure to work with Tim on the basic sequence tools, as he has been developing some (Codon). Jeff bizzaro@bc.edu From bizzaro at bc.edu Mon Mar 22 22:24:17 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] molecule viewer Message-ID: <36F70960.8B6B10BB@bc.edu> Locians, I found a simple C-GTK molecule viewer, but perhaps the only GTK+ molecule viewer. It comes from Eric Harlow's new book on GTK development. I have a link to the source code at the new Web site: http://129.63.144.25/ It compiled on my system, but I can't seem to display any molecules. If anyone gets it to work right, let me know. Greg, this might be something you want to take a close look at, since it deals with molecules and a GTK GUI. I can't find the license for it, but I think it is GNU GPL. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From justin at ukans.edu Tue Mar 23 02:24:39 1999 From: justin at ukans.edu (Justin Bradford) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] phylogeny and and overview question In-Reply-To: <36F6B1DC.487EB75E@bc.edu> Message-ID: > Phylogeny is something we be very concerned with in developing LocusML. > Your input to Justin and Rahul would be helpful. Phylogeny, too?!? Ok. Well, I'm going to need input here. I'd like to make a draft of the LocusML, but I need input on structure and phylogeny. Sequence I'm going to take from bioml and bsml. As for the structure, I want to clear up something I'm a little confused about. Does the Loci system work like this: Desktop <-> wfs <--|----> analysis locus #1 | |----> analysis locus #2 | |----> database | |----> etc... And things from the third column only talk to the wfs, and not directly to each other. Right? Justin From justin at ukans.edu Tue Mar 23 02:26:28 1999 From: justin at ukans.edu (Justin Bradford) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] phylogeny and and overview question In-Reply-To: Message-ID: > As for the structure, I want to clear up something I'm a little confused > about. Does the Loci system work like this: I just realized this might be a bit unclear. When I said structure in the sentence, I might the structure of the Loci framework of tools. Justin From hinsen at cnrs-orleans.fr Tue Mar 23 04:47:08 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] interesting... In-Reply-To: <36F6B1DC.487EB75E@bc.edu> (bizzaro@bc.edu) References: <99Mar20.134637est.131718@gateway.macroint.com> <14069.64306.977925.47568@beagle.bmc.uu.se> <36F6B1DC.487EB75E@bc.edu> Message-ID: <199903230947.KAA22818@dirac.cnrs-orleans.fr> > > My time schedule: > > * mar,apr,may: finish my current paper > > * apr: bioinformatics meeting in Lyon(France) (RECOMB99) > > * apr: bioinformatics meeting in Lund(Sweden) (bioinformatics'99) > > (am I going to meet some of you in Lyon or Lund ?) > > I wish. I haven't heard of Bioinformatics'99. Do you have a URL? > > Konrad, will you be at RECOMB99? I didn't even know about it until now! Which perhaps proves that I am not in bioinformatics... Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From Thomas.Sicheritz at molbio.uu.se Tue Mar 23 04:53:13 1999 From: Thomas.Sicheritz at molbio.uu.se (Thomas.Sicheritz@molbio.uu.se) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] interesting... In-Reply-To: <199903230947.KAA22818@dirac.cnrs-orleans.fr> References: <99Mar20.134637est.131718@gateway.macroint.com> <14069.64306.977925.47568@beagle.bmc.uu.se> <36F6B1DC.487EB75E@bc.edu> <199903230947.KAA22818@dirac.cnrs-orleans.fr> Message-ID: <14071.25652.327532.298103@beagle.bmc.uu.se> Konrad Hinsen writes: > > > My time schedule: > > > * mar,apr,may: finish my current paper > > > * apr: bioinformatics meeting in Lyon(France) (RECOMB99) > > > * apr: bioinformatics meeting in Lund(Sweden) (bioinformatics'99) > > > (am I going to meet some of you in Lyon or Lund ?) > > > > I wish. I haven't heard of Bioinformatics'99. Do you have a URL? > > Konrad, will you be at RECOMB99? > > I didn't even know about it until now! Which perhaps proves that > I am not in bioinformatics... http://www.loria.fr/~kucherov/RECOMB99/ http://www.biokemi.su.se/bioinformatics99/ -thomas -- Sicheritz Ponten Thomas E. Department of Molecular Biology blippblopp@linux.nu BMC, Uppsala University BMC: +46 18 4714214 BOX 590 S-751 24 UPPSALA Sweden Fax +46 18 557723 http://evolution.bmc.uu.se/~thomas Molecular Tcl: http://evolution.bmc.uu.se/~thomas/tcl Molecular Linux: http://evolution.bmc.uu.se/~thomas/mol_linux De Chelonian Mobile ... The Turtle Moves ... From rahul at photino.sid.rice.edu Tue Mar 23 05:47:52 1999 From: rahul at photino.sid.rice.edu (Rahul Jain) Date: Fri Feb 10 19:18:24 2006 Subject: [Pipet Devel] interesting... In-Reply-To: <14070.3927.402222.785804@beagle.bmc.uu.se> Message-ID: On Mon, 22 Mar 1999 Thomas.Sicheritz@molbio.uu.se wrote: > Thx - looks nice. But what I had in mind was more a viewer and editor. The Apple project (XCF?) had an integrated viewer/editor/generator(from an HTML nested list or by following links). The interface allowed the user to position the node in space and then use the mouse to "fly through" the tree. It was really quite impressive. > - what about the performance - is this hyperbolic viewer really usable on a > normal workstation ? The Java applet gives reasonable performance on my P150 under Linux 2.2 and Netscape 4.08. The Apple version was a bit less than twice as fast as a Windows binary on the same system. -- -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- |--|--------|--------------|----|-------------|------|---------|-----|-| Version 11.423.999.210000101.23.50110101.042 (c)1996-1999, All rights reserved. Disclaimer available upon request. From rahul at photino.sid.rice.edu Tue Mar 23 06:06:15 1999 From: rahul at photino.sid.rice.edu (Rahul Jain) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] interesting... In-Reply-To: Message-ID: On Tue, 23 Mar 1999, Rahul Jain wrote: > On Mon, 22 Mar 1999 Thomas.Sicheritz@molbio.uu.se wrote: > > > Thx - looks nice. But what I had in mind was more a viewer and editor. > > > - what about the performance - is this hyperbolic viewer really usable on a > > normal workstation ? Oops, I didn't realize that you were talking about another tree-viewer.... This one seems a bit slow and the labels tend to overlap a bit. The slow responsiveness makes it even harder to get to a place where you can read them well. The zoombar should also have a slightly different scale... Personally, I loved the Apple viewer and that model would be really nice to follow: It's sort of in 3D, with the top node in front and each lower level farther back. Clicking on a part of the figure flys in that direction, pressing Ctrl speeds it up and pressing Shift reverses direction. Double-clicking on a node centers it and brings it to a reasonable distance. You really have to use it to see how cool the view and interface was. Unfortunately, I think Apple canceled the project and scrapped all the code. I can't find a single reference to it now... -- -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- |--|--------|--------------|----|-------------|------|---------|-----|-| Version 11.423.999.210000101.23.50110101.042 (c)1996-1999, All rights reserved. Disclaimer available upon request. From rahul at photino.sid.rice.edu Tue Mar 23 07:29:57 1999 From: rahul at photino.sid.rice.edu (Rahul Jain) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] interesting... In-Reply-To: Message-ID: Ya know, we could support all of these if we code this thing correctly. Just make an abstract(virtual) class that's a "TreeViewer". Then we can have concrete subclasses such as FlyThruTreeViewer and SphereOverlaidTreeViewer or whatever. We can implement whatever's easiest at first and then add more as time goes on and as we get more developers. I better get some sleep... -- -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- |--|--------|--------------|----|-------------|------|---------|-----|-| Version 11.423.999.210000101.23.50110101.042 (c)1996-1999, All rights reserved. Disclaimer available upon request. From Thomas.Sicheritz at molbio.uu.se Tue Mar 23 07:38:01 1999 From: Thomas.Sicheritz at molbio.uu.se (Thomas.Sicheritz@molbio.uu.se) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] interesting... In-Reply-To: References: Message-ID: <14071.35325.208490.652352@beagle.bmc.uu.se> Rahul Jain writes: > Ya know, we could support all of these if we code this thing correctly. > Just make an abstract(virtual) class that's a "TreeViewer". > Then we can have concrete subclasses such as FlyThruTreeViewer and > SphereOverlaidTreeViewer or whatever. We can implement whatever's easiest > at first and then add more as time goes on and as we get more developers. Its a pitty that I cannot check the apple viewer - it sounds really interesting ... But you are right - we can implement (step by step) everything we want ... I'll think about what's needed. > I better get some sleep... Ok - I better get some lunch ... c ya -thomas -- Sicheritz Ponten Thomas E. Department of Molecular Biology blippblopp@linux.nu BMC, Uppsala University BMC: +46 18 4714214 BOX 590 S-751 24 UPPSALA Sweden Fax +46 18 557723 http://evolution.bmc.uu.se/~thomas Molecular Tcl: http://evolution.bmc.uu.se/~thomas/tcl Molecular Linux: http://evolution.bmc.uu.se/~thomas/mol_linux De Chelonian Mobile ... The Turtle Moves ... From Thomas.Sicheritz at molbio.uu.se Tue Mar 23 10:10:25 1999 From: Thomas.Sicheritz at molbio.uu.se (Thomas.Sicheritz@molbio.uu.se) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] Tree Data Structure Message-ID: <14071.44136.217697.452404@beagle.bmc.uu.se> Hej, What is the best way to implement a tree data structure in pyhton ? The example tree (in newick format) looks like: (chlamydia:100.000000,((PARDE:100.000000,RECAM:100.000000,RICKY:100.000000,(PORPU:100.000000,CHOCR:100.000000):100.000000):100.000000,(ECOLI:100.000000,COXBU:100.000000):100.000000):100.000000,MYCTUB:100.000000); drawn as an unrooted tree: /--------------------------------------------------- chlamydia(1) | | /----------------------- PARDE(2) | | | +----------------------- RECAM(3) | | | /------100------+----------------------- RICKY(6) | | | | | | /------ PORPU(7) | | \------100-------+ +--100------+ \------ CHOCR(9) | | | | /------ ECOLI(4) | \--------------100---------------+ | \------ COXBU(5) | \--------------------------------------------------- MYCTUB(8) Should we choose nested lists, NL parser or available tree modules ? I am not fluent in python yet - so I could need help with the basic structure and parsing classes. Suggestions ? thx -thomas -- Sicheritz Ponten Thomas E. Department of Molecular Biology blippblopp@linux.nu BMC, Uppsala University BMC: +46 18 4714214 BOX 590 S-751 24 UPPSALA Sweden Fax +46 18 557723 http://evolution.bmc.uu.se/~thomas Molecular Tcl: http://evolution.bmc.uu.se/~thomas/tcl Molecular Linux: http://evolution.bmc.uu.se/~thomas/mol_linux De Chelonian Mobile ... The Turtle Moves ... From bizzaro at bc.edu Tue Mar 23 10:40:13 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] phylogeny and and overview question References: Message-ID: <36F7B5DD.D4AC2173@bc.edu> Justin Bradford wrote: > As for the structure, I want to clear up something I'm a little confused > about. Does the Loci system work like this: > > Desktop <-> wfs <--|----> analysis locus #1 > | > |----> analysis locus #2 > | > |----> database > | > |----> etc... > > And things from the third column only talk to the wfs, and not directly to > each other. Right? > Maybe more like this: Workspace <-> wfs <--|----> analysis locus #1 | wfs | |----> analysis locus #2 | wfs | |----> gui locus #1 | wfs | |----> gui locus #2 | wfs | |----> database | wfs | |----> etc... where "Workspace" is (1) the workflow diagram/monitor, (2) the notebook/logger, and (3) the central canvas. Communication to these isn't really any different. It's just that these are for user monitoring and control. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From hinsen at cnrs-orleans.fr Tue Mar 23 14:17:56 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] Tree Data Structure In-Reply-To: <14071.44136.217697.452404@beagle.bmc.uu.se> (Thomas.Sicheritz@molbio.uu.se) References: <14071.44136.217697.452404@beagle.bmc.uu.se> Message-ID: <199903231917.UAA17142@dirac.cnrs-orleans.fr> Thomas.Sicheritz@molbio.uu.se writes: > What is the best way to implement a tree data structure in pyhton ? That depends on the operations that are to be performed on the data! Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From carlosm at moet.cs.colorado.edu Tue Mar 23 15:41:18 1999 From: carlosm at moet.cs.colorado.edu (Carlos Maltzahn) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] Tree Data Structure In-Reply-To: <199903231917.UAA17142@dirac.cnrs-orleans.fr> Message-ID: I agree with Konrad. If you are using the tree data structure for fast access to a large amount of data, use the B-tree portion of bsddb (www.sleepycat.com). Python has bindings to the (old) 1.85 version (somebody might have swigged the newer versions, too). It's faster than any Python program and you get persistency for free. Carlos On Tue, 23 Mar 1999, Konrad Hinsen wrote: Thomas.Sicheritz@molbio.uu.se writes: > What is the best way to implement a tree data structure in pyhton ? That depends on the operations that are to be performed on the data! Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From bizzaro at bc.edu Tue Mar 23 17:43:47 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] Tree Data Structure References: Message-ID: <36F81923.CB21F0EF@bc.edu> Thomas, The B+Tree module for Python (bplustree.py), written by Aaron Watters, is attached. I'm not sure if this is Berkeley DB version that Carlos was referring to, but it is not a binding. It's all Python. Carlos, have you seen this module before? Is it any good? There appears to be no license for this module other than this: This code is provided for arbitrary use, but without warrantee of any kind. At present it seems to work, but I'll call it an beta until it's better tested. Jeff bizzaro@bc.edu Carlos Maltzahn wrote: > > I agree with Konrad. If you are using the tree data structure for fast > access to a large amount of data, use the B-tree portion of bsddb > (www.sleepycat.com). Python has bindings to the (old) 1.85 version > (somebody might have swigged the newer versions, too). It's faster than > any Python program and you get persistency for free. > > Carlos > > On Tue, 23 Mar 1999, Konrad Hinsen wrote: > > Thomas.Sicheritz@molbio.uu.se writes: > > > What is the best way to implement a tree data structure in pyhton ? > > That depends on the operations that are to be performed on the > data! > > Konrad. > -- > ------------------------------------------------------------------------------- > Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr > Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 > Rue Charles Sadron | Fax: +33-2.38.63.15.17 > 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ > France | Nederlands/Francais > ------------------------------------------------------------------------------- > -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- -------------- next part -------------- """ B+tree implementation. ====================== B+ trees are an efficient index structure for mapping a dictionary type object into a disk file. All keys for these dictionary structures are strings with a fixed maximum length. The values can be strings or integers (often representing seek positions in a secondary file) depending on the implementation. B+ trees can be useful for storing large mappings on disk in such a way that a small number of keys/values can be retrieved very quickly (with very few disk accesses). B+ trees can also be useful for sorting a very large number (millions) of records by unique string key values. In this implementation all keys must not exceed the maximum length for a given tree. For string values there is no limitation on size of content. Note that in my tests updates are 2-3 times slower than retrieves, except for walking which is much faster than normal retrieves. As an add-on this module also provides a dbm compatible interface that permits arbitrary length keys and values. See below. Provided here are several implementations: BplusTree(): defines a mapping from strings to integers. caching_BPT(): subclass of BplusTree that caches key,value pairs already seen. This one cannot be updated. Construct a compatible index file using BplusTree and for read only access that touches a manageable number of keys, reopen the file using caching_BPT. SBplusTree(): defines a mapping from strings to strings. Updatable, but overwrites or deletions will leave "unreachable garbage" in the "value space" of the index file. Use recopy_sbplus() to recopy the file, eliminating the garbage. caching_SBPT(): analogous to caching_BPT, but mapping to strings. File creation: ============== To create an index file do the following: file = open(filename, "w+b") B = SBplusTree(file, seek_position, nodesize, keymax) B.startup() where seek_position is the seek_position where to "start" the tree (usually the start of file, 0), nodesize is the number of keys to keep at each node of the tree (pick an even number between 2 and 255), and keymax is the maximum size for the string keys in the mapping. When choosing nodesize remember that larger nodesizes make Python do more work and the file system do less work. I think 212 is probably a pretty good number. Of course choose keymax to be as large as you will need. A too large key size, however, may waste considerable space in the file. Now that you have a tree you can populate it with values just like a dictionary. B["this"] = "that" B["willy"] = "wonka" x = B["this"] del B["this"] print len(B) ... f.close() The supported dictionary operations are indexed retrieval B[k], indexed assignment B[k] = v, key deletion del B[k] and length len(B). Retrieval and deletion will raise KeyError on absent key. Assignment will raise ValueError if the key is too large. B.keys(), B.values(), B.items() are not directly supported, but see "Walking" below. Note that the "basic" B-plus tree implementations only accept and return integers as values. The SB-plus implementation will accept anything as values, but will use the str(x) function to convert them to a string before storing the value in the file. The value returned will always be the string value stored. IE B["okeydoke"] = 23 print `B["okeydoke"]` prints "'23'", with the quotes. The controlling application must control the serialization/deserialization of values if it needs to store something other than strings. Read only file access: ====================== Once an index file exists it can be re-opened in "read only" mode. f = open(filename, "rb") B = caching_SBPT(f) B.open() print B["willy"] Note that the configuration parameters for the tree are determined from a "file header". Note however that a file written to store integers using BplusTree should not be opened for strings using SBplusTree or undefined and undesirable behaviour will result. Opening an SBplusTree as a BplusTree is not advisable either. If the seek position for the start of the tree is anything other than 0, it must be specified: B = caching_SBPT(f, position) or undefined behaviour will result. In this mode, retrieval and walking are permitted, but attempts to modify the structure will cause an exception. In this mode the programmer may prefer to use the "caching" versions if they expect to retrieve the same keys many times and if the number of keys to touch is not huge (say, in the millions). Re-open for modification: ========================= An existing index file can also be reopened for modification. f = open(filename, "r+b") B = SBplusTree(f) B.open() B["this"] = "is fun!" ... f.close() Again, modifications are disallowed for cached trees. Walking: ======== One of the neat features of B-plus trees is that they keep their keys in sorted order. Hence it is easy and efficient to retrieve the keys/values sorted by the keys, and also to do range queries. To support this feature the tree implementations provide a "walker" interface. walker = tree.walker(lowerkey, includelower, upperkey, includeupper) while walker.valid: print (walker.current_key(), walker.current_value()) walker.next() walker.first() Or to traverse all pairs in key-sorted order walker = tree.walker() while walker.valid: print (walker.current_key(), walker.current_value()) walker.next() walker.first() The lowerkey/upperkey parameters indicate where to start/end walking (interpreted as the beginning/end if they are omitted or set to None) and includelower indicates whether to include the lower value if it is present in the tree, if not the next greater key will be the start position. For example to walk from key "m" (or just past it if absent) to the end: w = tree.walker("m", 1) or to walk between "mzzz" and "nzzz" not inclusive: w = tree.walker("mzzz", 0, "nzzz", 0) or walk from the beginning to "m", not inclusive w = tree.walker(None, None, "m", 0) Here w.current_key() and w.current_value() retrieve the current key and value respectively, w.next() moves to the next pair, if there is one and w.valid indicates whether there is a current pair, and w.first() resets the walker to the first pair, if there is one. At initialization the walker is already at the first pair, if it exists. Multiaccess optimizations: ========================== To make updates and retrievals run faster you can enable/disable a tree-global least-recently-used fifo mechanism which reduces reads and writes, but be *sure* to disable it before closing any BTree file that has been modified, or the tree may well become corrupt try: B.enable_fifo() do_updates(B) finally: B.disable_fifo() The fifo may also improve performance for read only access, but it is not important to disable the mechanism later. The optimizations help most when key accesses are localized. (ie, a bunch of inserts with keys starting "abc..." or 10000 inserts in [almost] key-sorted order). For only one access, it's no help at all! The fifo mechanism will not help for walking, so don't do it if you will only walk a portion of the tree once. You might want to try putting various values as the optional argument to enable_fifo, eg, B.enable_fifo(1000) (but that's probably past the diminishing returns point...). Large fifos will consume lots of "core" memory. Trash compacting ================ The functions recopy_bplus(f1, f2) and recopy_sbplus(f1, f2) recopy open "rb" file f1 to (open "w+b") file f2 for BplusTrees and SBplusTrees respectively. The copy f2 will have no "garbage" and almost all leaf nodes will be full. This can result in reducing file size by about 1/3. Both files must have headers at seek 0 and hold nothing but the tree nodes and tree data. Also look at recopy_tree(t1, t2). DBM compatibility ================= As an application of SBplusTree this module also provides a plug-compatible implementation of the standard python dbm style functionality, except that the "mode" parameter is not supported on initialization. See the Python Lib manual entry on dbm. Both keys and values may be of *arbitrary* length in this case, but keys are not kept in key-sorted order and overwrites and key collisions will result in unused garbage in the file (keys and values occur as SBplustree "values" using a PORTABLE bucket hashing scheme). d = dbm(filename, flag) creates a dictionary like structure with d[key]=value, x=d[key], d.has_key(key), del d[key], len(d), and d.keys(). Also after any modification be sure that d gets explicitly closed d.close() or the file *may* become corrupt. Also, d.copy(otherfilename, "c") will create a more compact copy of d in another file with garbage discarded. The dbm implementation uses a very large fifo, so many accesses may consume a lot of "core" memory. DBM comparison ============== An alternative to this module is gdbm or dbm for file indexing -- both supported by available Python extension modules. Expect dbm to be generally faster than this module, but remember: - dbm doesn't do key-sorted walking. - dbm often isn't portable across machines. - dbm isn't written in Python (ie, requires an extension module). - dbm sometimes doesn't allow arbitrary value lengths (but gdbm allows arbitrary length keys and values...) whereas this module does/is. I don't know precisely how much faster dbm is, but for some types of use it may turn out to actually be slower, for all I know. Please let me know! Probably the most compelling advantage is that the index files generated by this module are portable across platforms. Fun === For fun or debugging try tree.dump(). There is also a test suite for the module at the bottom (test() and retest()) which create a test index called "test" in the current directory. Also testdbm(). Caveats: ======== NOTE: only the standard string ordering is supported for walking at present. This could be fixed... WARNING: Never modify a tree while it is being walked. Always recreate all walkers after a tree modification. NEVER open the same tree for modification twice! ALWAYS make sure a modified tree has disabled the fifo and the file has been closed before reopening the tree. WARNING: This implementation has no support for concurrent modification. It is designed for "write once by one process", "read many by (possibly) several processes, but not with concurrent modification." WARNING: If during modification any exception other than a KeyError/ValueError is not caught, the indexed file structure *may* become corrupt (because some operations completed and others didn't). Walking all values of an index or B.dump() may detect some corrupt states (***Note I should write a sanity-check routine***) WARNING: As noted above an overwrite or delete for a SBTree (mapping to strings) will leave unreachable junk in the "value space" of the index. See above. This code is provided for arbitrary use, but without warrantee of any kind. At present it seems to work, but I'll call it an beta until it's better tested. Aaron Watters, arw@pythonpros.com http://starship.skyport.net/crew/aaron_watters http://www.pythonpros.com """ import string nilseek = -1 from marshal import dumps sequence_overhead = len(dumps("")) intsize = len(dumps(1)) # bisect algorithm with bounds (in 1.5 this is in /Lib) # Insert item x in list a, and keep it sorted assuming a is sorted def insort(a, x, lo=0, hi=None): if hi is None: hi = len(a) while lo < hi: mid = (lo+hi)/2 if x < a[mid]: hi = mid else: lo = mid+1 a.insert(lo, x) # Find the index where to insert item x in list a, assuming a is sorted def bisect(a, x, lo=0, hi=None): if hi is None: hi = len(a) while lo < hi: mid = (lo+hi)/2 if x < a[mid]: hi = mid else: lo = mid+1 return lo NOROOMERROR = "NOROOMERROR" Rootflag = 1 Interiorflag = 2 Freeflag = 3 Leafflag = 4 LeafandRootflag = 5 Leafflags = (Leafflag, LeafandRootflag) Interiorflags = (Interiorflag, Rootflag) class Node_Fifo: """fifo of nodes for locality access optimization""" def __init__(self, size=30): self.fifo = [] # fifo of active nodes, if used. self.fifosize = size self.fifo_dict = {} def flush_fifo(self): for node in self.fifo: if node.dirty: node.store(1) self.fifo = [] self.fifo_dict = {} class Node: """B+ tree node. follows Silberchatz & Korth database intro book closely. Each node has a number self.validkeys> 1 of valid keys (except for a tree with only 0 or 1 entries. For leaves each self.key[i] that is valid is associated with int value self.indices[i] For nonleaves nextnode integer reference is at self.indices[i+1] and self.indices[0] is for entries with keys255: raise ValueError, "size too large: "+`size` if size<0: #or size%2==1: raise ValueError, "size must be positive <= 255" self.position = position # seek position in file self.infile = infile # open file for storage self.keylen = keylen # maximum key length (no nulls!) # seek pointers for descendents (root/interior) # all but last is a value for a leaf, last is successor seek self.indices = [-1] * (size+1) # key storage # for leaves value for key[i] is at indices[i] # for others keys[i] is at indices[i+1], # indices[0] points to keys preceding keys[0]. # for freelist nodes, nodes are stored on # linked list with indices[0] forward self.keys = [""] * size # linearized storage length in file #self.intstorage = intsize * (size+1) #self.keystorage = keylen * size # in debug mode the seek position is prepended #if debug: # self.intstorage = self.intstorage + intsize #self.storage = (2 + # flag, valid # self.intstorage + self.keystorage) if cloner is None: self.storage = (sequence_overhead + # list overhead 2*intsize + # flag, valid (size+1)*intsize + # indices size*(sequence_overhead + keylen) # keys ) else: self.storage = cloner.storage self.fifo = cloner.fifo # note, for interior nodes # validkey of 0 means one valid pointer, -1 means none # for leaves validkeys should be positive if flag in [Interiorflag, Rootflag]: self.validkeys = -1 # number of valid entries else: self.validkeys = 0 def clear(self): # reinitialize keys, indices for self. size = self.size self.keys = [""] * size self.validkeys = 0 if self.flag in Interiorflags: # reinit all indices self.indices = [-1] * (size+1) self.validkeys = -1 else: # don't clobber forward pointer self.indices[:size] = [-1] * size # interior node operation. def putnode(self, key, node): """place a node for key into self. Raise NOROOMERROR if no room.""" from types import StringType if type(key)!=StringType: raise TypeError, "bad key "+`key` position = node.position self.putposition(key, position) def putfirstindex(self, index): #print "putfirstindex", index if self.validkeys>=0: raise ValueError, "putfirstindex on full node" self.indices[0] = index self.validkeys = 0 def putposition(self, key, position): #print "putposition", (key, position), self.indices, self.keys if self.flag not in Interiorflags: raise ValueError, "cannot insert into leaf node" validkeys = self.validkeys last = validkeys + 1 if self.validkeys>=self.size: raise NOROOMERROR, "no room" # store the key if validkeys<0: # no nodes currently #print "no keys" self.validkeys = 0 self.indices[0] = position else: # yes nodes keys = self.keys # is the key there already? if key in keys: if keys.index(key)value mapping into leaf node. """ from types import StringType, IntType if type(key)!=StringType and type(value)!=IntType: raise ValueError, "bad key, value"+ `(key,value)` if self.flag not in Leafflags: raise ValueError, "cannot get next for non-leaf." validkeys = self.validkeys indices = self.indices keys = self.keys if validkeys<=0: # empty # "first entry", (key, value) indices[0] = value keys[0] = key self.validkeys = 1 else: place=None if key in keys: place = keys.index(key) if place>=validkeys: place=None if place is not None: keys[place] = key indices[place] = value else: if validkeys>=self.size: #print "node out of room" #for x in self.__dict__.items(): print x raise NOROOMERROR, "no room" place = bisect(keys, key, 0, validkeys) #print "next entry at", place #next = place+1 last = validkeys+1 del keys[validkeys] del indices[validkeys] keys.insert(place, key) indices.insert(place, value) self.validkeys = last def put_all_values(self, keys_indices): """optimization for node restructuring.""" self.clear() indices = self.indices keys = self.keys length = self.validkeys = len(keys_indices) if length>self.size: raise IndexError, "bad length "+`length` #if lengthself.size: raise IndexError, "bad length "+`length` #if lengthself.fifo.fifosize: last = ff[-1] del ff[-1] del dict[last.position] #print "storing", last.position if last.dirty: last.store(1) #if len(dict)!=len(fifo): raise "whoops" def enable_fifo(self, size = 33): "you better disable it later!" if size<5 or size>1000000: raise ValueError, "size not valid: "+`size` self.fifo = Node_Fifo(size) def disable_fifo(self): #print "disabling fifo", self.fifo_dict.keys() #global fifo_on if self.fifo: self.fifo.flush_fifo() self.fifo = None def store(self, force=0): """write self to file at self.position return end of record seek position.""" #print "store", self.position position = self.position fifo = self.fifo if not force and fifo: fd = fifo.fifo_dict if fd.has_key(self.position) and fd[position] is self: self.dirty = 1 return # defer f = self.infile #save = f.tell() f.seek(position) data = self.linearize() f.write(data) last = f.tell() #f.seek(save) self.dirty = 0 if not force and self.fifo: self.add_to_fifo() return last def linearize(self): """create record format for self.""" from marshal import dumps all = [self.flag, self.validkeys] + self.indices + self.keys s = dumps(all) ls = len(s) storage = self.storage if (ls > storage): raise ValueError, "bad storage: " + `s` s = s + "X" * (storage-ls) return s #indices = self.indices # in debug prepend seek position #if debug: indices = [self.position] + indices #ints = encodeints(indices) #keys = encodestrs(self.keys, self.keylen) #validkeys = self.validkeys #if validkeys<0: v = "*" # dummy purposes only (prewrites) #else: v = chr(self.validkeys ^ CMASK) # try to make v readable #return "%s%s%s%s%s" % (self.flag, v, ints, keys, SEPARATOR) __print__ = linearize def delinearize(self, str): """parse, store from record format from self.""" from marshal import loads all = loads(str) [self.flag, self.validkeys] = all[:2] #self.flag = chr(ordflag) s = self.size next = 2+s+1 indices = self.indices = all[2:next] keys = self.keys = all[next:] if len(keys) != s: raise ValueError, "bad keys: " + `keys` + `len(keys)` def dump(self, indent=""): flag = self.flag if flag==Freeflag: print 'free->', self.position, nextp = self.indices[0] if nextp!=nilseek: next = self.clone(nextp) next = next.materialize() next.dump() else: print "!last" return nextindent = indent + " " print indent, if flag == Rootflag: print "root", elif flag == Interiorflag: print "interior", elif flag == Leafflag: print "leaf", elif flag == LeafandRootflag: print "root and leaf", else: print "invalid flag???", flag, print self.position, "valid=", self.validkeys print indent, "keys", self.keys print indent, "seeks", self.indices if flag in [Rootflag, Interiorflag]: # interior for i in self.indices: if i != nilseek: n = self.clone(i) n = n.materialize() n.dump(nextindent) else: # leaf pass print indent, "*****" class BplusTree: """Basic B+tree maps fixed length strings to integers (could be seek positions)""" length = None # fill in later dirty = 0 # default # length keylen, nodesize, root_seek, free header_format = "%10d %10d %10d %10d %10d\n" def __init__(self, infile, position=None, nodesize=None, keylen=None): """infile should be open file in "rb" or "w+b" mode. if optional args are not given they are determined from first line in file. """ #print "BPlusTree(%s, %s, %s)" % (position, nodesize, keylen) if keylen is not None and keylen<=2: raise ValueError, "keylen must be greater than 2" self.root_seek = nilseek # dummy self.free = nilseek self.root = None self.file = infile self.nodesize = nodesize self.keylen = keylen if position is None: position = 0 self.position = position #if nodesize is None: # self.get_parameters() def walker(self, keylower=None, includelower=None, keyupper=None, includeupper=None): return BplusWalker(self, keylower, includelower, keyupper, includeupper) def init_params(self): return (self.file, self.position, self.nodesize, self.keylen) def getfile(self): return self.file def getroot(self): return self.root def update_freelist(self, position): if self.free!= position: self.free = position self.reset_header() def startup(self): """startup the file, write header, set root""" if self.nodesize is None or self.keylen is None: raise ValueError, \ "cannot initialize without nodesize, keylen specified" self.length = 0 self.reset_header() file = self.file file.seek(0,2) # goto eof self.root_seek = file.tell() self.reset_header() root = self.root = Node(LeafandRootflag, self.nodesize, self.keylen, self.root_seek, file) root.store() def open(self): """get info on existing file.""" file = self.file self.get_parameters() self.root = Node(LeafandRootflag, self.nodesize, self.keylen, self.root_seek, file) self.root = self.root.materialize() fifo_enabled = 0 def enable_fifo(self,size=33): #print "fifo enabled" self.fifo_enabled = 1 self.root.enable_fifo(size) def disable_fifo(self): #print "fifo disabled" self.fifo_enabled = 0 if self.dirty: self.reset_header() self.dirty = 0 self.root.disable_fifo() def reset_header(self): """reset the header of the file""" if self.fifo_enabled: self.dirty = 1 return # defer file = self.file file.seek(self.position) #file.write( self.header_format % # (self.length, self.keylen, self.nodesize, self.root_seek, self.free) ) from marshal import dump dump( (self.length, self.keylen, self.nodesize, self.root_seek, self.free), file) def get_parameters(self): file = self.file #save = file.tell() file.seek(self.position) from marshal import load temp = load(file) #print temp, self.position (self.length, self.keylen, self.nodesize, self.root_seek, self.free)=\ temp #file.seek(save) def __len__(self): if self.length is None: self.get_parameters() return self.length def __getitem__(self, key): """self[key] -- get item associated with key""" if self.root is None: raise ValueError, "not open!" return self.find(key, self.root) def has_key(self, key): try: test = self[key] except KeyError: return 0 else: return 1 def __setitem__(self, key, value): """self[key]=value -- set map for key to value""" from types import StringType, IntType if type(key)!=StringType: raise ValueError, "key must be string" if type(value)!=IntType: raise ValueError, "value must be int" if len(key)>self.keylen: raise ValueError, "key too long" if value<0: raise ValueError, "value must be positive" current_length = self.length #if FORBIDDEN in key: # raise ValueError, "key cannot contain "+`FORBIDDEN` root = self.root if root is None: raise ValueError, "not open!" #global test1 #debug test1 = self.set(key, value, self.root) # do we need to split root? if test1 is not None: #print "splitting root", `test1` (leftmost, node) = test1 #print "leftmost", leftmost, node # make a non-leaf root (newroot, self.free) = root.getfreenode(self.free) newroot.flag = Rootflag if root.flag is LeafandRootflag: root.flag = Leafflag else: root.flag = Interiorflag newroot.clear() newroot.putfirstindex(root.position) newroot.putnode(leftmost, node) self.root = newroot self.root_seek = newroot.position newroot.store() root.store() self.reset_header() else: if self.length!=current_length: self.reset_header() def __delitem__(self, key): """del self[key] -- remove map for key to value""" root = self.root currentlength = self.length self.remove(key, root) if root.flag==Rootflag: validkeys = root.validkeys if validkeys<1: if validkeys<0: raise ValueError, "invalid empty non-leaf root" newroot = self.root = root.getnode(None) self.root_seek = newroot.position self.free = root.free(self.free) self.reset_header() if newroot.flag==Leafflag: newroot.flag = LeafandRootflag else: newroot.flag = Rootflag newroot.store() elif self.length!=currentlength: self.reset_header() elif root.flag!=LeafandRootflag: raise ValueError, "invalid flag for root" elif self.length!=currentlength: self.reset_header() def set(self, key, value, node): """insert key-->value starting at node. return None if no split, else return (leftmostkey, newnode) """ keys = node.keys validkeys = node.validkeys if node.flag in Interiorflags: # non leaf # find the descendent to insert in place = bisect(keys, key, 0, validkeys) #print place, key, validkeys, keys if place>=validkeys or keys[place]>=key: # insert at previous node index = place else: # index at node index = place+1 if index==0: nodekey=None else: nodekey=keys[place-1] #print "nodekey", nodekey, node.indices nextnode = node.getnode(nodekey) test = self.set(key, value, nextnode) # split? if test is not None: (leftmost, insertnode) = test try: # insert if room node.putnode(leftmost, insertnode) except NOROOMERROR: # no room, split insertindex = insertnode.position (newnode, self.free) = node.getfreenode( self.free, self.update_freelist) newnode.flag = Interiorflag ki = node.keys_indices("dummy") (dummy, firstindex) = ki[0] # remove dummy ki = ki[1:] # insert new pair insort(ki, (leftmost, insertindex)) newleftmost = self.divide_entries(firstindex, node, newnode, ki) node.store() newnode.store() return (newleftmost, newnode) else: node.store() return None # no split else: # leaf if key not in keys or keys.index(key)>=validkeys: newlength = self.length+1 else: newlength = self.length try: # insert if room node.putvalue(key, value) except NOROOMERROR: # no room: split # get entries (dummy is ignored for leaves) ki = node.keys_indices("dummy") insort(ki, (key, value)) (newnode, self.free) = node.getfreenode( self.free, self.update_freelist) newnode = node.newneighbor(newnode.position) newnode.flag = Leafflag # 0 is dummy firstindex, ignored for leaves newleftmost = self.divide_entries(0, node, newnode, ki) node.store() newnode.store() self.length = newlength return (newleftmost, newnode) else: node.store() self.length = newlength return None def remove(self, key, node): """remove key from tree at node. raise KeyError if absent. return (leftmost, size) if leftmost changes. otherwise return (None, size). Caller is responsible for restructuring node, if needed. """ newnodekey = None if node.flag in Interiorflags: # nonleaf keys = node.keys validkeys = node.validkeys place = bisect(keys, key, 0, validkeys) if place>=validkeys or keys[place]>=key: # delete at tree before place index = place else: # delete at tree for place index = place+1 if index==0: nodekey=None else: nodekey=keys[place-1] nextnode = node.getnode(nodekey) # recursively remove from nextnode (lm, size) = self.remove(key, nextnode) # is nextnode now too small? nodesize = self.nodesize half = nodesize/2 if (size=validkeys: # final node, get previous rightnode = nextnode rightkey = nodekey if validkeys<=1: leftkey = None else: leftkey = keys[place-2] leftnode = node.getnode(leftkey) else: # non-final, get next leftnode = nextnode leftkey = nodekey if index==0: rightkey=keys[0] else: rightkey = keys[place] rightnode = node.getnode(rightkey) # get all keys, indices rightki = rightnode.keys_indices(rightkey) leftki = leftnode.keys_indices(leftkey) ki = leftki + rightki # redistribute or merge? #print "ki, nodesize", ki, nodesize lki = len(ki) if lki>nodesize or (leftnode.flag!=Leafflag and lki>=nodesize): # redistribute (newleftkey, firstindex) = ki[0] if leftkey==None: newleftkey = lm if leftnode.flag!=Leafflag: # nuke first ki ki = ki[1:] newrightkey = self.divide_entries( firstindex, leftnode, rightnode, ki) # delete, reinsert right node.delnode(rightkey) node.putnode(newrightkey, rightnode) # ditto for left if first changed if (leftkey!=None and leftkey!=newleftkey): node.delnode(leftkey) node.putnode(newleftkey, leftnode) node.store() leftnode.store() rightnode.store() else: # merge into left, free right (newleftkey, firstindex) = ki[0] #leftnode.clear() if leftnode.flag!=Leafflag: #leftnode.putfirstindex(firstindex) #del ki[0] #for (k,i) in ki: # leftnode.putposition(k,i) leftnode.put_all_positions(firstindex, ki[1:]) else: #for (k,i) in ki: # leftnode.putvalue(k,i) leftnode.put_all_values(ki) if rightnode.flag==Leafflag: self.free = leftnode.delnext(rightnode, self.free) else: self.free = rightnode.free(self.free) if leftkey is not None and newleftkey!=leftkey: node.delnode(leftkey) node.putnode(newleftkey, leftnode) node.delnode(rightkey) node.store() leftnode.store() self.reset_header() if leftkey is None: newnodekey = lm else: # no restructure # update leftmost, if needed if nodekey is None: newnodekey = lm elif lm is not None: node.delnode(nodekey) node.putnode(lm, nextnode) # end of restructure if else: # leaf, base case: just delete it if node.validkeys<1: # should only happen for empty root raise KeyError, "no such key" first = node.keys[0] node.delvalue(key) rest = node.keys[0] if first!=rest: newnodekey = rest node.store() self.length = self.length - 1 return (newnodekey, node.validkeys) def divide_entries(self, firstindex, node1, node2, entries): """divide presorted entries evenly among node1, node2 return leftmost of node2. firstindex is ignored for leaves """ middle = len(entries)/2 + 1 #node1.clear() #node2.clear() if node1.flag in Interiorflags: #middle = middle+1 left = entries[:middle] right = entries[middle:] #print "left, right", left, right # nonleaf #node1.putfirstindex(firstindex) #for (k,i) in left: # node1.putposition(k,i) (leftmost, midindex) = right[0] #node2.putfirstindex(midindex) #for (k,i) in right[1:]: # node2.putposition(k, i) node1.put_all_positions(firstindex, left) node2.put_all_positions(midindex, right[1:]) return leftmost else: # leaf left = entries[:middle] right = entries[middle:] #for (k,i) in left: # node1.putvalue(k,i) #for (k,i) in right: # node2.putvalue(k,i) node1.put_all_values(left) node2.put_all_values(right) return right[0][0] def find(self, key, node): """find key starting at node.""" while node.flag in Interiorflags: # non-leaf thesekeys = node.keys validkeys = node.validkeys # find place at or just beyond key place = bisect(thesekeys, key, 0, validkeys) if place>=validkeys or thesekeys[place]>key: if place==0: nodekey=None else: nodekey=thesekeys[place-1] else: nodekey = key node = node.getnode(nodekey) return node.getvalue(key) def dump(self): self.root.dump() if self.free!=nilseek: free = self.root.clone(self.free) free = free.materialize() free.dump() def __del__(self): if self.fifo_enabled: self.disable_fifo() class BplusWalker: """iterative walker for bplustree leaf nodes.""" def __init__(self, tree, keylower=None, includelower=None, keyupper=None, includeupper=None): """initialize a walker for tree with key values bounded by upper/lower, if given, included or excluded as specified. Tree should never be updated while walker is active, otherwise behaviour of walker is undefined.""" self.tree = tree self.keylower = keylower self.includelower = includelower self.keyupper = keyupper self.includeupper = includeupper if self.tree.getroot() == None: self.tree.open() # get the first pertinent leaf in tree node = self.tree.getroot() while node.flag in Interiorflags: # interior node, seek a leaf if keylower is None: nkey = None else: keys = node.get_keys() place = bisect(keys, keylower) if place==0: nkey = None elif place>len(keys): nkey = keys[-1] else: nkey = keys[place-1] node = node.getnode(nkey) self.node = self.startnode = node # preinit self.node_index = None self.valid = 0 # pessimism self.first() def first(self): """reset walker to first position, or raise IndexError if keyrange is empty.""" node = self.node = self.startnode # is the key in the node? keys = node.keys #print "first at", keys keylower = self.keylower keyupper = self.keyupper validkeys = node.validkeys self.valid = 0 if keylower==None: self.node_index = 0 self.valid = 1 elif keylower in keys and self.includelower: index = self.node_index = keys.index(keylower) if indexkeylower or (self.includelower and testk==keylower)): self.valid = 1 else: self.valid = 0 else: # advance to the next node next = node.nextneighbor() if next is not None: self.startnode = next self.first() return else: self.valid = 0 # test keyupper if self.valid and keyupper is not None: key = self.current_key() if key=node.validkeys: # goto next node next = node.nextneighbor() if next is None: self.valid = 0 return node = self.node = next nextp = 0 #print "next at", node.keys, node.indices, nextp, node.validkeys if node.validkeys<=nextp: self.valid = 0 else: testkey = node.keys[nextp] keyupper = self.keyupper if (keyupper is None or testkeystrings. Key strings are fixed length as in BPlusTree. Value strings are arbitrary length but space for overwritten or deleted values will be wasted in the file (the aren't GC'd, unlike tree nodes which are. """ # can be overridden. treeclass = BplusTree def __init__(self, infile, position=None, nodesize=None, keylen=None): self.infile = infile self.tree = self.treeclass(infile, position, nodesize, keylen) def walker(self, keylower=None, includelower=None, keyupper=None, includeupper=None): return SBplusWalker(self, keylower, includelower, keyupper, includeupper) def __len__(self): return len(self.tree) def init_params(self): return self.tree.init_params() def getroot(self): return self.tree.getroot() def getfile(self): return self.infile def enable_fifo(self, size=33): self.tree.enable_fifo(size) def disable_fifo(self): self.tree.disable_fifo() def dump(self): """ignore real values here, should fix.""" self.tree.dump() def startup(self): self.tree.startup() def open(self): self.tree.open() def __getitem__(self, key): seek = self.tree[key] return getstring(self.infile, seek) def __setitem__(self, key, value): """Warning: overwrite "loses" old value space.""" #try: # test = self[key] #except KeyError: # go = 1 #else: # go = (test != key) #if go: # assume overwrite (optimize) seek = putstring(self.infile, value) self.tree[key] = seek def __delitem__(self, key): """Warning: loses old value storage.""" del self.tree[key] def has_key(self): return self.tree.has_key(self) class caching_SBPT(SBplusTree): """string-->string caching b-plus tree.""" treeclass = caching_BPT class SBplusWalker: """iterator for string-->string Bplus tree.""" # can be overridden walkerclass = BplusWalker def __init__(self, tree, keylower=None, includelower=None, keyupper=None, includeupper=None): self.walker = self.walkerclass(tree, keylower, includelower, keyupper, includeupper) self.file = tree.getfile() self.valid = self.walker.valid def first(self): self.walker.first() self.valid = self.walker.valid def current_key(self): return self.walker.current_key() def current_value(self): seek = self.walker.current_value() return getstring(self.file, seek) def next(self): self.walker.next() self.valid = self.walker.valid def putstring(infile, s): """Add a new string record to eof. return start seek.""" #save = infile.tell() # seek to eof infile.seek(0,2) last = infile.tell() from marshal import dump dump(s, infile) #infile.seek(save) return last def getstring(infile, i): """get an old string record at i""" #save = infile.tell() infile.seek(i) from marshal import load s = load(infile) #infile.seek(save) return s def recopy_bplus(fromfile, tofile, treeclass=BplusTree): """copy BplusTree from fromfile to tofile. from file should be open "rb", tofile "w+b".""" fromtree = treeclass(fromfile) fromtree.open() (f, p, n, k) = fromtree.init_params() totree = treeclass(tofile, p, n, k) totree.startup() return recopy_tree(fromtree, totree) def recopy_tree(fromtree, totree): """copy fromtree contents to totree. trees must be compatible. copy attempts to "compactize" totree.""" (f,p,n,k) = totree.init_params() try: totree.enable_fifo() walker = fromtree.walker() # fill up first node in totree part1 = n/2 +1 part2 = part1-2 defer = [] while walker.valid: # pseudooptimization: defer n/2-1 tail elements # for n even this makes all leaves full (in tests) for i in xrange(part1): if not walker.valid: break totree[ walker.current_key() ] = walker.current_value() walker.next() for (k,v) in defer: totree[k]=v defer = [] for i in xrange(part2): if not walker.valid: break defer.append( (walker.current_key(), walker.current_value()) ) walker.next() for (k,v) in defer: totree[k] = v return (fromtree, totree) finally: #print "disabling fifo" totree.disable_fifo() def recopy_sbplus(fromfile, tofile, treeclass=SBplusTree): """copy SBplusTree from fromfile to tofile. from file should be open "rb", tofile "w+b". this will create a new file without "lost garbage".""" return recopy_bplus(fromfile, tofile, treeclass) ##### simple dbm compatibility bignum = 0x7efe77 # 8 million buckets def myhash(s): """portable string hash function. (because builtin hash isn't portable).""" o = ord B = bignum result = 775 + len(s)*1001 for c in s: #print result result = (result*253 + o(c)*113) % B return result class dbm: """dbm compatible index file with unlimited key/value size. overwrites, dels and hash collisions leave "junk" in index. Alternate implementations left to reader, or to future. Hash indexed into buckets in an SBplusTree. buckets with marshalled dict of {key: value} for elements in this bucket. """ flagmap = {"r": "rb", "w": "r+b", "c": "w+b"} openmodes = ("r", "w") treeclass = SBplusTree nodesize = 202 def __init__(self, filename, flag="r", mode=None): #print "init", filename, flag, mode if mode is not None: raise ValueError, "sorry mode not supported (portability)" self.fileflag = flag rf = self.realflag = self.flagmap[flag] self.filename = filename f = self.file = open(filename, rf) # length record at start of file if flag in self.openmodes: from marshal import load from string import atoi self.length = load(f) # parameters determined from header #print "reopening", self.length, f.tell() t = self.tree = self.treeclass(f, f.tell()) t.open() else: # put length record from marshal import dump dump(0, f) self.length = 0 #print "creating", self.length, f.tell() t = self.tree = self.treeclass(f, f.tell(), self.nodesize, intsize-1) t.startup() self.tree.enable_fifo(self.nodesize+3) closed = 0 def close(self): if self.closed: return self.tree.disable_fifo() # put length record if self.length<0: raise ValueError, "negative len?"+`(self.length, self.filename)` f = self.file if self.fileflag in ("c", "w"): f.seek(0) from marshal import dump dump(self.length, f) f.close() self.closed = 1 def __del__(self): self.close() def __len__(self): return self.length def hash(self, key): from marshal import dumps h = myhash(key) hs = dumps(h)[1:] # nuke indicator return hs def pairs(self, hash): try: spairs = self.tree[hash] except KeyError: return {} from marshal import loads return loads(spairs) def setpairs(self, hash, pairs): from marshal import dumps spairs = dumps(pairs) self.tree[hash] = spairs def __getitem__(self, item): h = self.hash(item) pairs = self.pairs(h) return pairs[item] def __setitem__(self, item, value): h = self.hash(item) pairs = self.pairs(h) if not pairs.has_key(item): self.length = self.length+1 pairs[item] = value self.setpairs(h, pairs) #print self.length def __delitem__(self, item): h = self.hash(item) pairs = self.pairs(h) del pairs[item] if pairs: self.setpairs(h, pairs) else: del self.tree[h] self.length = self.length-1 #print self.length def has_key(self, item): try: test = self[item] except KeyError: return 0 else: return 1 def keys(self): """not terribly efficient! (should optimize?)""" result = [] w = self.tree.walker() from marshal import loads while w.valid: spairs = w.current_value() pairs = loads(spairs) for k in pairs.keys(): result.append(k) w.next() if len(result)!=self.length: raise IndexError, "bad tree length:"+`(len(result), self.length)` return result def copy(self, tofilename, flag, mode=None): if flag=="r": raise ValueError, "nonsense! can't copy into read only index" #print "copy", tofilename, flag other = dbm(tofilename, flag, mode) if flag=="c": # create: make optimal recopy_tree(self.tree, other.tree) other.length = self.length other.tree.enable_fifo(other.nodesize+3) elif flag=="w": # insert-into: simple traversal (collisions waste space) w = self.tree.walker() from marshal import loads while w.valid: spairs = w.current_value() pairs = loads(spairs) for (k,v) in pairs.items(): other[k] = v w.next() return other def testdbm(): print "creating files test1, 2, 3 for dbm test" d1 = dbm("test1", "c") for x in range(10): key = "hello"*x d1[key] = "01234567890"[:-x] print key, d1[key] print d1.keys() for x in range(300): d1[oct(x)] = hex(x) del d1[''] print len(d1), d1.keys() print "should be 0:", d1.has_key(""), d1.has_key("abd") print "copying" d2 = d1.copy("test2", "c") beforedel = len(d1) del d2["hello"] print len(d2), d2.keys() d2.close() d2 = dbm("test2", "r") print "should be equal", beforedel-1, len(d2) print "keys", d2.keys() print "testing copy-into" d3 = dbm("test3", "c") d3["willy"] = "wally" d3.close() d3 = d2.copy("test3", "w") print "should be equal", beforedel, len(d3) print "keys", d3.keys() ### test def test(): """test program: creates a bplustree file "test". try messing with the node size. """ print "creating file 'test' in current directory for test data." f = open("test", "w+b") B = SBplusTree(f, 0, 202, 10) B.startup() B.enable_fifo() #return B B["this"] = 0xdad from string import letters, digits for x in letters+digits: B[x] = ord(x) for x in "13579finalmopq": del B[x] print "final pass" from time import time s = time() for x in range(1000): B[hex(x)] = x; #print x print "one thousand assigns", time()-s #B.dump() B.disable_fifo() return (B, f) def retest(): from time import time f = open("test", "rb") B = caching_SBPT(f) B.open() B.enable_fifo() print "retesting" for x in "abcdefghi012345": try: print x, "-->", B[x] except KeyError: print x, "absent" print "entering torture chamber" s = time() for x in range(1000): l = B[hex(x)] print "1 thousand retrieves: ", time()-s return B print "keys, values between 4 and C (including C)" W = SBplusWalker(B, "4", 0, "C", 1) while W.valid: print (W.current_key(), W.current_value()), W.next() print print "keys, values between 4 (including 4) and C (excluding C)" W = SBplusWalker(B, "4", 1, "C", 0) while W.valid: print (W.current_key(), W.current_value()), W.next() print print "all keys" W = SBplusWalker(B) while W.valid: print W.current_key(), W.next() print print "A to A inclusive (1 elt)" W = SBplusWalker(B, "A", 1, "A", 1) while W.valid: print W.current_key(), W.next() print print "A to A exclusive (0 elt)" W = SBplusWalker(B, "A", 1, "A", 0) while W.valid: print W.current_key(), W.next() print print "AA to AA inclusive (0 elt)" W = SBplusWalker(B, "AA", 1, "AA", 0) while W.valid: print W.current_key(), W.next() print print B.disable_fifo() return (W, B, f) if __name__=="__main__": (B,f) = test() B=None f.close() retest() From bizzaro at bc.edu Tue Mar 23 17:56:51 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] phylogeny and and overview question References: <36F7B5DD.D4AC2173@bc.edu> Message-ID: <36F81C33.39FFCF1D@bc.edu> But if you're saying, Desktop = Workspace + all other GUI loci then you are correct. But the Workspace and GUI loci use the wfs to communicate with each other as well. In short, EVERYTHING uses the wfs :-) Jeff bizzaro@bc.edu "J.W. Bizzaro" wrote: > > Justin Bradford wrote: > > > As for the structure, I want to clear up something I'm a little confused > > about. Does the Loci system work like this: > > > > Desktop <-> wfs <--|----> analysis locus #1 > > | > > |----> analysis locus #2 > > | > > |----> database > > | > > |----> etc... > > > > And things from the third column only talk to the wfs, and not directly to > > each other. Right? > > > > Maybe more like this: > > Workspace <-> wfs <--|----> analysis locus #1 > | > wfs > | > |----> analysis locus #2 > | > wfs > | > |----> gui locus #1 > | > wfs > | > |----> gui locus #2 > | > wfs > | > |----> database > | > wfs > | > |----> etc... > > where "Workspace" is (1) the workflow diagram/monitor, (2) the notebook/logger, > and (3) the central canvas. Communication to these isn't really any different. > It's just that these are for user monitoring and control. > -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From justin at ukans.edu Tue Mar 23 18:28:49 1999 From: justin at ukans.edu (Justin Bradford) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] phylogeny and and overview question In-Reply-To: <36F81C33.39FFCF1D@bc.edu> Message-ID: > But if you're saying, > > Desktop = Workspace + all other GUI loci > > then you are correct. But the Workspace and GUI loci use the wfs to > communicate with each other as well. In short, EVERYTHING uses the wfs > :-) Yes, I am. I believe I understand it now. Justin From bizzaro at bc.edu Thu Mar 25 02:05:37 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] official GNU project status Message-ID: <36F9E041.900EBCB3@bc.edu> Locians, I've been mulling over the prospect of making The Loci Project part of the GNU Project. I had a number of questions, which I sent to the FSF, and below is their reply. What are your thoughts regarding this? I can't see how we can lose. Maybe some of you are just RMS haters...? Jeff bizzaro@bc.edu --------- jwb> Quoting Georg from Issue #1: jwb> "The TrustCenter in Hamburg (Germany) [5] released its PKCS#11 implementation jwb> under the GPL and made it an official GNU Project [6]." jwb> What makes a project "an official GNU Project"? What you wrote gives the jwb> impression that it was The TrustCenter's decision in this case. But it must of jwb> course be the FSF's decision. This is a question that might be worth answering in the issue #3... So you don't have to wait: here is your private answer. .-) "Official GNU Projects" are projects that are "officially accredited" by the FSF / GNU Project. The official GNU Projects are considered to be part of the GNU System and they are distributed on the GNU CD-ROMs. All GNU Projects follow the GNU coding guidelines (long commandline options, a help available via "--help" and such) as can be viewed on the GNU Webpage. jwb> Are all "GNU Projects" for the creation of the GNU OS and OS-related? The GIMP jwb> gained this status but is not a program critical for the existence of an OS. All projects under the GPL or Lesser GPL may become "official GNU Projects" as long as they are of interest for a group of people (just one or two isn't enough..). As you already said: Not all GNU Projects are neccessarily system-related. jwb> Also, are there restrictions to using "GNU" in a program name? If someone calls jwb> their program "GNU CD Player or GCP", do they or must they have permission to jwb> use "GNU"? Well. There are not really restrictions on the usage of "GNU" in a name although using GNU suggests a GNU affiliation and it would probably be a good idea to use it only if you have an official GNU Project. How to make a project "an official GNU Project" is easy. Contact the FSF / GNU Project and tell us what you are planning to do (or have done already) and (optional) why you think it'd be interesting to have this as a part of the GNU System. In most cases we'll give you an account on the GNU machines, offer you webspace for your project on www.gnu.org/software/your-cool-gnu-project and welcome you in the GNU community. :-) jwb> If a developer's project does become an accredited GNU project, what is the jwb> developer expected to give to the FSF? Particularly, does the FSF gain any jwb> copyright or legal claim to the software? That is up to you. If you ask to make your project a GNU Project you'll be asked whether you want to transfer your copyright to the FSF. This is not neccessary, though. If you take my GNU Project, the Xlogmaster, for instance: I still hold the copyright although it is an official GNU Project. The only thing that is really "expected" from you is to comply to the GNU coding standards which ensure that all applications have a similar "feel" to them (like everything should support the "--version" and "--help" commandline options). Those are never strictly enforced - it's more like something that "we would like you to do". Since those standards are basically common sense I never had a problem to follow them. We also encourage people to write clean code and write as much documentation as possible. Again this is never enforced. You will decide what happens to your project. Even after transferring the copyright you'd become the maintainer of the package and would determine it's course. No matter how you decide yourself: You'll always stay the original author and the maintainer as long as you want. The program will still be "your baby". .-) jwb> And what is the developer expected to jwb> get from the FSF in return. If you mean any monetary compensation: The FSF does not pay any money for making things official GNU Projects - sometimes you'll find announcements for special projects that might be funded by FSF money but in general we won't pay an author to make something a GNU Project. The other advantages are more than worth it in my eyes, though. You'll get an account on the GNU machines (together with a nifty "yourchoice@gnu.org" email address). You will be able to access GNU internal mailinglists and your project can have it's own homepage on www.gnu.org. If you want you can get a GNU Mailinglist and/or Newsgroup (gnu.your.project) for your program. Your project will be on the GNU CD-ROMs and ftp.gnu.org plus all it's mirrors. What I consider very pleasing myself is the feeling to have "given back" something to the community that allowed me to use all this cool software before. jwb> Thank you for answering my questions! No problem. Hope my answers helped you a bit. Regards, Georg -- J.W. Bizzaro mailto:bizzaro@bc.edu Boston College Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ Studies show that 93% of all people are below average. -- From bizzaro at bc.edu Fri Mar 26 08:15:53 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] Re: Future of TULIP References: <99032607283800.22282@mrnlinux.hg.med.umich.edu> Message-ID: <36FB8889.73FD4F00@bc.edu> Hi Matthew! TULIP/Loci is very much alive and growing. The lack of any signs of life is due to my failure to keep updating the "old" site. Maybe my excuse is that I have been planning on a "new", dedicated server for the project, which BTW is being set up now. Another excuse may be that the project is being defined and redefined so often that I can't put together a clear presentation. What I don't mention on the old site is that we now have a mailing list. To subscribe, send an e-mail to: majordomo@busboy.sped.ukans.edu with this text in the body of the message: subscribe tulip-list And we have an archive for the mailing list: http://toaster.sped.ukans.edu/tulip-list/ Take a look at some of the recent discussions. Your help would be very much appreciated! BTW, could you give me some background on your interests in research and experience in programming? Jeff bizzaro@bc.edu "Matthew R. Nelson" wrote: > > With your most recent news occurring within a one month period of time > almost four months ago, it looks like TULIP is dying or dead. Is this > true? If not, it is important to have your website at least give the > appearance of a living beast, providing any updates in decisions, class > definitions, prototypes, etc. Such activity can help reassure people like > me that it might be worth it to contribute to the effort. > > I'll continue to revisit your pages for signs of life. > > Regards, > > Matt > ---------------------------------------------------------------------------- > Matthew R. Nelson > Dept. of Human Genetics http://www-personal.umich.edu/~ticul/ > University of Michigan email: ticul@umich.edu > 4711 Medical Science II phone: (734) 763-8090 or 647-3151 > Ann Arbor, MI 48109-0618 fax: (734) 763-5477 > ---------------------------------------------------------------------------- -- J.W. Bizzaro mailto:bizzaro@bc.edu Boston College Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ Studies show that 93% of all people are below average. -- From bizzaro at bc.edu Fri Mar 26 16:42:37 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] [Fwd: Future of TULIP] Message-ID: <36FBFF4D.CE320E37@bc.edu> Reply from Matthew... -------------- next part -------------- An embedded message was scrubbed... From: "Matthew R. Nelson" Subject: Re: Future of TULIP Date: Fri, 26 Mar 1999 08:15:27 -0500 Size: 3195 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990326/0d0f80c7/attachment.mht From bizzaro at bc.edu Fri Mar 26 17:53:54 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] an interesting thought Message-ID: <36FC1002.63362451@bc.edu> Locians, It's strange that I just came across this article: http://www.news.com/News/Item/0%2C4%2C34314%2C00.html?dd.ne.txt.0326.04 Strange, because a couple days ago I was thinking about the future of GUI development and the role of XML and the Internet. I thought that the Web browser of today may some day become so customizable that it will be a portable GUI toolkit, with an Internet backbone. But this is also the direction Loci is heading. The article talks about using XUL (an XML) to provide the browser with GUI information (buttons, etc.). The problems are, (1) XUL would require an enormous and complex DTD, and (2) the browser would need all of the widgets built-in, ready to be called upon at runtime. I realized these would be insurmountable problems for Loci, if it were to go this route, simply because Loci is not the scale of Mozilla. But what if the GUI information for Loci were included in the XML, as with XUL, but >>>as a Python-GTK script<<< Yes, a functional program/module embedded in the XML. Has anyone heard of this being done before? Try that with compiled binaries! So each locus would be the same: just a shell that can process our XML (workflow + bio + GUI) and make an application on the fly. Any thoughts? What would be the advantages and disadvantages? Jeff bizzaro@bc.edu From justin at ukans.edu Sat Mar 27 04:37:13 1999 From: justin at ukans.edu (Justin Bradford) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] an interesting thought In-Reply-To: <36FC1002.63362451@bc.edu> Message-ID: > So each locus would be the same: just a shell that can process our XML > (workflow + bio + GUI) and make an application on the fly. > > Any thoughts? What would be the advantages and disadvantages? The only problem I see is one of speed. However, if we had widgets along the lines of render_3d_molecule, etc, it could work. You don't mean the analysis tool generates the script, though, right? Justin From David.Lapointe at umassmed.edu Sat Mar 27 09:34:54 1999 From: David.Lapointe at umassmed.edu (Lapointe, David) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] an interesting thought Message-ID: <93307F07DE63D211B2F30000F808E9E525D739@edunivexch02.umassmed.edu> Jeff, This has been puzzling me for a while. What with the emphasis toward using Python as a base language for Loci, what about Grail as a browser/GUI ? I haven't tried this but apparently python scripts can be downloaded and client side executed. Also, at the BAMBCT meeting Thursday, I brought up the issue of using Linux to do Computational Biology. Lance thought a June meeting Show 'n Tell would be interesting. There are *many* scientists who are using Linux scattered around the Boston area. David -----Original Message----- From: J.W. Bizzaro To: tulip-list Sent: 3/26/99 5:53 PM Subject: [Pipet Devel] an interesting thought Locians, It's strange that I just came across this article: http://www.news.com/News/Item/0%2C4%2C34314%2C00.html?dd.ne.txt.0326.04 Strange, because a couple days ago I was thinking about the future of GUI development and the role of XML and the Internet. I thought that the Web browser of today may some day become so customizable that it will be a portable GUI toolkit, with an Internet backbone. But this is also the direction Loci is heading. The article talks about using XUL (an XML) to provide the browser with GUI information (buttons, etc.). The problems are, (1) XUL would require an enormous and complex DTD, and (2) the browser would need all of the widgets built-in, ready to be called upon at runtime. I realized these would be insurmountable problems for Loci, if it were to go this route, simply because Loci is not the scale of Mozilla. But what if the GUI information for Loci were included in the XML, as with XUL, but >>>as a Python-GTK script<<< Yes, a functional program/module embedded in the XML. Has anyone heard of this being done before? Try that with compiled binaries! So each locus would be the same: just a shell that can process our XML (workflow + bio + GUI) and make an application on the fly. Any thoughts? What would be the advantages and disadvantages? Jeff bizzaro@bc.edu From bizzaro at bc.edu Sat Mar 27 21:03:36 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] an interesting thought References: Message-ID: <36FD8DF8.C6453A88@bc.edu> Justin Bradford wrote: > > > So each locus would be the same: just a shell that can process our XML > > (workflow + bio + GUI) and make an application on the fly. > > > > Any thoughts? What would be the advantages and disadvantages? > > The only problem I see is one of speed. However, if we had widgets along > the lines of render_3d_molecule, etc, it could work. Yes, I was thinking about making high-level widgets (bindings to C-GTK concoctions). In fact, I have always considered Loci to be a "library" of high-level biowidgets. But I would make the basic widgets available too, probably through the Python-GTK bindings. I don't think it would be all that slow. The tools/loci will be in Python anyway. As Python works, modules are "imported" from other files. The script would be in the LocusML like this: #!/usr/bin/env python import sys from Gtkinter import * import GtkExtra class Application: From bizzaro at bc.edu Sat Mar 27 21:08:44 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] an interesting thought References: <36FD8DF8.C6453A88@bc.edu> Message-ID: <36FD8F2C.CEC4671A@bc.edu> Hehehe. The marker I put in at the end of my last message: "/guiscript" cut off the message. Here is the rest... and perhaps the script is read in by the locus and then written to a temporary file. Once in that file, the locus can "import" the module as if it were always available. > > You don't mean the analysis tool generates the script, though, right? > Not from scratch. But the analysis tool can pick out a script from a library/repository that is appropriate for the biodata in the XML. IOW, we are looking at "browsing" data that contains its own instructions for display and manipulation. It may be somewhat akin to Javascript in an HTML doc. I think that's the best way to look at it. But what does the user gain? I think that, just as the original plan for Loci was to allow the user to access analysis tools without having them all on the user's machine, this would allow the user to access graphical tools the same way. What I think may go along with this very well (in fact may be required), is a GUI builder (or Locus builder), sort of like a specialized Delphi. This can help developers create custom loci without having to get into too much of the Python and LocusML. Maybe this is very complicated, but I think it is the direction software development is heading. If we don't go this way now, we may see AOL Navigator do the same stuff in 10 years. Ciao, Jeff -- J.W. Bizzaro mailto:bizzaro@bc.edu Boston College Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ Studies show that 93% of all people are below average. -- From bizzaro at bc.edu Sun Mar 28 04:08:51 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:25 2006 Subject: [Pipet Devel] an interesting thought References: <93307F07DE63D211B2F30000F808E9E525D739@edunivexch02.umassmed.edu> Message-ID: <36FDF1A3.53E95E0E@bc.edu> "Lapointe, David" wrote: > > This has been puzzling me for a while. What with the emphasis > toward using Python as a base language for Loci, what about Grail as a > browser/GUI ? I haven't tried this but apparently python scripts can be > downloaded and client side executed. An excellent point! I have seen Grail before, and looking at it again, it does do what I was talking about. It would be a good excercise for everyone to download it: http://grail.cnri.reston.va.us/grail/source/ Grail is able to run Python-Tk scripts in the browser window. Check out these demos: http://grail.cnri.reston.va.us/grail/demo/ Yes, Grail is 100% Python (with Tk) and a good model for Loci, with a couple exceptions... (1) It uses Tkinter rather than GTK. (2) It is not GNU LGPL or GPL (but it is free and open source). > Yes, this brings up the old Trojan Horse issue. Grail seems to use the standard "sandbox", and can be a model for Loci in security too. But they're not altogether certain about security either: http://grail.cnri.reston.va.us/grail/info/security.html > > Also, at the BAMBCT meeting Thursday, I brought up the issue of using Linux > to do Computational Biology. Lance thought a June meeting Show 'n Tell would > be interesting. There are *many* scientists who are using Linux scattered > around the Boston area. > So, maybe I can show Loci if anything works by then ;-) But of course Loci will be made to run on all flavors of UNIX, if we can help it. Jeff -- J.W. Bizzaro mailto:bizzaro@bc.edu Boston College Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ Studies show that 93% of all people are below average. -- From bizzaro at bc.edu Mon Mar 29 08:05:15 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:26 2006 Subject: [Pipet Devel] Open Labs Message-ID: <36FF7A8B.B5BC430E@bc.edu> Locians, I uploaded the start of the new Web site for "Open Labs" and Loci: http://129.63.144.25/ Also, we will probably go with the name "Open Labs". I have been communicating with the person using openlab.org. He is creating on an organization for open-source development, and I think there may be some confusion if we used "Open Lab". The plural form works well for us, since I was planning on having several "labs", each with a different project. The new server will likely be named: openlabs.uml.edu Jeff -- J.W. Bizzaro mailto:bizzaro@bc.edu Boston College Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ Studies show that 93% of all people are below average. -- From David.Lapointe at umassmed.edu Mon Mar 29 10:23:08 1999 From: David.Lapointe at umassmed.edu (Lapointe, David) Date: Fri Feb 10 19:18:26 2006 Subject: [Pipet Devel] an interesting thought Message-ID: <93307F07DE63D211B2F30000F808E9E525D73A@edunivexch02.umassmed.edu> What about Grail? That would allow embedded python scripts. Has anyone tried that? David > -----Original Message----- > From: J.W. Bizzaro [mailto:bizzaro@bc.edu] > Sent: Friday, March 26, 1999 5:54 PM > To: tulip-list > Subject: [Pipet Devel] an interesting thought > > > Locians, > > It's strange that I just came across this article: > > > http://www.news.com/News/Item/0%2C4%2C34314%2C00.html?dd.ne.tx > t.0326.04 > > Strange, because a couple days ago I was thinking about the > future of GUI > development and the role of XML and the Internet. I thought > that the Web > browser of today may some day become so customizable that it > will be a portable > GUI toolkit, with an Internet backbone. > > But this is also the direction Loci is heading. > > The article talks about using XUL (an XML) to provide the > browser with GUI > information (buttons, etc.). The problems are, (1) XUL would > require an > enormous and complex DTD, and (2) the browser would need all > of the widgets > built-in, ready to be called upon at runtime. > > I realized these would be insurmountable problems for Loci, > if it were to go > this route, simply because Loci is not the scale of Mozilla. > > But what if the GUI information for Loci were included in the > XML, as with XUL, > but > > >>>as a Python-GTK script<<< > > Yes, a functional program/module embedded in the XML. Has > anyone heard of this > being done before? Try that with compiled binaries! > > So each locus would be the same: just a shell that can > process our XML (workflow > + bio + GUI) and make an application on the fly. > > Any thoughts? What would be the advantages and disadvantages? > > > Jeff > bizzaro@bc.edu > From David.Lapointe at umassmed.edu Mon Mar 29 10:34:48 1999 From: David.Lapointe at umassmed.edu (Lapointe, David) Date: Fri Feb 10 19:18:26 2006 Subject: [Pipet Devel] an interesting thought Message-ID: <93307F07DE63D211B2F30000F808E9E525D73B@edunivexch02.umassmed.edu> Has anyone benchmarked GTK+ ? At least for our purposes? Eric Harlow has some cautionary things to say about real time performance ( in the games chapter ). > -----Original Message----- > From: J.W. Bizzaro [mailto:bizzaro@bc.edu] > Sent: Saturday, March 27, 1999 9:04 PM > To: tulip-list@busboy.sped.ukans.edu > Subject: Re: [Pipet Devel] an interesting thought > > > Justin Bradford wrote: > > > > > So each locus would be the same: just a shell that can > process our XML > > > (workflow + bio + GUI) and make an application on the fly. > > > > > > Any thoughts? What would be the advantages and disadvantages? > > > > The only problem I see is one of speed. However, if we had > widgets along > > the lines of render_3d_molecule, etc, it could work. > > Yes, I was thinking about making high-level widgets (bindings to C-GTK > concoctions). In fact, I have always considered Loci to be a > "library" of > high-level biowidgets. > > But I would make the basic widgets available too, probably through the > Python-GTK bindings. > > I don't think it would be all that slow. The tools/loci will > be in Python > anyway. As Python works, modules are "imported" from other > files. The script > would be in the LocusML like this: > > > #!/usr/bin/env python > > import sys > from Gtkinter import * > import GtkExtra > > class Application: > From hortiz at neurobio.upr.clu.edu Mon Mar 29 12:27:24 1999 From: hortiz at neurobio.upr.clu.edu (Humberto Ortiz Zuazaga) Date: Fri Feb 10 19:18:26 2006 Subject: [Pipet Devel] I'd like to contribute. Message-ID: <199903291727.NAA01917@chimbo.neurobio.upr.clu.edu> Hello, I'm Humberto Ortiz-Zuazaga, I work with the University of Puerto Rico's Institute of Neurobiology: http://www.neurobio.upr.clu.edu/ I've been given the go ahead to develop sequence analysis tools for a molecular biology core facility being set up here. I'd like to contribute to the Loci project, as it looks like you've designed the very system I'd like to create (by the way, where is the design document? I read it friday night, but it's not on the new web site yet). I've got a couple of years experience developing sequence analysis and genetic mapping tools, some of which are available on the web: http://www-bio.cnnet.clu.edu/analysis/ (when it's up) http://www.neurobio.upr.clu.edu/~hortiz/cmb/tkmap/ http://www.neurobio.upr.clu.edu/~hortiz/cmb/bpe/ I've dabbled a little in python, and done no gnome programming, but I agree with both these choices for a analysis GUI. I've gotten the python-gnome package set up on my machine, and looked over the mailing list archives (and subscribed). I look forward to contributing. -- Humberto Ortiz Zuazaga Bioinformatics Specialist Institute of Neurobiology hortiz@neurobio.upr.clu.edu From rahul at photino.sid.rice.edu Mon Mar 29 15:35:51 1999 From: rahul at photino.sid.rice.edu (Rahul Jain) Date: Fri Feb 10 19:18:26 2006 Subject: [Pipet Devel] an interesting thought In-Reply-To: <93307F07DE63D211B2F30000F808E9E525D73B@edunivexch02.umassmed.edu> Message-ID: On Mon, 29 Mar 1999, Lapointe, David wrote: > Has anyone benchmarked GTK+ ? At least for our purposes? Eric Harlow has > some cautionary things to say about real time performance ( in the games > chapter ). > I use GNOME all the time. It's reasonably fast on a P150/32MB RAM as long as you're using a new version. -- -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- |--|--------|--------------|----|-------------|------|---------|-----|-| Version 11.423.999.210000101.23.50110101.042 (c)1996-1999, All rights reserved. Disclaimer available upon request. From bizzaro at bc.edu Mon Mar 29 21:22:05 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:26 2006 Subject: [Pipet Devel] I'd like to contribute. References: <199903291727.NAA01917@chimbo.neurobio.upr.clu.edu> Message-ID: <3700354D.3D6B7A70@bc.edu> Hello Humberto! Humberto Ortiz Zuazaga wrote: > I've been given the go ahead to develop sequence analysis tools for a > molecular biology core facility being set up here. > > I'd like to contribute to the Loci project, as it looks like you've designed > the very system I'd like to create We'd love to have you help. We want to develop both a system for networking biotools and a set of basic biotools (loci). > (by the way, where is the design document? > I read it friday night, but it's not on the new web site yet). I don't have anything called a "design document". It could have been a few things. Did you see it on the mailing list archive or the old Web site? The new site is just starting to come together, so it is missing quite a bit. But as far as an actual design is concerned, it has been changing lately, and the best way to tell what we're going to do (until the Web site is ready) is to read the mailing list archive: http://toaster.sped.ukans.edu/tulip-list/ > > I've got a couple of years experience developing sequence analysis and genetic > mapping tools, some of which are available on the web: > > http://www-bio.cnnet.clu.edu/analysis/ (when it's up) > http://www.neurobio.upr.clu.edu/~hortiz/cmb/tkmap/ > http://www.neurobio.upr.clu.edu/~hortiz/cmb/bpe/ I took a look at the sites. You seem very capable. I like the "TkMap", since we will want a genome map viewer, and the sequence viewers will show a map as well. > > I've dabbled a little in python, and done no gnome programming, but I agree > with both these choices for a analysis GUI. I've gotten the python-gnome > package set up on my machine, and looked over the mailing list archives (and > subscribed). You may want to check out my PyG Tools Web site: http://www.uml.edu/Dept/Chem/BICGroup/PyGTools/ > > I look forward to contributing. > Great! I will make an account for you on our Linux box: 129.63.144.25 And I will soon send out a list of loci that need developers. Jeff -- J.W. Bizzaro mailto:bizzaro@bc.edu Boston College Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ I have always appreciated your ability to ________, whenever there has been a blank to fill. -- From hortiz at neurobio.upr.clu.edu Mon Mar 29 22:14:11 1999 From: hortiz at neurobio.upr.clu.edu (Humberto Ortiz Zuazaga) Date: Fri Feb 10 19:18:26 2006 Subject: [Pipet Devel] Old design document. In-Reply-To: Your message of "Tue, 30 Mar 1999 02:22:05 GMT." <3700354D.3D6B7A70@bc.edu> Message-ID: <199903300314.XAA03604@chimbo.neurobio.upr.clu.edu> > > (by the way, where is the design document? > > I read it friday night, but it's not on the new web site yet). > > I don't have anything called a "design document". It could have been a few > things. Did you see it on the mailing list archive or the old Web site? Old web site. A page describing how loci would use XML to pass results back and forth. Ive been reading the list archives, so I see most of the design has changed (mostly for the better). > But as far as an actual design is concerned, it has been changing lately I've noticed, I'll work from the glossary then. You should put that up on the web site. From bizzaro at bc.edu Mon Mar 29 22:45:00 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:26 2006 Subject: [Pipet Devel] Old design document. References: <199903300314.XAA03604@chimbo.neurobio.upr.clu.edu> Message-ID: <370048BC.E07E1259@bc.edu> Humberto Ortiz Zuazaga wrote: > > > > (by the way, where is the design document? > > > I read it friday night, but it's not on the new web site yet). > > > > I don't have anything called a "design document". It could have been a few > > things. Did you see it on the mailing list archive or the old Web site? > > Old web site. A page describing how loci would use XML to pass > results back and forth. Ive been reading the list archives, so I see > most of the design has changed (mostly for the better). Okay. That page is gone and pretty out of date. I have a copy of it, if you'd like to see it again. > > > But as far as an actual design is concerned, it has been changing lately > > I've noticed, I'll work from the glossary then. You should put that > up on the web site. Yep. That's going up soon. Jeff -- J.W. Bizzaro mailto:bizzaro@bc.edu Boston College Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ I have always appreciated your ability to ________, whenever there has been a blank to fill. -- From bizzaro at bc.edu Tue Mar 30 00:43:12 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:26 2006 Subject: [Pipet Devel] new stuff up Message-ID: <37006470.39DFE0B2@bc.edu> Locians, I just posted an updated glossary. Note that the biggest changes from the last version are (1) tools == loci == clients (there is no distinction now) and (2) I added the "FigureBuilder". http://129.63.144.25/loci/docs/gloss.html Also, I posted some developer bios. Let me know if you have anything to change. I don't have any bio for Greg...Hey Greg! http://129.63.144.25/loci/devel.html Jeff -- J.W. Bizzaro mailto:bizzaro@bc.edu Boston College Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ I have always appreciated your ability to ________, whenever there has been a blank to fill. -- From bizzaro at bc.edu Tue Mar 30 01:26:26 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:26 2006 Subject: [Pipet Devel] Open Labs References: <36FF7A8B.B5BC430E@bc.edu> Message-ID: <37006E92.7B61EA4@bc.edu> Nah! The heck with it. I like "The Open Lab". So the server will be... bioinformatics.org And I will register... theopenlab.org theopenlab.net Jeff "J.W. Bizzaro" wrote: > > Locians, > > I uploaded the start of the new Web site for "Open Labs" and Loci: > > http://129.63.144.25/ > > Also, we will probably go with the name "Open Labs". I have been communicating > with the person using openlab.org. He is creating on an organization for > open-source development, and I think there may be some confusion if we used > "Open Lab". The plural form works well for us, since I was planning on having > several "labs", each with a different project. The new server will likely be > named: > > openlabs.uml.edu > > Jeff > -- > J.W. Bizzaro mailto:bizzaro@bc.edu > Boston College Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ > > Studies show that 93% of all people are below average. > -- -- J.W. Bizzaro mailto:bizzaro@bc.edu Boston College Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ I have always appreciated your ability to ________, whenever there has been a blank to fill. -- From justin at ukans.edu Tue Mar 30 19:41:52 1999 From: justin at ukans.edu (Justin Bradford) Date: Fri Feb 10 19:18:26 2006 Subject: [Pipet Devel] SooHaeng joins In-Reply-To: <37017261.ED400ECB@bc.edu> Message-ID: > Although he is using mesa right now, while you said you are using > OpenGL. Are there any compatibilty problems? Mesa is an OpenGL compatible library, so theoretically, there shouldn't be any compatibility problems ;) Justin Bradford justin@ukans.edu From bizzaro at bc.edu Tue Mar 30 19:54:57 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:26 2006 Subject: [Pipet Devel] SooHaeng joins Message-ID: <37017261.ED400ECB@bc.edu> I recruited SooHaeng (and I guess his "partner"), who I found were in the middle of developing an MD modeler with GTK: http://dnd98.freeservers.com/ Greg, I think that SooHaeng and his program will compliment your development of a rendering engine. Perhaps you can help add different representations (cartoon views, etc.). Although he is using mesa right now, while you said you are using OpenGL. Are there any compatibilty problems? SooHaeng's e-mail is attached. Jeff -------------- next part -------------- An embedded message was scrubbed... From: yoo@theoalpha.korea.ac.kr Subject: Thank you. Date: Mon, 29 Mar 1999 13:26:19 +0900 Size: 1400 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990331/930a36ce/attachment.mht From bizzaro at bc.edu Wed Mar 31 04:33:54 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:26 2006 Subject: [Pipet Devel] an interesting thought References: <93307F07DE63D211B2F30000F808E9E525D73B@edunivexch02.umassmed.edu> Message-ID: <3701EC02.8C762EF7@bc.edu> "Lapointe, David" wrote: > > Has anyone benchmarked GTK+ ? At least for our purposes? Eric Harlow has > some cautionary things to say about real time performance ( in the games > chapter ). > Just qualitatively, it's one of the fastest GUI's for Linux I've seen. And PyGTK beats Tkinter hands down, AFAIC. Jeff -- J.W. Bizzaro mailto:bizzaro@bc.edu Boston College Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ I have always appreciated your ability to ________, whenever there has been a blank to fill. -- From bizzaro at bc.edu Wed Mar 31 06:56:07 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:26 2006 Subject: [Pipet Devel] new glossary Message-ID: <37020D57.4181F4FB@bc.edu> A new glossary is up: http://129.63.144.25/loci/docs/gloss.html I added a number of things we've been talking about lately: Locus PhyE - Phylogenic Editor Locus PhyV - Phylogenic Viewer Techie - (as in "lab technician") the name for the daemon that builds a database of what loci are available and what they can do Locus IAB & CAB - the application brokers (IAB == Gatekeeper) Also, thinking about the relationship between viewers and editors, I think we need to make it clear that editors are attached to or dependent on viewers. For example, if the user wants to edit a nucleotide sequence, the Benchtop doesn't call up a sequence editor, it calls up a sequence viewer, and the viewer calls the editor. Why is this important? The viewer manages the display of all the biological data, so the user should be able to see changes affected by the editor. This allows the editor to be simpler. Also, the editor can do without the code to manage workflow, since it will only communicate with the viewer. Make sense? Cheers, Jeff -- J.W. Bizzaro mailto:bizzaro@bc.edu Boston College Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ I have always appreciated your ability to ________, whenever there has been a blank to fill. -- From rahul at photino.sid.rice.edu Wed Mar 31 14:23:57 1999 From: rahul at photino.sid.rice.edu (Rahul Jain) Date: Fri Feb 10 19:18:26 2006 Subject: [Pipet Devel] an interesting thought In-Reply-To: <3701EC02.8C762EF7@bc.edu> Message-ID: On Wed, 31 Mar 1999, J.W. Bizzaro wrote: > Just qualitatively, it's one of the fastest GUI's for Linux I've seen. And > PyGTK beats Tkinter hands down, AFAIC. Absolutely. The real problem w/ Tk is that it is layered on top of so many other toolkits. GTK talks to Glib which talks directly to the X server. Although it's not that fast with the pixmap theme... :) -- -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- |--|--------|--------------|----|-------------|------|---------|-----|-| Version 11.423.999.210000101.23.50110101.042 (c)1996-1999, All rights reserved. Disclaimer available upon request.