From david.lapointe at umassmed.edu Sun Jan 24 15:30:58 1999 From: david.lapointe at umassmed.edu (david.lapointe@umassmed.edu) Date: Fri Feb 10 19:18:05 2006 Subject: [Pipet Devel] BlastXML Message-ID: <93307F07DE63D211B2F30000F808E9E525D662@edunivexch02.umassmed.edu> Ok, I am one of those weasels too. I am not suggesting moving to Java but here's a piece that came down the BioWidget pipeline this week. %%%%%%%%%% January 19, 1999 PharmTools SDK Suite(TM) (Early Access 1) Announcing the availability of PharmTools SDK Suite(TM) (Early Access 1) from WorkingObjects.com for evaluation. PharmTools SDK is a collection of reusable Java frameworks and toolkits for use in the development of bioinformatics applications. This early access release includes: the Blast Parsing Framework and BlastXML-SDK. The Blast Parsing Framework is a set of design patterns and Java classes for processing native (i.e., local) and NCBI website generated Blast2 reports. BlastXML-SDK extends the Blast Parsing Framework functionality to support the creation and processing of BlastXML documents. The PharmTools SDK Suite may be downloaded from the WorkingObjects.com website at http://www.workingobjects.com. %%%%%%%%%%%% It's pretty large, about 100 kb of *.jar for the application, and 500 kb for the sdk.jar with a blastxml.dtd. It has a functionality that relates to some discussion we had earlier this week on parsing the output of programs, which incidently is made easier by perl (regex). Many authors of *new and improved* programs, ie FASTA, have included parsible output into their programs. This makes it easier to connect different analyses. IMHO a good thing. Most MolbBio packages that I have seen are just a bag of unrelated pieces. meaning you can't run the output of BLAST or FASTA into CLUSTAL without a bit of work. David From bizzaro at bc.edu Sun Jan 24 19:14:00 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:05 2006 Subject: [Pipet Devel] BlastXML References: <93307F07DE63D211B2F30000F808E9E525D662@edunivexch02.umassmed.edu> Message-ID: <36ABB745.218CD79E@bc.edu> I'm not anti-Java either (well...maybe I am :-). You know, if any of these non-Python tools are open source, they will be very useful for us to see how it was done. If we are interested in a pure Python core, we may be able to translate some things to Python. BTW, what translators are out there that translate ???-to-Python? BlastXML should be good to look at because we want Blast searches to be included in Loci. Not that Loci will come with Blast, but we want it to be one of the first things added to the core. The same goes for FASTA. > Many authors of *new and > improved* programs, ie FASTA, have included parsible output into their > programs. This makes it easier to connect different analyses. Something we must take a look at. Have you looked into that Harry? > IMHO a good > thing. Most MolbBio packages that I have seen are just a bag of unrelated > pieces. meaning you can't run the output of BLAST or FASTA into CLUSTAL > without a bit of work. I couldn't have said it better myself! :-) This is what Loci must _not_ become! Whatever we use, it must pass this test: Will it cause a break in the Loci continuum? Will Loci become "a bag of unrelated pieces"? All Loci data (in XML) should be able to be tossed around between the core parts of Loci like a basketball at a Harlem Globetrotter's game! Jeff bizzaro@bc.edu david.lapointe@umassmed.edu wrote: > > Ok, I am one of those weasels too. I am not suggesting moving to Java but > here's a piece that came down the BioWidget pipeline this week. > > %%%%%%%%%% > January 19, 1999 > > PharmTools SDK Suite(TM) (Early Access 1) > > Announcing the availability of PharmTools SDK Suite(TM) (Early Access 1) > from WorkingObjects.com for evaluation. PharmTools SDK is a collection > of reusable Java frameworks and toolkits for use in the development of > bioinformatics applications. This early access release includes: the > Blast Parsing Framework and BlastXML-SDK. The Blast Parsing Framework is > a set of design patterns and Java classes for processing native (i.e., > local) and NCBI website generated Blast2 reports. BlastXML-SDK extends > the Blast Parsing Framework functionality to support the creation and > processing of BlastXML documents. > > The PharmTools SDK Suite may be downloaded from the WorkingObjects.com > website at > http://www.workingobjects.com. > > %%%%%%%%%%%% > > It's pretty large, about 100 kb of *.jar for the application, and 500 kb for > the sdk.jar with a blastxml.dtd. It has a functionality that relates to > some discussion we had earlier this week on parsing the output of programs, > which incidently is made easier by perl (regex). Many authors of *new and > improved* programs, ie FASTA, have included parsible output into their > programs. This makes it easier to connect different analyses. IMHO a good > thing. Most MolbBio packages that I have seen are just a bag of unrelated > pieces. meaning you can't run the output of BLAST or FASTA into CLUSTAL > without a bit of work. > > David -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From hjm at cx408397-a.irvn1.occa.home.com Sun Jan 24 20:54:26 1999 From: hjm at cx408397-a.irvn1.occa.home.com (Harry Mangalam) Date: Fri Feb 10 19:18:05 2006 Subject: [Pipet Devel] BlastXML In-Reply-To: <36ABB745.218CD79E@bc.edu> Message-ID: On Sun, 24 Jan 1999, J.W. Bizzaro wrote: [most deleted] > > Many authors of *new and > > improved* programs, ie FASTA, have included parsible output into their > > programs. This makes it easier to connect different analyses. > > Something we must take a look at. Have you looked into that Harry? I've been involved in a porting project involving FASTA, but I admit I missed the bit about making the output more parsable - I'll go back and check it more carefully - thanks!! Do others have any other pointers to packages that have made efforts to make parseable output? In relation to this is an approach that Lincoln Stein discussed in an article about using perl for the human genome project which I'll also throw out for general misinformation: the use of an i/o language called boulderio which had its beginnings in development of the Whitehead's 'Primer'. He described it as a way to pass data thru pipes with each added analyses being able to tag it with additional info. I'm not suggesting using it as is, but the idea of being able to add analytical value to a pipes/streams-based dataflow is vary attractive, especially to a large effort such as a genome initiative or even pharma. The article and links to boulderio are at: http://bio.perl.org/GetStarted/tpj_ls_bio.html http://stein.cshl.org/software/boulder/ This is a lightweight approach to marking up data so that it can be passed from app to app. It is not a very formal approach, but it has been used to coordinate some very large sequencing efforts. > > > IMHO a good > > thing. Most MolbBio packages that I have seen are just a bag of unrelated > > pieces. meaning you can't run the output of BLAST or FASTA into CLUSTAL > > without a bit of work. > > I couldn't have said it better myself! :-) This is what Loci must _not_ > become! Whatever we use, it must pass this test: Will it cause a break in > the Loci continuum? Will Loci become "a bag of unrelated pieces"? All Loci > data (in XML) should be able to be tossed around between the core parts of > Loci like a basketball at a Harlem Globetrotter's game! I like that analogy. Cheers Harry From bizzaro at bc.edu Sun Jan 24 21:25:40 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:05 2006 Subject: [Pipet Devel] embedding queries in XML References: Message-ID: <36ABD616.30DAC33E@bc.edu> Harry Mangalam wrote: > the use of an i/o language called boulderio which > had its beginnings in development of the Whitehead's 'Primer'. He described > it as a way to pass data thru pipes with each added analyses being able to > tag it with additional info. I'm not suggesting using it as is, but the idea > of being able to add analytical value to a pipes/streams-based dataflow is > vary attractive, especially to a large effort such as a genome initiative or > even pharma. > > This is a lightweight approach to marking up data so that it can be passed > from app to app. It is not a very formal approach, but it has been used to > coordinate some very large sequencing efforts. I like what was suggested by someone on the team earlier, that the XML file can contain a list of queries to be performed and already performed. In that sense, the XML files say "This is where I'm going. And this is where I've been. Can you help me?" So if the player who catches the basketball doesn't know where to throw it to, he can read the name of the recipient and sender right off of the ball. Why would a locus get an XML file it wasn't intended to get? Maybe this idea is best suited for a router system. Can each locus be a router? Each one _should_ be. Even if a locus was intended to get the data and do something with it, if the next step is to send the data somewhere else, it should know where to send it. Maybe the list of queries/commands can be put into the XML from the GCL Benchtop, at the start of the analysis. This way, the GCL won't have to control every step. Each locus won't have to ask the GCL where it should go next. The XML data will know the path. Once again, I'm not sure how this would work with Paos. Can you enlighten us Carlos? Can an XML object be treated as a mobile object with a mission? Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From carlosm at mroe.cs.colorado.edu Mon Jan 25 02:44:28 1999 From: carlosm at mroe.cs.colorado.edu (Carlos Maltzahn) Date: Fri Feb 10 19:18:06 2006 Subject: [Pipet Devel] embedding queries in XML Message-ID: On Sun, 24 Jan 1999, J.W. Bizzaro wrote: > > This is a lightweight approach to marking up data so that it can be passed > > from app to app. It is not a very formal approach, but it has been used to > > coordinate some very large sequencing efforts. > > I like what was suggested by someone on the team earlier, that the XML file > can contain a list of queries to be performed and already performed. In that > sense, the XML files say "This is where I'm going. And this is where I've > been. Can you help me?" So if the player who catches the basketball doesn't > know where to throw it to, he can read the name of the recipient and sender > right off of the ball. This sounds like a workflow system to me. Except the agents are now tools instead of office workers. > Why would a locus get an XML file it wasn't intended to get? Maybe this idea > is best suited for a router system. Can each locus be a router? Each one > _should_ be. Even if a locus was intended to get the data and do something > with it, if the next step is to send the data somewhere else, it should know > where to send it. Again, this sounds like nodes in a distributed workflow system. Exceptions can occur in each node and depending on the presence on matching exception handler the node knows where to send the data next - or just raise a flag and say "I don't know what to do with this, help me!" > Maybe the list of queries/commands can be put into the XML from the GCL > Benchtop, at the start of the analysis. This way, the GCL won't have to > control every step. Each locus won't have to ask the GCL where it should go > next. The XML data will know the path. > > Once again, I'm not sure how this would work with Paos. Can you enlighten us > Carlos? Can an XML object be treated as a mobile object with a mission? I assume that an XML object is an intermediate result in the execution of some composition of analysis tools, correct? If you want to add XML objects as agents who can actively seek their next tool, you need to provide the environment to do that (there is a Python module that offers a safe execution environment for such mobile objects). Those execution environments would be like special shells that receive XML objects and execute them. From the view point of Paos these shells would be Paos clients. They submit status information to one or more Paos server and receive notifications that either control the execution of tools or modify the content of XML objects (such as routing information). Other clients of Paos are GCL editors and monitors that visualize the whereabouts and status of these mobile XML objects. So I see Paos as control and monitoring infrastructure for shells which receive and send XML objects from/to other shells and start and feed tools according to these XML objects. I think you call these shells analysis loci (correct?). But you could use the same shells as visualization loci. For me a visualization loci is just another glyph in a GCL construct. To the user the only difference to an analysis loci is the fact that it usually runs fairly local to the user and calls a tool that shows up as a Gnome application that visualizes data. Let me know whether this sounds right. Carlos From bizzaro at bc.edu Tue Jan 26 18:45:39 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:06 2006 Subject: [Pipet Devel] embedding queries in XML References: Message-ID: <36AE53A2.B6DCEF36@bc.edu> Carlos Maltzahn wrote: > This sounds like a workflow system to me. Except the agents are now tools > instead of office workers. Yes! I ignored the concept of the workflow system because it related to office workers, but I guess we can treat the loci as workers. Great! Now where did I see a workflow system recently? Zope does this, right? > I assume that an XML object is an intermediate result in the execution of > some composition of analysis tools, correct? The XML is generated from whatever biological data the user starts with. They will get some piece of info (a sequence or a structure) and will want to do something with it. As soon as Loci knows what the data is, it is put into XML, and it stays in XML indefinitely. > So I see Paos as control and monitoring infrastructure for shells which > receive and send XML objects from/to other shells and start and feed tools > according to these XML objects. I think you call these shells analysis > loci (correct?). If by shells you mean analysis tools, in whatever language, wrapped in Python so that they become transparent, then yes. > But you could use the same shells as visualization loci. Well...They're shells in the sense that GTK/GNOME GUI are wrapped in Python. But they will be pure Python. > For me a visualization loci is just another glyph in a GCL construct. To > the user the only difference to an analysis loci is the fact that it > usually runs fairly local to the user and calls a tool that shows up as > a Gnome application that visualizes data. > > Let me know whether this sounds right. Yes! I think you see it the way I do. Only visualization loci are always local/client, while analysis loci can be remote/Internet-server (as I have been describing them) _or_ local/client. The mechanism for local or remote analysis loci (gatekeeper and porta) should work nearly the same. My reasoning for having both is that local analysis will give better control and faster feedback, while remote analysis will expand the Loci installation to the extent of what is on the Internet. BTW, Carlos, don't worry about rushing to get the documentation done. We understand that your thesis is your personal life and much more important ;-) Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Wed Jan 27 19:16:58 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:06 2006 Subject: [Pipet Devel] our own XML References: <36AE660B.4D9707B7@bc.edu> <199901271742.SAA17578@dirac.cnrs-orleans.fr> Message-ID: <36AFAC7A.860982B3@bc.edu> Justin et al, Then let's try to design our own XML, emphasizing (1) biomacromolecule structure according to Konrad's specifications, (2) biopolymer sequence, (3) commands and queries used by Loci, (4) object orientation, and (5) workflow...as these things work best with Paos. And let's see if we can make use of the best of existing XML's/DTD's. Carlos, can Paos be extended to offer _native_ support for XML objects that embed queries and other information needed to make our workflow system, as we've been discussing lately? Not to make Paos work only with our biological XML, but to work with any XML that supports the embedding of workflow information. Jeff Konrad Hinsen wrote: > > Konrad, you thought we might want to do this back when we had only three people > involved. Maybe we can call it "LocusML" or "Bio-Object ML" (BOML) or > "Bio-Macromolecule ML" (BMML). > > Fine with me, and I'd certainly use it for other applications as well. > > On the other hand, it is possible to design DTDs by extending existing > ones. Perhaps this is a good idea to save effort and keep compatibility > to some extent. -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From hjm at cx408397-a.irvn1.occa.home.com Sat Jan 23 14:03:48 1999 From: hjm at cx408397-a.irvn1.occa.home.com (Harry Mangalam) Date: Fri Feb 10 19:18:08 2006 Subject: [Pipet Devel] BioML vs BSML In-Reply-To: <36A95B6D.BF276B19@bc.edu> Message-ID: Hi All, Having only recently come to this arena, what's the group's evaluation of the relative merits of BSML: http://www.visualgenomics.com/sbir/rfc.htm vs BioML: http://www.proteometrics.com/BIOML/ I'm still going thru the text of the specs but if any of you have strong arguments regarding either approaches, I'd very much appreciate it. The bioperl people seem to like the BioML approach. Also, if you have come to an approach that can rationalize the competing stds, please let me know. Cheers Harry From bizzaro at bc.edu Sat Jan 23 18:00:05 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:08 2006 Subject: [Pipet Devel] BioML vs BSML References: Message-ID: <36AA5475.7BAE2552@bc.edu> Harry, We had some rather lengthy exchanges about the use of BSML and CML in Loci. We were looking for an XML that would work well for both sequence and structure information. But it seemed that neither was good at both. In sort, we stepped back from the issue, and decided we would try to support BOTH, until something better came along. We even considered making our own XML (we would call it BMML, "BioMolecule Markup Language"). What we really wanted was just one language that would give us an excellent description of large bio macromolecules. Well, of course that would have been too much to take on right now. Konrad actually has a lot to say about descriptive languages for macromolecule structure. He had corresponded with Peter Murray Rust, the author of CML and someone rather influential in the development of XML. Konrad, as he can tell you, is not satisfied with any current format. BioML comes as a bit of a surprise to me. It seems to be brand new. Looking it over a bit, it does seem to do a better job at handling structure than BSML, and I like the inclusion of all sorts of biological information (it can describe organisms as well, it seems)...although some may argue this is "bloat". I would like Konrad to give us his impression of BioML. It would be nice to use one XML rather than two. My big question, however, is the licensing. The Web page says to contact David Fenyo about the "commercial" use of BioML. I wonder if this is one of those "well, as long as you don't make money on it" licenses. If so, we have problems: It won't fit with GNU GPL. I wrote a message to David Fenyo about this, and to see if he can give us a contrast between BioML and BSML. A cc will be sent to the Tulip list. Cheers! Jeff bizzaro@bc.edu Harry Mangalam wrote: > > Hi All, > > Having only recently come to this arena, what's the group's evaluation of > the relative merits of BSML: > http://www.visualgenomics.com/sbir/rfc.htm > > vs BioML: > http://www.proteometrics.com/BIOML/ > > I'm still going thru the text of the specs but if any of you have strong > arguments regarding either approaches, I'd very much appreciate it. The > bioperl people seem to like the BioML approach. > > Also, if you have come to an approach that can rationalize the competing > stds, please let me know. > > Cheers > Harry -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Sat Jan 23 18:00:16 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:08 2006 Subject: [Pipet Devel] BioML license Message-ID: <36AA5480.CF43C1A8@bc.edu> David, I was just told about the BioML language. I am the coordinator of a rather large project to develop a free and open source (GNU GPL) bioinformatics package. It's called "The Loci Project," and here is the Web site: http://www.uml.edu/Dept/Chem/BICGroup/Apps/TULIP/ XML will be the backbone of communication between tools. We were looking closely at BSML and CML for descriptions of both sequence and structure (neither does both well). Anyway, my question to you, since you are the person to contact for "commercial" uses of BioML, is what sort of restrictions do you have on the use of BioML? I was not able to find this information on the Web site. Although Loci is not commercial, our licensing (GPL) is not compatible with other licenses that restrict commercial use. Also, could you give us a brief contrast between BioML and BSML? What was the motivation behind making "another" XML? Thank you! Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Sat Jan 23 18:11:38 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:08 2006 Subject: [Pipet Devel] BioML vs BSML References: <36AA5475.7BAE2552@bc.edu> Message-ID: <36AA572A.45B1237A@bc.edu> "J.W. Bizzaro" wrote: > I would like Konrad to give us his impression of BioML. And of course Justin, who is our resident XML X-pert ;-) Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From hjm at cx408397-a.irvn1.occa.home.com Sun Jan 24 00:38:21 1999 From: hjm at cx408397-a.irvn1.occa.home.com (Harry Mangalam) Date: Fri Feb 10 19:18:08 2006 Subject: [Pipet Devel] BioML vs BSML In-Reply-To: <36AA5475.7BAE2552@bc.edu> Message-ID: Hi Jeff et al, As you probably have found out now, BioML is being used by the bio.perl group and the perl masters at perl.org already have a pretty large archive of useful scripts for manipulating XML. And it looks like, in a VERY fast spin thru the docs, that using their XML parser tools, it may be possible to use both of these XMLs, using these perl modules to handle parsing and interconversion. ie: http://www.perl.com/pace/pub/perldocs/1998/11/xml.html http://www.perl.com/pace/pub/perldocs/1998/12/cooper-01.html perl also has gtk bindings and in general, I've heard about and done more things using perl in the bio world than with python. Not to say python doesn't make some or even most things easier - just that perl has a proven track record in the bio area. It seems not to be at cross purposes to the objectives of LOCI to implement chunks of it in perl, no? As long as it remains easliy implementable, and usable, and freely re-distributable, perl is as much of an option as python, isn't it? Also, how is the LOCI project planning on handling the display of the results of this effort? Both the BioML and the BSML browsers that are available are MS-centric and certainly do not use gtk. The only browser that I'm aware of that does is gzilla (www.gzilla.com) and mnemonic ( http://www.mnemonic.org/). Are you planning on using either of these for your display dev platform? Or are you not implementing any display technology? VisualGenomics is planning a Java-based BSML browser, but I'm sure that it won't use gtk, unless heavily funded. It'll probably be written with the swing classes - what's the redist controls on swing - I'm not a Java follower - sorry. OK - enough mind rot from me for the present... Cheers harry On Sat, 23 Jan 1999, J.W. Bizzaro wrote: > Harry, > > We had some rather lengthy exchanges about the use of BSML and CML in Loci. We > were looking for an XML that would work well for both sequence and structure > information. But it seemed that neither was good at both. > > In sort, we stepped back from the issue, and decided we would try to support > BOTH, until something better came along. We even considered making our own XML > (we would call it BMML, "BioMolecule Markup Language"). What we really wanted > was just one language that would give us an excellent description of large bio > macromolecules. Well, of course that would have been too much to take on right > now. > > Konrad actually has a lot to say about descriptive languages for macromolecule > structure. He had corresponded with Peter Murray Rust, the author of CML and > someone rather influential in the development of XML. Konrad, as he can tell > you, is not satisfied with any current format. > > BioML comes as a bit of a surprise to me. It seems to be brand new. Looking it > over a bit, it does seem to do a better job at handling structure than BSML, and > I like the inclusion of all sorts of biological information (it can describe > organisms as well, it seems)...although some may argue this is "bloat". > > I would like Konrad to give us his impression of BioML. It would be nice to use > one XML rather than two. My big question, however, is the licensing. The Web > page says to contact David Fenyo about the "commercial" use of BioML. I wonder > if this is one of those "well, as long as you don't make money on it" licenses. > If so, we have problems: It won't fit with GNU GPL. > > I wrote a message to David Fenyo about this, and to see if he can give us a > contrast between BioML and BSML. A cc will be sent to the Tulip list. > > > Cheers! > Jeff > bizzaro@bc.edu > > > Harry Mangalam wrote: > > > > Hi All, > > > > Having only recently come to this arena, what's the group's evaluation of > > the relative merits of BSML: > > http://www.visualgenomics.com/sbir/rfc.htm > > > > vs BioML: > > http://www.proteometrics.com/BIOML/ > > > > I'm still going thru the text of the specs but if any of you have strong > > arguments regarding either approaches, I'd very much appreciate it. The > > bioperl people seem to like the BioML approach. > > > > Also, if you have come to an approach that can rationalize the competing > > stds, please let me know. > > > > Cheers > > Harry > > -- > J.W. Bizzaro Phone: 617-552-3905 > Boston College mailto:bizzaro@bc.edu > Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ > -- > Cheers, Harry Harry J Mangalam, Developmental + Cell Biology Rm 4201, Biological Sciences II, UC Irvine, Irvine, CA, 92697 (949) 824 4824[vox], (949) 824 8551[fax], mangalam@uci.edu http://hornet.bio.uci.edu/~hjm/ From bizzaro at bc.edu Sun Jan 24 02:05:04 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:08 2006 Subject: [Pipet Devel] BioML vs BSML References: Message-ID: <36AAC620.56ECC21C@bc.edu> Harry Mangalam wrote: > It seems not to be at cross purposes to the objectives of LOCI to implement > chunks of it in perl, no? As long as it remains easliy implementable, and > usable, and freely re-distributable, perl is as much of an option as > python, isn't it? Errrr mmmmmmmmmm argggggghh! I'm having convulsions here ;-) It seems at every turn, there's something supposedly better to use, and I'm left having to defend what I have chosen. If I don't give in, I'm too stubborn. But if I do, Loci more closely resembles a smorgasbord of technologies. I am trying to keep Loci homogenous in terms of technology. I think Python is a knock-out language, beating even Perl in most respects. I know Perl is very prominent in sequence analyses...It's prominent in just about everything. But I think Python is not far behind in acceptance, and is gaining momentum. What can we do? If there is absolutely no other choice, we can go with something in Perl, ***providing we consider it a temporary option. If we can find something later in Python or can convert it to Python, then we will. But don't give in too easily. > > Also, how is the LOCI project planning on handling the display of the > results of this effort? Both the BioML and the BSML browsers that are > available are MS-centric and certainly do not use gtk. The only browser > that I'm aware of that does is gzilla (www.gzilla.com) and mnemonic ( > http://www.mnemonic.org/). Are you planning on using either of these for > your display dev platform? Or are you not implementing any display > technology? You're talking about gtk-based XML browsers? The Gnome libraries have a canvas that is rather powerful. I think we can make our own XML-to-display modules using Python-Gnome bindings. Of course what we are doing is unique. The only other bioinformatics XML browsers out there are the two you mentioned for BioML and BSML. So there are no standard libraries for handling this sort of thing. Thomas is working on a sequence editor, I think with the Gnome canvas. How have things been developing, Thomas? Harry, each Loci GUI tool will be a rather small XML browser, designed to specifically handle *one* type of display. For example, we will have separate tools for the display of short DNA sequences, circular genomes, chromosomes, protein sequences with secondary structure, 3D DNA structures, 3D protein structures, phylogenic trees, DNA sequence editing, protein sequence editing, data plots, and others we haven't thought of yet. The idea is to keep things small and modular. There really won't be any very large apps in Loci. > > VisualGenomics is planning a Java-based BSML browser, but I'm sure that it > won't use gtk, unless heavily funded. It'll probably be written with the > swing classes - what's the redist controls on swing - I'm not a Java > follower - sorry. > Yeah, if it uses Java, it will almost certainly use Swing. I think Swing is now a part of the standard distribution of Java, which is supposed to be fairly easy to obtain. The license is not nearly GNU GPL. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From hjm at cx408397-a.irvn1.occa.home.com Sun Jan 24 12:46:19 1999 From: hjm at cx408397-a.irvn1.occa.home.com (Harry Mangalam) Date: Fri Feb 10 19:18:08 2006 Subject: [Pipet Devel] BioML vs BSML In-Reply-To: <36AAC620.56ECC21C@bc.edu> Message-ID: Hi All (new content inline below) On Sun, 24 Jan 1999, J.W. Bizzaro wrote: > Harry Mangalam wrote: > > It seems not to be at cross purposes to the objectives of LOCI to implement > > chunks of it in perl, no? As long as it remains easliy implementable, and > > usable, and freely re-distributable, perl is as much of an option as > > python, isn't it? > > Errrr mmmmmmmmmm argggggghh! I'm having convulsions here ;-) It seems at every > turn, there's something supposedly better to use, and I'm left having to defend > what I have chosen. If I don't give in, I'm too stubborn. But if I do, Loci > more closely resembles a smorgasbord of technologies. :) I'm sorry for having caused the early morning convulsions, Jeff. I'm just trying to get a handle on the big picture. I wasn't actually suggesting replacing Python with perl for the central technology, but rather as one of the toolsets that supports your central technology. The problem I see with being too doctrinaire as to the languages used is that you run the risk of alienating some that support your model and see alternative ways of accomplishing a task (that may have already been solved with great effort) using another approach and as long as using it doesn;t conflict with your central goal, I don't see the problem. > I am trying to keep Loci homogenous in terms of technology. I think Python is a > knock-out language, beating even Perl in most respects. I know Perl is very > prominent in sequence analyses...It's prominent in just about everything. But I > think Python is not far behind in acceptance, and is gaining momentum. > > What can we do? If there is absolutely no other choice, we can go with > something in Perl, ***providing we consider it a temporary option. If we can > find something later in Python or can convert it to Python, then we will. But > don't give in too easily. Hear, hear. That's all I'm suggesting. However, IMHO excluding biosequence artists on the basis of what language they choose is certain to make for bad blood in the community and that's not what we want. As long as there's a standard way for the components to interact, anything that contributes to the effort should be considered. Speaking of which, one of the things that attracted me to this approach was it's close coding and thematic relation to the GNOME project. One way to rationalize the different language issue would be build the components using the GNOME's ORBit definition, which is a lightweight, lo-memory, GPLed, CORBA-compliant ORB (albeit written in C, but I'm not sure that would affect much). Info is at: http://www.labs.redhat.com/orbit/ Has that approach been evaluated? I'm certainly not trying to throw grit in the gastank - it's something that I'm currently investigating and so far it seems to be quite promising. If the list can give me reasons why it's a bad idea, I'd very much appreciate it. And finally, I appreciate the difficulties of trying to herd a bunch of wild, perl/C/scheme/lisp/etc-crazed code weasels. >From one of the weasels, Cheers Harry From david.lapointe at umassmed.edu Sun Jan 24 13:01:56 1999 From: david.lapointe at umassmed.edu (david.lapointe@umassmed.edu) Date: Fri Feb 10 19:18:08 2006 Subject: [Pipet Devel] BioML vs BSML In-Reply-To: Message-ID: <93307F07DE63D211B2F30000F808E9E525D661@edunivexch02.umassmed.edu> There is a Perl-XML FAQ at http://www.perlxml.com/faq/perl-xml-faq.html Also, Activestate.com has a bunch of perl mail-lists running, one of which is the PERL-XML list (http://www.activestate.com/lyris/lyris.pl). You can browse the archives as a guest. CPAN has several XML related modules. http://www.perl.com/CPAN/modules/by-category/15_World_Wide_Web_HTML_HTTP_CGI /XML/ David > -----Original Message----- > From: Harry Mangalam [mailto:hjm@cx408397-a.irvn1.occa.home.com] > Sent: Sunday, January 24, 1999 12:38 AM > To: tulip-list@busboy.sped.ukans.edu > Subject: Re: [Pipet Devel] BioML vs BSML > > > Hi Jeff et al, > > As you probably have found out now, BioML is being used by > the bio.perl > group and the perl masters at perl.org already have a pretty > large archive of > useful scripts for manipulating XML. > > And it looks like, in a VERY fast spin thru the docs, that > using their XML > parser tools, it may be possible to use both of these XMLs, > using these perl > modules to handle parsing and interconversion. > > ie: > http://www.perl.com/pace/pub/perldocs/1998/11/xml.html > http://www.perl.com/pace/pub/perldocs/1998/12/cooper-01.html > > perl also has gtk bindings and in general, I've heard about > and done more > things using perl in the bio world than with python. Not to > say python > doesn't make some or even most things easier - just that perl > has a proven > track record in the bio area. > > It seems not to be at cross purposes to the objectives of > LOCI to implement > chunks of it in perl, no? As long as it remains easliy > implementable, and > usable, and freely re-distributable, perl is as much of an option as > python, isn't it? > > Also, how is the LOCI project planning on handling the display of the > results of this effort? Both the BioML and the BSML browsers that are > available are MS-centric and certainly do not use gtk. The > only browser > that I'm aware of that does is gzilla (www.gzilla.com) and mnemonic ( > http://www.mnemonic.org/). Are you planning on using either > of these for > your display dev platform? Or are you not implementing any display > technology? > > VisualGenomics is planning a Java-based BSML browser, but I'm > sure that it > won't use gtk, unless heavily funded. It'll probably be > written with the > swing classes - what's the redist controls on swing - I'm not a Java > follower - sorry. > > OK - enough mind rot from me for the present... > > Cheers > harry > > On Sat, 23 Jan 1999, J.W. Bizzaro wrote: > > > Harry, > > > > We had some rather lengthy exchanges about the use of BSML > and CML in Loci. We > > were looking for an XML that would work well for both > sequence and structure > > information. But it seemed that neither was good at both. > > > > In sort, we stepped back from the issue, and decided we > would try to support > > BOTH, until something better came along. We even > considered making our own XML > > (we would call it BMML, "BioMolecule Markup Language"). > What we really wanted > > was just one language that would give us an excellent > description of large bio > > macromolecules. Well, of course that would have been too > much to take on right > > now. > > > > Konrad actually has a lot to say about descriptive > languages for macromolecule > > structure. He had corresponded with Peter Murray Rust, the > author of CML and > > someone rather influential in the development of XML. > Konrad, as he can tell > > you, is not satisfied with any current format. > > > > BioML comes as a bit of a surprise to me. It seems to be > brand new. Looking it > > over a bit, it does seem to do a better job at handling > structure than BSML, and > > I like the inclusion of all sorts of biological information > (it can describe > > organisms as well, it seems)...although some may argue this > is "bloat". > > > > I would like Konrad to give us his impression of BioML. It > would be nice to use > > one XML rather than two. My big question, however, is the > licensing. The Web > > page says to contact David Fenyo about the "commercial" use > of BioML. I wonder > > if this is one of those "well, as long as you don't make > money on it" licenses. > > If so, we have problems: It won't fit with GNU GPL. > > > > I wrote a message to David Fenyo about this, and to see if > he can give us a > > contrast between BioML and BSML. A cc will be sent to the > Tulip list. > > > > > > Cheers! > > Jeff > > bizzaro@bc.edu > > > > > > Harry Mangalam wrote: > > > > > > Hi All, > > > > > > Having only recently come to this arena, what's the > group's evaluation of > > > the relative merits of BSML: > > > http://www.visualgenomics.com/sbir/rfc.htm > > > > > > vs BioML: > > > http://www.proteometrics.com/BIOML/ > > > > > > I'm still going thru the text of the specs but if any of > you have strong > > > arguments regarding either approaches, I'd very much > appreciate it. The > > > bioperl people seem to like the BioML approach. > > > > > > Also, if you have come to an approach that can > rationalize the competing > > > stds, please let me know. > > > > > > Cheers > > > Harry > > > > -- > > J.W. Bizzaro Phone: 617-552-3905 > > Boston College mailto:bizzaro@bc.edu > > Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ > > -- > > > > Cheers, > Harry > > Harry J Mangalam, Developmental + Cell Biology > Rm 4201, Biological Sciences II, UC Irvine, Irvine, CA, 92697 > (949) 824 4824[vox], (949) 824 8551[fax], mangalam@uci.edu > http://hornet.bio.uci.edu/~hjm/ > From bizzaro at bc.edu Sun Jan 24 16:40:10 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:08 2006 Subject: [Pipet Devel] BioML vs BSML References: Message-ID: <36AB933A.5A270BC2@bc.edu> Harry Mangalam wrote: > :) I'm sorry for having caused the early morning convulsions, Jeff. I'm > just trying to get a handle on the big picture. I wasn't actually suggesting > replacing Python with perl for the central technology, but rather as one of > the toolsets that supports your central technology. The problem I see with > being too doctrinaire as to the languages used is that you run the risk of > alienating some that support your model and see alternative ways of > accomplishing a task (that may have already been solved with great effort) > using another approach and as long as using it doesn;t conflict with your > central goal, I don't see the problem. > I don't want to be misunderstood regarding my position on various languages. One of the main goals of Loci is to provide a framework to unify many bio tools of different languages. ***But can we keep just the core of Loci virgin Python? It will be hard enough allowing all sorts of languages to be attached to the core. > Hear, hear. That's all I'm suggesting. However, IMHO excluding biosequence > artists on the basis of what language they choose is certain to make for bad > blood in the community and that's not what we want. As long as there's a > standard way for the components to interact, anything that contributes to > the effort should be considered. Speaking of which, one of the things that > attracted me to this approach was it's close coding and thematic relation to > the GNOME project. One way to rationalize the different language issue > would be build the components using the GNOME's ORBit definition, which is a > lightweight, lo-memory, GPLed, CORBA-compliant ORB (albeit written in C, but > I'm not sure that would affect much). Info is at: > Yes, we have considered CORBA and ORBit. Right now there is no decent free implementation of CORBA for Python. But there is an effort underway to make Python bindings to ORBit, which we will consider. It is not a bad idea. CORBA may be the best way for tools of various languages to connect to the Loci core without going thru the Gatekeeper. So, we may very well see various different GUI attached to Loci. It's just that we won't do anything like that for the _core_. ***Remember, the Loci core won't contain even one analysis tool! We are considering the core to be very small, consisting of Python/C GTK/GNOME modules. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Sun Jan 24 18:58:16 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:08 2006 Subject: [Pipet Devel] BioML vs BSML References: <93307F07DE63D211B2F30000F808E9E525D661@edunivexch02.umassmed.edu> Message-ID: <36ABB397.734DAB46@bc.edu> Thanks for the info, David. In case we start to believe there is nothing for XML under Python, here is a link to the XML-SIG (Special Interest Group): http://www.python.org/sigs/xml-sig/ You will find links there to much of the work being done with Python-XML, and there is a lot. Also, here is a link to the Python-CORBA SIG: http://www.python.org/sigs/do-sig/ Jeff bizzaro@bc.edu david.lapointe@umassmed.edu wrote: > > There is a Perl-XML FAQ at http://www.perlxml.com/faq/perl-xml-faq.html > > Also, Activestate.com has a bunch of perl mail-lists running, one of which > is the PERL-XML list (http://www.activestate.com/lyris/lyris.pl). You can > browse the archives as a guest. CPAN has several XML related modules. > > http://www.perl.com/CPAN/modules/by-category/15_World_Wide_Web_HTML_HTTP_CGI > /XML/ > > David -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From hinsen at cnrs-orleans.fr Mon Jan 25 05:11:54 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:08 2006 Subject: [Pipet Devel] BioML vs BSML In-Reply-To: <36AA5475.7BAE2552@bc.edu> (bizzaro@bc.edu) References: <36AA5475.7BAE2552@bc.edu> Message-ID: <199901251011.LAA21088@dirac.cnrs-orleans.fr> > I would like Konrad to give us his impression of BioML. It would be I don't think my opinion is so relevant; my field of work is rather different from the Loci project. I work on structures, and BioML does not seem to have any provision for structures at all. Which is fine, of course, not everything has to be designed for my needs ;-) My complaint with CML is that it claims to handle biomolecular structures and does it badly. > the licensing. The Web page says to contact David Fenyo about the > "commercial" use of BioML. I wonder if this is one of those "well, I am not even sure that a data format is copyrightable. If it is, the current downloadable DTD does not contain any copyright statement or usage restrictions, so I don't see why it shouldn't be used for commercial applications. That aside, I did notice a couple of strange features in and about BioML that make me wonder whether it is the format of choice. First, and most importantly, I have the impression that the inventors have not quite understood the point of XML - separating content from layout. BioML contains some purely graphical entity definitions, for example ¶graph; defined as &newline;&tab;. In my opinion such things should never appear in XML files. Paragraphs should be marked up with a paragraph tag, whose visual interpretation is left to a stylesheet definition. Second, the BioML inventors seem to be more Windows-centric than Microsoft itself. Who would have the crazy idea of offering documentation in portable HTML format only as a self-extracting archive for Windows? Of course this doesn't affect the language, but I'd hate to see the next release contain tags for defining COM objects... Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From justin at ukans.edu Tue Jan 26 02:21:33 1999 From: justin at ukans.edu (Justin Bradford) Date: Fri Feb 10 19:18:08 2006 Subject: [Pipet Devel] BioML vs BSML In-Reply-To: <199901251011.LAA21088@dirac.cnrs-orleans.fr> Message-ID: > I don't think my opinion is so relevant; my field of work is rather > different from the Loci project. I work on structures, and BioML > does not seem to have any provision for structures at all. Which is > fine, of course, not everything has to be designed for my needs ;-) > My complaint with CML is that it claims to handle biomolecular > structures and does it badly. Does BSML not fulfill all of the requirements Loci needs? I'm guessing so, since CML was also planned. If so, what's missing? A visualization program is going to have to know the format of the data it gets back from the analysis program (obviously), so the XML translation wrappers will have to be consistent. Now, we could use two different languages, but a viewer may want data from two different tools, each with a different ML (markup language). Also, we'll be wanting to chain several tools together, which is going to require tools taking input data from a ML, right? But we also want control information tagging along with the object? And that would also be XML data? Furthermore, I'd like it if this thing could query/update databases, too (ie, a glyph for submitting my new protein structure to Brookhaven, or get the sequence for some gene out of the GDB, etc.) Now let me see if I understand the system so far. Paos is the network transport layer. But which end does the server run on? Jeff made a comment earlier implying the Paos server runs on the user's machine. One client is the GCL/viewer/monitor and one is on the actual machine running the analysis tool. But how would a connection be made to between the server and the analysis client? Doesn't the Paos server have to be on the analysis end? Also, a workflow/batch control system is in charge of directing the movements of the object (via Paos). In case of failure, the Paos object is updated with some exception, and the workflow system is notified and deals with it appropriately. Throughout this process, the workflow system is also updating the Paos object with current status and the anaylisis programs update the object (or create new ones?), which the monitor client is displaying for the user. When complete, the visualization/viewer program is notified, takes the Paos object and renders it for the user. Am I close? If so, it makes sense to use the Paos object to store control, exception, and status info. Data for anaylsis and analyzed data are stored in separate attributes. The gatekeeper takes the data from the appropriate attribute (as told by relevant control information), modifies it as necessary for the analysis tool, and runs that tool. Output is then committed to the Paos object (after conversion to the appropriate XML dialect by the gatekeeper), and the workflow system decides what to do next (depending on control info), until eventually, it is handed back to the user's client. In this model, the workflow system is a Paos server/client combo. It would get the original object from the user, hand that to an analysis server, but keep a local copy updated, which the user (status monitor) would access for updates. When one analysis step is done (and it had resynced it's copy of the remote object) it would delete the object on the analysis server (remote object), and then repeat the whole process (ie. give the object to the next analysis server, ...) All the user client stuff access the workflow system directly, which deals with the individual analysis servers. This runs as a separate process, so you might have a server running this. The client starts up his Loci GCL program on a networked computer anywhere, builds the analysis batch, starts it, gets an ID number, and can close the program and walk away. Then from any other computer with Loci (or via the web when that interface is done), enters the batch ID, and can see everything that has happened to it so far along with it's current status. When it's done, the user can save the object locally for future reference (or maybe it's moved to a networked Loci archive system [just a Paos server]). Of course, the workflow process could be run locally as well, along with all of the analysis tools. Also, the workflow system could implement more than just Paos network connection to the analysis programs, such as CORBA, COM, IRC (biobots!), etc. all of which would be transparent to the client tools. So is that what everyone what already thinking? Also, whenever I said "analyis tool/server", that could be replaced with "database query/update". Now what does the query language look like, and how do we embed info from analysis and db access early in the batch into later queries. Especially if we have multiple XML dialects that the tools speak in. Ugh. Well I have 2.5 hours of day-dreaming/class tomorrow to come up with something. Sorry for the ramblingness... Justin Bradford justin@ukans.edu From bizzaro at bc.edu Tue Jan 26 18:58:44 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:08 2006 Subject: [Pipet Devel] BioML vs BSML References: <36AA5475.7BAE2552@bc.edu> <199901251011.LAA21088@dirac.cnrs-orleans.fr> Message-ID: <36AE56B4.3D1429F0@bc.edu> Konrad Hinsen wrote: > I don't think my opinion is so relevant; my field of work is rather > different from the Loci project. I work on structures Are you kidding? Half of loci will be for structural analyses! People think bioinformatics is just about sequence analyses, and I believe wrongly so. Because many people exclude structural analyses, we were careful to name The BIC Group, "The Biomolecular Informatics and Computation Group". So we are involved in informatics plus anything else that involves computers and biology. I want to make certain that we include structural analysis tools in our list of analysis tools to be used. > and BioML > does not seem to have any provision for structures at all. Which is > fine, of course, not everything has to be designed for my needs ;-) I recall some BioML examples with structural data. Unless your talking about BSML. But you'll say that including structural data and making a good provision for it are completely different ;-) > I am not even sure that a data format is copyrightable. If it is, the > current downloadable DTD does not contain any copyright statement or > usage restrictions, so I don't see why it shouldn't be used for > commercial applications. Hmmm. And we didn't hear back from them. > That aside, I did notice a couple of strange features in and about > BioML that make me wonder whether it is the format of choice. First, > and most importantly, I have the impression that the inventors have > not quite understood the point of XML - separating content from > layout. BioML contains some purely graphical entity definitions, for > example ¶graph; defined as &newline;&tab;. In my opinion such > things should never appear in XML files. Paragraphs should be marked > up with a paragraph tag, whose visual interpretation is left to a > stylesheet definition. That is strange and maybe a good reason to not use it. > Second, the BioML inventors seem to be more Windows-centric than > Microsoft itself. Who would have the crazy idea of offering > documentation in portable HTML format only as a self-extracting > archive for Windows? Of course this doesn't affect the language, > but I'd hate to see the next release contain tags for defining > COM objects... Windows is the world...not. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Tue Jan 26 20:04:11 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:08 2006 Subject: [Pipet Devel] BioML vs BSML References: Message-ID: <36AE660B.4D9707B7@bc.edu> Justin Bradford wrote: > Does BSML not fulfill all of the requirements Loci needs? The Bioinformatic "Sequence" ML is pretty much for just that. Although they claim you can embed a PDB (Proten Data Bank) file inside of BSML. But Konrad is not a fan of PDB either. > I'm guessing so, since CML was also planned. > If so, what's missing? BSML is missing any decent description of structure, and CML is missing an acceptable description of structure for molecules larger than what organic chemists deal with. We actually can ignore the small chemical descriptions for Loci. If we just had something that was as good with sequences as BSML and as good with large molecule structure as CML is with small molecule structure. > A visualization program is going to have to know the format of the data it > gets back from the analysis program (obviously), so the XML translation > wrappers will have to be consistent. Now, we could use two different > languages, but a viewer may want data from two different tools, each with > a different ML (markup language). How about making our own XML? I think having four XML's has already diluted the field so that we can't complain about our XML being a proprietary format. I think Justin and Konrad could coordinate this effort, and the others can offer input on sequence representations. Really, we can get much of the sequence part from what we like about BSML and BioML. This may actually be necessary if we are to embed queries and commands into the documents. Konrad, you thought we might want to do this back when we had only three people involved. Maybe we can call it "LocusML" or "Bio-Object ML" (BOML) or "Bio-Macromolecule ML" (BMML). Give me some feedback. > Also, we'll be wanting to chain several tools together, which is going to > require tools taking input data from a ML, right? Yep. > But we also want control information tagging along with the object? And > that would also be XML data? Yepper. > Furthermore, I'd like it if this thing could query/update databases, too > (ie, a glyph for submitting my new protein structure to Brookhaven, or get > the sequence for some gene out of the GDB, etc.) You mean have a Loci _tool_ for this? You're not talking about XML here. > Now let me see if I understand the system so far. > Paos is the network transport layer. But which end does the server run on? > Jeff made a comment earlier implying the Paos server runs on the user's > machine. I believe we can have multiple Paos servers. Exactly where they go, I'm not sure. BTW, Carlos wrote in some detail about Paos and Loci in his e-mail messages from Monday. > One client is the GCL/viewer/monitor and one is on the actual > machine running the analysis tool. But how would a connection be made to > between the server and the analysis client? Doesn't the Paos server have > to be on the analysis end? (I'm sorry about using the word "client" to describe the user's machine. Of course it also describes a program that communicates with a server. When I say client, I mean local machine.) Yes, I think Paos can reside on both the server and client. Carlos will have some documentation for us that can clear things up, and I think there is a README at the Paos Web site. @@@ > Also, a workflow/batch control system is in charge of directing the > movements of the object (via Paos). In case of failure, the Paos object is > updated with some exception, and the workflow system is notified and deals > with it appropriately. Yes sir! > Throughout this process, the workflow system is also updating the Paos > object with current status The XML object can be changed, yes. > and the anaylisis programs update the object > (or create new ones?), which the monitor client is displaying for the > user. Yes, the GCL glyph, which can open a window to show current status. > When complete, the visualization/viewer program is notified, takes > the Paos object and renders it for the user. Right! > Am I close? Oh ya! > If so, it makes sense to use the Paos object to store control, exception, > and status info. Data for anaylsis and analyzed data are stored in > separate attributes. Yes. These are complications that may require us to write our own XML. > The gatekeeper takes the data from the appropriate > attribute (as told by relevant control information), modifies it as > necessary for the analysis tool, and runs that tool. Now we are back to analyzing the XML data (Paos object), back up to where I typed @@@. These are not two types of analyses. The gatekeeper will work with the workflow system, etc. > Output is then committed to the Paos object (after conversion to the > appropriate XML dialect by the gatekeeper), and the workflow system > decides what to do next (depending on control info), until eventually, it > is handed back to the user's client. Yes! I think you know just what I've been thinking. > In this model, the workflow system is a Paos server/client combo. It > would get the original object from the user, hand that to an analysis > server, but keep a local copy updated, which the user (status monitor) > would access for updates. I'm not sure about keeping a local copy of the data. You say that the data would updated, which would require the whole XML object to be transferred many times. I was thinking only once at the end, but the analysis locus could just keep reporting what is being done...like writing a log file. > ...and then repeat the whole process (ie. > give the object to the next analysis server, ...) Yes, when GCL is used to automate some analyses. > All the user client stuff access the workflow system directly, which deals > with the individual analysis servers. This runs as a separate process, so > you might have a server running this. The client starts up his Loci > GCL program on a networked computer anywhere, builds the analysis batch, > starts it, gets an ID number, and can close the program and walk away. I never thought of that, but it's a great idea! > Then from any other computer with Loci (or via the web when that interface > is done), enters the batch ID, and can see everything that has happened to > it so far along with it's current status. Hmmm. Turning the client off and getting the data from another client, means the server needs to know the original client is off and that the information should be held until the ID is provided. I think it'll work. The server may keep a copy on file for a time specified by the user. That way, the server doesn't have to probe for the client loci that sent the data. > When it's done, the user can > save the object locally for future reference (or maybe it's moved to a > networked Loci archive system [just a Paos server]). Yes. The object will appear to the user as a Loci object in the file open dialog, and it will appear as a larger glyph on the benchtop. It won't have to go through any translation again. > Of course, the workflow process could be run locally as well, along with > all of the analysis tools. Yes yes yes yes!!! > Also, the workflow system could implement more > than just Paos network connection to the analysis programs, such as CORBA, > COM, IRC (biobots!), etc. all of which would be transparent to the client > tools. Yes! Each connection is filtered by a porta locus, just like the Porta Internet & Gatekeeper combination. > So is that what everyone what already thinking? The rain in Spain falls mainly on the plane...Yes! By George I think he's got it! > Also, whenever I said "analyis tool/server", that could be replaced with > "database query/update". It sure can. > > Now what does the query language look like, and how do we embed info from > analysis and db access early in the batch into later queries. Especially > if we have multiple XML dialects that the tools speak in. Ugh. Well I have > 2.5 hours of day-dreaming/class tomorrow to come up with something. Well, again, if we make up our own system it will be less complicated...but we'll have more work. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From justin at ukans.edu Tue Jan 26 22:52:19 1999 From: justin at ukans.edu (Justin Bradford) Date: Fri Feb 10 19:18:09 2006 Subject: [Pipet Devel] BioML vs BSML In-Reply-To: <36AE660B.4D9707B7@bc.edu> Message-ID: > How about making our own XML? I think having four XML's has already diluted the > field so that we can't complain about our XML being a proprietary format. I > think Justin and Konrad could coordinate this effort, and the others can offer > input on sequence representations. Really, we can get much of the sequence part > from what we like about BSML and BioML. I was thinking the same thing too. Nothing seems to do exactly what we want, and it will be simpler for querying purposes if we only deal with one XML. Conversion to other formats could also be done, for exporting the data outside of the Loci system. > Give me some feedback. It would help if the people with lots of experience using existing formats could comment on how they'd like it to work. > > But we also want control information tagging along with the object? And > > that would also be XML data? What about storing control information in the Paos object, rather than in the XML? Or could we make the Paos object a mirror of the XML format? The purpose of this should become clearer as I explain other things. > > Furthermore, I'd like it if this thing could query/update databases, too > > (ie, a glyph for submitting my new protein structure to Brookhaven, or get > > the sequence for some gene out of the GDB, etc.) > > You mean have a Loci _tool_ for this? You're not talking about XML here. Well, whatever we use to describe queries should be capable of querying and updating databases, ideally. That way, a database dependent step could be as simple as an analysis step. This would require a gatekeeper interface, of course. I just want to make sure we can fit it in seamlessly. > Yes, I think Paos can reside on both the server and client. Carlos will have > some documentation for us that can clear things up, and I think there is a > README at the Paos Web site. A Paos client has to make a connection to a Paos server. Therefore, there must be a Paos server answering requests wherever an analysis tool is located. > > Also, a workflow/batch control system is in charge of directing the > > movements of the object (via Paos). In case of failure, the Paos object is > > updated with some exception, and the workflow system is notified and deals > > with it appropriately. > > Yes sir! But there has to be something around constantly to monitor these Paos objects throughout their lifetime. This would be the workflow system (wfs). It would be responsible for directing objects, keeping track of their status, and providing an interface for the user to check up on it. > > Throughout this process, the workflow system is also updating the Paos > > object with current status > > The XML object can be changed, yes. Now is the XML object in the Paos object, or are they the same thing? Since Paos can deliver updates on only specific attributes, I wanted to take advantage of that. Like I mentioned earlier, the Paos object could be a representation of the XML format we create, or it could contain XML data from analysis steps. In the latter case, other attributes of the Paos object would contain status and control information. That way it could be updated "atomically", regardless of the other XML data it contains. > > If so, it makes sense to use the Paos object to store control, exception, > > and status info. Data for anaylsis and analyzed data are stored in > > separate attributes. > > Yes. These are complications that may require us to write our own XML. Again, would it make sense for this to be in the Paos object, the XML it contains, or is their a difference? > > The gatekeeper takes the data from the appropriate > > attribute (as told by relevant control information), modifies it as > > necessary for the analysis tool, and runs that tool. > > Now we are back to analyzing the XML data (Paos object), back up to where I > typed @@@. These are not two types of analyses. The gatekeeper will work with > the workflow system, etc. Maybe. I was thinking that the Paos object contained the XML data in an attribute, which was extracted and presented to the gatekeeper depending on what it was supposed to do with it. But if the whole Paos object is an XML representation, then the gatekeeper takes what it needs. > > In this model, the workflow system is a Paos server/client combo. It > > would get the original object from the user, hand that to an analysis > > server, but keep a local copy updated, which the user (status monitor) > > would access for updates. > > I'm not sure about keeping a local copy of the data. You say that the data > would updated, which would require the whole XML object to be transferred many > times. I was thinking only once at the end, but the analysis locus could just > keep reporting what is being done...like writing a log file. Ok. I think you had envisioned just the gatekeeper just dealing with the whole XML file, which contained control and status info, and was stored in the Paos object. I want to take advantage of the object nature of Paos, and use multiple attributes on the object. One for control, one for status, one for data storage (the XML returned by the analyses). That way status could be updated individually of the rest of the XML data. Of course, it would be even better if the Paos object was simply a representation of the XML data. Then analyses could be updated atomically, too. Also, this way, the user client wouldn't have to parse XML. It would be provided with an object-oriented view of it right away. Sort of like DOM, which we could even provide an interface, too. > > ...and then repeat the whole process (ie. > > give the object to the next analysis server, ...) > > Yes, when GCL is used to automate some analyses. GCL is used to build the control data. The workflow system does the work, according to the control information. > > All the user client stuff access the workflow system directly, which deals > > with the individual analysis servers. This runs as a separate process, so > > you might have a server running this. The client starts up his Loci > > GCL program on a networked computer anywhere, builds the analysis batch, > > starts it, gets an ID number, and can close the program and walk away. > > I never thought of that, but it's a great idea! I had considered the possibility of our objects (which, for clarification, refers to a batch of controls, data to be analyzed, data already analyzed, and various status information) roaming independently of a "central" server. They could be passed from gatekeeper to gatekeeper directly. However, that would make it impossible to monitor them, unless the object "called home" every now and then. But I don't like that. It makes more sense for the user to query the object when it wants information. For that to be possible, there has to be some constant, central server which is watching the object. This would be the workflow system. It's in charge of directing the object, and it constantly keeps tabs on it's status. This is why I want atomic updates on status info. The workflow system (wfs) is really a Paos server, but it only talks to user clients. However, it pretends to be a Paos client to communicate with the Paos server associated with an analysis tool. When sending an object to be analyzed, the wfs commits the object to the remote (analysis) server. It also requests notification on all updates to it's status attributes. The copy of the object local to the wfs is updated with the remote status info. When analysis is complete, the wfs syncs it's copy with the remote copy, and then removes the remote copy. Now, at any time, a user client can access the wfs, and get the status information from the copy on the wfs. The user will always know where the wfs is, since it's running locally (either just for that one user, or maybe a department or university-wide instance). When an object completes, it gets moved to an archive section of the wfs. The user client accesses this object via a unique ID. Since the wfs is networked, the object can be accessed from any Loci user client. The user just has to know the wfs location and the object ID. > Hmmm. Turning the client off and getting the data from another client, means > the server needs to know the original client is off and that the information > should be held until the ID is provided. I think it'll work. The server may > keep a copy on file for a time specified by the user. That way, the server > doesn't have to probe for the client loci that sent the data. See above; the wfs doesn't care if the original client is still around. It just holds onto the object until someone comes along to retrieve it. The wfs shouldn't seek out the user client. The user client comes to it. Also, the object ID is provided when the object is first started. Click "Start Analysis", and the wfs responds with "OK, here's your ID." The user client should have an option to keep track of those for you, but the user should also be able to access the object from any other Loci user client, just using that info. > Well, again, if we make up our own system it will be less complicated...but > we'll have more work. That's probably the next step. Although, I'm curious about your thoughts on how to use the Paos object. I'm becoming rather fond of the object representation of the XML format. Although, I should make sure Paos is capable of this. Carlos, can it handle complex Python container classes? And can we update elements in it atomically? If not, just using separate attributes on the Paos object for the various components would work (the status, control, query, and data attributes). Justin Bradford justin@ukans.edu From hinsen at cnrs-orleans.fr Wed Jan 27 11:04:58 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:09 2006 Subject: [Pipet Devel] BioML vs BSML In-Reply-To: <36AE56B4.3D1429F0@bc.edu> (bizzaro@bc.edu) References: <36AA5475.7BAE2552@bc.edu> <199901251011.LAA21088@dirac.cnrs-orleans.fr> <36AE56B4.3D1429F0@bc.edu> Message-ID: <199901271604.RAA18094@dirac.cnrs-orleans.fr> Are you kidding? Half of loci will be for structural analyses! People think bioinformatics is just about sequence analyses, and I believe wrongly so. Fine, then I can perhaps contribute more than just Python expertise... I recall some BioML examples with structural data. Unless your talking about Such as? I didn't find any (but after the second example my network connection broke down...), and neither did I find any way to specify structure in the BioML documentation. Konrad -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From hinsen at cnrs-orleans.fr Wed Jan 27 12:42:59 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:09 2006 Subject: [Pipet Devel] BioML vs BSML In-Reply-To: <36AE660B.4D9707B7@bc.edu> (bizzaro@bc.edu) References: <36AE660B.4D9707B7@bc.edu> Message-ID: <199901271742.SAA17578@dirac.cnrs-orleans.fr> Konrad, you thought we might want to do this back when we had only three people involved. Maybe we can call it "LocusML" or "Bio-Object ML" (BOML) or "Bio-Macromolecule ML" (BMML). Fine with me, and I'd certainly use it for other applications as well. On the other hand, it is possible to design DTDs by extending existing ones. Perhaps this is a good idea to save effort and keep compatibility to some extent. Konrad -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From bizzaro at bc.edu Mon Jan 4 12:26:12 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:11 2006 Subject: [Pipet Devel] [Fwd: Development] Message-ID: <3690F9B4.70E55F46@bc.edu> >From Thomas... -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- -------------- next part -------------- An embedded message was scrubbed... From: Thomas Sicheritz Subject: Development Date: Mon, 4 Jan 1999 09:49:37 +0100 (MET) Size: 2305 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990104/9ebecdf5/attachment.mht From bizzaro at bc.edu Mon Jan 4 12:26:12 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:11 2006 Subject: [Pipet Devel] [Fwd: Development] Message-ID: <3690F9B4.70E55F46@bc.edu> >From Thomas... -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- -------------- next part -------------- An embedded message was scrubbed... From: Thomas Sicheritz Subject: Development Date: Mon, 4 Jan 1999 09:49:37 +0100 (MET) Size: 2305 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990104/9ebecdf5/attachment-0001.mht From bizzaro at bc.edu Mon Jan 4 14:59:13 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:11 2006 Subject: [Pipet Devel] to do list Message-ID: <36911D91.37748194@bc.edu> Justin, Here is a copy of the to do list I recently sent to the others: The next step is to build a list, not of analysis tools, but of dynamic client-side interfaces. EMBOSS will take care of the big analysis tools for us. For now, here is a brief list of some dynamic "loci" I've been thinking of. Please feel free to add to this: (1) Benchtop/workspace. GUI representation of all data objects (files, documents, graphs) and possibly various loci. Also may be used for automation of analyses (recall GCL?). (2) File translation interface: to read in various DNA/protein document formats and convert them to XML. Also may be used to query databases and sort/compile documents. (3) Sequence visualization/editing tool: to manipulate DNA/protein sequences (4) Sequence comparison tool: to show multiple sequences aligned or translated. May also perform some functions of (6) (5) 3D visualization tool: to display molecules as 3D structures, with emphasis on a schematic/cartoon representation. (6) Graphing tool: to display plots against sequences and to make simple graphs. Some may argue this isn't needed, but I need it for my programs, so others may too ;-) (7) HTML browser implementation: separate from the other tools, this would be a way for anyone with a browser to access analysis loci. The best approach may be for each of us to pick a single tool to concentrate on. And remember, most of these tools will be XML browsers of a sort. We should also make a list of "trivial" analysis/conversion tools for the client-side. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Mon Jan 4 15:45:45 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:12 2006 Subject: [Pipet Devel] [Fwd: Development] References: <3690F9B4.70E55F46@bc.edu> Message-ID: <36912879.DBCE52D@bc.edu> > Hmm - right now I feel I have time to get familiar with python ... so maybe I > am going to try to build a sequence editor ... any suggestions ? Great! The sequence vis/edit locus should be closely tied to the sequence comparison locus: I think users may open up various sequences (using the file/document translation locus) into the sequence comparison locus and then double-click on a sequence to see it in the vis/edit tool. Users may also load a sequence directly into the vis/edit locus, bypassing the comparison locus. Once in the vis/edit locus, sequences should be treated like an image in the GIMP or Photoshop: Users should be able to click-drag-select segments of a sequence just like an area in an image. I imagine the background of the selected segment would transition from white to black...or maybe a dashed-line box will surround the segment...I'd prefer the former. Once selected, the segments can be cut (^X) copied (^C) pasted (^V) or deleted (^D). The user should also be able to zoom in and out on the sequence...zooming in to the resolution of one residue. The mouse pointer can point out where selections or insertions occur. I'd also like to see a box on the side that shows the start and stop positions of selections, in numerical values. The menu bar should contain a file menu with open, close and exit...and an edit menu with copy, cut, paste and delete...maybe even undo...these are obvious standards. I can't recall exactly how your Tcl/Tk editor works. I may have described much of it already. I think this is a fun tool to be working on. Also. take a look at the graphics on this Web site: http://www.latrobe.edu.au/www/genetics/compmap.96.01.html It is a chromosome map comparison tool (which may be a part of what you're going to do...or another tool?), but I like the graphics. With the gnome-canvas widget (see below) we will be able to make anti-aliased shapes like this. > > I don't know anything about pythons way to handle classes - is there any > reason for me to code the sequence classes in C++ ? - or would it be enough > to let python handle the basic sequence object and code the heavy number > crunching part in C++/C ? >From my experience, Python may be better at handling this sort of thing than even C++, but Konrad is the best person to answer this question right now. By the way, C should be used for number crunching rather than C++. We discussed the "problems" with C++ before you came aboard, and we feel that C is more portable and more directly linked with Python and GTK. So the Tulip core distribution should be all Python and ANSI-C. Third party add-ons can be whatever...we just want the core to be consistent. > > I think I allready know from my Tcl/Tk sequence editor what > solutions/ways I definitely should avoid :-) > - anybody else with tips/hints/critics ? > If not, I am going to bugger my printer with ... some ... pages of python > and GTK manuals/references. > I'm sure you got my message about PyG Tools, but again, you may want to start at my page: http://www.uml.edu/Dept/Chem/BICGroup/PyGTools/ The first widget binding I think you'll want to get familiar with is the gnome-canvas (part of PyGNOME). It is supposed to be similar to the Tk canvas. -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Mon Jan 4 15:45:45 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:12 2006 Subject: [Pipet Devel] [Fwd: Development] References: <3690F9B4.70E55F46@bc.edu> Message-ID: <36912879.DBCE52D@bc.edu> > Hmm - right now I feel I have time to get familiar with python ... so maybe I > am going to try to build a sequence editor ... any suggestions ? Great! The sequence vis/edit locus should be closely tied to the sequence comparison locus: I think users may open up various sequences (using the file/document translation locus) into the sequence comparison locus and then double-click on a sequence to see it in the vis/edit tool. Users may also load a sequence directly into the vis/edit locus, bypassing the comparison locus. Once in the vis/edit locus, sequences should be treated like an image in the GIMP or Photoshop: Users should be able to click-drag-select segments of a sequence just like an area in an image. I imagine the background of the selected segment would transition from white to black...or maybe a dashed-line box will surround the segment...I'd prefer the former. Once selected, the segments can be cut (^X) copied (^C) pasted (^V) or deleted (^D). The user should also be able to zoom in and out on the sequence...zooming in to the resolution of one residue. The mouse pointer can point out where selections or insertions occur. I'd also like to see a box on the side that shows the start and stop positions of selections, in numerical values. The menu bar should contain a file menu with open, close and exit...and an edit menu with copy, cut, paste and delete...maybe even undo...these are obvious standards. I can't recall exactly how your Tcl/Tk editor works. I may have described much of it already. I think this is a fun tool to be working on. Also. take a look at the graphics on this Web site: http://www.latrobe.edu.au/www/genetics/compmap.96.01.html It is a chromosome map comparison tool (which may be a part of what you're going to do...or another tool?), but I like the graphics. With the gnome-canvas widget (see below) we will be able to make anti-aliased shapes like this. > > I don't know anything about pythons way to handle classes - is there any > reason for me to code the sequence classes in C++ ? - or would it be enough > to let python handle the basic sequence object and code the heavy number > crunching part in C++/C ? >From my experience, Python may be better at handling this sort of thing than even C++, but Konrad is the best person to answer this question right now. By the way, C should be used for number crunching rather than C++. We discussed the "problems" with C++ before you came aboard, and we feel that C is more portable and more directly linked with Python and GTK. So the Tulip core distribution should be all Python and ANSI-C. Third party add-ons can be whatever...we just want the core to be consistent. > > I think I allready know from my Tcl/Tk sequence editor what > solutions/ways I definitely should avoid :-) > - anybody else with tips/hints/critics ? > If not, I am going to bugger my printer with ... some ... pages of python > and GTK manuals/references. > I'm sure you got my message about PyG Tools, but again, you may want to start at my page: http://www.uml.edu/Dept/Chem/BICGroup/PyGTools/ The first widget binding I think you'll want to get familiar with is the gnome-canvas (part of PyGNOME). It is supposed to be similar to the Tk canvas. -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Mon Jan 4 16:12:09 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:12 2006 Subject: [Pipet Devel] [Fwd: Development] References: <3690F9B4.70E55F46@bc.edu> <36912879.DBCE52D@bc.edu> Message-ID: <36912EA9.437D88C4@bc.edu> > Once in the vis/edit locus, sequences should be treated like an image in the > GIMP or Photoshop: Users should be able to click-drag-select segments of a > sequence just like an area in an image. I imagine the background of the > selected segment would transition from white to black...or maybe a dashed-line > box will surround the segment...I'd prefer the former. > Hmmm. Now that I think about it, it should also be like a word processor. The user can position the mouse pointer between two residues and click on the left mouse button to move the "cursor" to that spot. He/she can even start typing out letters from the keyboard, inserting residues as they type. Hold down the shift key and press left or right arrow keys and the residues are selected one at a time. The backspace key will delete residues on the left...you get the idea :-) Also, when a sequence is edited, we should keep a note in the XML what was added or removed and where. So, maybe a red vertical line will appear in a sequence where something was...spliced. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Mon Jan 4 16:12:09 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:12 2006 Subject: [Pipet Devel] [Fwd: Development] References: <3690F9B4.70E55F46@bc.edu> <36912879.DBCE52D@bc.edu> Message-ID: <36912EA9.437D88C4@bc.edu> > Once in the vis/edit locus, sequences should be treated like an image in the > GIMP or Photoshop: Users should be able to click-drag-select segments of a > sequence just like an area in an image. I imagine the background of the > selected segment would transition from white to black...or maybe a dashed-line > box will surround the segment...I'd prefer the former. > Hmmm. Now that I think about it, it should also be like a word processor. The user can position the mouse pointer between two residues and click on the left mouse button to move the "cursor" to that spot. He/she can even start typing out letters from the keyboard, inserting residues as they type. Hold down the shift key and press left or right arrow keys and the residues are selected one at a time. The backspace key will delete residues on the left...you get the idea :-) Also, when a sequence is edited, we should keep a note in the XML what was added or removed and where. So, maybe a red vertical line will appear in a sequence where something was...spliced. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From Thomas.Sicheritz at molbio.uu.se Tue Jan 5 04:22:07 1999 From: Thomas.Sicheritz at molbio.uu.se (Thomas.Sicheritz@molbio.uu.se) Date: Fri Feb 10 19:18:12 2006 Subject: [Pipet Devel] [Fwd: Development] In-Reply-To: <36912879.DBCE52D@bc.edu> References: <3690F9B4.70E55F46@bc.edu> <36912879.DBCE52D@bc.edu> Message-ID: <13969.55581.685813.418675@beagle.bmc.uu.se> > I can't recall exactly how your Tcl/Tk editor works. I may have > described much of it already. I think this is a fun tool to be working > on. You can look at a VERY stripped tclet ( Tcl/Tk Applet) version at http://evolution.bmc.uu.se/~thomas/loci/xbblet.tcl -thomas -- Sicheritz Ponten Thomas E. Department of Molecular Biology blippblopp@linux.nu BMC, Uppsala University BMC: +46 18 4714214 BOX 590 S-751 24 UPPSALA Sweden Fax +46 18 557723 http://evolution.bmc.uu.se/~thomas Molecular Tcl: http://evolution.bmc.uu.se/~thomas/tcl Molecular Linux: http://evolution.bmc.uu.se/~thomas/mol_linux De Chelonian Mobile ... The Turtle Moves ... From bizzaro at bc.edu Tue Jan 5 15:20:14 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:12 2006 Subject: [Pipet Devel] [Fwd: Development] References: <3690F9B4.70E55F46@bc.edu> <36912879.DBCE52D@bc.edu> <13969.55581.685813.418675@beagle.bmc.uu.se> Message-ID: <369273FE.388311F4@bc.edu> Thomas.Sicheritz@molbio.uu.se wrote: > > > I can't recall exactly how your Tcl/Tk editor works. I may have > > described much of it already. I think this is a fun tool to be working > > on. > > You can look at a VERY stripped tclet ( Tcl/Tk Applet) version at > http://evolution.bmc.uu.se/~thomas/loci/xbblet.tcl > Yes it is very much how I described :-) But the editing locus I was thinking of would make use of XML to display the context of the sequence...if available. That is, a sequence that comes from a GenBank document would _always_ retain information on where the UTR, CDS, exon and intron, etc. regions are. So I see a sequence in the editor displayed as a "genetic map". You know how molecular biologists draw out horizontal bars of different colors representing different regions, binding sites, etc. I think the editor can show a color-coded map with ACGT bases transposed over the map or maybe sitting below it. Looking at molecular bio journals might help to find an informative and attractive solution. I know we don't have the facility right now to go from GenBank to XML to representation. This facility might be a good starting point for Justin and others who are well versed in XML/DOM...hint hint ;-) Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Tue Jan 5 16:33:39 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:12 2006 Subject: [Pipet Devel] example GUIs Message-ID: <36928533.BAA16209@bc.edu> DNAstar and MacVector are two "competing" application suites for sequence analysis. They have some very nicely made GUIs. Take a look: http://www.dnastar.com/products/products.html http://www.oxmol.com/prods/macvector/work/ I wonder if DNA sequence analysis tools should be different programs from protein (or polypeptide) sequence analysis tools, or maybe a single program such as the sequence editor can switch between the two? Of course they present some very different problems...but then again...? What do you guys think? We should also consider different types of genetic maps, according to the system: chromosome vs. bacterial circular genome vs. plasmid vs. viral genome. Even proteins can be represented in several different ways: primary/sequence vs. secondary vs. tertiary vs. quarternary. I'm just thinking about whether we'll need one big tool to show these or many smaller tools. I tend to favor many small loci here. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Tue Jan 5 16:33:39 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:12 2006 Subject: [Pipet Devel] example GUIs Message-ID: <36928533.BAA16209@bc.edu> DNAstar and MacVector are two "competing" application suites for sequence analysis. They have some very nicely made GUIs. Take a look: http://www.dnastar.com/products/products.html http://www.oxmol.com/prods/macvector/work/ I wonder if DNA sequence analysis tools should be different programs from protein (or polypeptide) sequence analysis tools, or maybe a single program such as the sequence editor can switch between the two? Of course they present some very different problems...but then again...? What do you guys think? We should also consider different types of genetic maps, according to the system: chromosome vs. bacterial circular genome vs. plasmid vs. viral genome. Even proteins can be represented in several different ways: primary/sequence vs. secondary vs. tertiary vs. quarternary. I'm just thinking about whether we'll need one big tool to show these or many smaller tools. I tend to favor many small loci here. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Tue Jan 5 16:43:06 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:12 2006 Subject: [Fwd: [Pipet Devel] [Fwd: Development]] Message-ID: <3692876A.FB4AD56B@bc.edu> ... -------------- next part -------------- An embedded message was scrubbed... From: "J.W. Bizzaro" Subject: Re: [Pipet Devel] [Fwd: Development] Date: Tue, 05 Jan 1999 20:20:14 +0000 Size: 1995 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990105/8d927766/attachment.mht From bizzaro at bc.edu Tue Jan 5 16:43:06 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:12 2006 Subject: [Fwd: [Pipet Devel] [Fwd: Development]] Message-ID: <3692876A.FB4AD56B@bc.edu> ... -------------- next part -------------- An embedded message was scrubbed... From: "J.W. Bizzaro" Subject: Re: [Pipet Devel] [Fwd: Development] Date: Tue, 05 Jan 1999 20:20:14 +0000 Size: 1995 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990105/8d927766/attachment-0001.mht From bizzaro at bc.edu Wed Jan 6 12:59:28 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:12 2006 Subject: [Pipet Devel] [Fwd: BioWidgets] Message-ID: <3693A480.91B49F47@bc.edu> Attached is from David... I have these sites bookmarked, and they are good examples. I recall David Searls's BioTk from a Gene-COMBIS article...very nice. Thomas, you're probably familiar with it. BTW, what happened to the "BioWidgets Consortium"? It hasn't been updated in 1.5 years! Many of the links are broken. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- -------------- next part -------------- An embedded message was scrubbed... From: david.lapointe@umassmed.edu Subject: BioWidgets Date: Wed, 6 Jan 1999 11:18:51 -0500 Size: 1552 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990106/af6d37c2/attachment.mht From bizzaro at bc.edu Wed Jan 6 12:59:28 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:12 2006 Subject: [Pipet Devel] [Fwd: BioWidgets] Message-ID: <3693A480.91B49F47@bc.edu> Attached is from David... I have these sites bookmarked, and they are good examples. I recall David Searls's BioTk from a Gene-COMBIS article...very nice. Thomas, you're probably familiar with it. BTW, what happened to the "BioWidgets Consortium"? It hasn't been updated in 1.5 years! Many of the links are broken. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- -------------- next part -------------- An embedded message was scrubbed... From: david.lapointe@umassmed.edu Subject: BioWidgets Date: Wed, 6 Jan 1999 11:18:51 -0500 Size: 1552 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990106/af6d37c2/attachment-0001.mht From bizzaro at bc.edu Fri Jan 8 02:54:53 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:12 2006 Subject: [Pipet Devel] more on gnome-canvas Message-ID: <3695B9CD.FA8B3670@bc.edu> FYI, from the GNOME home page: New GNOME Canvas Information The Canvas is a very exciting feature of GNOME. It allows very high level manipulation of objects. The programmer need not worry about handling the redrawing of the canvas during expose events, the Canvas does all this for you. The end result is a wonderful API that allows extremely rapid application development. The latest version of the Canvas in gnome-libs has the new antialiased rendering engine incorperated in it. See the GNOME Canvas Development page for screenshots and more information. (Warning: graphics intensive) http://www.gnome.org/devel/canvas/ Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Fri Jan 8 02:54:53 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:12 2006 Subject: [Pipet Devel] more on gnome-canvas Message-ID: <3695B9CD.FA8B3670@bc.edu> FYI, from the GNOME home page: New GNOME Canvas Information The Canvas is a very exciting feature of GNOME. It allows very high level manipulation of objects. The programmer need not worry about handling the redrawing of the canvas during expose events, the Canvas does all this for you. The end result is a wonderful API that allows extremely rapid application development. The latest version of the Canvas in gnome-libs has the new antialiased rendering engine incorperated in it. See the GNOME Canvas Development page for screenshots and more information. (Warning: graphics intensive) http://www.gnome.org/devel/canvas/ Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From Thomas.Sicheritz at molbio.uu.se Mon Jan 11 07:24:40 1999 From: Thomas.Sicheritz at molbio.uu.se (Thomas.Sicheritz@molbio.uu.se) Date: Fri Feb 10 19:18:12 2006 Subject: [Pipet Devel] [Fwd: BioWidgets] In-Reply-To: <3693A480.91B49F47@bc.edu> References: <3693A480.91B49F47@bc.edu> Message-ID: <13977.58855.91987.738400@beagle.bmc.uu.se> J.W. Bizzaro writes: > I have these sites bookmarked, and they are good examples. I recall > David Searls's BioTk from a Gene-COMBIS article...very nice. Thomas, > you're probably familiar with it. BioTk was a nice start, but unfortunately they stopped the development before it became really useful - thats one of the reasons I hacked BioWish. > BTW, what happened to the "BioWidgets > Consortium"? It hasn't been updated in 1.5 years! Many of the links > are broken. I think the perltk based project moved to java, and some other biowidgets projects just stopped. Survivers: http://www.cbil.upenn.edu/bioWidgets/ http://www.ii.uib.no/~oleart/biowidgets.html -thomas -- Sicheritz Ponten Thomas E. Department of Molecular Biology blippblopp@linux.nu BMC, Uppsala University BMC: +46 18 4714214 BOX 590 S-751 24 UPPSALA Sweden Fax +46 18 557723 http://evolution.bmc.uu.se/~thomas Molecular Tcl: http://evolution.bmc.uu.se/~thomas/tcl Molecular Linux: http://evolution.bmc.uu.se/~thomas/mol_linux De Chelonian Mobile ... The Turtle Moves ... From Thomas.Sicheritz at molbio.uu.se Mon Jan 11 08:19:00 1999 From: Thomas.Sicheritz at molbio.uu.se (Thomas.Sicheritz@molbio.uu.se) Date: Fri Feb 10 19:18:12 2006 Subject: [Pipet Devel] python data structure In-Reply-To: <3695B9CD.FA8B3670@bc.edu> References: <3695B9CD.FA8B3670@bc.edu> Message-ID: <13977.60819.945966.806093@beagle.bmc.uu.se> Hej all python gurus ... What python data type would you recommend for the class representation of a nucleotide sequence ? - string, list or array (module) ? I am not (yet) familiar with the performance questions of python types, but I got the impression that lists are very slow - and I have no idea how the array module is implemented. (btw I used strings in Tcl) J.W. Bizzaro writes: > I wonder if DNA sequence analysis tools should be different programs from > protein (or polypeptide) sequence analysis tools, or maybe a single > program such as the sequence editor can switch between the two? Of > course they present some very different problems...but then again...? > What do you guys think? In case of an editor/viewer - I vote for different programs/implementations. - Sequence analysis tools are just connected modules - e.g. the blast module/parser/filter is only slightly different for DNA or protein sequences. my 2 cents ... -thomas -- Sicheritz Ponten Thomas E. Department of Molecular Biology blippblopp@linux.nu BMC, Uppsala University BMC: +46 18 4714214 BOX 590 S-751 24 UPPSALA Sweden Fax +46 18 557723 http://evolution.bmc.uu.se/~thomas Molecular Tcl: http://evolution.bmc.uu.se/~thomas/tcl Molecular Linux: http://evolution.bmc.uu.se/~thomas/mol_linux De Chelonian Mobile ... The Turtle Moves ... From bizzaro at bc.edu Tue Jan 12 03:01:36 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:13 2006 Subject: [Fwd: [Pipet Devel] [Fwd: BioWidgets]] Message-ID: <369B0160.9EC51307@bc.edu> >From Thomas... -------------- next part -------------- An embedded message was scrubbed... From: Thomas.Sicheritz@molbio.uu.se Subject: Re: [Pipet Devel] [Fwd: BioWidgets] Date: Mon, 11 Jan 1999 13:24:40 +0100 (MET) Size: 2438 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990112/6f23d485/attachment.mht From bizzaro at bc.edu Tue Jan 12 03:01:36 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:13 2006 Subject: [Fwd: [Pipet Devel] [Fwd: BioWidgets]] Message-ID: <369B0160.9EC51307@bc.edu> >From Thomas... -------------- next part -------------- An embedded message was scrubbed... From: Thomas.Sicheritz@molbio.uu.se Subject: Re: [Pipet Devel] [Fwd: BioWidgets] Date: Mon, 11 Jan 1999 13:24:40 +0100 (MET) Size: 2438 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990112/6f23d485/attachment-0001.mht From bizzaro at bc.edu Tue Jan 12 03:02:49 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:13 2006 Subject: [Pipet Devel] python data structure Message-ID: <369B01A9.AD8F6B36@bc.edu> >From Thomas.... [Konrad, can you take a shot at this question?] -------------- next part -------------- An embedded message was scrubbed... From: Thomas.Sicheritz@molbio.uu.se Subject: [Pipet Devel] python data structure Date: Mon, 11 Jan 1999 14:19:00 +0100 (MET) Size: 2708 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990112/33cf084d/attachment.mht From bizzaro at bc.edu Tue Jan 12 03:02:49 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:13 2006 Subject: [Pipet Devel] python data structure Message-ID: <369B01A9.AD8F6B36@bc.edu> >From Thomas.... [Konrad, can you take a shot at this question?] -------------- next part -------------- An embedded message was scrubbed... From: Thomas.Sicheritz@molbio.uu.se Subject: [Pipet Devel] python data structure Date: Mon, 11 Jan 1999 14:19:00 +0100 (MET) Size: 2708 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990112/33cf084d/attachment-0001.mht From hinsen at cnrs-orleans.fr Tue Jan 12 04:04:35 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:13 2006 Subject: [Pipet Devel] python data structure In-Reply-To: <369B01A9.AD8F6B36@bc.edu> (bizzaro@bc.edu) References: <369B01A9.AD8F6B36@bc.edu> Message-ID: <199901120904.KAA14700@dirac.cnrs-orleans.fr> > >From Thomas.... > > [Konrad, can you take a shot at this question?] I'll try... > What python data type would you recommend for the class representation > of a nucleotide sequence ? > - string, list or array (module) ? > I am not (yet) familiar with the performance questions of python types, but > I got the impression that lists are very slow - and I have no idea how the > array module is implemented. (btw I used strings in Tcl) The main question is what operations you want to perform on nucleotide sequences. Here are some considerations: - Strings are compact and benefit from a large range of string operations (in module "string"). However, elements can only be characters, and strings are immutable, i.e. cannot be changed once created. So any modification requires constructing a new string. But being immutable can be an advantage as well, e.g. you can use strings as keys in dictionaries. - Lists can store any data type, and can be modified in a very general way (including insertion of lists etc.), but there are fewer operations available on them. - Tuples are just immutable lists. - Arrays don't seem to be very useful for non-numerical data, with two exceptions: they can most easily be accessed from C modules, and they facilitate certain structural operations. In terms of performance, there is not so much difference for basic operations (creation, indexing, etc.). The main concern should be to as many built-in operations as possible for typical manipulations; any piece of Python code is much slower than a simple call to a built-in function implemented in C! So the first thing to do is to find out which operations are to be performed on nucleotide sequences, and which of them occur most frequently. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From bizzaro at bc.edu Tue Jan 12 07:02:02 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:13 2006 Subject: [Pipet Devel] python data structure References: <369B01A9.AD8F6B36@bc.edu> <199901120904.KAA14700@dirac.cnrs-orleans.fr> Message-ID: <369B39BA.C71F482D@bc.edu> Konrad Hinsen wrote: > - Strings are compact and benefit from a large range of string operations > (in module "string"). However, elements can only be characters, > and strings are immutable, i.e. cannot be changed once created. > So any modification requires constructing a new string. But being > immutable can be an advantage as well, e.g. you can use strings as > keys in dictionaries. What are the limits on string sizes in Python (too lazy to look it up right now)? If it is 256, as with some languages, I imagine this presents a little problem. String immutabilty does also make sequence manipulation a bit awkward. > - Arrays don't seem to be very useful for non-numerical data, with two > exceptions: they can most easily be accessed from C modules, and > they facilitate certain structural operations. I have used arrays of characters in the past. Using parallel arrays can be a covenient way to index or "markup" sequences, i.e. the second array can be used to indicated where features start and stop. Another thought: Many analysis programs are limited by having to put everything into RAM, all in one shot. I tend to prefer keeping the sequence file open and reading in chunks at a time. BTW, some simple database features of Python allow you to keep and work from a data structure stored as a file, correct? On the same note, system resources are growing enough that they can handle large sequences in RAM. But on the other hand, the sequencing projects are turning out larger sequence files. The human genome will be one of the largest sequences (how big? 100 Gb?), and I think the frog genome is several times larger (go figure). Imagine, seriously because this will be hot stuff in a few years, that someone using Loci/Tulip will want to manipulate parts of the human genome like they can now with BioWish and E. coli. > > In terms of performance, there is not so much difference for basic > operations (creation, indexing, etc.). The main concern should be to > as many built-in operations as possible for typical manipulations; > any piece of Python code is much slower than a simple call to a > built-in function implemented in C! So the first thing to do is to > find out which operations are to be performed on nucleotide sequences, > and which of them occur most frequently. > Right, and just because I keep harping Python, doesn't mean we can't turn to compiled C when we really need it...and we may with sequences ranging in the millions and billions (I sound like Carl Sagan). Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Tue Jan 12 09:58:26 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:13 2006 Subject: [Pipet Devel] analysis management Message-ID: <369B6312.7E85D461@bc.edu> I thought I'd share some ideas I had about how the Loci user might manage multiple analyses. By management I mean keeping track of what has to be done and what was already done. The first idea I thought of some time ago but haven't mentioned yet. It is an expansion upon the concept of a log file. Normally log files are generated one for each run of a program. But I think we can change that a bit to suit the need of any scientist. You know a "good" scientist will write all experiment data in a physical journal. I think though that it is most incovenient--a real headache infact--to take everything that comes off of the screen and write it down. Even cutting out printouts and gluing them is can be a big hassle when you consider how much data a computer normally generates. (I infact convinced my advisor to let me a keep a computer-based journal--in HTML). Well, to get to the point, I think Loci should keep a running log of all actions. That is, record everything to a single file--in HTML--with links to data files, images, whatever. Even keeping track of times down to the second...better than anyone can do with a notebook. Imagine having a Web-browsable catalogged journal of all Loci analyses! The second idea I mentioned already and have on the Loci Web page. It has to do with using icons and arrows to represent documents and analyses being performed on them. I've seen this before, although it is not very common. What I wanted to point out to you guys was the data mining program called "Clementine". Has anyone used it? Ken Marx (Lowell professor that is entertaining the Loci Project) told me the user interface works much like I am describing for Loci. So, here is the Web site for Clementine: http://www.isldsi.com/clementine.htm And here is the screenshot. If nothing else, just glance at it to see what I am talking about. http://www.isldsi.com/_borders/Image53.gif In other news, Prof. Marx says he will purchase a new Linux box to dedicate to the Loci Project and act as a Web server. We can each have accounts, etc. I also convinced him that when the Project takes off we may need more servers to host the first server-side analysis loci. Linux boxes are pretty cheap, so I'm sure we'll get just about whatever we want ;-) Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Tue Jan 12 09:58:26 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:13 2006 Subject: [Pipet Devel] analysis management Message-ID: <369B6312.7E85D461@bc.edu> I thought I'd share some ideas I had about how the Loci user might manage multiple analyses. By management I mean keeping track of what has to be done and what was already done. The first idea I thought of some time ago but haven't mentioned yet. It is an expansion upon the concept of a log file. Normally log files are generated one for each run of a program. But I think we can change that a bit to suit the need of any scientist. You know a "good" scientist will write all experiment data in a physical journal. I think though that it is most incovenient--a real headache infact--to take everything that comes off of the screen and write it down. Even cutting out printouts and gluing them is can be a big hassle when you consider how much data a computer normally generates. (I infact convinced my advisor to let me a keep a computer-based journal--in HTML). Well, to get to the point, I think Loci should keep a running log of all actions. That is, record everything to a single file--in HTML--with links to data files, images, whatever. Even keeping track of times down to the second...better than anyone can do with a notebook. Imagine having a Web-browsable catalogged journal of all Loci analyses! The second idea I mentioned already and have on the Loci Web page. It has to do with using icons and arrows to represent documents and analyses being performed on them. I've seen this before, although it is not very common. What I wanted to point out to you guys was the data mining program called "Clementine". Has anyone used it? Ken Marx (Lowell professor that is entertaining the Loci Project) told me the user interface works much like I am describing for Loci. So, here is the Web site for Clementine: http://www.isldsi.com/clementine.htm And here is the screenshot. If nothing else, just glance at it to see what I am talking about. http://www.isldsi.com/_borders/Image53.gif In other news, Prof. Marx says he will purchase a new Linux box to dedicate to the Loci Project and act as a Web server. We can each have accounts, etc. I also convinced him that when the Project takes off we may need more servers to host the first server-side analysis loci. Linux boxes are pretty cheap, so I'm sure we'll get just about whatever we want ;-) Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From hinsen at cnrs-orleans.fr Tue Jan 12 10:07:19 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:13 2006 Subject: [Pipet Devel] python data structure In-Reply-To: <369B39BA.C71F482D@bc.edu> (bizzaro@bc.edu) References: <369B01A9.AD8F6B36@bc.edu> <199901120904.KAA14700@dirac.cnrs-orleans.fr> <369B39BA.C71F482D@bc.edu> Message-ID: <199901121507.QAA18454@dirac.cnrs-orleans.fr> > What are the limits on string sizes in Python (too lazy to look it up right > now)? If it is 256, as with some languages, I imagine this presents a little > problem. String immutabilty does also make sequence manipulation a bit awkward. The length of a string must fit into an int variable. So in practice you shouldn't rely on having strings larger than 2**31 character if you want your program to be portable. In other words, there is no serious limitations. > I have used arrays of characters in the past. Using parallel arrays can be a > covenient way to index or "markup" sequences, i.e. the second array can be used > to indicated where features start and stop. But you could also use two lists for that, or lists of lists, depending on requirements. Of course there is nothing wrong with character arrays, except that you give up many useful string operations. > Another thought: Many analysis programs are limited by having to put > everything into RAM, all in one shot. I tend to prefer keeping the > sequence file open and reading in chunks at a time. BTW, some simple > database features of Python allow you to keep and work from a data > structure stored as a file, correct? I don't see what you refer to. Python's file handling works much like C's stdio library; you can read arbitrary parts out of a file. There are also database interfaces (dbm and variants), which make it easy to store data in large files, but these are special-format files that are hardly useable with general programs like editors. Assuming a modern OS, you can also use memory mapping for large files, but I am not sure that we can already afford to ignore OS without memory mapping support. > Right, and just because I keep harping Python, doesn't mean we can't turn to > compiled C when we really need it...and we may with sequences ranging in the > millions and billions (I sound like Carl Sagan). Of course. But even in that case, all that has to be implemented in C is one rather small module. Example: suppose we use strings for nucleotide sequences now, and then find out next year that we must be able to treat sequences that are longer than available memory. Then we'll just write a small C module that implements a special "nucleotide sequence" type. This can look like a drop-in replacement for strings to Python, and all that will have to be changed in the Python code is the place where nucleotide sequence types are created. There are some advantages to a language without static type checking! Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From bizzaro at bc.edu Tue Jan 12 12:30:27 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:13 2006 Subject: [Pipet Devel] python data structure References: <369B01A9.AD8F6B36@bc.edu> <199901120904.KAA14700@dirac.cnrs-orleans.fr> <369B39BA.C71F482D@bc.edu> <199901121507.QAA18454@dirac.cnrs-orleans.fr> Message-ID: <369B86B3.AE784E74@bc.edu> Konrad Hinsen wrote: > I don't see what you refer to. Python's file handling works much like > C's stdio library; you can read arbitrary parts out of a file. There > are also database interfaces (dbm and variants), which make it easy to > store data in large files, but these are special-format files that are > hardly useable with general programs like editors. > You'll have to pardon my ignorance. I am too used to manipulating text files in Pascal. (Don't laugh.) > Example: suppose we use strings for nucleotide sequences now, and then > find out next year that we must be able to treat sequences that are > longer than available memory. Then we'll just write a small C module > that implements a special "nucleotide sequence" type. This can look > like a drop-in replacement for strings to Python, and all that will > have to be changed in the Python code is the place where nucleotide > sequence types are created. There are some advantages to a language > without static type checking! > ...and this "nucleotide sequence" type will work straight from a file rather than memory. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Tue Jan 12 12:31:54 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:13 2006 Subject: [Pipet Devel] python data structure Message-ID: <369B870A.13CCFDFF@bc.edu> Konrad Hinsen wrote: > I don't see what you refer to. Python's file handling works much like > C's stdio library; you can read arbitrary parts out of a file. There > are also database interfaces (dbm and variants), which make it easy to > store data in large files, but these are special-format files that are > hardly useable with general programs like editors. > You'll have to pardon my ignorance. I am too used to manipulating text files in Pascal. (Don't laugh.) > Example: suppose we use strings for nucleotide sequences now, and then > find out next year that we must be able to treat sequences that are > longer than available memory. Then we'll just write a small C module > that implements a special "nucleotide sequence" type. This can look > like a drop-in replacement for strings to Python, and all that will > have to be changed in the Python code is the place where nucleotide > sequence types are created. There are some advantages to a language > without static type checking! > ...and this "nucleotide sequence" type will work straight from a file rather than memory. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Tue Jan 12 12:31:54 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:13 2006 Subject: [Pipet Devel] python data structure Message-ID: <369B870A.13CCFDFF@bc.edu> Konrad Hinsen wrote: > I don't see what you refer to. Python's file handling works much like > C's stdio library; you can read arbitrary parts out of a file. There > are also database interfaces (dbm and variants), which make it easy to > store data in large files, but these are special-format files that are > hardly useable with general programs like editors. > You'll have to pardon my ignorance. I am too used to manipulating text files in Pascal. (Don't laugh.) > Example: suppose we use strings for nucleotide sequences now, and then > find out next year that we must be able to treat sequences that are > longer than available memory. Then we'll just write a small C module > that implements a special "nucleotide sequence" type. This can look > like a drop-in replacement for strings to Python, and all that will > have to be changed in the Python code is the place where nucleotide > sequence types are created. There are some advantages to a language > without static type checking! > ...and this "nucleotide sequence" type will work straight from a file rather than memory. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Tue Jan 12 12:52:32 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:13 2006 Subject: [Pipet Devel] Re: Casbah Project References: <93307F07DE63D211B2F30000F808E9E525D643@edunivexch02.umassmed.edu> Message-ID: <369B8BE0.ED4EBF13@bc.edu> david.lapointe@umassmed.edu wrote: > > Some of the recent email here reminded me of this project (NTLUG -North > Texas Linux Users Group). Basically the notion of content management. > > http://www.ntlug.org/casbah/index.shtml Hmmmm. Indeed. This is something I think we could pick up a few ideas from. Casbah does intend to be the framework for an application such as Loci, but I think we should avoid the Java (we want Python to be the real backbone for Loci), and some of the other components of Casbah may be a bit of bloat for us. But real interesting stuff...thanks. Anyone else want to comment on it? Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From hinsen at cnrs-orleans.fr Tue Jan 12 13:35:59 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] python data structure In-Reply-To: <369B870A.13CCFDFF@bc.edu> (bizzaro@bc.edu) References: <369B870A.13CCFDFF@bc.edu> Message-ID: <199901121835.TAA18878@dirac.cnrs-orleans.fr> > You'll have to pardon my ignorance. I am too used to manipulating > text files in Pascal. (Don't laugh.) I am just surprised: Standard Pascal didn't even have any facility to work with text files. OK, nobody used Standard Pascal, but still... > ...and this "nucleotide sequence" type will work straight from a file rather > than memory. Or even from a network connection. It doesn't matter! Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From bizzaro at bc.edu Sun Jan 17 09:49:50 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] [Fwd: Paos project] Message-ID: <36A1F88E.F3B7F660@bc.edu> Fellow Tulipians, I contacted Carlos Maltzahn, author of the Paos Project, which is a "Python Active Object Server". I was considering using such a system rather than CGI. The main benefit of this would be tighter and more active communication between client and server loci. Paos is similar to Bobo in some respects, but is smaller and not primarily for HTML. This is the Paos Web site: http://www.cs.colorado.edu/~carlosm/software.html I asked Carlos if he would be interested in joining the Loci Project, being responsible for integrating Paos and establishing the communication framework. His reply is attached. -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- -------------- next part -------------- An embedded message was scrubbed... From: Carlos Maltzahn Subject: Re: Paos project Date: Sat, 16 Jan 1999 16:35:42 -0700 (MST) Size: 3348 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990117/29412d37/attachment.mht From bizzaro at bc.edu Sun Jan 17 10:42:30 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] Re: Paos project References: Message-ID: <36A204E6.69657E02@bc.edu> Carlos Maltzahn wrote: > The LOCI project looks very interesting. I'd love to spend some time on > this. Unfortunately, I'm currently in a very busy phase of graduating so > my contributions might be pretty small until May. But I would like to > join. That's no problem. Several of the Loci developers are very busy with their real lives right now. I for one am trying to prepare for my second year Ph.D. exams. Instead of having a couple people try to do everything, we are inviting many people to contribute, with the philosophy that many hands make light work. > Paos has been dormant for years and I would like to revive it and make it > more usable. Bobo is probably more sophisticated and efficient for > retrieval but I don't know whether it supports a notification service as > Paos does. The last time I looked at Bobo (which is also years ago), it > was entirely web based, i.e. its front end is a web server. Paos provides > a Client module that makes it very easy to write clients that have > persistent connections to the Paos server, using a Paos specific protocol. What I do not like about Bobo is that it is very much HTML-centric, as you mentioned, and I think much of what Bobo has to offer, we don't need. We will be using XML and some special protocols, which we haven't nailed down yet. I think Paos will fit better with what we are trying to do. By the way, the Paos Web page says it works with Python 1.4. Has it been tested with Python 1.5? Will it need many modifications? > I really like the idea of the Glyphic Command Language. For the > Chautauqua system I wrote a graphical editor that lets you edit a > bi-partite control graph (similar to Petri Nets). Because the editor is a > Paos client, it also lets you observe its execution. So one idea would be > to build a similar editor (Python/Tkinter) to construct GCL structures and > then watch their execution. > I have a few more links to systems similar to GCL: Clementine data miner http://www.isldsi.com/_borders/Image53.gif Lego Mindstorms RCX language http://www.legomindstorms.com/program/tips_tricks/tips_orgprog.html Lego Mindstorms Robolab http://www.lego.com/dacta/robolab/rcxprograms.htm Crickit Logoblocks http://lcs.www.media.mit.edu/people/fredm/projects/cricket/logoblocks/index.html I plan to have GCL be a part of the Loci workspace, which will consist of a laboratory bechtop (this is where we will have objects represented as glyphs) and a laboratory notebook (this will be a simple HTML browser for viewing persistent analysis logs). Oh, and we are using Python/GTK rather than Tkinter. Here is a page I have describing "PyG Tools": http://www.uml.edu/Dept/Chem/BICGroup/PyGTools/ > Chautauqua was explicitly designed to support exception handling. One > could imagine using similar mechanisms to support exception handling in > GCL executions so that expensive intermediate results don't get lost if > part of the execution fails. That sounds nice. I think you will appreciate how well the object distribution model will work for biological analyses. And I think you will enjoy working with a type of data that very few computer scientists have worked with...Bioinformatics deals with some very unique and I think exciting problems! > > Hmm, I wish I wouldn't have to write a thesis right now ... I wish I wouldn't have to take the second year exams right now ;-) > > If you are still interested in using Paos, my first contribution to Loci > could be to write some documentation for it. Let me know. Sure. I think the other developers may need to get the gist of Paos first. We are also developing a list of loci tools that need to be developed. I'll put you on the mailing list (nothing automated yet--I just redistribute what I get), and we'll see things developing (still) over the spring. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Sun Jan 17 11:20:20 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] Petri Nets Message-ID: <36A20DC4.7E0B0B0D@bc.edu> Carlos referred to "Petri Nets" in his e-mail. For those not familiar with Petri Nets (I was not), I found a Web site that gives an overview of Coloured Petri Nets (CPN), an extension: http://www.daimi.au.dk/CPnets/intro/ Petri Nets date back to the 60's. They aren't directly applicable (I far as I can tell) to our bioinformatics-specific communication model. But they share many similarities, particularly with the Glyphic Command Language we will use on the benchtop to represent connected loci, documents, analyses, etc. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Tue Jan 19 12:28:52 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] [Fwd: What kind of people are you looking at ?] Message-ID: <36A4C0D1.E51E8510@bc.edu> >From Raynald... -------------- next part -------------- An embedded message was scrubbed... From: Raynald de =?iso-8859-1?Q?Lahond=E8s?= Subject: What kind of people are you looking at ? Date: Tue, 19 Jan 1999 17:56:56 +0000 Size: 2004 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990119/f50a8fed/attachment.mht From bizzaro at bc.edu Tue Jan 19 13:25:13 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] Re: What kind of people are you looking at ? Message-ID: <36A4CE03.C6529824@bc.edu> Raynald de Lahond?s wrote: > > I can program a little (I have coded a few line in Tcl). > It seems that python looks quite interesting. I'd like to help > developpement provided a little help. Raynald, we can use people who are not programmers. If you are new to programming, you may want to learn Python and follow the development of Loci with us. What you could do to help, while you are learning Python, is (1) test the programs and (2) write some instructions (documentation) in French for users of Loci. Is this something you would like to do? Your help would be very valuable, and we would appreciate it! Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Tue Jan 19 14:21:53 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] Biosoft software Message-ID: <36A4DB47.4A6A8A4@bc.edu> In the message from Raynald, he mentioned "GeneJockey". This program is from Biosoft. Below are some links to Biosoft programs that we might look to for ideas/inspiration: GeneJockey II (some ideas for viewers and editors): http://www.biosoft.com/genejock.htm Screenshot (but I think we can make something twice as good looking): http://www.biosoft.com/genescr.htm Below are two programs for enzyme analyses, which we haven't really addressed, but enzyme analysis is something done in most biochem labs. It may be better as an add-on to Loci???: AssayZap: http://www.biosoft.com/assaywin.htm Screenshot: http://www.biosoft.com/asywscr.htm WinZyme: http://www.biosoft.com/winzyme.htm Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From lahondes at pasteur.fr Wed Jan 20 09:19:58 1999 From: lahondes at pasteur.fr (Raynald de =?iso-8859-1?Q?Lahond=E8s?=) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] Re: What kind of people are you looking at ? References: <36A4CE03.C6529824@bc.edu> Message-ID: <36A5E60E.D236EA43@pasteur.fr> "J.W. Bizzaro" wrote: > Raynald, we can use people who are not programmers. If you are new to > programming, you may want to learn Python and follow the development of Loci > with us. What you could do to help, while you are learning Python, is (1) > test the programs and (2) write some instructions (documentation) in French > for users of Loci. Is this something you would like to do? Yes, I will be glad to do that. > > Your help would be very valuable, and we would appreciate it! I think this project is very interesting. -- Raynald de Lahondes Unite des Virus Oncogenes - Departement de Biotechnologie Institut Pasteur - 25, rue du Docteur Roux 75724 Paris Cedex 15 - FRANCE tel: 01.45.68.84.54 - fax: 01.40.61.30.33 - cellular: 06.15.65.85.08 email: lahondes@pasteur.fr From rahul at photino.sid.rice.edu Wed Jan 20 22:06:34 1999 From: rahul at photino.sid.rice.edu (Rahul Jain) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] Web interface In-Reply-To: <36A4DB47.4A6A8A4@bc.edu> Message-ID: I was thinking about the Web interface we are planning to have to TULIP. We'll need to plan out our design of the other tools very carefully so that we don't have to create messy kludges to get this web interface to work. Many of the GTK widgets are difficult to incorporate into pure HTML pages. Using JavaScript may help, but there may still be some difficulties. It may even turn out that we'll need to create a Java interface instead. We really ought to limit the widgets we use and the way we modify them. - Looking at the glade widget palette, the first 3 rows shouldn't be a problem. Lists are fine as long as they are kept simple. - Trees... not too good. - Columned lists shouldn't be too bad (either a table or a bunch of lists next to each other w/ Javascript to keep their scrolling in sync... Is that possible?) - Columned tree... ugh. - Rulers, shouldn't really need those. - The H and V rules aren't a problem. - The scales and scrollbars, on the other hand present a problem w/out JavaScript. - Menu bars are fine, but we may need/want to use layers to implement them. - Status bar is easy, so is toolbar. - Progress bars may need some JavaScript. - Arrows are trivial. - Image and pixmap should be simple (as long as my assumptions of what they do are right.) - Drawing area, probably kind of tough if my assumption of what it does is right. - Font selection, not needed. - Most of the Containers can be implemented with tables and/or frames. - For scrolled window, viewport, and handle box, I think we'll need layers. Maybe we can have wrapper classes around the GTK widgets and then have them also able to create HTML code. What are everyone's thoughts on this? We really need to make sure our UI possibilities are strictly specified and translatable to HTML/JavaScript. It may be a pain to do that, I think we want to keep away from requiring Java, which will be a real pain. -- -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- |--|--------|--------------|----|-------------|------|---------|-----|-| Version 10.423.999.211011001.23.20110101.042 (c)1996-1998, All rights reserved. Disclaimer available upon request. From bizzaro at bc.edu Thu Jan 21 12:08:10 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] [Fwd: loci] Message-ID: <36A75EFA.4CCF3D68@bc.edu> I just got this e-mail... -------------- next part -------------- An embedded message was scrubbed... From: Harry Mangalam Subject: loci Date: Thu, 21 Jan 1999 08:46:05 -0800 Size: 3512 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990121/5eaed44e/attachment.mht From bizzaro at bc.edu Thu Jan 21 13:11:49 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] Web interface Message-ID: <36A76DE0.5A346E1E@bc.edu> >From Rahul... -------------- next part -------------- An embedded message was scrubbed... From: Rahul Jain Subject: [Pipet Devel] Web interface Date: Wed, 20 Jan 1999 21:06:34 -0600 (CST) Size: 3499 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990121/b1e7c97d/attachment.mht From hinsen at cnrs-orleans.fr Thu Jan 21 13:33:24 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] Web interface In-Reply-To: <36A76DE0.5A346E1E@bc.edu> (bizzaro@bc.edu) References: <36A76DE0.5A346E1E@bc.edu> Message-ID: <199901211833.TAA18268@dirac.cnrs-orleans.fr> > What are everyone's thoughts on this? We really need to make sure our UI > possibilities are strictly specified and translatable to HTML/JavaScript. Do we? I am not even sure that a Web interface is realistic for everything. The Web was designed for distributing information, not for interactive manipulation. Yes, I know about Java applets etc., but I have disabled Java for good reasons, and I am not the only one. Maybe it will get better over time, but I won't bet on it. I'd say the most important feature is a really good GTK interface, without restrictions imposed by compatibility with rather simple technology. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From bizzaro at bc.edu Thu Jan 21 13:45:12 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] Re: Web interface References: Message-ID: <36A775B1.5D5379F1@bc.edu> Rahul Jain wrote: > > I was thinking about the Web interface we are planning to have to TULIP. > > We'll need to plan out our design of the other tools very carefully so > that we don't have to create messy kludges to get this web interface to > work. My initial thought regarding the Web interface was not that we could or should replicate the GUI loci in a Web browser. My thought was that Loci can provide an HTML interface to _some_ of the _analysis_ loci, considering they are command-line, short-lived, output ASCI text, which will be formatted into XML anyway. Konrad and I have communicated quite a bit about the limitations of HTML. In fact, when I first mentioned XML, Konrad thought I was speaking of putting everything into a Web browser. Neither he nor I like the static interface of HTML browsers, so I will make a point that GUI loci will not look or act like one but will be very dynamic. In short, I think the Web interface can be a quick-and-dirty way for people without the Loci package to access the wealth of analysis loci we may someday amass. I appreciate that you looked into this. But I think this plan would be an enormous task, and probably best left to some heavy duty Java...which we don't want to use for reasons I've expressed before. Do you want to take on this Web interface project, as the simpler project I envisioned? I think the biggest part of this would be some sort of XML to HTML conversion, which is actually what XSL is all about. The problem with XSL is that browsers don't support it right now, and I don't think there is a specification for CML or BSML. And I think another big thing (providing no XSL is used) would be converting diagrams specified by XML into GIF format. You would have to write a program that would generate custom GIF's. Check out NCBI's "graphical view" of nucleotide data: http://www.ncbi.nlm.nih.gov/cgi-bin/Entrez/framik?gi=3647232&db=Nucleotide Also, the Web interface implementation will work from one server that I will have set up. It will act as a limited version of Loci, and will contact certain analysis loci around the Internet. This is no small project either. It will almost parallel Loci itself and use many of the same components. So, we must limit the types of views and analysis tools that the Web interface can handle. Jeff bizzaro@bc.edu > > Many of the GTK widgets are difficult to incorporate into pure HTML pages. > Using JavaScript may help, but there may still be some difficulties. It > may even turn out that we'll need to create a Java interface instead. > > We really ought to limit the widgets we use and the way we modify them. > - Looking at the glade widget palette, the first 3 rows shouldn't be a > problem. Lists are fine as long as they are kept simple. > - Trees... not too good. > - Columned lists shouldn't be too bad (either a table or a bunch of lists > next to each other w/ Javascript to keep their scrolling in sync... Is > that possible?) > - Columned tree... ugh. > - Rulers, shouldn't really need those. > - The H and V rules aren't a problem. > - The scales and scrollbars, on the other hand present a problem w/out > JavaScript. > - Menu bars are fine, but we may need/want to use layers to implement > them. > - Status bar is easy, so is toolbar. > - Progress bars may need some JavaScript. > - Arrows are trivial. > - Image and pixmap should be simple (as long as my assumptions of what > they do are right.) > - Drawing area, probably kind of tough if my assumption of what it does is > right. > - Font selection, not needed. > - Most of the Containers can be implemented with tables and/or frames. > - For scrolled window, viewport, and handle box, I think we'll need > layers. > > Maybe we can have wrapper classes around the GTK widgets and then have > them also able to create HTML code. > > What are everyone's thoughts on this? We really need to make sure our UI > possibilities are strictly specified and translatable to HTML/JavaScript. > It may be a pain to do that, I think we want to keep away from requiring > Java, which will be a real pain. > > -- > -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- > -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- > -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- > -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- > |--|--------|--------------|----|-------------|------|---------|-----|-| > Version 10.423.999.211011001.23.20110101.042 > (c)1996-1998, All rights reserved. > Disclaimer available upon request. -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Thu Jan 21 14:30:18 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] Raynald joins Message-ID: <36A78040.136F6770@bc.edu> Fellow Tulipians, Raynald has agreed to join the project in the area of testing and documentation (French and English). He would also like to learn Python and may do some development later on. Raynald, I added you to the mailing list. Also, when you are writing documentation in French, could you write it in English too, so that it won't have to be rewritten or translated later. Don't worry about how well the English comes out, because I can edit that. Raynald and Konrad, do you think we should have French and German translations of the Loci Web pages as well? Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From carlosm at moet.cs.colorado.edu Thu Jan 21 14:30:46 1999 From: carlosm at moet.cs.colorado.edu (Carlos Maltzahn) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] Web interface In-Reply-To: <36A76DE0.5A346E1E@bc.edu> Message-ID: I don't know GTK/Gnome very well but how portable is it? Is that an issue? I personally don't mind using GTK/Gnome but a powerful web interface that runs everywhere (using Mozilla) sounds also very interesting. Also, don't underestimate the things you can do with JavaScript and newer versions of HTML. I found http://developer.netscape.com/viewsource/index_frame.html?content=archive/archivelist.html useful to get an impression. See also http://developer.netscape.com/viewsource/goodman_drag/goodman_drag.html for (at least to me) surprising applications. Carlos On Thu, 21 Jan 1999, J.W. Bizzaro wrote: >From Rahul... From rahul at photino.sid.rice.edu Thu Jan 21 14:41:01 1999 From: rahul at photino.sid.rice.edu (Rahul Jain) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] Re: Web interface In-Reply-To: <36A775B1.5D5379F1@bc.edu> Message-ID: On Thu, 21 Jan 1999, J.W. Bizzaro wrote: > Rahul Jain wrote: > > > > I was thinking about the Web interface we are planning to have to TULIP. > > > > We'll need to plan out our design of the other tools very carefully so > > that we don't have to create messy kludges to get this web interface to > > work. > > My initial thought regarding the Web interface was not that we could or should > replicate the GUI loci in a Web browser. My thought was that Loci can provide > an HTML interface to _some_ of the _analysis_ loci, considering they are > command-line, short-lived, output ASCI text, which will be formatted into XML anyway. > > Konrad and I have communicated quite a bit about the limitations of HTML. In > fact, when I first mentioned XML, Konrad thought I was speaking of putting > everything into a Web browser. Neither he nor I like the static interface of > HTML browsers, so I will make a point that GUI loci will not look or act like > one but will be very dynamic. > > In short, I think the Web interface can be a quick-and-dirty way for people > without the Loci package to access the wealth of analysis loci we may someday amass. > Since we are limiting the interface to many loci to GTK/GNOME, we are limiting the people able to use Loci to those with Linux. GTK/GNOME may compile on other Unices, but I don't think GNOME does, and I'm sure that it'll probably take quite a bit of tweaking to get it to compile on any other system. Our main concern should be getting it to work under Windows, since those who use any other Unix won't have any trouble with Linux if they need to run it. Windows users, on the other hand are often either unable to understand Linux or not allowed by superiors to use Linux. That's where we really need to target the Web interface. > I appreciate that you looked into this. But I think this plan would be an > enormous task, and probably best left to some heavy duty Java...which we don't > want to use for reasons I've expressed before. > > Do you want to take on this Web interface project, as the simpler project I > envisioned? I think the biggest part of this would be some sort of XML to > HTML conversion, which is actually what XSL is all about. The problem with > XSL is that browsers don't support it right now, and I don't think there is a > specification for CML or BSML. And I think another big thing (providing no > XSL is used) would be converting diagrams specified by XML into GIF format. > You would have to write a program that would generate custom GIF's. Check out > NCBI's "graphical view" of nucleotide data: > > http://www.ncbi.nlm.nih.gov/cgi-bin/Entrez/framik?gi=3647232&db=Nucleotide I think this is where Perl really can be useful. It's designed to process and manipulate text, and there are modules for XML and HTML. Also, there's GD for creating GIFs. OTOH, Perl and Python running at the same time would probably wear out all but the best servers, so I think we may have to rely on Python alone. Then again, mod_perl would make the system much more responsive. If there's a GD module for Python, then it shouldn't be too tough to do the whole thing in Python and have it integrate much more cleanly with the other parts of Loci. > Also, the Web interface implementation will work from one server that I will > have set up. It will act as a limited version of Loci, and will contact > certain analysis loci around the Internet. Oh, the way I envisioned it, the Web interface would be a package that could be installed on any Loci server as a Loci client and it would use CGI to handle the requests from other computers. > This is no small project either. It will almost parallel Loci itself and use > many of the same components. So, we must limit the types of views and > analysis tools that the Web interface can handle. I think I'll do this project, as I haven't taken any molbio/genetics courses yet. Considering the situation of the people who would use the Web interface, they probably have a JavaScript capable browser, so I can implement most of the widgets. (Does IE have layers support?) -- -> -\-=-=-=-=-=-=-=-=-=-/^\-=-=-=<*><*>=-=-=-/^\-=-=-=-=-=-=-=-=-=-/- <- -> -/-=-=-=-=-=-=-=-=-=/ { Rahul -<>- Jain } \=-=-=-=-=-=-=-=-=-\- <- -> -\- "I never could get the hang of Thursdays." - HHGTTG by DNA -/- <- -> -/- http://photino.sid.rice.edu/ -=- mailto:rahul-jain@usa.net -\- <- |--|--------|--------------|----|-------------|------|---------|-----|-| Version 10.423.999.211011001.23.20110101.042 (c)1996-1998, All rights reserved. Disclaimer available upon request. From bizzaro at bc.edu Thu Jan 21 14:53:27 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] Web interface References: Message-ID: <36A785AC.26C24D14@bc.edu> Carlos Maltzahn wrote: > > I don't know GTK/Gnome very well but how portable is it? Is that an issue? It was made for Linux/UNIX. It has been ported to Windows, and it was designed to be portable to most architectures. But the best support right now is on Linux/UNIX. You may disagree, but I think Linux/UNIX is the best platform for developing Loci, and I consider Windows and Mac ports to be important but not our primary consideration. I think many compromises are made using a truly portable GUI widget set, such as Tkinter or Java. And this is underscored by many of the complaints people have about these being bloated and slow. I want to make an excellent Linux/UNIX package first and then consider the other platforms later. We may even switch widget sets for other platforms. That's the nice thing about Python...we can do it. But we'll see. > I personally don't mind using GTK/Gnome but a powerful web interface that > runs everywhere (using Mozilla) sounds also very interesting. Yes, I think so ;-) I recognize the limitations of a static HTML browser, so we can't have just that. Besides, NCBI and others already use HTML and do it pretty well. An HTML interface will be important for non-Linux/UNIX users. I think it helps solve our portability problem for now. Will HTML someday be good enough for Loci? Well, if and when Word and Excel are ported to HTML, we will port Loci. I want Loci to be that dynamic, which HTML just isn't right now. > > Also, don't underestimate the things you can do with JavaScript and newer > versions of HTML. I found > http://developer.netscape.com/viewsource/index_frame.html?content=archive/archivelist.html > useful to get an impression. > > See also > http://developer.netscape.com/viewsource/goodman_drag/goodman_drag.html > for (at least to me) surprising applications. These are examples of both JavaScript and DHTML (Dynamic HTML). The first is Netscape's proprietary scripting language, and the other is Microsoft's proprietary extension of HTML. I am aware of how well they both work, and yes they can do some amazing things. The HTML interface can make use of these, but I would like some feedback on using these proprietary languages in a GNU project. I had rejected doing that before, which is one of the reasons why we aren't using Java or Tcl/Tk. Maybe we should stick with open source everywhere...? Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Thu Jan 21 15:30:04 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] Re: Web interface References: Message-ID: <36A78E3E.4806760A@bc.edu> Rahul Jain wrote: > Since we are limiting the interface to many loci to GTK/GNOME, we are > limiting the people able to use Loci to those with Linux. GTK/GNOME may > compile on other Unices, but I don't think GNOME does, Both GTK and the entire GNOME desktop have been ported to just about every platform that can run X-Windows. So, we really have all Unices covered. > and I'm sure that > it'll probably take quite a bit of tweaking to get it to compile on any > other system. Probably. Especially if we use UNIX utilities. Portability is a huge issue that we aren't about to conquer just yet. You can see Sun tearing their hair out over this issue and Java. *But* by using Python, we are in a much better position to port Loci than EMBOSS would be. ***It's a tradeoff guys! We are sticking with one platform for development so that we don't limit ourselves to the intersection of all UNIX, Windows, and Mac GUI, which is relatively small. *And* we get to use native UNIX implementations, not everything running through a virtual machine. Besides, I have confidence that GTK/GNOME will find its way to those other platforms without any effort of our own. If not, we'll see about porting to native Windows and Mac API...It's been done. > Our main concern should be getting it to work under Windows, > since those who use any other Unix won't have any trouble with Linux if > they need to run it. Windows users, on the other hand are often either > unable to understand Linux or not allowed by superiors to use Linux. > That's where we really need to target the Web interface. Yes, that is exactly what the Web interface is targeting :-) But can we put all of the bells and whistles in the Web interface that we have available with GTK? There is just no way right now. As I said in my last e-mail to Carlos, we want Loci to be as dynamic as Word and Excel. If Microsoft can't put those in a Web browser, I certainly doubt we could put Loci. The Web interface will simply have to work with the limitations imposed on that type of interface. > I think this is where Perl really can be useful. It's designed to process > and manipulate text, and there are modules for XML and HTML. Also, there's > GD for creating GIFs. OTOH, Perl and Python running at the same time would > probably wear out all but the best servers, so I think we may have to rely > on Python alone. Then again, mod_perl would make the system much more > responsive. If there's a GD module for Python, then it shouldn't be too > tough to do the whole thing in Python and have it integrate much more > cleanly with the other parts of Loci. > I am not really anti-Perl, but Python can handle much of what Perl can, and I think we should not try to mix Perl and Python. I don't know if there is a GD module for Python. We'll have to look (Konrad, do you know of one?). There is an XML parser being developed by the Python developers. Of course, Python can handle text as well as Perl can. > Oh, the way I envisioned it, the Web interface would be a package that > could be installed on any Loci server as a Loci client and it would use > CGI to handle the requests from other computers. Yes it would be a client in the way it would substitute for all (nearly) of the client side loci. But it would act as a Web server (really, work with a Web server) that can be tapped into by anyone using a Web browser. Where should it go? I think we may just need one at the main Loci URL (which doesn't exist yet). Why do you think it should be portable? It can be, but I don't think it has to be, as long as one Web server can handle the requests. > I think I'll do this project, as I haven't taken any molbio/genetics > courses yet. Considering the situation of the people who would use the Web > interface, they probably have a JavaScript capable browser, so I can > implement most of the widgets. Great! But let's see what we can do using standard CGI over JavaScript (see my last message). If we have to use it, then we have to. > (Does IE have layers support?) You've got me there. I haven't used IE much at all. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From carlosm at moet.cs.colorado.edu Thu Jan 21 16:07:02 1999 From: carlosm at moet.cs.colorado.edu (Carlos Maltzahn) Date: Fri Feb 10 19:18:14 2006 Subject: [Pipet Devel] Web interface In-Reply-To: <36A785AC.26C24D14@bc.edu> Message-ID: > Also, don't underestimate the things you can do with JavaScript and newer > versions of HTML. I found > http://developer.netscape.com/viewsource/index_frame.html?content=archive/archivelist.html > useful to get an impression. > > See also > http://developer.netscape.com/viewsource/goodman_drag/goodman_drag.html > for (at least to me) surprising applications. These are examples of both JavaScript and DHTML (Dynamic HTML). The first is Netscape's proprietary scripting language, and the other is Microsoft's proprietary extension of HTML. I am aware of how well they both work, and yes they can do some amazing things. The HTML interface can make use of these, but I would like some feedback on using these proprietary languages in a GNU project. I had rejected doing that before, which is one of the reasons why we aren't using Java or Tcl/Tk. Maybe we should stick with open source everywhere...? JavaScript is open source isn't it? It's part of Mozilla. Supposedly the Raptor/Gecko layout engine is going to support "HTML 4.0, CSS 1/2, XML 1.0, and the Document Object Model" (first stable version of Gecko is due sometime during first half of 1999). For example for dragging and dropping stuff around you need layers (HTML4), event handling (JavaScript/DOM), and absolute positioning of elements (CSS2/DOM). I'm not sure how soon Mozilla is going to support all this, but I suspect within this year. Because there is no open source version that supports a sufficiently dynamic interface yet - but it might arrive within this year -, it is probably a good idea to implement the UI in GTK first. Once the design stabilizes and the open source web interface language implementation becomes available, some of us can then see how far one can push a dynamic web interface. Carlos From justin at ukans.edu Thu Jan 21 17:21:54 1999 From: justin at ukans.edu (Justin Bradford) Date: Fri Feb 10 19:18:15 2006 Subject: [Pipet Devel] toolkit and data access/storage In-Reply-To: <36A78E3E.4806760A@bc.edu> Message-ID: On Thu, 21 Jan 1999, J.W. Bizzaro wrote: > > Yes, that is exactly what the Web interface is targeting :-) But can we put > all of the bells and whistles in the Web interface that we have available with > GTK? There is just no way right now. As I said in my last e-mail to Carlos, > we want Loci to be as dynamic as Word and Excel. If Microsoft can't put those > in a Web browser, I certainly doubt we could put Loci. The Web interface will > simply have to work with the limitations imposed on that type of interface. > A Windows port using gtk should not be a significant problem. A cross-compile with cygnus' tools and Win gdk, should be all that's necessary. Theoretically Mac OS X should be even less work (once someone ports gdk). I agree, though, that for development purposes, it's best not to even worry about it. Linux is where we'll find the most contributors in the early phase of development. > Yes it would be a client in the way it would substitute for all (nearly) of > the client side loci. But it would act as a Web server (really, work with a > Web server) that can be tapped into by anyone using a Web browser. Where > should it go? I think we may just need one at the main Loci URL (which > doesn't exist yet). Why do you think it should be portable? It can be, but I > don't think it has to be, as long as one Web server can handle the requests. I was under the impression that the CGI program would return an XML file which the client used for its display. Using a browser connected directly to this would require additional information (javascript, style sheets, etc) for display, right? Or would the dhtml interface be conditional (only send it if it's not our special app client)? Also, I think a typical, url-encoded CGI request syntax is poor for this system. There's no particular reason we couldn't feed the server program our own query syntax. I think using one of the proposed XML query languages would be more useful. I like AT&T's; it's a mix of XML and semi-structured query language (halfway between SQL and OQL). It lets you specify the format of the XML returned. This could be tied to a database or analysis program, or a mixture of both. A client can take whatever pieces of information from any source it likes and combine them for it's display. And for now, it could still be implemented over a typical HTTP connection, via an intermediate "dispatch" agent. The basic idea is like the Casbah project, just not so complicated, and no Java. I'm working on the basic idea for a database/application infrastructure for education related ideas. XML-QL (AT&T's proposal): http://www.w3.org/TR/NOTE-xml-ql/ XQL (Microsoft's proposal): http://www.w3.org/TandS/QL/QL98/pp/xql.html Supposedly, there was a conference on query languages a while back, but I haven't seen any working drafts on w3 yet. It's hard to tell which becomes the recommendation (probably a mixture of the two). I'm not quite sure how the interface layer between the query parsing and database retrieval and/or application will work yet. Ideas are welcome. Also, do we want a listserv? If so, I can have something set up here. Justin Bradford justin@ukans.edu From bizzaro at bc.edu Thu Jan 21 18:16:57 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:15 2006 Subject: [Pipet Devel] Re: loci References: Message-ID: <36A7B54F.6E59DBC0@bc.edu> Harry Mangalam wrote: > I'm considering releasing the next version under std GPL, but I'm old enough > to want to take a good look at it and try to consider the possibilities that > GPL requires/allows. Okay. I hope you do! I started out worrying about what greedy corporate types might do with my programs, but I don't care as much now. I think the GPL gives pretty good protection to the intellectual property of the developers. And I think an important part of an unestablished project is _not_ restricting one's work to anyone. > > Right - it's a vanilla ANSI-C command-line app - one of the reasons for > dev'ing it was to supply something like a DNA Strider for the command-line. > I develop on Linux and port to IRIX/SunOS/Solaris/DEC Unix/HPUX with no > problems.. yet. That's fine. I think ANSI-C is more portable than C++ and is more suited to a UNIX environment. We use ANSI-C to supplement Python. But Python is preferred here because of some very powerful features. It is also, from my observation, the scripting language most preferred by physical scientist (I sound like an advertisement)...considered by a few to be a likely replacement for FORTRAN. > > I was considering starting with the former (a GUI-wrapped command-line app) > and moving to the latter (fully GUI), but I'm still feeling my way in terms > of how to present it. As I understand LOCI, the underlying apps can be > distributed but communicate at the XML layer. I'm starting to add this to > tacg for reasons related to interoperability but until yesterday, unrelated > to LOCI :). Yes. Tools can/will communicate locally via (1) direct Python implementation or (2) indirect use of XML. The other way to communicate is (3) remotely via XML with a CGI-like interface. > Are you planning to make a psuedo-visual programming language out of this - > is this what the GCL is? If I understand XML correctly, this shoul dbe > possible but would probably require a large expenditure of energy... Yes. I was just talking to Carlos Maltzahn in our group about that (Carlos developed the Paos Project for distributing Python objects over a network, and he will help incorporate that into Loci). GCL is about the highest level programming language you may find, and it is specifically to manage multi-step analyses in a graphical way. (I don't consider biologists to be very computer-savvy.) The XML is mostly used to format data and not to issue/manage commands. The job of GCL is to issue/manage commands. But putting commands in with the XML is an interesting thought. > > I like the idea of being able to use whatever underlying language you want - > lots of goodies are written in perl and there's no real reason to exclude > those apps/libs and those authors. Yes! The distributed model with a CGI-like mode of operation will make this connection between Loci and any other command-line language! > I also am working on contract for NCGR (National Center for Genome Resources > (Santa Fe) which is also interested in developing freely available tools for > sequence analysis/bioinformatics and I'll try to get them to pay attention to what you're > trying to do - there may be room for some effort on their part. What sort of "effort" are you speaking of? Loci is unfunded, but that may change. > I tend to agree with your feeling that both can go forward - with the state > of funding these days, there will be times when one or the other will be > moving faster, but both groups should stay in contact with each other. It > would be good if representatives of both could meet and get drunk together > at some birds-of-a-feather meeting soon... :-) Which continent? > OK - I am VERY much interested in hearing what the rest of you think about > how this should be done - I'm interested in getting biosequence analysis > made much easier and cheaper and have tried to walk my talk by taking the > time to put something together towards that end. If I can contribute to > other projects by this, so much the better. IF you go GPL, I would hope that tacg would be a model for taking a command-line bioinformatics program and adding a GUI to it. In other words, a good point of collaboration for us would be to use the energy that would be put into making a GTK interface for tacg, and put it into developing the part of Loci that tacg would need. We would otherwise be duplicating our efforts. We are just starting to make these tools, such as sequence visualization and editing tools, so why not make them to work for tacg (as well as some EMBOSS programs)? Thomas Sicheritz (author of BioWish) is working on an editor right now, and I think Justin Bradford may help with XML implementation. > > So by all means, plase keep me in the loop. If I can help out in any way. > let me know... Okay. You are hereby on the mailing list. And I'll consider you an observer who may want to help (please help :-) with incorporating his not-yet-GPL analysis algorithms. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Thu Jan 21 19:11:34 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:15 2006 Subject: [Pipet Devel] toolkit and data access/storage References: Message-ID: <36A7C218.1FD827BE@bc.edu> Justin Bradford wrote: > I was under the impression that the CGI program would return an XML file > which the client used for its display. Well...actually the model is changing some now that Carlos will help with Paos (see the messages we sent). I'm using the term CGI-like, because at some point, on the server side, the "query" or command will be passed from a Python script (let's call it "Gatekeeper") to a stand-alone analysis algorithm (runs from command line and returns XML). The XML will then be returned to Gatekeeper and sent back to the client. This is a lot like a Web browser (client) communicating with a CGI program (on a server). > Using a browser connected directly > to this would require additional information (javascript, style sheets, > etc) for display, right? Or would the dhtml interface be conditional (only > send it if it's not our special app client)? Okay. You're asking how will the XML accomidate the fact that the client can be either a GUI loci or a Web browser? Good question. I want the XML to be formatted to best accomidate the GUI loci (you know, the tools made with PyGTK and all that). The job of the person developing the Web/HTML interface (Rahul) is to write a translator that will turn the XML into HTML + GIF. > > Also, I think a typical, url-encoded CGI request syntax is poor for this > system. There's no particular reason we couldn't feed the server program > our own query syntax. I think using one of the proposed XML query > languages would be more useful. I like AT&T's; it's a mix of XML and > semi-structured query language (halfway between SQL and OQL). > It lets you specify the format of the XML returned. > This could be tied to a database or analysis program, or a mixture of > both. A client can take whatever pieces of information from any source it > likes and combine them for it's display. Again, we are going with distributed Python objects via Paos. But the "query" or command language used is something I haven't thought much about. We do want the XML that is returned to be something the client can handle. For example, the client asks (queries) the Gatekeeper for a Chao-Fasman prediction of protein secondary structure. It should get something back that can be displayed by that client, or the client may have to pass the info along to another client. In any case, the data has to be of the type that has a client in existence for it. Maybe the client doesn't need to say what it is expecting. Maybe we just need a filter for things not expected. If we do get something unexpected, it may be the fault of analysis algorithm author who tried to implement something not supported. (BTW, we do need a specification and a template system for converting analysis loci output into XML that the clients can handle--falls between the Gatekeeper and the analysis loci). > The basic idea is like the Casbah project, just not so complicated, and no > Java. I'm working on the basic idea for a database/application > infrastructure for education related ideas. > > XML-QL (AT&T's proposal): > http://www.w3.org/TR/NOTE-xml-ql/ > > XQL (Microsoft's proposal): > http://www.w3.org/TandS/QL/QL98/pp/xql.html I would like you to consider what query language we need, that would best suit XML transfer. But since this is a GNU-only project, we can't go with a proprietary language. ***Besides, those are more complex because they need to be general purpose. Can you invent something along those lines that is small and special purpose for Loci? > > I'm not quite sure how the interface layer between the query parsing and > database retrieval and/or application will work yet. Ideas are welcome. You mean the program that takes the query, converts the query to a command-line, issues the command to the analysis loci, accepts the ASCI text, converts it to XML, and sends it back to the client? Once again, the Gatekeeper! That's a BIG component to Loci that needs to be set up before most anything else will work. Of course it will be in Python and be closely connected to Paos. Who wants that project??? It is actually a part of defining a query langauge and the protocol for handling XML. Do you want to take a shot, Justin? The other XML projects are (1) converting standard docs like PDB and GENBANK to XML, and (2) parsing XML to display images within the client loci. > > Also, do we want a listserv? If so, I can have something set up here. Heh. I know it's getting too much for me to handle. Can you set something up until I get some servers going at UMass Lowell for this project? Thanks! The list of people that got this message (including you and me) is the whole mailing list. BTW, Jay Painter is off the list...he's working too hard on getting GNOME ready for RedHat 6.0. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From carlosm at moet.cs.colorado.edu Thu Jan 21 20:12:32 1999 From: carlosm at moet.cs.colorado.edu (Carlos Maltzahn) Date: Fri Feb 10 19:18:15 2006 Subject: [Pipet Devel] toolkit and data access/storage In-Reply-To: <36A7C218.1FD827BE@bc.edu> Message-ID: Jeff Bizzaro wrote: Again, we are going with distributed Python objects via Paos. But the "query" or command language used is something I haven't thought much about. We do want the XML that is returned to be something the client can handle. For example, the client asks (queries) the Gatekeeper for a Chao-Fasman prediction of protein secondary structure. It should get something back that can be displayed by that client, or the client may have to pass the info along to another client. In any case, the data has to be of the type that has a client in existence for it. Paos does have a (pretty ad-hoc) query language. The README in the distribution contains a terse description/definition. (ftp://ftp.cs.colorado.edu/users/carlosm/README.paos). Results are Python objects. The client module and the base classes for schema definitions are optimized for reducing object serialization overhead. The result of a query are objects that match the query and all "primitive objects" such as strings and numbers that are values of instance variables. Pointers to other objects are internally represented as object IDs in the form of strings. They are transparently loaded from the server as the user references them. The client maintains a local cache so that re-references of attributes that point to other objects don't cause any client/server traffic. Cache consistency depends on the use of notification services. Objects that are received via notifications are written into the same cache. Thus, one can maintain very tight cache consistency by appropriately defining notification requests. But notification requests also allow you to limit the scope of cache consistency to a few relevant objects. I'm not sure how Paos should communicate with the actual tools (in the case people agree to use it as "gatekeeper"). I personally don't like CGI because of it's unflexible fork-request-once-response-once-terminate assumption. I suspect these tools are running for a longer time period and we would like to be able to find out about their state. On the other hand it's probably not a good idea to run them natively in Paos because of their size, no? What are people's thoughts about this? Carlos From bizzaro at bc.edu Thu Jan 21 20:44:48 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:15 2006 Subject: [Pipet Devel] toolkit and data access/storage References: Message-ID: <36A7D7EB.4D52C752@bc.edu> Carlos Maltzahn wrote: > > I'm not sure how Paos should communicate with the actual tools (in the > case people agree to use it as "gatekeeper"). Actually, I was not calling Paos "Gatekeeper". Let's see if I have the Paos model right: Some clients are local to the user, and the server is remote and acts as hub for remote clients to communicate with local clients. Correct? If I am correct, the Gatekeeper would be a remote client to Paos. It converts queries into command-lines (via Ajax by EMBOSS?) for execution by the analysis algorithms, waits for response, gets ASCI text response, formats it into XML (according to a template), and sends it back to the client. on the server side, we are catering to fork-request-once-response-once-terminate programs made by who-knows and whenever with whatever language. In other words, we still need a CGI-like system. But this is only one type of Loci client. Other clients can make better use of Paos. > I personally don't like CGI > because of it's unflexible fork-request-once-response-once-terminate > assumption. I suspect these tools are running for a longer time period and > we would like to be able to find out about their state. ^^^^^^^^^^^^^^^^^^^^^^^^ Yes! This is something I realized would not work with standard CGI. These analysis algorithms (server side only) will be longer lived than standard CGI scripts. Some may take hours or days to complete. I would like to return the state or maybe the time elapsed to the client so that the user knows something is happening and the whole thing didn't just die. > On the other > hand it's probably not a good idea to run them natively in Paos > because of their size, no? What are people's thoughts about this? Because biological analysis algorithms tend to be command-line, run-once-and-terminate, I think we will just have to treat them like rather big CGI programs. That's not to say that other tools, libraries, etc., written for Loci, will not communicate directly using Paos and XML...They can and will. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From carlosm at moet.cs.colorado.edu Thu Jan 21 21:30:05 1999 From: carlosm at moet.cs.colorado.edu (Carlos Maltzahn) Date: Fri Feb 10 19:18:15 2006 Subject: [Pipet Devel] toolkit and data access/storage In-Reply-To: <36A7D7EB.4D52C752@bc.edu> Message-ID: Correct me if I'm wrong but I think what we really are trying to design here is somekind of batch processing management system. Our department runs a commercial product called Load Sharing Facility (LSF) sold by Platform Computing (www.platform.com). See http://www.cs.colorado.edu/csops/FAQ/lsf.html for our installation and http://www.cs.colorado.edu/csops/FAQ/lsf-webpages/quick-admin.html for some documentation on it. Is there an open source equivalent for this? If we want to define and monitor computations with GCL we need to have some way to manage batch processing on different machines, query their state, and have access to intermediate results or check points. That means each execution needs to be submitted to some kind of management system that then schedules and runs the tools in some kind of shell that it can remotely query and control. Once we have established such a management system it should be fairly easy to write a GCL user interfaces for it. I can see Paos to sit on top of such a management system and the GCL editors and monitors to be Paos clients. LSF is designed primarily for workload management. Our focus would be more on composing tools, and scheduling, and controlling them. Carlos On Thu, 21 Jan 1999, J.W. Bizzaro wrote: on the server side, we are catering to fork-request-once-response-once-terminate programs made by who-knows and whenever with whatever language. In other words, we still need a CGI-like system. But this is only one type of Loci client. Other clients can make better use of Paos. > I personally don't like CGI > because of it's unflexible fork-request-once-response-once-terminate > assumption. I suspect these tools are running for a longer time period and > we would like to be able to find out about their state. Yes! This is something I realized would not work with standard CGI. These analysis algorithms (server side only) will be longer lived than standard CGI scripts. Some may take hours or days to complete. I would like to return the state or maybe the time elapsed to the client so that the user knows something is happening and the whole thing didn't just die. > On the other > hand it's probably not a good idea to run them natively in Paos > because of their size, no? What are people's thoughts about this? Because biological analysis algorithms tend to be command-line, run-once-and-terminate, I think we will just have to treat them like rather big CGI programs. That's not to say that other tools, libraries, etc., written for Loci, will not communicate directly using Paos and XML...They can and will. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Thu Jan 21 21:35:11 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:15 2006 Subject: [Pipet Devel] Web interface References: Message-ID: <36A7E3B6.D6FDB87F@bc.edu> Carlos Maltzahn wrote: > JavaScript is open source isn't it? It's part of Mozilla. Supposedly the > Raptor/Gecko layout engine is going to support "HTML 4.0, CSS 1/2, XML > 1.0, and the Document Object Model" (first stable version of Gecko is due > sometime during first half of 1999). For example for dragging and dropping > stuff around you need layers (HTML4), event handling (JavaScript/DOM), and > absolute positioning of elements (CSS2/DOM). I'm not sure how soon Mozilla > is going to support all this, but I suspect within this year. Okay. I didn't realize JavaScript would go open source with Mozilla. If that's the case, then I have no quarrels about using it. *But* the Mozilla license is more restrictive than GPL...Hmmm. > Because there is no open source version that supports a sufficiently > dynamic interface yet - but it might arrive within this year -, it is > probably a good idea to implement the UI in GTK first. Once the design > stabilizes and the open source web interface language implementation > becomes available, some of us can then see how far one can push a dynamic > web interface. I agree. That's just the way I see it :-) Who knows, if the Web becomes THAT dynamic, the Web interface and the rest of the Loci clients may merge...But are we to expect that Netscape will provide an all-pupose, cross-platform GUI widget set? Hmmm. It does sound hard to believe...We'll wait and take the conservative route here. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From hinsen at cnrs-orleans.fr Fri Jan 22 04:15:28 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:15 2006 Subject: [Pipet Devel] Raynald joins In-Reply-To: <36A77C94.F730D99C@bc.edu> (bizzaro@bc.edu) References: <36A77C94.F730D99C@bc.edu> Message-ID: <199901220915.KAA19818@dirac.cnrs-orleans.fr> > Raynald and Konrad, do you think we should have French and German translations > of the Loci Web pages as well? I'd say yes, but not now. As soon as there is code that people can actually use, it makes sense. I think we can safely expect everyone interested in *development* to be able to deal with a website in English. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From hinsen at cnrs-orleans.fr Fri Jan 22 04:25:23 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:15 2006 Subject: [Pipet Devel] Re: Web interface In-Reply-To: <36A78E3E.4806760A@bc.edu> (bizzaro@bc.edu) References: <36A78E3E.4806760A@bc.edu> Message-ID: <199901220925.KAA21138@dirac.cnrs-orleans.fr> > I am not really anti-Perl, but Python can handle much of what Perl can, and I > think we should not try to mix Perl and Python. I don't know if there is a GD > module for Python. We'll have to look (Konrad, do you know of one?). There There is, but it's no longer maintained (http://alumni.dgs.monash.edu.au/~richard/gdmodule/). The module of choice for creating graphics in Python is the Python Imaging Library (http://www.python.org/sigs/image-sig/Imaging.html). I have used it for some small tasks and it works as advertised. > Great! But let's see what we can do using standard CGI over JavaScript (see > my last message). If we have to use it, then we have to. Anyone interested in Web interfaces should have a look at Zope (http://www.zope.org). All I have used myself is the module ZPublisher, which is essentially an object-oriented CGI library (but I promise that once you have used it you don't want to deal with plain CGI any more). Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From hinsen at cnrs-orleans.fr Fri Jan 22 04:29:34 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:15 2006 Subject: [Pipet Devel] Web interface In-Reply-To: (message from Carlos Maltzahn on Thu, 21 Jan 1999 14:07:02 -0700 (MST)) References: Message-ID: <199901220929.KAA21140@dirac.cnrs-orleans.fr> > JavaScript is open source isn't it? It's part of Mozilla. Supposedly the The problem with JavaScript is not licensing, but compatibility. I haven't tried myself, but those who did try to write non-trivial JavaScript code supposed to work in all popular browsers tell me that it's not a pleasant experience. > absolute positioning of elements (CSS2/DOM). I'm not sure how soon Mozilla > is going to support all this, but I suspect within this year. And how long until it works reliably? It seems that Web browsers are the only software category whose quality standards are even below scientific code. > Because there is no open source version that supports a sufficiently > dynamic interface yet - but it might arrive within this year -, it is > probably a good idea to implement the UI in GTK first. Once the design > stabilizes and the open source web interface language implementation > becomes available, some of us can then see how far one can push a dynamic > web interface. That sounds like a good approach to me. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From hinsen at cnrs-orleans.fr Fri Jan 22 04:42:35 1999 From: hinsen at cnrs-orleans.fr (Konrad Hinsen) Date: Fri Feb 10 19:18:15 2006 Subject: [Pipet Devel] toolkit and data access/storage In-Reply-To: <36A7D7EB.4D52C752@bc.edu> (bizzaro@bc.edu) References: <36A7D7EB.4D52C752@bc.edu> Message-ID: <199901220942.KAA23302@dirac.cnrs-orleans.fr> > > I personally don't like CGI > > because of it's unflexible fork-request-once-response-once-terminate > > assumption. I suspect these tools are running for a longer time period and > > we would like to be able to find out about their state. > ^^^^^^^^^^^^^^^^^^^^^^^^ > Yes! This is something I realized would not work with standard CGI. These > analysis algorithms (server side only) will be longer lived than standard CGI > scripts. Some may take hours or days to complete. I would like to return There are solutions to this. Something I have considered for monitoring long-running MD simulations is a two-threaded program (remember that Python has very nice threading support) with one thread running the simulation and the other one running the Zope HTTP server (which is a specialized Web server for ZPublisher). Since threads share global data, the Web server could always access the state of the simulation and provide any information the user wants. Also have a look at the PCGI (persistent CGI) http://starship.skyport.net/crew/jbauer/persistcgi/ system, which is more generic. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hinsen@cnrs-orleans.fr Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.55.69 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- From lahondes at pasteur.fr Fri Jan 22 06:42:24 1999 From: lahondes at pasteur.fr (Raynald de =?iso-8859-1?Q?Lahond=E8s?=) Date: Fri Feb 10 19:18:15 2006 Subject: [Pipet Devel] Raynald joins References: <36A78040.136F6770@bc.edu> Message-ID: <36A86420.1157AFF9@pasteur.fr> "J.W. Bizzaro" wrote: > Raynald and Konrad, do you think we should have French and German translations > of the Loci Web pages as well? I think this is the kind of thing you hope to find on internet, don't you ? -- Raynald de Lahondes Unite des Virus Oncogenes - Departement de Biotechnologie Institut Pasteur - 25, rue du Docteur Roux 75724 Paris Cedex 15 - FRANCE tel: 01.45.68.84.54 - fax: 01.40.61.30.33 - cellular: 06.15.65.85.08 email: lahondes@pasteur.fr From justin at ukans.edu Fri Jan 22 15:31:08 1999 From: justin at ukans.edu (Justin Bradford) Date: Fri Feb 10 19:18:15 2006 Subject: [Pipet Devel] tulip mailing list Message-ID: I've set up a tulip mailing list at: tulip-list@busboy.sped.ukans.edu It's running majordomo, so for new people to subscribe, they send mail to majordomo@busboy.sped.ukans.edu with "subscribe tulip-list" in the message body. To unsubscribe, just do the same as above, substituting "subscribe" with "unsubscribe" in the message body. Also, I have it automatically insert the [Pipet Devel] in the subject, if it's not already there, and the reply-to header is set to the list. For those of you using procmail, I recommend keying it off the sender header. Here's a sample recipe: :0: * ^Sender: owner-tulip-list@busboy.sped.ukans.edu mail/tulip The current recipients are: justin@ukans.edu bizzaro@bc.edu hinsen@cnrs-orleans.fr jabbo@mindless.com Thomas.Sicheritz@molbio.uu.se david.lapointe@umassmed.edu rahul@photino.sid.rice.edu carlosm@moet.cs.colorado.edu lahondes@pasteur.fr hjm@cx408397-a.irvn1.occa.home.com Mail me if you have any questions or problems. Justin Bradford justin@ukans.edu From bizzaro at bc.edu Fri Jan 22 16:11:14 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:15 2006 Subject: [Pipet Devel] tulip mailing list References: Message-ID: <36A8E971.E7514D87@bc.edu> Great! Thanks Justin! I'll post this info on the Loci Web page ASAP. BTW, do you know how to create an HTML archive of the mailing list, for people to access later? And can I get a list from time to time of everyone on the list? Jeff bizzaro@bc.edu Justin Bradford wrote: > > I've set up a tulip mailing list at: > tulip-list@busboy.sped.ukans.edu > > It's running majordomo, so for new people to subscribe, they send mail to > majordomo@busboy.sped.ukans.edu with "subscribe tulip-list" in the message > body. > > To unsubscribe, just do the same as above, substituting "subscribe" with > "unsubscribe" in the message body. > > Also, I have it automatically insert the [Pipet Devel] in the subject, if it's > not already there, and the reply-to header is set to the list. > > For those of you using procmail, I recommend keying it off the sender > header. Here's a sample recipe: > :0: > * ^Sender: owner-tulip-list@busboy.sped.ukans.edu > mail/tulip > > The current recipients are: > justin@ukans.edu > bizzaro@bc.edu > hinsen@cnrs-orleans.fr > jabbo@mindless.com > Thomas.Sicheritz@molbio.uu.se > david.lapointe@umassmed.edu > rahul@photino.sid.rice.edu > carlosm@moet.cs.colorado.edu > lahondes@pasteur.fr > hjm@cx408397-a.irvn1.occa.home.com > > Mail me if you have any questions or problems. > > Justin Bradford > justin@ukans.edu -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Fri Jan 22 21:15:00 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:15 2006 Subject: [Pipet Devel] batch processing Message-ID: <36A930A4.3CE49B7A@bc.edu> Carlos, I did a search on freshmeat.net for batch processing systems, and I found the two below. They are both GNU GPL, but Queue looks like it will do more than we need. Funny things is, Queue was the top item on the page when I first connected to Freshmeat :-) GNU Queue: http://bioinfo.mbb.yale.edu/~wkrebs/queue.html Generic NQS: http://www.gnqs.org/home.htm Let me know what you think of these and what we might be able to do. Thanks. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Fri Jan 22 21:35:41 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:15 2006 Subject: [Pipet Devel] more batch processing Message-ID: <36A9357D.C758271A@bc.edu> One more, CERN NQS. It is "freely available". I don't know the license or the Web site, but here's the FTP site: ftp://shift.cern.ch/pub/NQS/ Looking at Generic NQS (last e-mail), I think it may be the way to go. GNU Queue is for "homogeneous clusters of workstations". Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From hjm at cx408397-a.irvn1.occa.home.com Fri Jan 22 22:41:42 1999 From: hjm at cx408397-a.irvn1.occa.home.com (Harry Mangalam) Date: Fri Feb 10 19:18:15 2006 Subject: [Pipet Devel] more batch processing In-Reply-To: <36A9357D.C758271A@bc.edu> Message-ID: Hi All, Re: the question of queueing, load-sharing/leveling, and being able to track a pid persistently, isn't this something that should be addressed at the system level? ie - isn't this something that should be taken up in concert with the kernel folks or maybe the gnome folks so that the tulip plan doesn't go off in a direction ideal for us but turns out to be the one NOT chosen by others? I HATE committees but I hate rewriting large chunks of code more. In the interim, if we need something to get this off the ground, a little hack could be writ to take the pid of the process and track it thru a cgi call to look at the the appropriate /proc entry. This approach would, of course, require a different shim for evey OS (Irix is different than linux is different than Solaris, etc), but it would allow progress without committing to a possibly nonsensical path. Or we just ignore it for the present and write a dummy call HereBePersistantIds() that allows us to sidestep it. If it's gonna be done, it should be done right, but waiting for it to be done right doesn't have to lock other efforts. Or...I'm completely offbase and forgive me... Cheers harry On Sat, 23 Jan 1999, J.W. Bizzaro wrote: > One more, CERN NQS. It is "freely available". I don't know the license or the > Web site, but here's the FTP site: > > ftp://shift.cern.ch/pub/NQS/ > > Looking at Generic NQS (last e-mail), I think it may be the way to go. GNU > Queue is for "homogeneous clusters of workstations". > > > Jeff > -- > J.W. Bizzaro Phone: 617-552-3905 > Boston College mailto:bizzaro@bc.edu > Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ > -- > Cheers, Harry Harry J Mangalam, Developmental + Cell Biology Rm 4201, Biological Sciences II, UC Irvine, Irvine, CA, 92697 (949) 824 4824[vox], (949) 824 8551[fax], mangalam@uci.edu http://hornet.bio.uci.edu/~hjm/ From bizzaro at bc.edu Fri Jan 22 23:48:08 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:15 2006 Subject: [Pipet Devel] more batch processing References: Message-ID: <36A95488.68192128@bc.edu> Harry Mangalam wrote: > Re: the question of queueing, load-sharing/leveling, and being able to > track a pid persistently, isn't this something that should be addressed at > the system level? ie - isn't this something that should be taken up in > concert with the kernel folks or maybe the gnome folks so that the tulip > plan doesn't go off in a direction ideal for us but turns out to be the one > NOT chosen by others? > As you mentioned below, if we expect the kernel (Linux?) or GNOME developers to solve the problem, we (1) have to wait for these guys to do it, if they even want to, and (2) we end up with something that is platform (in this case Linux) dependent. > > In the interim, if we need something to get this off the ground, a little > hack could be writ to take the pid of the process and track it thru a cgi > call to look at the the appropriate /proc entry. This approach would, of > course, require a different shim for evey OS (Irix is different than linux > is different than Solaris, etc), but it would allow progress without > committing to a possibly nonsensical path. If there is some other way to do it, that won't require a different version of Loci for each flavor of UNIX (some on the team think it is bad enough we are ignoring Windows), that's fine with me. I think "all we need" is a binding from Python to GNQS (Generic NQS). We can get the source code, but I don't think writing a binding will require that we recompile it. It shouldn't be all that bad. Regarding compatibility, GNQS has been ported to nearly all flavors of UNIX...just like Python and GTK and GNOME. I don't know if we can call it nonsensical. From what I read, it was one of the first of all UNIX batch systems, derived from the very first one used by NASA. Is it out of date? I don't know. > > Or we just ignore it for the present and write a dummy call > HereBePersistantIds() that allows us to sidestep it. If it's gonna be done, > it should be done right, but waiting for it to be done right doesn't have to > lock other efforts. We absolutely need to have the Paos and XML framework set up before we can even test something like a batch system. So I agree with you. Let's just pretend for now that we will have some system set up for doing this. Exactly what it is I think Justin and Carlos need to think about very carefully...I'll put the burden on someone else ;-) Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Sat Jan 23 00:17:33 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:16 2006 Subject: [Pipet Devel] template system Message-ID: <36A95B6D.BF276B19@bc.edu> Harry, Thinking about the whole Gatekeeper/batch system thing, we do need some sort of a template system to convert formatted ASCII output to XML. You know that command-line programs like yours will need to put the output into XML so that we can have the GUI clients show pretty pictures (very pretty--I'm looking for publication quality). But I don't want to require the authors of the command-line programs to change anything in their programs. What I envisioned is someone who wants to plug a new command-line program into the server-side of Loci, will write a text file that is a template for the Gatekeeper to convert the text into XML. I'm not sure just how it will work, but I think the template essentially needs to say "this much of the output is such and such, and that much is such and such". You know that XML is linear, and the ASCII output from the command-line program can be read one character at a time. Somehow we need to give the Gatekeeper instructions on making a conversion between two linear formats. Do you have any thoughts on this? Can you think of how you might make templates for tacg? Is this something you'd like to work on as a project? Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From hjm at cx408397-a.irvn1.occa.home.com Sat Jan 23 00:40:21 1999 From: hjm at cx408397-a.irvn1.occa.home.com (Harry Mangalam) Date: Fri Feb 10 19:18:16 2006 Subject: [Pipet Devel] template system In-Reply-To: <36A95B6D.BF276B19@bc.edu> Message-ID: I was planning to re-write the output specifically to generate XML (as a commandline flag request), but this is an interesting approach. In one sense it would be an XML preprocessor - some sort of tag or format that would be easy for the author of a cli app to insert in his output to hint to the XML preprocessor to 'treat this grid of x,y numbers as a flibber' or 'treat this column of numbers as a trippet'. The tag hint is breaking with your idea about keeping the output untouched, but sometimes a little hint is a big break - If a little fudging saves a lot of work, I'll go for the fudge. So yes, I'll think (and do) something about this. It will probably be necessary for me to actually re-write my output to get a handle on the issues that need to be addressed for other generic cli apps, but yes, I'll give it a shot. Cheers Harry On Sat, 23 Jan 1999, J.W. Bizzaro wrote: > Harry, > > Thinking about the whole Gatekeeper/batch system thing, we do need some sort of > a template system to convert formatted ASCII output to XML. > > You know that command-line programs like yours will need to put the output into > XML so that we can have the GUI clients show pretty pictures (very pretty--I'm > looking for publication quality). But I don't want to require the authors of > the command-line programs to change anything in their programs. > > What I envisioned is someone who wants to plug a new command-line program into > the server-side of Loci, will write a text file that is a template for the > Gatekeeper to convert the text into XML. I'm not sure just how it will work, > but I think the template essentially needs to say "this much of the output is > such and such, and that much is such and such". You know that XML is linear, > and the ASCII output from the command-line program can be read one character at > a time. Somehow we need to give the Gatekeeper instructions on making a > conversion between two linear formats. > > Do you have any thoughts on this? Can you think of how you might make templates > for tacg? Is this something you'd like to work on as a project? > > > Jeff > -- > J.W. Bizzaro Phone: 617-552-3905 > Boston College mailto:bizzaro@bc.edu > Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ > -- > Cheers, Harry Harry J Mangalam, Developmental + Cell Biology Rm 4201, Biological Sciences II, UC Irvine, Irvine, CA, 92697 (949) 824 4824[vox], (949) 824 8551[fax], mangalam@uci.edu http://hornet.bio.uci.edu/~hjm/ From bizzaro at bc.edu Sun Jan 24 19:39:07 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:16 2006 Subject: [Pipet Devel] and another thing... Message-ID: <36ABBD27.A146A6CE@bc.edu> To expand upon the Globetrotter analogy, each locus must be aware of the other loci in the installation, without having anything added to the code. In other words, the bball players must be able to handle the addition and subtraction of other players. If someone has Loci installed with say 10 tools, and they download an 11th tool from some third-party developer, the original 10 must know the 11th is there and what it can do. And the 11th must know that there are 10 others and what each of them can do. This is similar to what I had planned for the Gatekeeper. The Gatekeeper will know what tools are installed locally (on the server) and what each can do. This information is reported to the connecting client, so that the client loci will then know what analysis loci are there. I actually have that expanded a bit to include a hub server at Lowell (or wherever) that will have all the Gatekeepers on the Internet registered, so that someone with Loci can see all of the analysis loci available in the world. For both the client and server sides, we need databases to keep track of what loci are present and what they can do. In the case of the client (GUI) loci, all of the public objects need to be recorded. Carlos, does this make sense regarding Paos? Paos is an active object server. Does it have a way to catalog objects that may change as the configuration changes? This is VERY important! Can you guys see now, with this model in mind, just how difficult it would be to get Loci to operate harmoiously with more than one language at the core? Sweet Georgia Brown... Jeff bizzaro@bc.edu -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From bizzaro at bc.edu Sun Jan 24 22:54:35 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:16 2006 Subject: [Pipet Devel] porta Message-ID: <36ABEAE4.3146619B@bc.edu> I wrote: > This is similar to what I had planned for the Gatekeeper. The Gatekeeper will > know what tools are installed locally (on the server) and what each can do. > This information is reported to the connecting client, so that the client loci > will then know what analysis loci are there. In case this isn't clear to anyone, the function of the Gatekeeper (and possibly a client side locus that handles all calls to the Gatekeeper--I'm calling it the "Porta Internet", Latin for Internet portal) is to make the analysis algorithms transparent to the client loci. In other words, the information will come from a Python module (Gatekeeper via Porta Internet) and be nicely packaged as an XML object. The clients must have no idea they are communicating with non-Python programs. They act as though Porta Internet is just another client. Oh, and that goes for CORBA as well. We should have a Porta CORBA that turns the CORBA objects into Python/Paos/Loci objects, making Perl, etc. transparent to the clients. (Maybe we should use CORBA to connect Perl to Loci, since we've been talking about using Perl, unless anyone knows a better way.) Think Globetrotters, not Washington Generals ;-) Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- From carlosm at mroe.cs.colorado.edu Mon Jan 25 02:43:34 1999 From: carlosm at mroe.cs.colorado.edu (Carlos Maltzahn) Date: Fri Feb 10 19:18:16 2006 Subject: [Pipet Devel] and another thing... Message-ID: [Jeff] > For both the client and server sides, we need databases to keep track of > what loci are present and what they can do. In the case of the client > (GUI) loci, all of the public objects need to be recorded. Carlos, does > this make sense regarding Paos? Paos is an active object server. Does > it have a way to catalog objects that may change as the configuration > changes? This is VERY important! I believe so. Depending on how you design the object schema you can register notification requests with a Paos server to notify you of any changes as well as any additions/removals. Notification requests have the same power as regular queries. So all you need to do is to define the catalog of objects that may change as a query and register it with Paos. Additions/removals are handled by formulating queries for changes in sets of objects. * * * I'm currently in a paper deadline crunch - sorry - but I'm working on the Paos documentation/tutorial "real soon now". :) Carlos From hjm at cx408397-a.irvn1.occa.home.com Tue Jan 26 21:18:48 1999 From: hjm at cx408397-a.irvn1.occa.home.com (Harry Mangalam) Date: Fri Feb 10 19:18:16 2006 Subject: [Pipet Devel] What is Paos? In-Reply-To: <36AE660B.4D9707B7@bc.edu> Message-ID: Is there some docs or description on Paos? I'm not familiar with it, although it sounds like it might be a some kind of object database with a little brokerage mixed in...? Cheers, Harry Harry J Mangalam, Developmental + Cell Biology Rm 4201, Biological Sciences II, UC Irvine, Irvine, CA, 92697 (949) 824 4824[vox], (949) 824 8551[fax], mangalam@uci.edu http://hornet.bio.uci.edu/~hjm/ From carlosm at mroe.cs.colorado.edu Tue Jan 26 21:54:30 1999 From: carlosm at mroe.cs.colorado.edu (Carlos Maltzahn) Date: Fri Feb 10 19:18:16 2006 Subject: [Pipet Devel] What is Paos? In-Reply-To: Message-ID: Unfortunately, the documentation of Paos is currently very poor. I'm working on improving that (but don't expect anything before next week. If you look at http://www.cs.colorado.edu/~carlosm/software.html you will find a paper on Paos in German - use babelfish for (poor) translation. Another bit of documentation is in ftp://www.cs.colorado.edu/users/carlosm/README.paos. Sorry, Carlos On Tue, 26 Jan 1999, Harry Mangalam wrote: > Is there some docs or description on Paos? I'm not familiar with it, > although it sounds like it might be a some kind of object database with a > little brokerage mixed in...? > > > Cheers, > Harry > > Harry J Mangalam, Developmental + Cell Biology > Rm 4201, Biological Sciences II, UC Irvine, Irvine, CA, 92697 > (949) 824 4824[vox], (949) 824 8551[fax], mangalam@uci.edu > http://hornet.bio.uci.edu/~hjm/ > > From bizzaro at bc.edu Tue Jan 26 23:47:12 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:16 2006 Subject: [Pipet Devel] [Fwd: Express] Message-ID: <36AE9A50.107E6D32@bc.edu> Fellow Locians, This is a reply to a message I sent to Conrad Parker a few months ago. I wrote to him about his GTK/GNOME Web browser, "Express". I thought it might serve as the core for XML display in Loci. Look at his comment near the end, about Aube and GCL. Interesting. I'll send him a message back right now. Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ -- -------------- next part -------------- An embedded message was scrubbed... From: Conrad Parker Subject: Re: Express Date: Wed, 27 Jan 1999 14:10:03 +1100 Size: 5716 Url: http://bioinformatics.org/pipermail/pipet-devel/attachments/19990127/984fef1d/attachment.mht From bizzaro at bc.edu Wed Jan 27 00:30:51 1999 From: bizzaro at bc.edu (J.W. Bizzaro) Date: Fri Feb 10 19:18:16 2006 Subject: [Pipet Devel] Re: Express References: <364B8DC1.9B17FEEB@bc.edu> <19990127141003.H4759@cse.unsw.edu.au> Message-ID: <36AEA48B.CC29FAA5@bc.edu> Hi Conrad! Wow. I thought maybe you died or something ;-) Conrad Parker wrote: > I'm planning on making Express handle XML applications nicely, though I haven't > looked into it much yet. Insofar as my contribution involves writing a > browser which can support various XML applications, yes I'd like to be involved > with TULIP :) Beyond that I don't think I can help much - my knowledge of > biochemistry doesn't extend too much beyond high school and brief encounters in > studying information theory and genetic algorithms :) We have 9 bioscientists on our team, so you need not worry about that. Time has passed, and at this point we are looking for code to a generic XML browser, but something we can build upon. Each GUI tool in Loci (the name Tulip is being phased out) will be a special-purpose XML browser and will support one (probably one) XML definition. Also, we are working with Python/C with bindings to GTK/GNOME. So, we need something we can wrap some Python code around. We do have bindings to all of the GTK/GNOME widgets, so we may be able to make the whole thing in Python. But of course native C will be faster. Which would you recommend? I haven't looked into the speed requirements for a browser, but what we need will be graphics intensive. > cool :) looking at your developer's page, if Jay Painter is working on the > BSML implementation then it should probably be ok for me to just do the web > browser support (which will of course give networking etc). Unfortunately, Jay is tied up with GNOME development for RedHat and may not come back to Loci. He was our only GTK/GNOME expert. So, I guess we'll have to start with page one of the tutorial :-P > The reason I mention it is because its architecture is similar to your ideas > for TULIP. In particular, looking at your ideas for GCL (do you have an > implementation yet?) it looks like the way you want to be able to connect up > components (tools) is similar to the way aube works - however aube's system is > currently entirely graphical (ie. you can connect up various components, but > not load/save the state of connections). I am looking at using XML to handle > this information, as it can save parameters of each component more cleanly than > a scripting language could. We are just now trying to implement an active object server for Python (Paos, by Carlos Maltzahn). And we're talking about making a workflow system so that XML objects can be juggled and tracked. I agree that XML is a nice way to handle this sort of data. > So, if you'd like to save yourself some coding you can use the system I've got > going with aube, including some widgets for selecting inputs and connecting up > components. I'll soon be adding an overview widget for editing the whole graph > of connections Aube looks very nice! You know, we _really_ need a GTK guru to put the graphics together for GCL (all the glyphs and arrows). This may be even more helpful to us than an XML browser, and it wouldn't require any knowledge of biology. I imagine you might be able to make use of GCL in Aube or other programs (in fact, I think GCL is the way to go for user-friendly UNIX). Is this something you'd like to take on? Have you tried to make dnd icons and user-manipulated graphics with gnome-canvas? Jeff -- J.W. Bizzaro Phone: 617-552-3905 Boston College mailto:bizzaro@bc.edu Department of Chemistry http://www.uml.edu/Dept/Chem/Bizzaro/ --