From landman at scientificappliance.com Fri Apr 12 15:52:34 2002 From: landman at scientificappliance.com (Joe Landman) Date: Tue Aug 8 04:23:02 2006 Subject: [Biodevelopers] Welcome, and starting things off ... Message-ID: <1018641154.4216.77.camel@protein.dtw.macsch.com> Thanks for joining this discussion list. As the introductory message implies, this group is meant to be an open forum for discussing all topics relating to development issues for those engaged in bioinformatics and related efforts. I personally use a very broad definition of bioinformatics, as an all-encompassing study of the information content of living systems, and how the life processes make use of the information. This means in my mind that we are looking at classical genomics (primary structure), computational chemistry (secondary structure, SAR, etc), computational biophysics, and many other topics. Even if this definition of bioinformatics doesnt necessarily match with yours, please feel free to contribute your thoughts and comments. In order to try to jumpstart a discussion, I would like to talk briefly about some of the work I am currently doing. My work as of late has been in designing and building scalable computing systems (software and hardware). I have been using SOAP and XML as well as object oriented Perl. Some of the issues I am running into on the development side involve creating persistent objects to temporarily store complex data structures on my server. Basically the type of thing that I would like to do is to take either a hash or an XML document such as follows: localhost 6931 10 http bio0 jobdb rundb
and have this persistent across calls. There are several "easy" ways to do this. One method is to "serialize" this data structure, and to store the serialized verision. Many "XML databases" do exactly that. What would be nice is to have a simple set of (Perl or Perl callable) methods to trivially store and retrieve this structure or elements of this structure. I have written a wrapper above the Perl DBI to make it very OO looking/acting, and maybe this is the route I need to take. If anyone has done anything related to this, I would like to hear about it. As indicated, I am using SOAP, and the SOAP::Lite module in Perl. This makes building "web visible" services quite simple. Of course, nothing is without cost. What I have found is that after calling a SOAP new method to instantiate the object, I cannot use the traditional Perl OO methods to decorate the object with attributes, that is mutators and accessors: # # mutator # sub set_thingy { my ($self,$thingy) = @_; return $self->{'thingy'} = $thingy ; } # # accessor # sub get_thingy { my ($self) = @_; if (exists($self->{'thingy'})) { return $self->{'thingy'}; } else { return undef; } } does not work anymore (works great outside of SOAP::Lite). This is a little bothersome, as I use these types of accessors/mutators all the time. Hence my need for a persistent object storage on the remote host. On an unrelated note, I am curious as to what analysis techniques people are using for proteomic data. At this stage, I have seen image processing work, various bits of data mining and regression tests, etc. It of course is highly dependent upon the processes being studied. My interests are specifically in those analyses which are computationally demanding. Thanks, and once again, welcome to the group! From landman at scientificappliance.com Fri Apr 12 17:01:28 2002 From: landman at scientificappliance.com (Joe Landman) Date: Tue Aug 8 04:23:02 2006 Subject: [Biodevelopers] Another question... Message-ID: <1018645288.28777.49.camel@squash.canton01.mi.comcast.net> What issues are people running into for their development work? Most of my work is done on Linux machines and clusters, so my issues are related to the compilers (gcc et al), and related. A question I have is, for those doing algorithm development, would a very high level environment (e.g. Matlab-ish) make sense for bioinformatics? That is, it would be not so difficult to imagine setting up a language where you could perform calculations like this: [high_scoring_sequences, low_scoring_sequences] = NCBI_BLAST("sequence_file", [database1, database2]); alignment=ClustalW(high_scoring_sequences[subset]); . . . I have been thinking about doing something like this for a long time. You can get near to this concept using BioPerl. Joe From titus at caltech.edu Fri Apr 12 18:56:22 2002 From: titus at caltech.edu (Titus Brown) Date: Tue Aug 8 04:23:02 2006 Subject: [Biodevelopers] Another question... In-Reply-To: <1018645288.28777.49.camel@squash.canton01.mi.comcast.net> References: <1018645288.28777.49.camel@squash.canton01.mi.comcast.net> Message-ID: <20020412225622.GG30383@caltech.edu> -> What issues are people running into for their development work? None! Well, not enough time... -> Most of my work is done on Linux machines and clusters, so my issues are -> related to the compilers (gcc et al), and related. Everything pretty much works; I'm not using any oddball software for anything, so *shrug* it all works. Mainly I'm irritated that a new version of Python comes out every week... -> A question I have is, for those doing algorithm development, would a -> very high level environment (e.g. Matlab-ish) make sense for -> bioinformatics? That is, it would be not so difficult to imagine -> setting up a language where you could perform calculations like this: -> -> [high_scoring_sequences, low_scoring_sequences] = NCBI_BLAST("sequence_file", [database1, database2]); -> alignment=ClustalW(high_scoring_sequences[subset]); -> . -> . -> . -> -> -> I have been thinking about doing something like this for a long time. -> You can get near to this concept using BioPerl. Who would be your target audience? Other developers, or biology researchers? In neuroscience Matlab seems to be enshrined as the language of choice, so it might be a good way to give bioinformatics capabilities to biologists... --t From biorst at hotmail.com Tue Apr 16 03:44:00 2002 From: biorst at hotmail.com (Rannveig Storaa) Date: Tue Aug 8 04:23:02 2006 Subject: [Biodevelopers] Fwd: [BiO BB] New to bioinformatics Message-ID: >From: "Rannveig Storaa" >Reply-To: bio_bulletin_board@bioinformatics.org >To: bio_bulletin_board@bioinformatics.org >Subject: [BiO BB] New to bioinformatics >Date: Sat, 13 Apr 2002 16:49:08 +0200 > >Hello everybody, >I'd like to know if there is any way to use EMBOSS in a Windows >environment. >Is there anyavailable tools which is able to make multi-alignment and >tranlsate the aligned sequences altogether? >Thanks > > >Rannveig > >_________________________________________________________________ >H?mta MSN Explorer kostnadsfritt p? http://explorer.msn.se/intl.asp > >_______________________________________________ >BiO_Bulletin_Board maillist - BiO_Bulletin_Board@bioinformatics.org >http://bioinformatics.org/mailman/listinfo/bio_bulletin_board _________________________________________________________________ H?mta MSN Explorer kostnadsfritt p? http://explorer.msn.se/intl.asp From martingoodson at hotmail.com Tue Apr 16 08:26:27 2002 From: martingoodson at hotmail.com (martin goodson) Date: Tue Aug 8 04:23:02 2006 Subject: [Biodevelopers] Fwd: [BiO BB] New to bioinformatics Message-ID: see Jemboss - a java version of emboss (http://www.hgmp.mrc.ac.uk/Registered/Option/emboss.html). Note that registration is required. martin >From: "Rannveig Storaa" >Reply-To: biodevelopers@bioinformatics.org >To: biodevelopers@bioinformatics.org, bio_bulletin_board@bioinformatics.org >Subject: [Biodevelopers] Fwd: [BiO BB] New to bioinformatics >Date: Tue, 16 Apr 2002 09:44:00 +0200 > > > > >>From: "Rannveig Storaa" >>Reply-To: bio_bulletin_board@bioinformatics.org >>To: bio_bulletin_board@bioinformatics.org >>Subject: [BiO BB] New to bioinformatics >>Date: Sat, 13 Apr 2002 16:49:08 +0200 >> >>Hello everybody, >>I'd like to know if there is any way to use EMBOSS in a Windows >>environment. >>Is there anyavailable tools which is able to make multi-alignment and >>tranlsate the aligned sequences altogether? >>Thanks >> >> >>Rannveig >> >>_________________________________________________________________ >>Hämta MSN Explorer kostnadsfritt på http://explorer.msn.se/intl.asp >> >>_______________________________________________ >>BiO_Bulletin_Board maillist - BiO_Bulletin_Board@bioinformatics.org >>http://bioinformatics.org/mailman/listinfo/bio_bulletin_board > > > > >_________________________________________________________________ >Hämta MSN Explorer kostnadsfritt på http://explorer.msn.se/intl.asp > >_______________________________________________ >Biodevelopers mailing list >Biodevelopers@bioinformatics.org >http://bioinformatics.org/mailman/listinfo/biodevelopers _________________________________________________________________ Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp. From martingoodson at hotmail.com Tue Apr 16 08:59:08 2002 From: martingoodson at hotmail.com (martin goodson) Date: Tue Aug 8 04:23:02 2006 Subject: [Biodevelopers] Welcome, and starting things off ... Message-ID: >On an unrelated note, I am curious as to what analysis techniques people >are using for proteomic data. At this stage, I have seen image >processing work, various bits of data mining and regression tests, etc. >It of course is highly dependent upon the processes being studied. My >interests are specifically in those analyses which are computationally >demanding. > are you thinking of things like peptide mass fingerprinting and peptide ID via of tandem mass spec/ databse searching. Martin _________________________________________________________________ Chat with friends online, try MSN Messenger: http://messenger.msn.com From landman at scientificappliance.com Thu Apr 18 05:03:54 2002 From: landman at scientificappliance.com (Joe Landman) Date: Tue Aug 8 04:23:02 2006 Subject: [Biodevelopers] Quick question on distributed development Message-ID: <1019120635.24362.40.camel@protein.dtw.macsch.com> Folks: I wanted to know if anyone is looking at targetting grids or distributed computing via Platform, Entropia/UD/DataSynapse, or Globus/Legion toolkits. I am curious about any experiences or thoughts on these systems, and their corresponding SDKs. Thanks! Joe From landman at scientificappliance.com Wed Apr 24 08:58:56 2002 From: landman at scientificappliance.com (Joe Landman) Date: Tue Aug 8 04:23:02 2006 Subject: [Biodevelopers] Object Oriented Perl issue Message-ID: <1019653141.2548.137.camel@protein.dtw.macsch.com> Folks: My day job (ok, one of the hats I wear) has me coding in Perl. I am using the OO methodologies with modules. And of course, I had a question on the "rightness" of a particular way of doing things. It works, and works nicely (except under SOAP::Lite, which I do not understand). Here is the problem. When you instantiate an object, usually you do something like this: use Foo::Bar; $object = Foo::Bar->new(); If you want to pass specific parameters to the new object, you typically put an argument string in the parenthesis to new. This can be a scalar, an array, or a hash. You just have to parse it correctly in your Foo::Bar::new method. This is fine, no issues here. Now suppose you want to attach attributes to the object (think of the object as having methods and attributes, attributes being "localized" variables to the object). So you would use attribute accessor/mutator methods (aka get/set). If you read Damian Conway's book on the subject, he indicates that what you should do is to build one accessor/mutator per attribute. So if you have many attributes, you have many accessor/mutator pairs. This means that your code is sprinkled with: sub get_fragment { my $self = shift; return $self->{'fragment'}; } sub set_fragment { my ($self,$value) = @_ return $self->{'fragment'}=$value; } This is fine, until you hit the second attribute, and you immediately start thinking that this is a waste of time, typing, will slow down the Perl system, etc. To use this in your code, you do something like this: $object->set_fragment(10); . . printf "Fragment is %i\n",$object->get_fragment(); So thinking about this, I realized that you could use a general accessor/mutator pair and the auto-vivification capability of Perl, and deal with any number of attributes with a single pair, like this: sub get_attribute { my ($self,$attribute) = @_; if (exists($self->{$attribute})) { return $self->{$attribute}; } else { return undef; } } sub set_attribute { my ($self,$attribute,$value) = @_ return $self->{$attribute}=$value; } To use this in your code, you would do something like: $object->set_attribute('fragment',10); . . printf "Fragment is %i\n",$object->get_attribute('fragment'); So this works. It works quite well. Except for some reason, in a remotely instantiated object under SOAP::Lite (then again, the other method also doesnt work there). Basically in your new method, you include a line to copy the argument hash to the attributes like this: sub new { my ($class,%args) = @_; my $self={}; bless $self,ref $class || $class ; while (my ($key,$value) = each %args) { $self->set_attribute($key,$value); } return $self; } and your call to new now looks like this: $object=Foo::Bar->new( 'attribute-1'=>'value1', 'attribute-2'=>'value2',... ); You can use Data::Dumper and print Dumper(\$object) to see that this really works. My question is whether or not this is considered good OO style. I am effectively relying on the auto-vivification (the automatic creation of a new variable) to build the attributes. This means that in theory, I can have an unlimited number of attributes. I presume that the orginial framers of OO would not be happy with this. Another question revolves around what could go wrong with this. I am not trying to enforce hiding in OO here, just localization of an objects variables with its methods. Thoughts and comments welcome. Is this an issue in Python (everything is an object) or similar? Joe From bub at io.com Wed Apr 24 17:25:14 2002 From: bub at io.com (Steve O) Date: Tue Aug 8 04:23:02 2006 Subject: [Biodevelopers] Object Oriented Perl issue Message-ID: <20020424162514.A12845@hagbard.io.com> Joe, You might want to check out Damian Conway's book Object Oriented Perl. He shows how to use perl's AUTOLOAD feature to create get/set methods on the fly if they don't exist. This allows you to write get/set methods that have interesting behavior, and leave the simple ones for perl to make up. Here's the example: sub AUTOLOAD { my ( $self, $newval ) = @_; if ($AUTOLOAD =~ /.*::get(_\w+)/ && $self->_accessable( $1, 'read' )) { my $attr_name=$1; *{$AUTOLOAD} = sub { return $_[0]->{$attr_name} }; return $self->{$attr_name}; } elsif ($AUTOLOAD =~ /.*::set(_\w+)/ && $self->_accessable( $1, 'write')) { my $attr_name=$1; *{$AUTOLOAD} = sub { $_[0]->{$attr_name} = $_[1]; return }; return $self->{$attr_name} = $newval; } croak "No such method: $AUTOLOAD"; } The argument I think OO people would have with your get_attribute and set_attribute functions is that it allows access to all of the object's state and allows no extra processing. The methods are providing no extra benefit over the simpler: $object->{fragment}="foo"; print "fragment: ",$object->{fragment},"\n"; -steve From landman at scientificappliance.com Thu Apr 25 00:07:16 2002 From: landman at scientificappliance.com (Joe Landman) Date: Tue Aug 8 04:23:02 2006 Subject: [Biodevelopers] testing Message-ID: <1019707636.8031.29.camel@protein.dtw.macsch.com> This is just a test (of the bioinformatics.org mail system) after an upgrade. From landman at scientificappliance.com Thu Apr 25 10:29:16 2002 From: landman at scientificappliance.com (Joe Landman) Date: Tue Aug 8 04:23:02 2006 Subject: [Biodevelopers] Object Oriented Perl issue In-Reply-To: <20020424162514.A12845@hagbard.io.com> References: <20020424162514.A12845@hagbard.io.com> Message-ID: <1019744957.8031.139.camel@protein.dtw.macsch.com> On Wed, 2002-04-24 at 17:25, Steve O wrote: > Joe, > > You might want to check out Damian Conway's book Object Oriented Perl. > He shows how to use perl's AUTOLOAD feature to create get/set > methods on the fly if they don't exist. This allows you to write > get/set methods that have interesting behavior, and leave the simple > ones for perl to make up. I saw this in the book and wondered if it introduced too much complexity in what I thought should be a simple problem. [...] > The argument I think OO people would have with your get_attribute > and set_attribute functions is that it allows access to all > of the object's state and allows no extra processing. The methods > are providing no extra benefit over the simpler: So what you are saying is that my get/set pair let you run roughshod over the object (which is a potential security/stability risk) without bound. The autogenerated accessor/mutator do not. That is a good argument to use the autogenerated ones. > $object->{fragment}="foo"; > print "fragment: ",$object->{fragment},"\n"; From landman at scientificappliance.com Thu Apr 25 10:42:04 2002 From: landman at scientificappliance.com (Joe Landman) Date: Tue Aug 8 04:23:03 2006 Subject: [Biodevelopers] On security models for networked applications Message-ID: <1019745724.8031.153.camel@protein.dtw.macsch.com> Folks: (I hope others will start posting here as well :)) I am thinking about security issues for my networked application. Specifically how to authenticate a user properly, so a server can trust the client talking to it is doing so on behalf of the correct user, and the client can trust that the server it is talking to in fact represents a valid server for the application, and can autheticate this. I havent read up on things like public key infrastructures or whatnot else. If someone else has run into this problem before, and is willing to share some of what they learned, I think that would be valuable to the list. Basically I see the security issue broken up into sections. 1) transport security: being able to send data/information without compromise of the information (generally handled by TLS, SSL, etc) 2) user authentication: being able to verify the identity of the user of the service 3) server authentication: being able to verify the identity of the server and service (generally handled by certification authorities and server certificates). I look at each transaction between server and client as needing to be secure in the sense of the above list (and possibly others I have not considered). Are there any good discussions of this type of security in book or URL formats? I am looking for practical examples I can use/learn from. If you have any experience with these issues, please feel free to talk about them here. Thanks again! Joe From titus at caltech.edu Thu Apr 25 11:02:03 2002 From: titus at caltech.edu (Titus Brown) Date: Tue Aug 8 04:23:03 2006 Subject: [Biodevelopers] On security models for networked applications In-Reply-To: <1019745724.8031.153.camel@protein.dtw.macsch.com> References: <1019745724.8031.153.camel@protein.dtw.macsch.com> Message-ID: <20020425150203.GA28816@caltech.edu> -> I am thinking about security issues for my networked application. -> Specifically how to authenticate a user properly, so a server can trust -> the client talking to it is doing so on behalf of the correct user, and -> the client can trust that the server it is talking to in fact represents -> a valid server for the application, and can autheticate this. Do you want to know about generic network communication, or RPC mechanisms, or something over straight HTTP? (I'm guessing RPC...) As you say, the transport can handle the data security, and server authentication can be handled by hardcoding the server name , unless you want things to be a bit more flexible, in which case you'll have to buy into some sort of distributed authentication framework. As for user authentication, I don't think there's a good generic way to do it for generic network communication (this is one of the things that RPC mechanisms like SOAP are supposed to help with!). I can recommend a simple reference for how to do it in SOAP, but I haven't used that. Of course, if you have a secure transport layer, you can just send a user/pass along with every request ;). --titus From landman at scientificappliance.com Thu Apr 25 10:53:29 2002 From: landman at scientificappliance.com (Joe Landman) Date: Tue Aug 8 04:23:03 2006 Subject: [Biodevelopers] Another point of discussion: XML Message-ID: <1019746409.8030.166.camel@protein.dtw.macsch.com> In his talk at the O'Reilly conference in Jan 2002, Ewan Birney indicated he had mixed feelings about using XML for informatics applications. His point as I remember it, was that it "encouraged bloat". XML is a verbose language. There is no doubt about that. I do believe it is a useful technology for data exchange, with the caveat that you need to think carefully about how you are going to use it. Specifically, if you are going to exchange many records of a database, it might make sense not to force everything into XML for transfer, but to use XML to describe the structure of the BLOB you transfer over, so the remote system can understand it. Basically use XML not to encapsulate the data for transfer (which is what I think Ewan was talking about), but as a descriptive block which need only be sent once, and the data can be piped over raw. I was hoping that others who are looking at or using XML (or similar technologies) might chime up in how they are using it, or thinking about it. I am using XML for my program config files. I am using XML for passing state information between services on my system. The nice thing about this is that I can add/change the way the thing works by adding to or changing the document, without rewriting the interface that generates or parses the document. Of course, as I am using SOAP, I am using XML implicitly for the structure of my object transport layer. How are you/arent you using this or related technologies? From landman at scientificappliance.com Thu Apr 25 11:10:10 2002 From: landman at scientificappliance.com (Joe Landman) Date: Tue Aug 8 04:23:03 2006 Subject: [Biodevelopers] On security models for networked applications In-Reply-To: <20020425150203.GA28816@caltech.edu> References: <1019745724.8031.153.camel@protein.dtw.macsch.com> <20020425150203.GA28816@caltech.edu> Message-ID: <1019747410.8031.177.camel@protein.dtw.macsch.com> On Thu, 2002-04-25 at 11:02, Titus Brown wrote: > -> I am thinking about security issues for my networked application. > -> Specifically how to authenticate a user properly, so a server can trust > -> the client talking to it is doing so on behalf of the correct user, and > -> the client can trust that the server it is talking to in fact represents > -> a valid server for the application, and can autheticate this. > > Do you want to know about generic network communication, or RPC mechanisms, > or something over straight HTTP? (I'm guessing RPC...) Actually, HTTP would be best, given that this is the transport layer I am using. > As you say, the transport can handle the data security, and server > authentication can be handled by hardcoding the server name , > unless you want things to be a bit more flexible, in which case you'll > have to buy into some sort of distributed authentication framework. I need to be flexible. Hardcoding == bad for my application. Distributed authentication is what I am looking for. > As for user authentication, I don't think there's a good generic way to do > it for generic network communication (this is one of the things that RPC > mechanisms like SOAP are supposed to help with!). I can recommend a > simple reference for how to do it in SOAP, but I haven't used that. > > Of course, if you have a secure transport layer, you can just send a user/pass > along with every request ;). What I am trying to avoid is the notion of trust. From what I have seen of systems that use trust, there are two states, untrusted and trusted. The transition between these two states is mediated by a process of authetication. This process is usually something related to a login. Once you are in the trusted state, you can do as you wish. So a dedicated cracker/hacker type could figure out some bug somewhere which forces this transition to occur, enter the trusted state, and then perform their nefarious acts. I dont know if it makes sense, but I want to avoid this trusted state. If I communicate over a secure link (SSL) to my server, and I send my userid/password at every transaction, how can I be sure that (from the server's perspective) that I am who I say I am? Dont I need either a shared secret (aside from userid/password), or some sort of other authetication method? Maybe I am being too paranoid about this. From rsimon at acclivitydev.com Thu Apr 25 11:26:21 2002 From: rsimon at acclivitydev.com (Richard Simon) Date: Tue Aug 8 04:23:03 2006 Subject: [Biodevelopers] XML in BioInformatics magazine article Message-ID: Hello all, I am writing an article on the current and projected future use of XML in BioInformatics for the BioITWorld magazine. I am interested in how XML is used for integrating data sources, exchanging biological/sequence information, web services applications, etc. If anyone would like to chat with me on this subject please email me back. Regards, ____________________________________________ Richard L. Simon President Acclivity Development Corporation 978-835-4432 rsimon@acclivitydev.com www.acclivitydev.com ____________________________________________