From landman at scientificappliance.com Fri Apr 12 15:52:34 2002
From: landman at scientificappliance.com (Joe Landman)
Date: Tue Aug 8 04:23:02 2006
Subject: [Biodevelopers] Welcome, and starting things off ...
Message-ID: <1018641154.4216.77.camel@protein.dtw.macsch.com>
Thanks for joining this discussion list. As the introductory message
implies, this group is meant to be an open forum for discussing all
topics relating to development issues for those engaged in
bioinformatics and related efforts.
I personally use a very broad definition of bioinformatics, as an
all-encompassing study of the information content of living systems, and
how the life processes make use of the information. This means in my
mind that we are looking at classical genomics (primary structure),
computational chemistry (secondary structure, SAR, etc), computational
biophysics, and many other topics. Even if this definition of
bioinformatics doesnt necessarily match with yours, please feel free to
contribute your thoughts and comments.
In order to try to jumpstart a discussion, I would like to talk briefly
about some of the work I am currently doing.
My work as of late has been in designing and building scalable computing
systems (software and hardware). I have been using SOAP and XML as well
as object oriented Perl. Some of the issues I am running into on the
development side involve creating persistent objects to temporarily
store complex data structures on my server. Basically the type of thing
that I would like to do is to take either a hash or an XML document such
as follows:
localhost
6931
10
http
bio0
jobdb
and have this persistent across calls. There are several "easy" ways to
do this. One method is to "serialize" this data structure, and to store
the serialized verision. Many "XML databases" do exactly that. What
would be nice is to have a simple set of (Perl or Perl callable) methods
to trivially store and retrieve this structure or elements of this
structure. I have written a wrapper above the Perl DBI to make it very
OO looking/acting, and maybe this is the route I need to take. If
anyone has done anything related to this, I would like to hear about it.
As indicated, I am using SOAP, and the SOAP::Lite module in Perl. This
makes building "web visible" services quite simple. Of course, nothing
is without cost. What I have found is that after calling a SOAP new
method to instantiate the object, I cannot use the traditional Perl OO
methods to decorate the object with attributes, that is mutators and
accessors:
#
# mutator
#
sub set_thingy
{
my ($self,$thingy) = @_;
return $self->{'thingy'} = $thingy ;
}
#
# accessor
#
sub get_thingy
{
my ($self) = @_;
if (exists($self->{'thingy'}))
{
return $self->{'thingy'};
}
else
{
return undef;
}
}
does not work anymore (works great outside of SOAP::Lite). This is a
little bothersome, as I use these types of accessors/mutators all the
time. Hence my need for a persistent object storage on the remote host.
On an unrelated note, I am curious as to what analysis techniques people
are using for proteomic data. At this stage, I have seen image
processing work, various bits of data mining and regression tests, etc.
It of course is highly dependent upon the processes being studied. My
interests are specifically in those analyses which are computationally
demanding.
Thanks, and once again, welcome to the group!
From landman at scientificappliance.com Fri Apr 12 17:01:28 2002
From: landman at scientificappliance.com (Joe Landman)
Date: Tue Aug 8 04:23:02 2006
Subject: [Biodevelopers] Another question...
Message-ID: <1018645288.28777.49.camel@squash.canton01.mi.comcast.net>
What issues are people running into for their development work?
Most of my work is done on Linux machines and clusters, so my issues are
related to the compilers (gcc et al), and related.
A question I have is, for those doing algorithm development, would a
very high level environment (e.g. Matlab-ish) make sense for
bioinformatics? That is, it would be not so difficult to imagine
setting up a language where you could perform calculations like this:
[high_scoring_sequences, low_scoring_sequences] = NCBI_BLAST("sequence_file", [database1, database2]);
alignment=ClustalW(high_scoring_sequences[subset]);
.
.
.
I have been thinking about doing something like this for a long time.
You can get near to this concept using BioPerl.
Joe
From titus at caltech.edu Fri Apr 12 18:56:22 2002
From: titus at caltech.edu (Titus Brown)
Date: Tue Aug 8 04:23:02 2006
Subject: [Biodevelopers] Another question...
In-Reply-To: <1018645288.28777.49.camel@squash.canton01.mi.comcast.net>
References: <1018645288.28777.49.camel@squash.canton01.mi.comcast.net>
Message-ID: <20020412225622.GG30383@caltech.edu>
-> What issues are people running into for their development work?
None! Well, not enough time...
-> Most of my work is done on Linux machines and clusters, so my issues are
-> related to the compilers (gcc et al), and related.
Everything pretty much works; I'm not using any oddball software for anything,
so *shrug* it all works. Mainly I'm irritated that a new version of Python
comes out every week...
-> A question I have is, for those doing algorithm development, would a
-> very high level environment (e.g. Matlab-ish) make sense for
-> bioinformatics? That is, it would be not so difficult to imagine
-> setting up a language where you could perform calculations like this:
->
-> [high_scoring_sequences, low_scoring_sequences] = NCBI_BLAST("sequence_file", [database1, database2]);
-> alignment=ClustalW(high_scoring_sequences[subset]);
-> .
-> .
-> .
->
->
-> I have been thinking about doing something like this for a long time.
-> You can get near to this concept using BioPerl.
Who would be your target audience? Other developers, or biology researchers?
In neuroscience Matlab seems to be enshrined as the language of choice, so
it might be a good way to give bioinformatics capabilities to biologists...
--t
From biorst at hotmail.com Tue Apr 16 03:44:00 2002
From: biorst at hotmail.com (Rannveig Storaa)
Date: Tue Aug 8 04:23:02 2006
Subject: [Biodevelopers] Fwd: [BiO BB] New to bioinformatics
Message-ID:
>From: "Rannveig Storaa"
>Reply-To: bio_bulletin_board@bioinformatics.org
>To: bio_bulletin_board@bioinformatics.org
>Subject: [BiO BB] New to bioinformatics
>Date: Sat, 13 Apr 2002 16:49:08 +0200
>
>Hello everybody,
>I'd like to know if there is any way to use EMBOSS in a Windows
>environment.
>Is there anyavailable tools which is able to make multi-alignment and
>tranlsate the aligned sequences altogether?
>Thanks
>
>
>Rannveig
>
>_________________________________________________________________
>H?mta MSN Explorer kostnadsfritt p? http://explorer.msn.se/intl.asp
>
>_______________________________________________
>BiO_Bulletin_Board maillist - BiO_Bulletin_Board@bioinformatics.org
>http://bioinformatics.org/mailman/listinfo/bio_bulletin_board
_________________________________________________________________
H?mta MSN Explorer kostnadsfritt p? http://explorer.msn.se/intl.asp
From martingoodson at hotmail.com Tue Apr 16 08:26:27 2002
From: martingoodson at hotmail.com (martin goodson)
Date: Tue Aug 8 04:23:02 2006
Subject: [Biodevelopers] Fwd: [BiO BB] New to bioinformatics
Message-ID:
see Jemboss - a java version of emboss
(http://www.hgmp.mrc.ac.uk/Registered/Option/emboss.html). Note that
registration is required.
martin
>From: "Rannveig Storaa"
>Reply-To: biodevelopers@bioinformatics.org
>To: biodevelopers@bioinformatics.org, bio_bulletin_board@bioinformatics.org
>Subject: [Biodevelopers] Fwd: [BiO BB] New to bioinformatics
>Date: Tue, 16 Apr 2002 09:44:00 +0200
>
>
>
>
>>From: "Rannveig Storaa"
>>Reply-To: bio_bulletin_board@bioinformatics.org
>>To: bio_bulletin_board@bioinformatics.org
>>Subject: [BiO BB] New to bioinformatics
>>Date: Sat, 13 Apr 2002 16:49:08 +0200
>>
>>Hello everybody,
>>I'd like to know if there is any way to use EMBOSS in a Windows
>>environment.
>>Is there anyavailable tools which is able to make multi-alignment and
>>tranlsate the aligned sequences altogether?
>>Thanks
>>
>>
>>Rannveig
>>
>>_________________________________________________________________
>>Hämta MSN Explorer kostnadsfritt på http://explorer.msn.se/intl.asp
>>
>>_______________________________________________
>>BiO_Bulletin_Board maillist - BiO_Bulletin_Board@bioinformatics.org
>>http://bioinformatics.org/mailman/listinfo/bio_bulletin_board
>
>
>
>
>_________________________________________________________________
>Hämta MSN Explorer kostnadsfritt på http://explorer.msn.se/intl.asp
>
>_______________________________________________
>Biodevelopers mailing list
>Biodevelopers@bioinformatics.org
>http://bioinformatics.org/mailman/listinfo/biodevelopers
_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp.
From martingoodson at hotmail.com Tue Apr 16 08:59:08 2002
From: martingoodson at hotmail.com (martin goodson)
Date: Tue Aug 8 04:23:02 2006
Subject: [Biodevelopers] Welcome, and starting things off ...
Message-ID:
>On an unrelated note, I am curious as to what analysis techniques people
>are using for proteomic data. At this stage, I have seen image
>processing work, various bits of data mining and regression tests, etc.
>It of course is highly dependent upon the processes being studied. My
>interests are specifically in those analyses which are computationally
>demanding.
>
are you thinking of things like peptide mass fingerprinting and peptide ID
via of tandem mass spec/ databse searching.
Martin
_________________________________________________________________
Chat with friends online, try MSN Messenger: http://messenger.msn.com
From landman at scientificappliance.com Thu Apr 18 05:03:54 2002
From: landman at scientificappliance.com (Joe Landman)
Date: Tue Aug 8 04:23:02 2006
Subject: [Biodevelopers] Quick question on distributed development
Message-ID: <1019120635.24362.40.camel@protein.dtw.macsch.com>
Folks:
I wanted to know if anyone is looking at targetting grids or
distributed computing via Platform, Entropia/UD/DataSynapse, or
Globus/Legion toolkits. I am curious about any experiences or thoughts
on these systems, and their corresponding SDKs.
Thanks!
Joe
From landman at scientificappliance.com Wed Apr 24 08:58:56 2002
From: landman at scientificappliance.com (Joe Landman)
Date: Tue Aug 8 04:23:02 2006
Subject: [Biodevelopers] Object Oriented Perl issue
Message-ID: <1019653141.2548.137.camel@protein.dtw.macsch.com>
Folks:
My day job (ok, one of the hats I wear) has me coding in Perl. I am
using the OO methodologies with modules. And of course, I had a
question on the "rightness" of a particular way of doing things.
It works, and works nicely (except under SOAP::Lite, which I do not
understand).
Here is the problem. When you instantiate an object, usually you do
something like this:
use Foo::Bar;
$object = Foo::Bar->new();
If you want to pass specific parameters to the new object, you
typically put an argument string in the parenthesis to new. This can be
a scalar, an array, or a hash. You just have to parse it correctly in
your Foo::Bar::new method.
This is fine, no issues here. Now suppose you want to attach
attributes to the object (think of the object as having methods and
attributes, attributes being "localized" variables to the object). So
you would use attribute accessor/mutator methods (aka get/set). If you
read Damian Conway's book on the subject, he indicates that what you
should do is to build one accessor/mutator per attribute. So if you
have many attributes, you have many accessor/mutator pairs. This means
that your code is sprinkled with:
sub get_fragment
{
my $self = shift;
return $self->{'fragment'};
}
sub set_fragment
{
my ($self,$value) = @_
return $self->{'fragment'}=$value;
}
This is fine, until you hit the second attribute, and you immediately
start thinking that this is a waste of time, typing, will slow down the
Perl system, etc. To use this in your code, you do something like this:
$object->set_fragment(10);
.
.
printf "Fragment is %i\n",$object->get_fragment();
So thinking about this, I realized that you could use a general
accessor/mutator pair and the auto-vivification capability of Perl, and
deal with any number of attributes with a single pair, like this:
sub get_attribute
{
my ($self,$attribute) = @_;
if (exists($self->{$attribute}))
{
return $self->{$attribute};
}
else
{
return undef;
}
}
sub set_attribute
{
my ($self,$attribute,$value) = @_
return $self->{$attribute}=$value;
}
To use this in your code, you would do something like:
$object->set_attribute('fragment',10);
.
.
printf "Fragment is %i\n",$object->get_attribute('fragment');
So this works. It works quite well. Except for some reason, in a
remotely instantiated object under SOAP::Lite (then again, the other
method also doesnt work there). Basically in your new method, you
include a line to copy the argument hash to the attributes like this:
sub new
{
my ($class,%args) = @_;
my $self={};
bless $self,ref $class || $class ;
while (my ($key,$value) = each %args)
{
$self->set_attribute($key,$value);
}
return $self;
}
and your call to new now looks like this:
$object=Foo::Bar->new(
'attribute-1'=>'value1',
'attribute-2'=>'value2',...
);
You can use Data::Dumper and print Dumper(\$object) to see that this
really works.
My question is whether or not this is considered good OO style. I am
effectively relying on the auto-vivification (the automatic creation of
a new variable) to build the attributes. This means that in theory, I
can have an unlimited number of attributes. I presume that the orginial
framers of OO would not be happy with this.
Another question revolves around what could go wrong with this. I am
not trying to enforce hiding in OO here, just localization of an objects
variables with its methods.
Thoughts and comments welcome. Is this an issue in Python (everything
is an object) or similar?
Joe
From bub at io.com Wed Apr 24 17:25:14 2002
From: bub at io.com (Steve O)
Date: Tue Aug 8 04:23:02 2006
Subject: [Biodevelopers] Object Oriented Perl issue
Message-ID: <20020424162514.A12845@hagbard.io.com>
Joe,
You might want to check out Damian Conway's book Object Oriented Perl.
He shows how to use perl's AUTOLOAD feature to create get/set
methods on the fly if they don't exist. This allows you to write
get/set methods that have interesting behavior, and leave the simple
ones for perl to make up.
Here's the example:
sub AUTOLOAD {
my ( $self, $newval ) = @_;
if ($AUTOLOAD =~ /.*::get(_\w+)/ && $self->_accessable( $1, 'read' ))
{
my $attr_name=$1;
*{$AUTOLOAD} = sub { return $_[0]->{$attr_name} };
return $self->{$attr_name};
}
elsif ($AUTOLOAD =~ /.*::set(_\w+)/ && $self->_accessable( $1, 'write'))
{
my $attr_name=$1;
*{$AUTOLOAD} = sub { $_[0]->{$attr_name} = $_[1]; return };
return $self->{$attr_name} = $newval;
}
croak "No such method: $AUTOLOAD";
}
The argument I think OO people would have with your get_attribute
and set_attribute functions is that it allows access to all
of the object's state and allows no extra processing. The methods
are providing no extra benefit over the simpler:
$object->{fragment}="foo";
print "fragment: ",$object->{fragment},"\n";
-steve
From landman at scientificappliance.com Thu Apr 25 00:07:16 2002
From: landman at scientificappliance.com (Joe Landman)
Date: Tue Aug 8 04:23:02 2006
Subject: [Biodevelopers] testing
Message-ID: <1019707636.8031.29.camel@protein.dtw.macsch.com>
This is just a test (of the bioinformatics.org mail system) after an
upgrade.
From landman at scientificappliance.com Thu Apr 25 10:29:16 2002
From: landman at scientificappliance.com (Joe Landman)
Date: Tue Aug 8 04:23:02 2006
Subject: [Biodevelopers] Object Oriented Perl issue
In-Reply-To: <20020424162514.A12845@hagbard.io.com>
References: <20020424162514.A12845@hagbard.io.com>
Message-ID: <1019744957.8031.139.camel@protein.dtw.macsch.com>
On Wed, 2002-04-24 at 17:25, Steve O wrote:
> Joe,
>
> You might want to check out Damian Conway's book Object Oriented Perl.
> He shows how to use perl's AUTOLOAD feature to create get/set
> methods on the fly if they don't exist. This allows you to write
> get/set methods that have interesting behavior, and leave the simple
> ones for perl to make up.
I saw this in the book and wondered if it introduced too much complexity
in what I thought should be a simple problem.
[...]
> The argument I think OO people would have with your get_attribute
> and set_attribute functions is that it allows access to all
> of the object's state and allows no extra processing. The methods
> are providing no extra benefit over the simpler:
So what you are saying is that my get/set pair let you run roughshod
over the object (which is a potential security/stability risk) without
bound. The autogenerated accessor/mutator do not.
That is a good argument to use the autogenerated ones.
> $object->{fragment}="foo";
> print "fragment: ",$object->{fragment},"\n";
From landman at scientificappliance.com Thu Apr 25 10:42:04 2002
From: landman at scientificappliance.com (Joe Landman)
Date: Tue Aug 8 04:23:03 2006
Subject: [Biodevelopers] On security models for networked applications
Message-ID: <1019745724.8031.153.camel@protein.dtw.macsch.com>
Folks:
(I hope others will start posting here as well :))
I am thinking about security issues for my networked application.
Specifically how to authenticate a user properly, so a server can trust
the client talking to it is doing so on behalf of the correct user, and
the client can trust that the server it is talking to in fact represents
a valid server for the application, and can autheticate this.
I havent read up on things like public key infrastructures or whatnot
else. If someone else has run into this problem before, and is willing
to share some of what they learned, I think that would be valuable to
the list.
Basically I see the security issue broken up into sections.
1) transport security: being able to send data/information without
compromise of the information (generally handled by TLS, SSL, etc)
2) user authentication: being able to verify the identity of the user of
the service
3) server authentication: being able to verify the identity of the
server and service (generally handled by certification authorities and
server certificates).
I look at each transaction between server and client as needing to be
secure in the sense of the above list (and possibly others I have not
considered).
Are there any good discussions of this type of security in book or URL
formats? I am looking for practical examples I can use/learn from. If
you have any experience with these issues, please feel free to talk
about them here.
Thanks again!
Joe
From titus at caltech.edu Thu Apr 25 11:02:03 2002
From: titus at caltech.edu (Titus Brown)
Date: Tue Aug 8 04:23:03 2006
Subject: [Biodevelopers] On security models for networked applications
In-Reply-To: <1019745724.8031.153.camel@protein.dtw.macsch.com>
References: <1019745724.8031.153.camel@protein.dtw.macsch.com>
Message-ID: <20020425150203.GA28816@caltech.edu>
-> I am thinking about security issues for my networked application.
-> Specifically how to authenticate a user properly, so a server can trust
-> the client talking to it is doing so on behalf of the correct user, and
-> the client can trust that the server it is talking to in fact represents
-> a valid server for the application, and can autheticate this.
Do you want to know about generic network communication, or RPC mechanisms,
or something over straight HTTP? (I'm guessing RPC...)
As you say, the transport can handle the data security, and server
authentication can be handled by hardcoding the server name ,
unless you want things to be a bit more flexible, in which case you'll
have to buy into some sort of distributed authentication framework.
As for user authentication, I don't think there's a good generic way to do
it for generic network communication (this is one of the things that RPC
mechanisms like SOAP are supposed to help with!). I can recommend a
simple reference for how to do it in SOAP, but I haven't used that.
Of course, if you have a secure transport layer, you can just send a user/pass
along with every request ;).
--titus
From landman at scientificappliance.com Thu Apr 25 10:53:29 2002
From: landman at scientificappliance.com (Joe Landman)
Date: Tue Aug 8 04:23:03 2006
Subject: [Biodevelopers] Another point of discussion: XML
Message-ID: <1019746409.8030.166.camel@protein.dtw.macsch.com>
In his talk at the O'Reilly conference in Jan 2002, Ewan Birney
indicated he had mixed feelings about using XML for informatics
applications. His point as I remember it, was that it "encouraged
bloat".
XML is a verbose language. There is no doubt about that. I do believe
it is a useful technology for data exchange, with the caveat that you
need to think carefully about how you are going to use it.
Specifically, if you are going to exchange many records of a database,
it might make sense not to force everything into XML for transfer, but
to use XML to describe the structure of the BLOB you transfer over, so
the remote system can understand it. Basically use XML not to
encapsulate the data for transfer (which is what I think Ewan was
talking about), but as a descriptive block which need only be sent once,
and the data can be piped over raw.
I was hoping that others who are looking at or using XML (or similar
technologies) might chime up in how they are using it, or thinking about
it.
I am using XML for my program config files. I am using XML for passing
state information between services on my system. The nice thing about
this is that I can add/change the way the thing works by adding to or
changing the document, without rewriting the interface that generates or
parses the document. Of course, as I am using SOAP, I am using XML
implicitly for the structure of my object transport layer.
How are you/arent you using this or related technologies?
From landman at scientificappliance.com Thu Apr 25 11:10:10 2002
From: landman at scientificappliance.com (Joe Landman)
Date: Tue Aug 8 04:23:03 2006
Subject: [Biodevelopers] On security models for networked applications
In-Reply-To: <20020425150203.GA28816@caltech.edu>
References: <1019745724.8031.153.camel@protein.dtw.macsch.com>
<20020425150203.GA28816@caltech.edu>
Message-ID: <1019747410.8031.177.camel@protein.dtw.macsch.com>
On Thu, 2002-04-25 at 11:02, Titus Brown wrote:
> -> I am thinking about security issues for my networked application.
> -> Specifically how to authenticate a user properly, so a server can trust
> -> the client talking to it is doing so on behalf of the correct user, and
> -> the client can trust that the server it is talking to in fact represents
> -> a valid server for the application, and can autheticate this.
>
> Do you want to know about generic network communication, or RPC mechanisms,
> or something over straight HTTP? (I'm guessing RPC...)
Actually, HTTP would be best, given that this is the transport layer I
am using.
> As you say, the transport can handle the data security, and server
> authentication can be handled by hardcoding the server name ,
> unless you want things to be a bit more flexible, in which case you'll
> have to buy into some sort of distributed authentication framework.
I need to be flexible. Hardcoding == bad for my application.
Distributed authentication is what I am looking for.
> As for user authentication, I don't think there's a good generic way to do
> it for generic network communication (this is one of the things that RPC
> mechanisms like SOAP are supposed to help with!). I can recommend a
> simple reference for how to do it in SOAP, but I haven't used that.
>
> Of course, if you have a secure transport layer, you can just send a user/pass
> along with every request ;).
What I am trying to avoid is the notion of trust. From what I have seen
of systems that use trust, there are two states, untrusted and trusted.
The transition between these two states is mediated by a process of
authetication. This process is usually something related to a login.
Once you are in the trusted state, you can do as you wish. So a
dedicated cracker/hacker type could figure out some bug somewhere which
forces this transition to occur, enter the trusted state, and then
perform their nefarious acts. I dont know if it makes sense, but I want
to avoid this trusted state.
If I communicate over a secure link (SSL) to my server, and I send my
userid/password at every transaction, how can I be sure that (from the
server's perspective) that I am who I say I am? Dont I need either a
shared secret (aside from userid/password), or some sort of other
authetication method?
Maybe I am being too paranoid about this.
From rsimon at acclivitydev.com Thu Apr 25 11:26:21 2002
From: rsimon at acclivitydev.com (Richard Simon)
Date: Tue Aug 8 04:23:03 2006
Subject: [Biodevelopers] XML in BioInformatics magazine article
Message-ID:
Hello all,
I am writing an article on the current and projected future use of XML in
BioInformatics for the BioITWorld magazine. I am interested in how XML is
used for integrating data sources, exchanging biological/sequence
information, web services applications, etc.
If anyone would like to chat with me on this subject please email me back.
Regards,
____________________________________________
Richard L. Simon
President
Acclivity Development Corporation
978-835-4432
rsimon@acclivitydev.com
www.acclivitydev.com
____________________________________________