[Pipet Devel] New guy speaks up

Gary Van Domselaar gvd at redpoll.pharmacy.ualberta.ca
Tue Nov 30 13:09:39 EST 1999

Brad Chapman wrote:
> Hello brave Locians!

Welcome to the club, Brad.  I'll add your info to the lab rats page (if
you have a favorite email and webpage you can provide me with that as

> 2) Converters. This is based on Oct 15 discussions about how to convert
> data between formats. My random thought was--why have a specific internal
> format (ie. XML)? Instead, how about when a document comes in (in some
> particular format) it is parsed and pushed into a relational database (see
> point 3 for more on the storage component). This eliminates a need for an
> internal data format (because, man, we do not need anymore formats to put
> sequence data in!) and also allows direct querying of specific parts of a
> format (ie. you can search only sequence data, or search only bib data, or
> search only genbank id, or whatever). In this way, to read any particular
> file type into Loci, you would need to write a plug-in that would insert
> data into the database. 

To put a sequence into a relational database, wouldn't you have to
design a data model to place the sequence information into? at least
tables like: 

Then you would have a sequence data format based on a relational data
model, no?  I'm no database exert either, so I'm asking out of sheer
naievety here :-)

Jeff and I talked at length about this issue more than once.  We believe
it would be better to have Loci as a Graphical Shell/ Graphical
Scripting language with a database 'locus', but the actual database, and
data model used to store the sequence data (and annotations) would be an
'option' depending on what the developers have provided for loci.  so
there may be a relational database, an object database etc., or a
combination of databases for sequence data storage.  Because of the GNU
MySQL, the Loci developers will likely create a relational database to
store and retrieve sequence data, but this would not be a 'requirement'
so much as a 'plug-in'.  Loci's own database requirements may not be so
much for sequence storage as much as it is for things like the container
locus, which is a queriable locus that contains other loci.

But i'm on your side here, and I like your idea.

>To me, this seems analogous to the DBD/DBI
> mechansisms that perl uses for database connectivity.

Python has crazy-wicked bindings to  MySQL (and mSQL):  basic
connectivity, dynamic connectivity, queries, statement handlers, even
database meta-data (information about a database connection).  No need
to worry about Perl here.

>         I think for a LGPLed project, there are two choices of databases:
> 1) MySQL (http://www.mysql.org) GPLs old versions (as Jeff mentioned
> earlier). The new version costs for Microsoft users. 2) PostgreSQL
> (http://www.postgresql.org) has a "do whatever you want with it" copywrite.
> Both are very good from what I hear and have decent documentation to make
> learning easier. I currently mess around with MySQL (I converted ArrayDB,
> an NIH microarray storage/query program, to run with MySQL instead of
> Sybase), but am by no means an expert, but from what I hear the basic
> difference between MySQL and PostgreSQL are that MySQL is lighter and
> faster, while PostgreSQL supports more functions and has better data
> integrity. But like I said, I'm no expert!

Check out O'reilly MySQL and mSQL for a good comparison between MySQL,
mSQL, and PostgreSQL.

> 4) Other languages besides python. This isn't from the discussion, but from
> my own personal interest--how easy is it to intergrate other languages with
> python and the planned Loci implementation? Does everything have to run
> through CORBA before it can interoperate or is there a clean way to, say,
> use python and perl together?

That would mean two interpreters running together. I'm not sure if Jeff
would be smiling at that idea.  Python can be extended with C libraries,
and C can 'embed' Python.  Python can do just about anything Perl can
do, and it can access precompiled binaries, so, is there a need for

> 5) Me. (Sorry, I'm definately not egocentric!) I would really like to see
> Loci succeed and would like to try to get involved in some coding. I will
> admit I'm not ready to code in Python yet since I know next to nothing (I
> am working through the documentation, though!) but I'm really
> interested/excited about the project and so I would like to try to jump in
> (and hopfully swim!) and try to do some helping. In regards to this, I
> would therefore like put the question forward to everyone about what I
> could/should work on to start. Like I've mentioned, I'm definately no
> expert on anything, but I can try to help as much as possible and do my
> best. So what do you think? What should I hack?

I'll let the Ubermeister handle that one.  Again, nice to have you
aboard, Brad!

Gary Van Domselaar		gvd at redpoll.pharmacy.ualberta.ca
Faculty of Pharmacy 		Phone: (780) 492-4493
University of Alberta		FAX:   (780) 492-5305
Edmonton, Alberta, Canada       http://redpoll.pharmacy.ualberta.ca/~gvd

More information about the Pipet-Devel mailing list