[Biophp-dev] Biophp, where is it and where does it go?

Nico Stuurman biophp-dev@bioinformatics.org
04 Mar 2004 22:27:06 -0800


Prompted by some needs at work (designing primers to make fusion
proteins with fluorescent proteins at a medium scale, setting up a
database for coiled-coil segments) I was looking again at biophp and
felt disappointed that there is not really something directly usable.  

So, instead of spending time learning the gory details of bioperl (php
simply works so much easier for me, perl is always a pain to get used to
again and I forget the routine in half an hour), I'd rather put some
effort in writing the code I need and contributing it to biophp.

The question is then, where is biophp now and how can I contribute while
not waisting too much time?  As far as I can see (please correct me if I
am wrong), there is still no unified biophp cvs repository.  It looks
like Serge restarted the cvs of genephp at sourceforge with just his own
code (deleting contributions by me and Sean in the process
http://cvs.sourceforge.net/viewcvs.py/genephp/genephp), and there is the
repository at biophp
(http://bioinformatics.org/cgi-bin/cvsweb.cgi/biophp/genephp/) which is
basically Serge'scode base with some changes/additions by Sean and me
(b.t.w, I added an interface to primer3 - to be found in directory
interfaces - and added the method prettyseq to object seq in the last
couple of days).  So, it doesn't yet look like there are multiple people
working on the same code base.  It would really be nice if we could get
that going...

Next, it will be important to decide on the basic structure of biophp. 
Doing so will make it so much easier for everyone to contribute, because
it will be clear what is missing, what needs to be done, or where
whatever you feel like doing will fit in.  Serge has made quite an
impressive start with his code, and that should be the basis of such a
discussion.  In my experience, there was too much code there that was
too specific, and could not easily be re-used in other contexts, so the
simplest approach would be to go through the code and re-organize it in
a more fine-grained manner, making more independent units that can be
used in other contexts without modification (I guess that the file
parsers that Sean and I worked on are a nice example, I lifted lots of
Serge's code and made it more widely usable - or so I think).

Simultanuously, we will have to work on code readability and
documentation, using the PEAR guidelines is something that has been
discussed here before.  To make this code useful to outsiders, decent
documentation is a must.

Apart from this code re-organization I think one of the most important
aspects is to have a (universal) SQL database interface.  I looked at
BioSQL at one point, and - although we will have to connect to that at
one point in time - BioSQL is terribly inflexible for lots of things.  I
think it would simply be useful to be able to write (and restore) a
Biophp seq object to an SQL database in a meaningful way.

How will we go about doing this?  First and foremost, we should all
decide to work on the same code-base and try to agree on what is good
and bad, what needs to be added and what needs to deleted.  I would
simply start polishing the code that is now in cvs at
bioinformatics.org, but I'll be happy to join in any other place (as
long as we have some kind of sourcecode version control system in
place).

Hope this gets something rolling again.

Nico