[BioEdu] Lanuages and environments

Jose R. Valverde jrvalverde at cnb.uam.es
Thu Mar 8 12:03:46 EST 2007


On Wed, 14 Feb 2007 22:27:13 -0500
Paulo Nuin <nuin at genedrift.org> wrote:
> Piyush Mundra wrote:
> >  I guess choice of language  also depends on the kind of work one is 
> > targeting. I mean, if one is looking for  biological data analysis, 
> > machine learning algorithm applications and other related stuffs, 
> > MATLAB or R should be sufficient.
> >

I have been silent on this as I hate to get into flame wars and have grown 
tired over time of repeating the same answer. But here it goes:

As Piyush says the choice depends on the problem. We wouldn't have as many
languages if people could standardize on a small set. So the only reasonable
answer is
	- look at what other knowledgeable people in the field are using
	- look at various languages and choose the best (no need to learn 
them it may suffice to talk to pleople using them)

In Bioinformatics, as in everywhere else, you might suicide as well as
getting a good answer. There is none as there is NO bioinformatics either.
so,
	- if you do sequence analysis, most people prefer (or have preferred)
C. Perl is gaining acceptance, and so is Python
	- if you do structural biology, fortran has been the standard for
decades, with C++ becoming the new choice, and Python following closely
	- if you work with databases, SQL of course, and if you want to
have nice web interfaces, go for LAMP (Linux+Apache+MySQL/PosetGreSQL+
PHP followed by perl and python)
	- if you work with genomic data, R is a must, plus some managing
language (PHP is excellent for web interfaces, although most people still
uses Perl)
	- if you work on high-throughput problems your best start is the
shell (bash, and less tcsh) for quick scripting, followed by Perl and Python
	- if you work on GUI-driven problems, Java is your choice, followed
by C++, Perl, Python and C, with the choice if not Java is greatly influenced
by the environment (KDE, Gnome, etc..)
	- if you like Systems Biology, then R, a scripting language and 
a GOOD modeller (e.g. Ptolemy) is a better start
	- if you are on heavy math simulation, MatLab, Mathematica, etc..
are the best start, followed by Fortran, C++...
	- if you need to work with legacy code, almost anything is possible:
Phylip used to be written in Pascal, other tools in Visual Something, etc...
	- if you want to be on the cutting edge, you need SOAP, WSDL, Corba,
and their description languages, etc..
	- if you are into AI, LISP, Prolog, C, C++ and a number of others 
may be your choice

and so, on and on and on and on. Do I need to follow?

As for the OS... Linux? Nah! There used to be a time (and it lasted well over
a decade) where nobody in Bioinformatics would think of touching anything else
than VMS. Well except for structural biologists who would use UNIX workstations.
And I'm sure we'll have plenty of time to switch OSes in the future. BTW there
was Bioinformatics before the VAX/VMS, mostly on IBM System/360... so there.

Personally I started with COBOL, then LOGO, C, Pascal, Fortran, APL, BASIC, 
assembler, ADA, Modula, Sather, Shell, Awk, Perl, Python, SQL, PHP, PL/I, TCL
Prolog, LISP and so on with a list too long to remember. Nor do I care, I 
don't believe I know well any of them anymore -save the one I use at a time- 
but can use anything that falls in my hands, often without needing to learn 
it as long as I have some sample code. And I have used as many (or more)
operating systems and machines as well (MasPar, mainframes, minis, workstations,
PCs, Macs, micros like the AppleII, Amstrad or Commodore, etc..).

So to summarize: forget about languages. Learn to program using one and
learn it well. Then CHOOSE your target field and learn whatever is needed.

-- 
	These opinions are mine and only mine. Hey man, I saw them first!

			    José R. Valverde

	De nada sirve la Inteligencia Artificial cuando falta la Natural


More information about the BioEdu mailing list