Getting involved
From Bioinformatics.Org Wiki
Contents |
Starting from scratch
Please read the education section for information about some of the places you can currently study bioinformatics.
If you are a high school student / sixth former, think about taking an interdisciplinary computational biology or bioinformatics bachelor's degree of the sort offered at, for example, Manchester University in the UK or UPenn in the States. Don't worry if you can't find a place on such a course or there isn't one nearby; perhaps the best way to approach this subject is from two sides. Do a bachelor's degree in one area while taking a healthy interest in the other -- or (if you can afford to) complement a first degree in one part of the discipline with a second degree in the second.
If you already have a degree in a biological discipline there are similar Master's courses -- both interdisciplinary (e.g. Birkbeck's in London) and conversion type courses -- for biologists or others to learn computer science, for example.
If you are currently doing a computer science or biology PhD, try to take advantage of the opportunity to take courses in the "other" discipline.
From a background in biology
Biologists should take as many real computing courses as you can. It's important not just to learn a programming language, but also to learn the discipline of computing; to structure and document your work in a rigorous way. What courses you take might be directed by the kind of work you are interested in doing when you graduate -- whether you see yourself supporting bioinformatics applications or building them. For the former you need all-round familiarity with the programs themselves and the hardware and software needed to run them -- plus your existing understanding of biology. For the latter you need to learn a structured programming language and the principles of good program design -- plus the ability to talk to and understand biologists.
Courses biologists might consider taking
- UNIX
- Of all the computing courses available it is most important that you have a proper introduction to the UNIX operating system(s). Most current bioinformatics software (especially the free stuff) runs on "open" platforms like Linux and the Web. The UNIX philosophy is elegant, powerful, and frustrating. Master it and you will save a lot of time.
- Mathematics
- Learn some maths. Basic statistics, logic/set theory and a little calculus would be my recommendation. Logic will come in handy at the very least if you want to query databases in an intelligent way.
- Programming
- If you're interested in development, learn a real programming language: C/C++, Java, Perl, Python, etc.
- Perl and HTML are the stuff that holds the Web together. A grasp of these is essential for a lot of the Web/database work being done by many bioinformaticians at the moment.
- Good old BASIC can be very useful as an introduction to programming or as a tool in its own right, but none of these latter languages are built to crunch numbers and tackle real world biological problems -- which isn't to say people don't try...
From a background in computational/quantitative science
There is the simple value of doing some "proper" biological laboratory science. There have been many talks during which a bioinformatics "scientist" describes in great detail how his -- it's usually "his" -- application of a trendy mathematical tool offers a supposed insight into a (sometimes supposed) biological problem. But, nine times out of ten this method will never be so much as sneezed on by a practising biologist.
Quantitative scientists sometimes talk about their interest in studying some aspect of "God's mind". Biologists, in contrast, are interested in "Mother Nature". You might meditate on God in the hope of some revelation, but to understand Nature you have to meet her in the flesh. You are as likely to be useful to biologists working in isolation at the keyboard as you are to conceive with your clothes on. Desk-bound bioinformaticians have written code that has turned out to be popular with biologists, but almost always because they have collaborated with biologists.
Courses quantitative scientists might consider taking
- Molecular biology
- "MoBi" was the bioinformatics of its day; desperately fashionable, the province of new, higher-paid practitioners and considered with slight suspicion by more traditional biologists. It was once a great achievement to sequence a modest stretch of DNA, now it's a job for robots. Today the technology of molecular biology is very well established. Scientists can buy kits to perform the sort of genetic manipulations that would make your parents' jaws drop. Some of the kits are so simple your small children could use them (with a modest amount of training and supervision).
- Despite the profusion of commercial kits, there is still a requirement for real skill in molecular biology and the general level of scientific understanding required to be a good biological scientist -- rather than just completing a practical class -- doesn't come easy. Living matter, the stuff you have to work with is unpredictable and responds slowly -- except when it's dying. Even supposedly fast-growing bacteria can take a long time to yield up their secrets.
- Now, fashions in biomedical research are shifting from molecular biology back to cell biology and protein biochemistry, but it's well worth offering yourself up as a volunteer for some vacation work in a molecular biology lab. The term is now more often used to refer to the technological tools provided by MoBi to biology in general, rather than to fundamental research in the field itself. Those tools are common to a vast array of different kinds of research, from archaeology to zoology.
- Protein (bio)chemistry
- Protein (bio)chemistry is experiencing a revival. Proteins are still more delicate and fussy than nucleic acids. The same advice that applies to molecular biology applies to protein biochemistry. That stuff bioinformatics people refer to as "wet lab science" is much harder than it looks.
- You might find it more difficult to get access to a good protein lab than a good molecular biology lab and do protein science with real wizards, but the very least you can do is read about the theoretical aspects of the subject.
- For insights into the principles of proteins structure, try, for example, Carl Branden and John Tooze's "Introduction to Protein Structure" [Garland ISBN 0-8153-2305-0]. Physicists in particular might find the lack of general unifying principles in this area overwhelming. Unfortunately there's no substitute for acquiring a "feel" from the subject by examining a lot of examples. Still the most critical stages in the successful prediction of protein structure from sequence are those requiring human intervention.
- Thomas E. Creighton has been responsible for a range of standard texts on protein chemistry. If you are working in a protein lab you are likely to come across his "Protein Function : A Practical Approach" [ISBN 019963615X] and the rather more expensive and theoretical "Proteins : Structures and Molecular Properties" [ISBN 071677030X]
- Evolutionary biology
- It's a worn quote, but worth repeating:
"The mechanisms that bring evolution about certainly need study and clarification. There are no alternatives to evolution as history that can withstand critical examination. Yet we are constantly learning new and important facts about evolutionary mechanisms. Nothing in biology makes sense except in the light of evolution." Theodosius Dobzhansky in "American Biology Teacher" vol.35
- Darwin's theory is one of the simplest and most misunderstood in science. Start with a good layperson's introduction, Richard Dawkin's "The Selfish Gene" (and remember: it's a metaphor, stupid) or Steve Jones' paraphrasing of Darwin's original The Origin of the Species, "Almost Like a Whale". All biologists agree on the underlying principles, but they are nearly ready to kill one another over the details. After reading a decent book on evolutionary biology you should have at least a handful of good questions. Now you are ready to take a class in the subject. Take your questions with you. You'll probably start an argument -- or a fight.
- You might also like to peruse Cynthia Gibas's answers to similar questions from computational scientists on the O'Reilly Web site.
More general advice
Volunteering
Volunteering for an open-source project is a great way for people who'd like to enter the field to gain some experience. Plus, most open-source projects are desperate for volunteer help.
There are hundreds of open-source bioinformatics projects hosted at Bioinformatics.Org. In our full list, we have information/database projects and not just software development projects.
SourceForge also has hundreds of bioinformatics projects, listed under the category "bio-informatics".
And you might also want to consider helping with some of the larger projects hosted at the Open Bioinformatics Foundation.
Another place to look is in journals like BMC Bioinformatics and the Bioinformatics journal. Every issue is loaded with descriptions of new tools, many of which could be licensed as Open Source (just ask them).
There are many of choices, so you can afford to be very picky about the subject matter of a project and the people who you'd be volunteering to help.
Use the software
Get access to an installation of Galaxy and/or EMBOSS, and get someone to lead you through the tools available. RasMol is a simple, but powerful and elegant molecular imaging program which can teach you a great deal about biological macromolecules; try a tutorial. if you have access on an HPC platform, do not miss out trying EasyBuild that delivers in an automated way most of HPCBIOS packages, seen at [1] There's so much stuff out there -- and most of it is free to academics.