Bioinformatics
From Bioinformatics.Org Wiki
Bioinformatics has been defined many different ways, since practitioners do not always agree upon the scope of its use within the biological and computer sciences, but it is always considered a combination of both sciences, along with other contributing disciplines.
Contents |
Bioinformatics as a biological science
It is debatable whether bioinformatics and the discipline computational biology, literally "biology that involves computation," are the same or distinct. To some, both bioinformatics and computational biology are defined as any use of computers for processing any biologically-derived information, whether DNA sequences or breast X-rays. Therefore, there are other fields, e.g. medical imaging / image analysis, that might be considered part of bioinformatics. This would be the broadest definition of the term. But, in practice, the definition used by most people is even narrower; bioinformatics to them is a synonym for computational molecular biology: any use of computers to characterize the molecular components of living things.
Bioinformatics as a computer science
To others, bioinformatics is a grammatical contraction of "biological informatics" and is therefore related to the computer science disciplines of information science and/or information technology. This definition would thus emphasize the information contained within the biological data, also implying that large amounts of data would be managed and/or analyzed.
Pre-genomic bioinformatics
Most biologists talk about "doing bioinformatics" when they use computers to store, retrieve, analyze or predict the composition or the structure of biomolecules. As computers become more powerful you could probably add simulate to this list of bioinformatics verbs. "Biomolecules" include your genetic material---nucleic acids---and the products of your genes: proteins. These are the concerns of pre-genomic or "classical" bioinformatics, which deal primarily with sequence analysis.
Fredj Tekaia at the Institut Pasteur offers this definition of bioinformatics:
"The mathematical, statistical and computing methods that aim to solve biological problems using DNA and amino acid sequences and related information."
It is a mathematically interesting property of most large biological molecules that they are polymers; ordered chains of simpler molecular modules called monomers. Think of the monomers as beads or building blocks which, despite having different colors and shapes, all have the same thickness and the same way of connecting to one another.
Monomers that can combine in a chain are of the same general class, but each kind of monomer in that class has its own well-defined set of characteristics. And many monomer molecules can be joined together to form a single, far larger, macromolecule. Macromolecules can have exquisitely specific informational content and/or chemical properties.
According to this scheme, the monomers in a given macromolecule of DNA or protein can be treated computationally as letters of an alphabet, put together in pre-programmed arrangements to carry messages or do work in a cell.
Post-genomic bioinformatics
The greatest achievement of bioinformatics methods, the Human Genome Project, is practically complete. Because of this the nature and priorities of bioinformatics research and applications have changed. People often talk portentously of our living in the "post-genomic" era. This affects bioinformatics in several ways:
- Now that we possess multiple whole genomes, we can look for differences and similarities between all the genes of multiple species. From such studies we can draw particular conclusions about species and general ones about evolution. This kind of science is often referred to as comparative genomics.
- There are now technologies designed to measure the relative number of copies of a genetic message (levels of gene expression) at different stages in development or disease or in different tissues. Such technologies, such as DNA microarrays will grow in importance.
- Other, more direct, large-scale ways of identifying gene functions and associations (for example yeast two-hybrid methods) will grow in significance and with them the accompanying bioinformatics of functional genomics.
- There will be a general shift in emphasis (of sequence analysis especially) from genes themselves to gene products. This will lead to:
- attempts to catalog the activities and characterize interactions between all gene products (in humans): proteomics ).
- attempts to crystallography and or predict the structures of all proteins (in humans): structural genomics.
- fewer DNA double-helices in bad sci-fi movies.
- What some people refer to as research or medical informatics, the management of all biomedical experimental data associated with particular molecules or patients---from mass spectroscopy, to in vitro assays to clinical side-effects---will move from the concern of those working in drug company and hospital I.T. (information technology) into the mainstream of cell and molecular biology and migrate from the commercial and clinical to academic sectors.
It is worth noting that all of the above post-genomic areas of research depend upon established, pre-genomic sequence analysis techniques.
Computer science disciplines inspired by the life sciences
There are also whole other disciplines of biologically-inspired computation, e.g. genetic algorithms, AI, and neural networks. Often these areas interact in strange ways. Neural networks, inspired by crude models of the functioning of nerve cells in the brain, are used in a program called PHD to predict, surprisingly accurately, the secondary structures of proteins from their primary sequences.