From Bioinformatics.Org Wiki
Machine learning is the ability of computers (machines) to change their expectations of a model according to how that model functions, allowing for more accurate predictions. Learning can be either supervised, unsupervised or reinforced. Supervised and reinforced learning involve input from humans and interaction with a model, respectively, while unsupervised learning generally involves neither.
Bayes rule is the basis for Bayesian learning and many other machine learning approaches, such as maximum likelihood and hidden Markov models. For Bayesian learning, the rule is written as the conditional probability
P(M|D) = P(D|M)P(M)/P(D)
If M represents a model (or its parameters), and D its data, then Bayes rule states that the "posterior probability" (what is currently considered probable) of a model equals the "likelihood" (hypothetical probability based on the past function of the model) of the data, times the "prior probability" (what was previously considered probable) of the model, divided by the "marginal likelihood" (the evidence). The values change for each observation of the model during training, and posterior probability is recalculated. When model paramters are learned with this approach, predictions of the observations can be made by using the parameters as weights to the probabilities.
Maximum likelihood learning
Maximum likelihood (ML) learning is also based on Bayes rule. But where Bayesian learning uses the prior probability of a model's parameters, ML learning does not. ML learning finds the parameters which maximize the likelihood of the data, given the parameters (instead of the model) as the past event.
Both Bayesian and ML learning approaches are good for analyzing correlations in data sets of independent and identically distributed points. However, the approaches are limited by high-order statistical structure, outliers, and a large number of parameters (high dimensionality).
Another machine learning approach involves neural networks. Neural networks contain layers of computational units or nodes, the first layer being the input and the last being the output. Each unit is connected to others in its layer and computes the function of its input as well as its activation function. And layers are connected by adaptive weights, which affect the network's activity.
Activity in neural networks is propagated in a "feed-forward" manner, starting with the input layer and ending with the output layer. When training a neural network, errors found on output can be "back-propagated" into the network, changing the adaptive weights.
Neural networks are particularly useful for analyzing data in matrices or arrays, since the units in the networks can be arranged as such. They are thus often used in analyzing microarray data, where they can also be trained. One disadvantage of the approach, however, is that the networks are trained on finite data sets, sometimes leading to a bias in the analysis of new data. Such a bias is considered the result of "overtraining."
- Ghahramani, Z. and Rasmussen, C.E. 2001. Unsupervised learning 2001, lecture 1: introduction and statistical foundations. http://www.gatsby.ucl.ac.uk/~zoubin/course01/lect1.ps.gz
- Hoffman, T. 2001. Machine learning and pattern recognition, lecture #14. http://www.cs.brown.edu/people/th/Course/Lecture14.pdf
- Mount, D.W. 2001. Bioinformatics: sequence and genome analysis. Cold Spring Horbor Laboratory Press, Cold Spring Harbor, New York.
- Akaho, S. 1998. Nonmonotonic generalization bias of Gaussian mixture models. http://www.etl.go.jp/~akaho/RBBM/etl-tr-98-22.ps.gz