Module: MaxEntropy

Table of Contents

Module: MaxEntropy

Bio/Tools/Classification/MaxEntropy.py

MaxEnt.py

Maximum Entropy code.

Uses Improved Iterative Scaling: XXX ref

# XXX need to define terminology

Imported modules

from Bio.Tools import listfns
from Numeric import *
import math

Functions

_calc_empirical_expects
_calc_f_sharp
_calc_model_expects
_calc_p_class_given_x
_eval_feature_fn

_iis_solve_delta
_train_iis
calculate
classify
train

_calc_empirical_expects

_calc_empirical_expects (
        xs,
        ys,
        classes,
        features,
        )

_calc_empirical_expects(xs, ys, classes, features) -> list of expectations

Calculate the expectation of each function from the data. This is the constraint for the maximum entropy distribution. Return a list of expectations, parallel to the list of features.

_calc_f_sharp

_calc_f_sharp (
        N,
        nclasses,
        features,
        )

_calc_f_sharp(N, nclasses, features) -> matrix of f sharp values.

_calc_model_expects

_calc_model_expects (
        xs,
        classes,
        features,
        alphas,
        )

_calc_model_expects(xs, classes, features, alphas) -> list of expectations.

Calculate the expectation of each feature from the model. This is not used in maximum entropy training, but provides a good function for debugging.

_calc_p_class_given_x

_calc_p_class_given_x (
        xs,
        classes,
        features,
        alphas,
        )

_calc_p_class_given_x(xs, classes, features, alphas) -> matrix

Calculate P(y|x), where y is the class and x is an instance from the training set. Return a XSxCLASSES matrix of probabilities.

_eval_feature_fn

_eval_feature_fn (
        fn,
        xs,
        classes,
        )

_eval_feature_fn(fn, xs, classes) -> dict of values

Evaluate a feature function on every instance of the training set and class. fn is a callback function that takes two parameters: a training instance and a class. Return a dictionary of (training set index, class index) -> non-zero value. Values of 0 are not stored in the dictionary.

_iis_solve_delta

_iis_solve_delta (
        N,
        feature,
        f_sharp,
        empirical,
        prob_yx,
        )

Exceptions
Exceptions	"Newton's method did not converge"

_train_iis

_train_iis (
        xs,
        classes,
        features,
        f_sharp,
        alphas,
        e_empirical,
        )

Do one iteration of hill climbing to find better alphas. This is a good function to parallelize.

calculate

calculate ( me,  observation )

calculate(me, observation) -> list of log probs

Calculate the log of the probability for each class. me is a MaxEntropy object that has been trained. observation is a vector representing the observed data. The return value is a list of unnormalized log probabilities for each class.

classify

classify ( me,  observation )

classify(me, observation) -> class

Classify an observation into a class.

train

train (
        training_set,
        results,
        feature_fns,
        update_fn=None,
        )

train(training_set, results, feature_fns[, update_fn]) -> MaxEntropy object

Train a maximum entropy classifier on a training set. training_set is a list of observations. results is a list of the class assignments for each observation. feature_fns is a list of the features. These are callback functions that take an observation and class and return a 1 or 0. update_fn is a callback function that's called at each training iteration. It is passed a MaxEntropy object that encapsulates the current state of the training.

Exceptions
Exceptions	"IIS did not converge" ValueError, "No data in the training set." ValueError, "training_set and results should be parallel lists."

Classes

MaxEntropy

Holds information for a Maximum Entropy classifier.

Table of Contents

This document was automatically generated on Mon Jul 1 12:03:03 2002 by HappyDoc version 2.0.1