Table of Contents

Module: MaxEntropy Bio/Tools/Classification/MaxEntropy.py

MaxEnt.py

Maximum Entropy code.

Uses Improved Iterative Scaling: XXX ref

# XXX need to define terminology

Imported modules   
from Bio.Tools import listfns
from Numeric import *
import math
Functions   
_calc_empirical_expects
_calc_f_sharp
_calc_model_expects
_calc_p_class_given_x
_eval_feature_fn
_iis_solve_delta
_train_iis
calculate
classify
train
  _calc_empirical_expects 
_calc_empirical_expects (
        xs,
        ys,
        classes,
        features,
        )

_calc_empirical_expects(xs, ys, classes, features) -> list of expectations

Calculate the expectation of each function from the data. This is the constraint for the maximum entropy distribution. Return a list of expectations, parallel to the list of features.

  _calc_f_sharp 
_calc_f_sharp (
        N,
        nclasses,
        features,
        )

_calc_f_sharp(N, nclasses, features) -> matrix of f sharp values.

  _calc_model_expects 
_calc_model_expects (
        xs,
        classes,
        features,
        alphas,
        )

_calc_model_expects(xs, classes, features, alphas) -> list of expectations.

Calculate the expectation of each feature from the model. This is not used in maximum entropy training, but provides a good function for debugging.

  _calc_p_class_given_x 
_calc_p_class_given_x (
        xs,
        classes,
        features,
        alphas,
        )

_calc_p_class_given_x(xs, classes, features, alphas) -> matrix

Calculate P(y|x), where y is the class and x is an instance from the training set. Return a XSxCLASSES matrix of probabilities.

  _eval_feature_fn 
_eval_feature_fn (
        fn,
        xs,
        classes,
        )

_eval_feature_fn(fn, xs, classes) -> dict of values

Evaluate a feature function on every instance of the training set and class. fn is a callback function that takes two parameters: a training instance and a class. Return a dictionary of (training set index, class index) -> non-zero value. Values of 0 are not stored in the dictionary.

  _iis_solve_delta 
_iis_solve_delta (
        N,
        feature,
        f_sharp,
        empirical,
        prob_yx,
        )

Exceptions   
"Newton's method did not converge"
  _train_iis 
_train_iis (
        xs,
        classes,
        features,
        f_sharp,
        alphas,
        e_empirical,
        )

Do one iteration of hill climbing to find better alphas. This is a good function to parallelize.

  calculate 
calculate ( me,  observation )

calculate(me, observation) -> list of log probs

Calculate the log of the probability for each class. me is a MaxEntropy object that has been trained. observation is a vector representing the observed data. The return value is a list of unnormalized log probabilities for each class.

  classify 
classify ( me,  observation )

classify(me, observation) -> class

Classify an observation into a class.

  train 
train (
        training_set,
        results,
        feature_fns,
        update_fn=None,
        )

train(training_set, results, feature_fns[, update_fn]) -> MaxEntropy object

Train a maximum entropy classifier on a training set. training_set is a list of observations. results is a list of the class assignments for each observation. feature_fns is a list of the features. These are callback functions that take an observation and class and return a 1 or 0. update_fn is a callback function that's called at each training iteration. It is passed a MaxEntropy object that encapsulates the current state of the training.

Exceptions   
"IIS did not converge"
ValueError, "No data in the training set."
ValueError, "training_set and results should be parallel lists."
Classes   
MaxEntropy

Holds information for a Maximum Entropy classifier.


Table of Contents

This document was automatically generated on Mon Jul 1 12:03:03 2002 by HappyDoc version 2.0.1