Table of Contents

Module: kNN Bio/Tools/Classification/kNN.py

kNN.py

This module provides code for doing k-nearest-neighbors classification.

k Nearest Neighbors is a supervised learning algorithm that classifies a new observation based the classes in its surrounding neighborhood.

Glossary: distance The distance between two points in the feature space. weight The importance given to each point for classification.

Classes: kNN Holds information for a nearest neighbors classifier.

Functions: train Train a new kNN classifier. calculate Calculate the probabilities of each class, given an observation. classify Classify an observation into a class.

Weighting Functions: equal_weight Every example is given a weight of 1.

Imported modules   
from Bio.Tools import listfns, distance
Functions   
calculate
classify
equal_weight
train
  calculate 
calculate (
        knn,
        x,
        weight_fn=equal_weight,
        distance_fn=distance.euclidean,
        )

calculate(knn, x[, weight_fn][, distance_fn]) -> weight dict

Calculate the probability for each class. knn is a kNN object. x is the observed data. weight_fn is an optional function that takes x and a training example, and returns a weight. distance_fn is an optional function that takes two points and returns the distance between them. Returns a dictionary of the class to the weight given to the class.

  classify 
classify (
        knn,
        x,
        weight_fn=equal_weight,
        distance_fn=distance.euclidean,
        )

classify(knn, x[, weight_fn][, distance_fn]) -> class

Classify an observation into a class. If not specified, weight_fn will give all neighbors equal weight and distance_fn will be the euclidean distance.

  equal_weight 
equal_weight ( x,  y )

equal_weight(x, y) -> 1

  train 
train (
        xs,
        ys,
        k,
        typecode=None,
        )

train(xs, ys, k) -> kNN

Train a k nearest neighbors classifier on a training set. xs is a list of observations and ys is a list of the class assignments. Thus, xs and ys should contain the same number of elements. k is the number of neighbors that should be examined when doing the classification.

Classes   
kNN

Holds information necessary to do nearest neighbors classification.


Table of Contents

This document was automatically generated on Mon Jul 1 12:03:03 2002 by HappyDoc version 2.0.1