|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.metaqtl.algo.EMAlgorithm
public final class EMAlgorithm
This class defines methods to perform the EM-algorithm for univariate mixture of gaussians with known variances.
| Field Summary | |
|---|---|
static boolean |
DO_SEM
If true the best starting point is used to do Supplemental EM (SEM) in order to compute the DM Matrix.. |
static int |
EM_CONTINUE
The EM convergence status : continue |
static double |
EM_ERR
The EM convergence tolerance. |
static int |
EM_FAILURE
The EM convergence status : failure |
static int |
EM_ITER_MAX
The EM maximum number of iterations. |
static double |
EM_MIN_DISTANCE
The minimal mahalanobis distance between components. |
static int |
EM_OK
The EM convergence status : success |
static int |
EM_START
The number of replicates for a run of the EM-algorithm. |
static double |
TINY_Z_PROBA
The minimal value for Z proba in EM-clustering |
| Constructor Summary | |
|---|---|
EMAlgorithm()
|
|
| Method Summary | |
|---|---|
static void |
computeCOV(double[] sd,
EMResult theta)
Computes the variance-covariance matrix of the mixture component estimates. |
static void |
computeDM(double[] x,
double[] sd,
double[] mu,
EMResult theta)
Computes the matrix of derivatives of the update function. |
static EMResult |
doEM(double[] x,
double[] sd,
int k,
EMResult spoint)
Apply the EM-Algorithm on the given data set where x
is an array of observed value and sd an array of same
size than x which stores the standard deviations of
the observed values. |
static void |
emRate(EMResult theta,
double[] mu,
double[] pi)
Compute the euclidean distance between the new parameters and the old ones to obtain a simple approximation of the EM convergence rate. |
static void |
eStep(double[] x,
double[] sd,
EMResult theta)
The Expectation step : it just consists in updating the Z matrix, i.e the cluster membership probabilities. |
static void |
initEM(double[] x,
double[] sd,
EMResult theta,
java.util.Random rng)
Initializes the EM algorithm by randomly assigning observations to the clusters and do one M-Step to compute the first values of the parameters. |
static int |
iterate(double[] x,
double[] sd,
EMResult theta)
This method performs one iteration of the EM-algorithm. |
static int |
mStep(double[] x,
double[] sd,
EMResult theta,
double err)
The Maximization step : here this step is straighforward and simple analytical formula are applied to obtain the new parameter estimates. |
static void |
updateLoglikelihood(double[] x,
double[] sd,
EMResult theta)
Updates the loglikelihood value. |
static double[] |
updateMuVector(double[] x,
double[] sd,
EMResult theta)
Updates the vector of mixture components and returns the new values. |
static double[] |
updatePiVector(EMResult theta)
Updates the vector of mixture mixings and returns the new values. |
static void |
updateZMatrix(double[] x,
double[] sd,
EMResult theta)
Updates the Z matrix. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static int EM_START
public static int EM_ITER_MAX
public static double EM_ERR
public static double EM_MIN_DISTANCE
public static boolean DO_SEM
public static final double TINY_Z_PROBA
public static final int EM_OK
public static final int EM_CONTINUE
public static final int EM_FAILURE
| Constructor Detail |
|---|
public EMAlgorithm()
| Method Detail |
|---|
public static EMResult doEM(double[] x,
double[] sd,
int k,
EMResult spoint)
x
is an array of observed value and sd an array of same
size than x which stores the standard deviations of
the observed values. The underlying mixture model will have
k distinct components and its parameter estimates
will be returned as an EMResult. If soint
is not null then the EM will start at the given point specified by
the mixture parameters of soint. To control some
behaviours of the EM-algorithm use the global variables of the class.
x - the observed positions.sd - the standard deviation of the observed positions.k - the number of mixture components.spoint - the starting point for the EM-algortihm
EMResult from which the parameter estimates can be obtained.EMResult
public static void initEM(double[] x,
double[] sd,
EMResult theta,
java.util.Random rng)
x - the observed values.sd - the standard deviations of the observed values.theta - the current parameter estimates.
public static int iterate(double[] x,
double[] sd,
EMResult theta)
EM_OK, otherwise it will return EM_CONTINUE.
In some cases it can happen - mainly due to numerical roundoff errors -
that the likelihood value after the iteration is lower than the likelihood
before the iteration which in theory cannot happen. Then in this case the
method will return EM_FAILURE.
- Parameters:
x - the observed values.sd - the standard deviations of the observed values.theta - the current parameter estimates.
- Returns:
- the status of the iteration.
public static void eStep(double[] x,
double[] sd,
EMResult theta)
x - the observed values.sd - the standard deviations of the observed values.theta - the current parameter estimates.updateZMatrix(double[], double[], EMResult)
public static int mStep(double[] x,
double[] sd,
EMResult theta,
double err)
EM_OK, otherwise it will return EM_CONTINUE.
In some cases it can happen - mainly due to numerical roundoff errors -
that the likelihood value after the maximization is lower than the likelihood
before the iteration which in theory cannot happen. Then in this case the
method will return EM_FAILURE
- Parameters:
x - the observed values.sd - the standard deviations of the observed values.theta - the current parameter estimates.- See Also:
updateMuVector(double[], double[], EMResult),
EMAlgorithm#updatePiVector(double[], double[], EMResult),
updateLoglikelihood(double[], double[], EMResult)
public static void emRate(EMResult theta,
double[] mu,
double[] pi)
theta - the current parameter estimates.mu - the last estimates of the means.pi - the last estimates of the mixing.
public static double[] updateMuVector(double[] x,
double[] sd,
EMResult theta)
x - the observed values.sd - the standard deviations of the observed values.theta - the current parameter estimates.
public static double[] updatePiVector(EMResult theta)
theta - the current parameter estimates.
public static void updateZMatrix(double[] x,
double[] sd,
EMResult theta)
x - the observed values.sd - the standard deviations of the observed values.theta - the current parameter estimates.
public static void updateLoglikelihood(double[] x,
double[] sd,
EMResult theta)
x - the observed values.sd - the standard deviations of the observed values.theta - the current parameter estimates.
public static void computeDM(double[] x,
double[] sd,
double[] mu,
EMResult theta)
x - the data points.sd - the standard deviation of the data points.mu - the previous estimate of the meanstheta - the current parameter estimates.
public static void computeCOV(double[] sd,
EMResult theta)
DO_SEM has been
set to false then the observed information won't be
computed and the variance-covariance matrix will be
obtained by using only the complete information.
sd - the standard deviations.theta - the maximum likelihood estimate.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||