Expectation maximization

Algorithm

Expectation maximization is an algorithm used when data within a multidimensional space is clustered around two or more local centres. Its output is:

  • the centres of each cluster; and
  • the probabilities of each data item belonging to each cluster.

Expectation maximization is very similar to k-means:

  • The standard versions of both algorithms require a hyperparameter in that you need to know how many clusters there are before you start;
     
  • Both are initialized with random cluster centres that then move to optimal positions over a series of iterations;
     
  • Neither guarantees mathematically that they will converge to the global (theoretically best) optimum;
     
  • In both cases, the output can be influenced by the initially chosen random positions.

The main differences are:

  • Expectation maximization requires a multivariate Gaussian distribution of data items around their respective cluster centres, while k-means does not;
     
  • Expectation maximization can cope with different variances from dimension to dimension, while k-means requires fairly spherical clusters;
     
  • Both during its iterations and when it expresses its output, expectation maximization gives a probability for each data item belonging to each cluster, while k-means makes a best guess as to which cluster each data item belongs to.
alias
EM Non-parametric expectation maximization npEM
subtype
Gaussian mixture GMM
has functional building block
FBB_Classification
has input data type
IDT_Vector of quantitative variables
has internal model
has output data type
ODT_Vector of quantitative variables ODT_Classification ODT_Probability
has learning style
LST_Unsupervised
has parametricity
PRM_Nonparametric with hyperparameter(s)
has relevance
REL_Relevant
uses
sometimes supports
ALG_Naive Bayesian Classifier ALG_Bayesian network
mathematically similar to
ALG_k-means