Expectation maximization


Expectation maximization is an algorithm used when data within a multidimensional space is clustered around two or more local centres. Its output is:

  • the centres of each cluster; and
  • the probabilities of each data item belonging to each cluster.

Expectation maximization is very similar to k-means:

  • The standard versions of both algorithms require a hyperparameter in that you need to know how many clusters there are before you start;
  • Both are initialized with random cluster centres that then move to optimal positions over a series of iterations;
  • Neither guarantees mathematically that they will converge to the global (theoretically best) optimum;
  • In both cases, the output can be influenced by the initially chosen random positions.

The main differences are:

  • Expectation maximization requires a multivariate Gaussian distribution of data items around their respective cluster centres, while k-means does not;
  • Expectation maximization can cope with different variances from dimension to dimension, while k-means requires fairly spherical clusters;
  • Both during its iterations and when it expresses its output, expectation maximization gives a probability for each data item belonging to each cluster, while k-means makes a best guess as to which cluster each data item belongs to.
EM Non-parametric expectation maximization npEM
Gaussian mixture GMM
has functional building block
has input data type
IDT_Vector of quantitative variables
has internal model
has output data type
ODT_Vector of quantitative variables ODT_Classification ODT_Probability
has learning style
has parametricity
PRM_Nonparametric with hyperparameter(s)
has relevance
sometimes supports
ALG_Naive Bayesian Classifier ALG_Bayesian network
mathematically similar to