The aim of hierarchical clustering is to represent the similarity relationships within a set of vector data as a tree structure. In a very simple case, imagine four villages on a map that are arranged in two groups of two. Successful hierarchical clustering would analyse the houses that make up the villages as being arranged in four clusters with the clusters themselves arranged in two groups of two.
There are two different approaches to hierarchical clustering:
- divisive or top-down, where the algorithm starts with a single cluster and successively splits it up into its constituent structures;
- agglomerative or bottom-up, where each input data item starts off life in its own cluster and the clusters are gradually merged to form the tree structure.
Any measure of distance can be plugged into a hierarchical clustering algorithm. Alongside the standard geometric measures, this extends to domain-specific measures like Levenshtein distance in natural-language processing.
- Divisive hierarchical clustering Agglomerative hierarchical clustering
- has functional building block
- has input data type
- IDT_Vector of quantitative variables
- has internal model
- has output data type
- has learning style
- has parametricity
- has relevance
- sometimes supports
- mathematically similar to