Projection pursuit


Projection pursuit is an iterative (=brute-force) method used for dimensionality reduction that focusses on preserving interesting features about the data.

Imagine a two-dimensional space in which objects are randomly distributed along each axis. If you shine a light diagonally from one corner to the other corner, the shadows visible on the opposite two sides will take on a Gaussian / normal distribution with many shadows in the middle and fewer shadows towards the two ends. If you try shining the light from all sides, the two projections where the distribution of the shadows will be most unlike a Gaussian distribution will be directly along the two axes. These are most interesting because these are the two dimensions according to which the objects were randomly distributed.

The hyperparameter for projection pursuit is the number of target dimensions. As with principal component analysis, the algorithm builds up one dimension at a time, initially finding the least Guassian projection, then removing that projection from the input data, then finding the next least Gaussian projection and so on.

The above example is trivial, but imagine the same idea projecting from a three-dimensional space on to a two-dimensional space that contains points around the outline of an object, and you will see that the least Gaussian projections are likely to correspond closely to the results of a principal component analysis of the same data. One important difference is however that with principal component analysis the target vectors are necessarily orthogonal, which does not need to be the case with general projection pursuit.

has functional building block
FBB_Dimensionality reduction
has input data type
IDT_Vector of quantitative variables
has internal model
has output data type
ODT_Vector of quantitative variables
has learning style
has parametricity
PRM_Nonparametric with hyperparameter(s)
has relevance
sometimes supports
mathematically similar to