Projection Pursuit

Supporting Technique

Projection pursuit is an iterative method used for dimensionality reduction that focuses on preserving interesting features about the data.

Projection pursuit is used in scenarios where identifying and preserving the most informative projections of high-dimensional data is crucial, such as in exploratory data analysis and visualization. It helps in reducing the dimensionality of the data while retaining the most significant structures and patterns.

Projection pursuit works by iteratively finding the projections of the data that are most “interesting,” typically defined as the projections that are least Gaussian. The algorithm builds up one dimension at a time, initially finding the least Gaussian projection, then removing that projection from the input data, and then finding the next least Gaussian projection, and so on.

For example, imagine a two-dimensional space where objects are randomly distributed along each axis. If you shine a light diagonally from one corner to the other, the shadows on the opposite sides will form a Gaussian distribution. By finding the projections where the distribution of the shadows is most unlike a Gaussian distribution, projection pursuit identifies the most informative dimensions.

One hyperparameter for projection pursuit is the number of target dimensions. As with principal component analysis, the algorithm builds up one dimension at a time, initially finding the least Guassian projection, then removing that projection from the input data, then finding the next least Gaussian projection and so on.

The above example is trivial, but imagine the same idea projecting from a three-dimensional space on to a two-dimensional space that contains points around the outline of an object, and you will see that the least Gaussian projections are likely to correspond closely to the results of a principal component analysis of the same data. One important difference is however that with principal component analysis the target vectors are necessarily orthogonal, which does not need to be the case with general projection pursuit.

Alias
PP
Related terms
Dimensionality Reduction Principal Component Analysis Non-Gaussian Projections