Binary decomposition is a technique used in machine learning to solve multiclass or ordinal multiclass prediction problems.
Binary decomposition is commonly used in machine learning to simplify the complexity of multiclass problems and improve the performance of classifiers by reducing the problem to a series of binary classification tasks. These easier tasks can then be solved independently. The final classification is determined by voting and potentially taking the confidence of the binary classifications into account.
One popular binary decomposition method is one-vs-rest, also known as one-vs-all.
In this method, a binary classifier is trained for each class to distinguish that class from all the other classes.
For example, in a problem with three classes A
, B
, and C
, three binary classifiers would be trained: one to distinguish A
from B or C
, one to distinguish B
from A or C
, and one to distinguish C
from A or B
.
The final vote is determined by selecting the class with the highest confidence score among the binary classifiers.
Another binary decomposition method is one-vs-one, also known as all-pairs.
In this method, a binary classifier is trained for every pair of classes to distinguish one from the other.
For example, in a problem with three classes A
, B
, and C
, three binary classifiers would be trained: one to distinguish A
from B
, one to distinguish A
from C
, and one to distinguish B
from C
.
The final vote is determined by a majority vote among all the binary classifiers.
Ordinal multiclass prediction, a special case where a reasonable order of the classes can be defined, can be decomposed by stacked binary decisions.
First, one binary classifier decides if the classification is >
or <=
than the median of the available classes.
Similar to a binary tree search, it is then proceeded with the median of the remaining classes until a final decision is reached. (cf. this paper)
Alternatively, all binary classification of the form “is it worse than Class X” can be performed simultaneously.
In this case, the difference of the prediction probabilities is used to determine the final prediction.
Given three classes Small, Medium, and Large, the prediction probability P(Medium)
can be calculated as the difference of P(greater than Small)
and P(greater than Medium)
. (cf. this paper)
Binary decomposition methods can simplify the complexity of multiclass problems and improve the performance of classifiers by reducing the problem to a series of binary classification tasks. The choice of method depends on the specific problem and available resources. One-vs-all is generally simpler to implement and can handle imbalanced class distributions, while all-pairs can be more accurate for some problems but requires more computational resources.
- Alias
- One vs Rest One vs All All Pairs
- Related terms
- Multiclass Classification Ordinal Multiclass Classification Ensemble