A Bayesian network is a graph whose nodes represent variables. Each variable has two or more values with certain probabilities. The links between nodes model dependency relationships between pairs of variables.
A Bayesian network can be used both for classification and to investigate the interrelationships between the variables within a system. For classification, the input nodes are set to match a specific data item and the classification probabilities are read off the output nodes. A probability table capturing all possible combinations of all variables would be a special-case Bayesian network where all node pairs are joined by a link. Because such a table would grow exponentially with respect to the number of variables, the practical aim is generally to minimize the number of links by only linking nodes where a genuine dependency relationship exists.
A standard Bayesian network is a directed graph. The direction is based solely on how the nodes are to be used, input or output, as well as on which topology allows the most information to be modelled with the fewest number of links. A causal network is a subtype where link directions express causality.
A Bayesian network can be completely created by a subject-matter expert and used to classify input data without any machine learning. This possibility is seen e.g. in medicine where Bayesian networks are used to capture the likelihood of a patient with certain symptoms having a certain illness. However, there are also various levels on which Bayesian networks appear in a machine-learning context:
- The structure of the network is given and training data is used to derive the probability rules.
- The network contains intermediate nodes for which data is missing. A Bayesian network for which some variables cannot be observed is known as a hidden Markov model and is commonly used to model time-sequence data. Various algorithms including the expectation-maximization algorithm can be used to derive the missing values.
- Various techniques can also be used to discover the network structure itself, determining which variables are mutually dependent, from training data.
The defining difference between a Bayesian network and a Markov random field is that a Bayesian field is directed while a Markov random field is undirected. There are some graphs that can be converted from one type to the other without losing information, apart from the directedness, but each type of model is also able to capture information that the other type cannot. Contrary to what the name might suggest, a hidden Markov model is directed and thus a sub-type of Bayesian network rather than a sub-type of Markov random field.
Bayesian networks are used in various fields such as medicine, finance, and engineering to model complex systems and make probabilistic inferences. They are valuable because they provide a clear framework for understanding dependencies among variables and can handle uncertainty effectively.
- Alias
- Bayesian belief network Belief network
- Related terms