msg.Machine Learning Catalogue

A long short-term memory (LSTM) network is a neural network used to process sequential data.

LSTM networks are commonly used in tasks where the input data is sequential, such as video frames or audio recordings. They belong to the class of recurrent neural networks (RNNs) and are designed to overcome the limitations of traditional RNNs by using a more complex architecture.

An LSTM network consists of a chain of units, each containing a set of gates and memory cells. The main data flows through the network similar to other neural networks, while the memory data is managed by gates that control the flow of information. These gates include the input gate, output gate, and forget gate, which regulate the addition, retention, and removal of information in the memory cells.

Unlike general random-access memory, the memory in an LSTM network is accessed using learned weights. This allows the network to selectively read from or write to specific memory locations by adjusting the weights accordingly.

Variants of LSTM networks include Gated Recurrent Unit (GRU) networks and peephole networks, which differ in their gate and vector operation arrangements. Hidden LSTM (H-LSTM) networks feature gates composed of entire neural networks, unlike the single-layer gates in classic LSTMs.

An excellent general introduction to LSTMs complete with diagrams is available here.

Alias: LSTM
Related terms: Neural Network Attention Recurrent Neural Networks

Long Short-term Memory Network