The standard neural network implementation of actor-critic involves linking two neural sub-networks, an actor sub-network and a critic sub-network, and training them in two steps in reverse order. However, many recent approaches actually involve sharing both the input parameters and most of the neural network layers between the actor and the critic parts of the network, blurring the distinction apart from the fact that such networks retain separate outputs for the policy and the value functions.
A3C (Asynchronous Actor-Critic Agent) is one such approach that is trained using several parallel actors that pool their results, which serves to reduce overlearning.
- alias
- subtype
- Asynchronous Actor-Critic Agent A3C
- has functional building block
- FBB_Behavioural modelling
- has input data type
- IDT_Vector of quantitative variables IDT_Vector of categorical variables IDT_Binary vector
- has internal model
- INM_Markov decision process INM_Neural network
- has output data type
- ODT_Classification
- has learning style
- LST_Reinforcement
- has parametricity
- PRM_Nonparametric with hyperparameter(s)
- has relevance
- REL_Relevant
- uses
- sometimes supports
- mathematically similar to