Neural actor-critic


The standard neural network implementation of actor-critic involves linking two neural sub-networks, an actor sub-network and a critic sub-network, and training them in two steps in reverse order. However, many recent approaches actually involve sharing both the input parameters and most of the neural network layers between the actor and the critic parts of the network, blurring the distinction apart from the fact that such networks retain separate outputs for the policy and the value functions.

A3C (Asynchronous Actor-Critic Agent) is one such approach that is trained using several parallel actors that pool their results, which serves to reduce overlearning.

Asynchronous Actor-Critic Agent A3C
has functional building block
FBB_Behavioural modelling
has input data type
IDT_Vector of quantitative variables IDT_Vector of categorical variables IDT_Binary vector
has internal model
INM_Markov decision process INM_Neural network
has output data type
has learning style
has parametricity
PRM_Nonparametric with hyperparameter(s)
has relevance
sometimes supports
mathematically similar to