Adaptive Critic

description18 papers

group9 followers

lightbulbAbout this topic

An adaptive critic is a computational model used in reinforcement learning that employs two components: a critic, which evaluates the actions taken by an agent, and an actor, which selects actions based on the critic's feedback. This framework facilitates the optimization of decision-making processes in dynamic environments.

lightbulbAbout this topic

Key research themes

1. How can policy gradient and natural gradient actor-critic methods be adapted for improved sample efficiency and stability in reinforcement learning with function approximation?

This research area investigates actor-critic algorithms leveraging gradient-based policy optimization, focusing on enhancing sample efficiency, convergence stability, and compatibility with function approximation, especially neural networks. It addresses challenges including high variance gradient estimates, off-policy evaluation, and the use of natural gradients to respect the parameterization geometry, enabling robust learning in large or continuous state and action spaces.

Natural actor-critic algorithms

by Shalabh Bhatnagar

2021, Automatica

Key finding: This work introduces four actor-critic algorithms combining natural gradient methods and function approximation, proving their convergence and demonstrating that natural gradients reduce sensitivity to parameterization and... Read more

articleView Paper downloadDownload

Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm

by Shalabh Bhatnagar

2025, arXiv (Cornell University)

Key finding: Proposes an off-policy natural actor-critic algorithm utilizing state-action distribution correction with compatible features that enables integration with arbitrary neural networks approximating policy and value functions,... Read more

articleView Paper downloadDownload

Actor-critic algorithms

by John N. Tsitsiklis

2012

Key finding: Presents a class of actor-critic methods where the critic employs temporal difference learning with a linearly parameterized value function approximation tailored to the actor's parameterization. This approach guarantees... Read more

articleView Paper downloadDownload

A simple actor-critic algorithm for continuous environments

by Pawel Wawrzynski

2021

Key finding: Introduces the Randomized Policy Optimizer (RPO), an actor-critic algorithm with a modular design using parameterized action distributions and neural network approximators for policy and value functions. The method optimizes... Read more

articleView Paper downloadDownload

Gradient Temporal-Difference Learning with Regularized Corrections

by Shivam Garg

2021

Key finding: Proposes the TD with Regularized Corrections (TDRC) algorithm balancing the simplicity and performance of TD with the assured convergence of Gradient TD methods. It achieves practical stability and improved sample efficiency... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

2. How can temporally extended actions (options) and hierarchical reinforcement learning be autonomously learned and optimized using policy gradient and natural gradient methods?

This research theme focuses on learning hierarchical policies through options, which are temporally extended actions, enabling scalable and efficient reinforcement learning. It advances methods to autonomously discover, optimize, and terminate such options within a unifying framework, employing policy gradient theory and natural gradient approaches to learn intra-option policies, termination conditions, and policies over options without predefined subgoals or extrinsic rewards.

The Option-Critic Architecture

by amit kumar

2021

Key finding: Develops the option-critic framework deriving policy gradient theorems for simultaneous learning of intra-option policies, termination functions, and policy over options without additional reward signals or subgoals.... Read more

articleView Paper downloadDownload

Natural Option Critic

by Saket Tiwari

2023, Proceedings of the AAAI Conference on Artificial Intelligence

Key finding: Extends the option-critic architecture to incorporate natural gradients by deriving Fisher information matrices for option policies and termination functions, enabling efficient natural gradient updates. Employs compatible... Read more

articleView Paper downloadDownload

Adaptive Critic Designs

by yongliang yang

2014

Key finding: Reviews adaptive critic designs as neural network-based approximations of dynamic programming, emphasizing their roots in reinforcement learning for continuous control tasks requiring temporally extended action sequences. The... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

3. What are effective algorithmic adaptations and architectures for actor-critic reinforcement learning tailored to control and autonomous systems with safety, sample efficiency, and application-specific constraints?

Research under this theme develops specialized actor-critic methods for real-world control applications, addressing challenges such as constrained computation and memory (e.g., in IoT devices), safety-critical environments, sample inefficiency, and model biases. Techniques include adaptive learning rates, biologically plausible training methods, multi-step evaluations, integration of robust control techniques, and human-inspired experience inference to improve reactivity, convergence speed, stability, and robustness of reinforcement learning controllers.

Highly Adaptive Linear Actor-Critic for Lightweight Energy-Harvesting IoT Applications

by suzanne lesecq

2023, Journal of Low Power Electronics and Applications

Key finding: Proposes the LAC-AB algorithm combining linear actor-critic with an adaptive Adam-based learning rate optimized for fast reactivity to environmental changes in power-constrained IoT nodes. Demonstrates via real solar... Read more

articleView Paper downloadDownload

Asymptotic tracking by a reinforcement learning-based adaptive critic controller

by Nitin Sharma

2024, Journal of Control Theory and Applications

Key finding: Develops a continuous-time adaptive critic controller using two neural networks (actor and critic) and integrates the Robust Integral of the Sign of the Error (RISE) feedback technique to guarantee semiglobal asymptotic... Read more

articleView Paper downloadDownload

Combining Backpropagation with Equilibrium Propagation to improve an Actor-Critic RL framework

by Eric Chalmers

2024

Key finding: Presents the first reinforcement learning architecture applying Equilibrium Propagation (EP) to train the actor network within an actor-critic framework, improving biological plausibility over backpropagation while... Read more

articleView Paper downloadDownload

Adaptive Multi-Step Evaluation Design With Stability Guarantee for Discrete-Time Optimal Learning Control

by IEEE/CAA J. Autom. Sinica

2023, IEEE/CAA Journal of Automatica Sinica

Key finding: Introduces a novel multi-step heuristic dynamic programming (MsHDP) algorithm initialized from zero cost function, proving convergence to the solution of the Hamilton-Jacobi-Bellman equation with stability guarantees. The... Read more

articleView Paper downloadDownload

Optimal Control of Nonlinear Systems Using Experience Inference Human-Behavior Learning

by IEEE/CAA J. Autom. Sinica

2023, IEEE/CAA Journal of Automatica Sinica

Key finding: Proposes a human-behavior inspired experience inference learning approach combining hippocampus-like model-based reference system and neocortex-like adaptive dynamic programming (reinforcement learning) with striatum-like... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

All papers in Adaptive Critic

Asymptotic tracking by a reinforcement learning-based adaptive critic controller

by Nitin Sharma

2024, Journal of Control Theory and Applications

Adaptive critic (AC) based controllers are typically discrete and/or yield a uniformly ultimately bounded stability result because of the presence of disturbances and unknown approximation errors. A continuous-time AC controller is... more

descriptionView Paper arrow_downwardDownload

Asymptotic tracking by a reinforcement learning-based adaptive critic controller

by Warren Dixon

2024, Journal of Control Theory and Applications

descriptionView Paper arrow_downwardDownload

Efficient reduced order controller design for dissipative PDE systems with strong convective phenomena

by Antonios Armaou

2024, Chemical engineering research & design

The control of dissipative distributed parameter systems with strong convective phenomena is considered, employing model order reduction. The accuracy of the derived reduced order model (ROM) and the associated observer may decrease as... more

descriptionView Paper arrow_downwardDownload

Asymptotic tracking by a reinforcement learning-based adaptive critic controller

by nitin sharma

2024, Journal of Control Theory and Applications

descriptionView Paper arrow_downwardDownload

Adaptive Multi-Step Evaluation Design With Stability Guarantee for Discrete-Time Optimal Learning Control

by IEEE/CAA J. Autom. Sinica

2023, IEEE/CAA Journal of Automatica Sinica

This paper is concerned with a novel integrated multi-step heuristic dynamic programming (MsHDP) algorithm for solving optimal control problems. It is shown that, initialized by the zero cost function, MsHDP can converge to the optimal... more

descriptionView Paper arrow_downwardDownload

Intelligent robot control using an adaptive critic with a task control center and dynamic database

by Ernest L Hall

2023, Proceedings of SPIE

The purpose of this paper is to describe the design, development and simulation of a real time controller for an intelligent, vision guided robot. The use of a creative controller that can select its own tasks is demonstrated. This... more

descriptionView Paper arrow_downwardDownload

Asymptotic tracking by a reinforcement learning-based adaptive critic controller

by Warren Dixon