Academia.eduAcademia.edu

Deep Q-Networks

description20 papers
group4 followers
lightbulbAbout this topic
Deep Q-Networks (DQN) are a class of reinforcement learning algorithms that combine Q-learning with deep neural networks to approximate the optimal action-value function. They enable agents to learn effective policies in high-dimensional state spaces by using experience replay and target networks to stabilize training.
lightbulbAbout this topic
Deep Q-Networks (DQN) are a class of reinforcement learning algorithms that combine Q-learning with deep neural networks to approximate the optimal action-value function. They enable agents to learn effective policies in high-dimensional state spaces by using experience replay and target networks to stabilize training.

Key research themes

1. How can Deep Q-Networks improve learning efficiency and performance robustness in autonomous navigation and path planning for mobile agents?

This theme explores the application of Deep Q-Networks (DQN) in guiding autonomous agents, such as mobile robots and vehicles, to efficiently navigate complex and dynamic environments. The research focuses on enhancing sample efficiency, overcoming high-dimensional state spaces, and ensuring generalizability and safety in navigation tasks. It studies the integration of DQN with techniques like experience replay, heuristic knowledge, and simulation environments to enable real-time decision-making in unknown or partially known spaces, addressing challenges in path planning, obstacle avoidance, and autonomous driving.

Key finding: This paper presents an approach combining DQN, experience replay, and heuristic knowledge to efficiently enable smart robot path planning and obstacle avoidance in unknown environments. The use of experience replay addresses... Read more
Key finding: The study applies DQN agents to the urban autonomous driving scenario simulated in CARLA, addressing policy learning for lane following and collision avoidance using sensor inputs and path planners. It highlights both... Read more
Key finding: This work demonstrates the effective use of DQN for mobile robot navigation and exploration in a biomedical operating room environment, leveraging rewards reflecting successful task completion and collision avoidance. It... Read more
Key finding: This research develops a DQN-based reinforcement learning control system for autonomous mobile robot navigation in dynamic environments simulated in Gazebo. The model integrates state-feedback from sensors and trains policies... Read more
Key finding: The paper proposes a convolutional neural network-driven DQN model for controlling a four-wheel mobile robot’s line tracking based on camera inputs in a Gazebo simulated environment. The model achieves superior tracking... Read more

2. In what ways can Deep Q-Networks be enhanced or adapted to address challenges in training stability, hyperparameter sensitivity, and time discretization robustness?

This research area investigates methodological innovations and theoretical analyses to improve DQN training efficiency, robustness to environmental and algorithmic parameters, and stability under different time discretizations. It encompasses algorithmic contributions such as dynamic reward mechanisms, capacity reduction strategies for experience replay, and theoretical formalizations about Q-function behavior in continuous or near-continuous time settings. The goal is to enhance the reliability and applicability of DQN in diverse real-world scenarios by addressing known limitations in training procedures and hyperparameter tuning.

Key finding: The authors theoretically prove the collapse of traditional Q-learning in the continuous-time limit and propose architectural and algorithmic adjustments called Deep Advantage Updating to maintain learning performance across... Read more
Key finding: This work presents a reinforcement learning framework utilizing continuous Deep Q-Learning techniques (notably NAF) for dynamic hyperparameter optimization in object tracking. By treating hyperparameter selection as a... Read more
Key finding: By experimentally reducing the capacity of Experience Replay in Deep Q-Learning across Atari games, this paper finds that moderate buffer size reduction (from 10,000 to 5,000) does not significantly impair performance,... Read more

3. How can Deep Q-Learning be applied to complex decision-making problems involving high-dimensional, combinatorial, or multi-agent action spaces such as financial portfolio trading, cloud load balancing, or multi-agent target search?

This theme focuses on the extension of Deep Q-Learning methodologies to domains with sophisticated action and state representations, including multi-asset financial markets, cloud computing infrastructures, and cooperative multi-agent systems. Research contributions include devising specialized discrete or combinatorial action spaces, mapping infeasible actions to feasible alternatives, hybrid learning architectures, and the use of distributed Q-learning to optimize collective decision-making. Such work addresses scaling challenges and practical applicability considerations for DQN-based solutions beyond simple control tasks.

Key finding: This study formulates portfolio trading as a Markov decision process with a discrete combinatorial action space representing buy/hold/sell decisions per asset. It introduces a novel mapping function to convert infeasible... Read more
Key finding: This paper proposes Rough Q-learning, integrating rough set theory with classical Q-learning to address overestimation bias in approximated Q-values. The approach improves algorithm stability and performance by minimizing the... Read more
Key finding: The authors propose a hybrid Deep Q Recurrent Neural Network (DQRNN) combining deep Q-networks and recurrent architectures to manage cloud load balancing, incorporating factors such as supply, demand, capacity, resource... Read more
Key finding: This paper develops a distributed multi-agent search strategy leveraging deep Q-learning with error-prone sensors to maximize expected information gain under statistical detection uncertainties (type I and II errors). It... Read more

All papers in Deep Q-Networks

The mobile robot is an intelligent device that can achieve many tasks in life. For autonomous, navigation based on the line on the ground is often used because it helps the robot to move along a predefined path, simplifies the path... more
Deep Q-Learning has been successfully applied to a wide variety of tasks in the past several years. However, the architecture of the vanilla Deep Q-Network is not suited to deal with partially observable environments such as 3D video... more
This study proposes an enhanced method for preventing data breaches in mobile storage media by improving access control mechanisms through the integration of Deep Q-Network (DQN) algorithms. Building on attributebased encryption (ABE)... more
In response to the complex demands of autonomous vehicle (AV) navigation in urban environments, this study explores a data-driven, reinforcement learning (RL)-based approach to optimize navigation for AVs, enhancing both efficiency and... more
This paper conducts a sociological analysis of artificial intelligence (AI), examining its benefits, concerns, and future implications for society. Through a multidimensional lens, it explores how AI technologies shape social structures,... more
Cloud services are among the technologies that are developing the fastest. Additionally, it is acknowledged that load balancing poses a major obstacle to reaching energy efficiency. Distributing the load among several resources in order... more
This systematic review, carried out under the PRISMA methodology, aims to identify how reinforcement learning has been used in demand forecasting, distinguishing the problems they are trying to overcome, recognizing the algorithms used,... more
The rapid advancement of Artificial Intelligence (AI) technologies has significantly transformed various sectors, including healthcare, finance, and transportation. However, these developments raise critical ethical concerns that require... more
Reinforcement Learning (RL) has emerged as a pivotal area in artificial intelligence, revolutionizing the way agents learn optimal behaviors through interaction with their environment. This paper explores the evolution of RL techniques,... more
This paper proposes new algorithms to improve Reinforcement Learning (RL) and Deep Q-Network (DQN) methods for path planning considering uncertainty in the perception of environment. The study aimed to formulate and solve the path... more
This paper presents a modification of the deep Q-network (DQN) in deep reinforcement learning to control the angle of the inverted pendulum (IP). The original DQN method often uses two actions related to two force states like constant... more
Reinforcement learning (RL) algorithms still suffer from high sample complexity despite outstanding recent successes. The need for intensive interactions with the environment is especially observed in many widely popular policy gradient... more
Reinforcement learning constantly deals with hard integrals, for example when computing expectations in policy evaluation and policy iteration. These integrals are rarely analytically solvable and typically estimated with the Monte Carlo... more
This paper presents a modification of the deep Q-network (DQN) in deep reinforcement learning to control the angle of the inverted pendulum (IP). The original DQN method often uses two actions related to two force states like constant... more
In reinforcement learning (RL), an agent learns an environment through hit and trail. This behavior allows the agent to learn in complex and difficult environments. In RL, the agent normally learns the given environment by exploring or... more
The desire to make applications and machines more intelligent and the aspiration to enable their operation without human interaction have been driving innovations in neural networks, deep learning, and other machine learning techniques.... more
Artificial Intelligence (AI) is becoming a critical component in the defense industry, as recently demonstrated by DARPA's AlphaDogfight Trials (ADT). ADT sought to vet the feasibility of AI algorithms capable of piloting an F-16 in... more
One of the most interesting topics for research and also for making a profit is stock trading. Artificial intelligence has had a great impact on this path. A lot of research has been done to investigate the application of machine... more
Researchers and practitioners in the field of reinforcement learning (RL) frequently leverage parallel computation, which has led to a plethora of new algorithms and systems in the last few years. In this paper, we re-examine the... more
by Jamie Pote and 
1 more
We investigate and analyze principles of typical motion planning algorithms. These include traditional planning algorithms, supervised learning, optimal value reinforcement learning, policy gradient reinforcement learning. Traditional... more
This work considers the problem of learning cooperative policies in complex, partially observable domains without explicit communication. We extend three classes of single-agent deep reinforcement learning algorithms based on policy... more
In this paper, a deep reinforcement learning (DRL) method is proposed to address the problem of UAV navigation in an unknown environment. However, DRL algorithms are limited by the data efficiency problem as they typically require a huge... more
In this paper we present a Bayesian reinforcement learning framework that allows robotic manipulators to adaptively recover from random mechanical failures autonomously, hence being survivable. To this end, we formulate the framework of... more
Artificial Intelligence (AI) is becoming a critical component in the defense industry, as recently demonstrated by DARPA's AlphaDogfight Trials (ADT). ADT sought to vet the feasibility of AI algorithms capable of piloting an F-16 in... more
Serverless Function-as-a-Service (FaaS) is an emerging cloud computing paradigm that frees application developers from infrastructure management tasks such as resource provisioning and scaling. To reduce the tail latency of functions and... more
The performance of off-policy learning, including deep Q-learning and deep deterministic policy gradient (DDPG), critically depends on the choice of the exploration policy. Existing exploration methods are mostly based on adding noise to... more
Unmanned aerial vehicles (UAVs) have been extensively used in civil and industrial applications due to the rapid development of the guidance, navigation and control (GNC) technologies. Especially, using deep reinforcement learning methods... more
We present an efficient deep reinforcement learning (DRL) approach to automatically construct time-dependent optimal control fields that enable desired transitions in reduced-dimensional chemical systems. Our DRL approach gives impressive... more
We have developed an autonomous virtual character guided by emotions. The agent is a virtual character who lives in a three-dimensional maze world. We found that emotion drivers can induce the behavior of a trained agent. Our approach is... more
Value-based deep Reinforcement Learning (RL) algorithms suffer from the estimation bias primarily caused by function approximation and temporal difference (TD) learning. This problem induces faulty state-action value estimates and... more
In this article, we sketch an algorithm that extends the Q-learning algorithms to the continuous action space domain. Our method is based on the discretization of the action space. Despite the commonly used discretization methods, our... more
This paper presents a benchmarking study of some of the state-of-the-art reinforcement learning algorithms used for solving two simulated vision-based robotics problems. The algorithms considered in this study include soft actor-critic... more
Unmanned Aerial Vehicles (UAVs) are increasingly being used in many challenging and diversified applications. These applications belong to the civilian and the military fields. To name a few; infrastructure inspection, traffic patrolling,... more
Recent advances in Reinforcement Learning (RL) have surpassed human-level performance in many simulated environments. However, existing reinforcement learning techniques are incapable of explicitly incorporating already known... more
Hindsight Experience Replay (HER) is one of the efficient algorithm to solve Reinforcement Learning tasks related to sparse rewarded environments.But due to its reduced sample efficiency and slower convergence HER fails to perform... more
Trust region methods and maximum entropy methods are two state-of-the-art branches used in reinforcement learning (RL) for the benefits of stability and exploration in continuous environments, respectively. This paper proposes to... more
Smart systems are often battery-constrained, and compete for resources from remote clouds, which results in high delay. Collaboratively sharing resource among neighbors in proximity is promising to control such delay for time-sensitive... more
Since the establishment of robotics in industrial applications, industrial robot programming involves therepetitive and time-consuming process of manually specifying a fixed trajectory, which results in machineidle time in terms of... more
Deep reinforcement learning (RL) has made it possible to solve complex robotics problems using neural networks as function approximators. However, the policies trained on stationary environments suffer in terms of generalization when... more
Value-based deep Reinforcement Learning (RL) algorithms suffer from the estimation bias primarily caused by function approximation and temporal difference (TD) learning. This problem induces faulty state-action value estimates and... more
The performance of off-policy learning, including deep Q-learning and deep deterministic policy gradient (DDPG), critically depends on the choice of the exploration policy. Existing exploration methods are mostly based on adding noise to... more
Reinforcement learning (RL) is attracting increasing interests in autonomous driving due to its potential to solve complex classification and control problems. However, existing RL algorithms are rarely applied to real vehicles for two... more
Utilizing the collected experience tuples in the replay buffer (RB) is the primary way of exploiting the experiences in the off-policy reinforcement learning (RL) algorithms, and, therefore, the sampling scheme for the experience tuples... more
User identity linkage is a task of recognizing the identities of the same user across different social networks (SN). Previous works tackle this problem via estimating the pairwise similarity between identities from different SN,... more
This paper addresses the question of how a previously available control policy πs can be used as a supervisor to more quickly and safely train a new learned control policy πL for a robot. A weighted average of the supervisor and learned... more
Utilizing the idea of long-term cumulative return, reinforcement learning (RL) has shown remarkable performance in various fields. We propose a formulation of the landmark localization in 3D medical images as a reinforcement learning... more
We introduce a new count-based optimistic exploration algorithm for Reinforcement Learning (RL) that is feasible in environments with high-dimensional state-action spaces. The success of RL algorithms in these domains depends crucially on... more
Being able to navigate to a target with minimal supervision and prior knowledge is critical to creating human-like assistive agents. Prior work on map-based and map-less approaches have limited generalizability. In this paper, we present... more
Discrete-action algorithms have been central to numerous recent successes of deep reinforcement learning. However, applying these algorithms to high-dimensional action tasks requires tackling the combinatorial increase of the number of... more
Download research papers for free!