Reinforcement Learning: Navigating mazes using SARSA

Brian Manolovitz; Linhai Ma; Rahul K . Dass

Outline

Title

Reinforcement Learning: Navigating mazes using SARSA

Brian Manolovitz

Linhai Ma

Rahul K . Dass

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract
AI

This paper explores the application of the SARSA reinforcement learning algorithm to navigate mazes and identify the shortest path. The agent learns through a reward system that encourages exploration while penalizing sub-optimal actions. Experiments demonstrate that after sufficient episodes, the agent successfully reduces both the number of steps taken and the error rate in reaching the destination, highlighting the efficiency of the SARSA algorithm combined with an epsilon-greedy strategy.

FAQs

What are the key results of using SARSA in maze navigation?add

The agent learned to find the shortest path, achieving a 0% error rate after approximately 80 episodes in smaller mazes.

How does the ϵ-greedy strategy enhance the SARSA algorithm?add

The ϵ-greedy strategy allows the agent to explore alternative paths, preventing it from getting trapped in local minima.

What impact do maze sizes have on the learning process?add

Larger mazes, such as 10x10, required significantly more episodes to stabilize performance, with 400 episodes needed.

How are Q-values updated in the SARSA algorithm?add

Q-values are updated based on the actual actions taken by the agent and the accompanying rewards received.

What metrics were used to measure the agent's performance in the mazes?add

The study measured rewards, steps required to reach the destination, and error rates over defined episode intervals.

Figures (6)

To evaluate this process in different situations, mazes of two different sizes were used: 6x8 and 10x10, shown in Fig. 1. The green block refers to start point and the yellow one is destination. The red ones are walls that we can’t run into. And the blue ones represent the shortest path.

For the experiment on maze (a), of which the size is 6 * 8, we run 400 episodes to calculate the reward of « approximate 60th episode the number of steps keeps staying very close to the leas number o destination in this maze. As for maze (b), of which the size is the same with that of maze (a), more researing space on the agent’s way to destination, whic final goal. We run it 400 episodes as well. As we can see in the decrease while experiencing more episodes as well, and that h finitely cause more s Fig. 2(c) and Fig. 2(d), from approximate 100t eps needed the reward h episode t episode and the steps the agent need to reach the final destination. The maximum number of steps is set a: which means that if the agent can not find the destination before taking up to 0 steps, we see the trial in episode as failure. The results of this experiment is shown in Fig. 2. According to Fig. 2(a) and Fig. 2(b) can see that the reward as well as the steps needed decrease as the number of episodes increases, and that f steps to here is n to get to and the s he numbe steps keeps staying very close to the least number of steps in maze (b). According to the result shown in 2(e) and Fig. 2(f), we can see in the experiment on maze (c), t he size of which is 10 * 10, that after about episodes’ learning, the reward and the steps decrease to a relatively stable stage. For maze (c), we rut episodes. More episodes are needed due to this maze’s more complicated structure and bigger size.

episodes. More episodes are needed due to this maze’s more complicated structure and bigger size.

In Fig. 3, the processes that error rate is going down are shown. For each episode, if agent can’t reach the goal within specified number of steps, we see it as failure. And the error rate is the number of failing episodes divided by the total number of episodes during a this particular period, say, 20 episodes. For maze (a), we run 400 episodes and for each 20 episodes, we calculate its error rate. The Fig. 3(a) shows that after about 80th episode, the error rate decreases to 0% and keeps 0% in the following period. The Fig. 3(b) shows that, as for maze (b), after about 100th episode, the error rate decreases to 0% and keeps 0% in the following period. As for

Nawaf Hazim

Pathfinding algorithm addresses the problem of finding the shortest path from source to destination and avoiding obstacles. One of the greatest challenges in the design of realistic Artificial Intelligence (AI) in computer games is agent movement. Pathfinding strategies are usually employed as the core of any AI movement system. In this work, A* search algorithm is used to find the shortest path between the source and destination on image that represents a map or a maze. Finding a path through a maze is a basic computer science problem that can take many forms. The A* algorithm is widely used in pathfinding and graph traversal. Different map and maze images are used to test the system performance (100 images for each map and maze). The system overall performance is acceptable and able to find the shortest path between two points on the images. More than 85% images can find the shortest path between the selected two points.

downloadDownload free PDF View PDFchevron_right

On the Classification of Maze Problems

Anthony Bagnall

downloadDownload free PDF View PDFchevron_right

Comparison of RBF Network Learning and Reinforcement Learning on the Maze Exploration Problem

Roman Neruda

Lecture Notes in Computer Science

An emergence of intelligent behavior within a simple robotic agent is studied in this paper. Two control mechanisms for an agent are considereda radial basis function neural network trained by evolutionary algorithm, and a traditional reinforcement learning algorithm over a finite agent state space. A comparison of these two approaches is presented on the maze exploration problem.

downloadDownload free PDF View PDFchevron_right

Reinforcement learning-based mobile robot navigation

Erkan iMAL

TURKISH JOURNAL OF ELECTRICAL ENGINEERING & COMPUTER SCIENCES, 2016

In recent decades, reinforcement learning (RL) has been widely used in different research fields ranging from psychology to computer science. The unfeasibility of sampling all possibilities for continuous-state problems and the absence of an explicit teacher make RL algorithms preferable for supervised learning in the machine learning area, as the optimal control problem has become a popular subject of research. In this study, a system is proposed to solve mobile robot navigation by opting for the most popular two RL algorithms, Sarsa( λ) and Q( λ) . The proposed system, developed in MATLAB, uses state and action sets, defined in a novel way, to increase performance. The system can guide the mobile robot to a desired goal by avoiding obstacles with a high success rate in both simulated and real environments.

downloadDownload free PDF View PDFchevron_right

Autonomous Maze Solving Robotics: Algorithms and Systems

Tareq Alhmiedat

International Journal of Mechanical Engineering and Robotics Research

In robotics, autonomous movement is an important feature that enables the robot to move independently from one location to another. Autonomous movement within an unknown area requires the robot to carry out investigations. The concept of solving a maze has an important place in the field of robotics, and is based on one of the most important areas of robotics, the Decision-Making Algorithm. In this paper, we discuss and analyse existing maze solving algorithms, and investigate the recent development of autonomous maze solving robotic systems. In addition, the work presented in this paper guides the researcher and developer for choosing an adequate maze solving algorithm to develop an efficient maze solving robotic system for a certain scenario. Index Terms-maze, autonomous robot, maze solving, solver robot, maze solving algorithms II. ROBOTIC APPLICATIONS The possible applications for maze solving vehicles range from simple tasks such as transferring goods through factories, office buildings, classrooms and other 668

downloadDownload free PDF View PDFchevron_right

Q-Learning for Autonomous Mobile Robot Obstacle Avoidance

A. Fernando Ribeiro

2019 IEEE International Conference on Autonomous Robot Systems and Competitions (ICARSC), 2019

An approach to the problem of autonomous mobile robot obstacle avoidance using Reinforcement Learning, more precisely Q-Learning, is presented in this paper. Reinforcement Learning in Robotics has been a challenging topic for the past few years. The ability to equip a robot with a powerful enough tool to allow an autonomous discovery of an optimal behavior through trial-and-error interactions with its environment has been a reason for numerous deep research projects. In this paper, two different Q-Learning approaches are presented as well as an extensive hyperparameter study. These algorithms were developed for a simplistically simulated Bot'n Roll ONE A (Fig. 1). The simulated robot communicates with the control script via ROS. The robot must surpass three levels of iterative complexity mazes similar to the ones presented on RoboParty [1] educational event challenge. For both algorithms, an extensive hyperparameter search was taken into account by testing hundreds of simulations with different parameters. Both Q-Learning solutions develop different strategies trying to solve the three labyrinths, enhancing its learning ability as well as discovering different approaches to certain situations, and finishing the task in complex environments.

downloadDownload free PDF View PDFchevron_right

AUTONOMOUS MAZE SOLVING ROBOT

Musfiqur Rahman

Maze solving problem is a very old problem, but still now it is considered as an important field of robotics. This field is based on decision making algorithms. The main aim of this project is to make an Arduino based efficient autonomous maze solver robot. Two simple mazes solving algorithms “Wall following algorithm” and “Flood fill algorithm” are used to make this robot. In this project Hardware development, software development and maze construction had been done. For performance testing, the robot will implement to solve 4×4 maze. Capability of finding the shortest path is also verified.

downloadDownload free PDF View PDFchevron_right

ARTIFICIALLY INTELLIGENT MAZE SOLVER ROBOT

IRJET Journal

Robotics is very important now days, especially due to its increasing practice in many industries. It was recently observed that there is a great difficulty being faced in the separation of the articles in the industries, like Textiles. Being a manual job it was also time consuming and boring. Therefore, for giving this problem a technological perspective, automation in such field was required. When a robot having its own sense of judgment to the path which it follows, would be introduced then a high efficiency in performance could be achieved along with increase in reliability and affordability of the manufacturers could be seen. The robot would be self-sufficient to take a note the paths through which it is moving, hence executing some complex maze-solving algorithms in its CPU core and taking its own decision on turnings and reaching its goal. It would certainly be a proof of a robot having its own " Brain-like " structural methodology having an access to the real-time inputs, making the prototype an Artificially Intelligent Robot.

downloadDownload free PDF View PDFchevron_right

SHORTEST DISTANCE MAZE SOLVING ROBOT

eSAT Journals

In this paper, the design of maze solving robot which has the ability to navigate automatically in an unknown area based on its own decision is presented. For the proposed design algorithm, a wall following technique based on LSRB and RSLB algorithm is used. The designed robot obtains input from ultrasonic sensor, Infra-red sensor and wheel rotation encoders and then make decision for solving maze. It has the capability to solve the maze by taking the shortest distant path and it stores the details for the further reference also. The designed robot has the ability to learn any arbitrary maze and find the possible shortest route for solving it. The best application of this designed robot could be for navigational purposes

downloadDownload free PDF View PDFchevron_right

Collaborative Route Finding in Semantic Mazes

Rem Collier

2022

The following document describes the submission to the All The Agents Challenge of a system that inter-acts with the Autonomous Maze Environment Explorer Project (AMEE) 1 . Our solution is a collaborative route finding service for Semantically Defined Mazes. Agents with Reinforcement Learning tools are used to discover paths through the maze. These agents collaborate by sharing schematic knowledge of the maze (information about the maze layout). The code repository and video demonstration locations are detailed in the Online Resources section at the end of the document.

downloadDownload free PDF View PDFchevron_right

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

Kelly Cohen

The purpose of the project is to develop maze exploration algorithms for a multi-agent system, using autonomous robots, that allows the agents to successfully navigate through an array of different mazes based on the game Theseus and the Minotaur. Theseus and the Minotaur is a maze game in which Theseus tries to get to the exit of each maze without being eaten by the Minotaur. For every one move Theseus makes, the Minotaur can make two. The mazes become progressively harder as each maze is completed. A single intelligence system is made up of algorithms for the robots to use in order successfully simulate the Theseus and the Minotaur game. One of the robots will represent Theseus and the other robot will represent the Minotaur. The Theseus robot will work to traverse the maze while avoiding the Minotaur. The Minotaur robot will work to navigate through the maze in order to catch the Theseus robot. The goal is to have the robots simulate the Theseus and the Minotaur game without any human interaction. The next step is to validate the developed algorithms in a laboratory experiment using programmable mobile robots.

downloadDownload free PDF View PDFchevron_right

An Experimental Study for Exploration-oriented Behavior in Maze-solving using Reinforcement Learning

SDIWC Organization

The Fifth International Conference on Electronics and Software Science (ICESS2019), 2019

In this study, the reinforcement learning agent under the situation of communicable as multi-agent system will be improved efficiency. In reinforcement learning, this method will be supposed that agent is able to observe the environment, completely. However, there is a limit on the information of the sensors. Moreover, it is hard to learn the reinforcement learning agent in the actual environment cause of some noise of actual environment or source device. In addition, a time per a episode will enlarge because an agent will be explored in a given area. In this study, the proposed method has been using two type agents that communicate as information exchange on the location to settle this problem, moreover, the noise will be mixed with knowledge space in the situation of the knowledge sharing. In addition, sometimes the any information won’t be transmitted in the situation of knowledge sharing. From this viewpoint, in this study aims to improve maze-solving technique, efficiency by which to the multi-agent reinforcement learning’s agents under the situation. As a result, the proposed method has been confirmed that is provided suitable solution for an approach to the goal for the agents.

downloadDownload free PDF View PDFchevron_right

Evaluation a New Anticipatory Agent System on a Maze Problem

Ahmed elmahalawy

For several decades, the field of Artificial Intelligence has been pursuing the study of intelligent behaviour using the methodology of the artificial. Recently, a subgroup within the Artificial Intelligence community has started to stress embodied intelligence and made strong alliances with biology and research on Artificial Life. One of the most promising research fields is simulating and modelling of the Artificial Life behaviour using Multi Agent Systems. The very important behaviour that must be studied is the Anticipation Behaviour. It is the concept of an agent making decisions based on predictions, expectations, or beliefs about the future states. Previously, I introduce the performance of a well known system, Anticipatory Agent System , using a famous machine learning algorithms, Markov Chain and Genetic Algorithm , in its Anticipation Module. This work provides a new design of the Artificial Life system based on the principles and concepts of both the previous system and the Multi Agent Systems. The Anticipatory Module of the system is based on Markov chain. My results confirmed that the new system's performance is much better than the original system. To support this hypothesis, I present the application of my approach to a well known problem-a Maze Problem.

downloadDownload free PDF View PDFchevron_right

Comparing exploration strategies for Q-learning in random stochastic mazes

Madalina Drugan

2016 IEEE Symposium Series on Computational Intelligence (SSCI), 2016

DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal. If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the "Taverne" license above, please follow below link for the End User Agreement:

downloadDownload free PDF View PDFchevron_right

Learning algorithms for small mobile robots: case study on maze exploration

Roman Neruda

2008

An emergence of intelligent behavior within a simple robotic agent is studied in this paper. Two control mechanisms for an agent are considered — new direction of reinforcement learning called relational reinforcement learning, and a radial basis function neural network trained by evolutionary algorithm. Relational reinforcement learning is a new interdisciplinary approach combining logical programming with traditional reinforcement learning. Radial basis function networks offer wider interpretation possibilities than commonly used multilayer perceptrons. Results are discussed on the maze exploration problem.

downloadDownload free PDF View PDFchevron_right

Obstacle Avoidance through Reinforcement Learning

Tony J Prescott

Nips, 1991

A method is described for generating plan-like, reflexive, obstacle avoidance behaviour in a mobile robot. The experiments reported here use a simulated vehicle with a primitive range sensor. Avoidance behaviour is encoded as a set of continuous functions of the perceptual input space. These functions are stored using CMACs and trained by a variant of Barto and Sutton's adaptive critic algorithm. As the vehicle explores its surroundings it adapts its responses to sensory stimuli so as to minimise the negative reinforcement arising from collisions. Strategies for local navigation are therefore acquired in an explicitly goal-driven fashion. The resulting trajectories form elegant collisionfree paths through the environment

downloadDownload free PDF View PDFchevron_right

Optimization Maze Robot Using A* and Flood Fill Algorithm

Erwin Setiawan

International Journal of Mechanical Engineering and Robotics Research, 2017

downloadDownload free PDF View PDFchevron_right

Reinforcement learning generalization using state aggregation with a maze-solving problem

Walid Gomaa

Proc. of the 2012 Japan-Egypt IEEE Conference on Electronics, Communications, and Computers (JEC-ECC 2012), 2012

Reinforcement learning (RL) depends on constructing a lookup table for the value function of state-action pairs. Consequently, when learning in environments with large-scale state-action space, RL fails to achieve practical convergence rates. Therefore, the need for generalizing the original state-action space into more compact representation is crucial for many practical applications. In this paper, we propose a generalization technique using `state aggregation'. We apply this generalization technique to Q-learning, and show how to aggregate similar states together. The modified RL system architecture is presented along with the new algorithm. The proposed approach is tested and analyzed on a maze problem.

downloadDownload free PDF View PDFchevron_right

Dynamic Path Planning Algorithm for Mobile Robots: Leveraging Reinforcement Learning for Efficient Navigation

sivayazi kappagantula

Journal of internet services and information security, 2024

Traversing unfamiliar terrain presents a considerable challenge, particularly concerning the task of locating a viable pathway, regardless of its actual existence. This paper presents a novel navigation algorithm leveraging reinforcement learning, specifically the Markov Decision Process, to address the challenges of navigating dynamic environments. In contrast to traditional methods, this approach offers adaptability and efficiency in scenarios ranging from mobile robot navigation to complex industrial settings. The algorithm integrates an enhanced A* algorithm, showcasing its versatility in handling various tasks, from pathfinding to obstacle avoidance. To evaluate its effectiveness, the algorithm undergoes rigorous testing across multiple scenarios, comparing its performance with and without reinforcement learning. Through extensive experimentation, the algorithm demonstrates superior performance in terms of efficiency and adaptability, particularly in scenarios. The results presented highlight the algorithm's learning progress and effectiveness in finding the shortest path. Notably, the algorithm's performance surpasses that of conventional approaches, underscoring its potential for real-world applications in mobile robot navigation and beyond. In conclusion, the proposed algorithm represents a significant advancement in navigation techniques, offering a robust solution for addressing the challenges posed by dynamic environments. Its integration of reinforcement learning enhances adaptability and efficiency, making it a promising tool for various industries and applications.

downloadDownload free PDF View PDFchevron_right

Q-Learnıng Based Real Tıme Path Plannıng for Mobıle Robots

Muhammed Kelek

2019

Decision making and movement control are used for mobile robots to perform the given tasks. This study presents a real time application in which the robotic system estimates the shortest way from robot's current location to target point via Q learning algorithm and makes decision to go the target point on the estimated path by using movement control. Q Learning algorithm is known as a Reinforcement Learning RL algorithm. In this study, it is used as a core algorithm for estimation of the path that is optimum way for mobile robot in an environment. The environment is viewed by a camera. This study includes three phases. Firstly, the map and the locations of all objects including a mobile robot, obstacles and target point in the environment are determined by using image processing. Secondly, Q Learning algorithm is applied for the problem of the estimation of the shortest way from the current location of the robot to target point. Finally, a mobile robot with three omni wheels was...

downloadDownload free PDF View PDFchevron_right

Reinforcement Learning: Navigating mazes using SARSA

Sign up for access to the world's latest research

AbstractAI

FAQs

Related papers

Related papers

Related topics

Abstract
AI