Academia.eduAcademia.edu

Outline

Reinforcement Learning: Navigating mazes using SARSA

Abstract
sparkles

AI

This paper explores the application of the SARSA reinforcement learning algorithm to navigate mazes and identify the shortest path. The agent learns through a reward system that encourages exploration while penalizing sub-optimal actions. Experiments demonstrate that after sufficient episodes, the agent successfully reduces both the number of steps taken and the error rate in reaching the destination, highlighting the efficiency of the SARSA algorithm combined with an epsilon-greedy strategy.

FAQs

sparkles

AI

What are the key results of using SARSA in maze navigation?add

The agent learned to find the shortest path, achieving a 0% error rate after approximately 80 episodes in smaller mazes.

How does the ϵ-greedy strategy enhance the SARSA algorithm?add

The ϵ-greedy strategy allows the agent to explore alternative paths, preventing it from getting trapped in local minima.

What impact do maze sizes have on the learning process?add

Larger mazes, such as 10x10, required significantly more episodes to stabilize performance, with 400 episodes needed.

How are Q-values updated in the SARSA algorithm?add

Q-values are updated based on the actual actions taken by the agent and the accompanying rewards received.

What metrics were used to measure the agent's performance in the mazes?add

The study measured rewards, steps required to reach the destination, and error rates over defined episode intervals.