목표지향적 강화학습 시스템

이창훈

목표지향적 강화학습 시스템

원문정보

Goal-Directed Reinforcement Learning System

이창훈

한국인터넷방송통신학회 한국인터넷방송통신학회 논문지 제10권 제5호 2010.10 pp.265-270 KCI 등재

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

Reinforcement learning performs learning through interacting with trial-and-error in dynamic environment. Therefore, in dynamic environment, reinforcement learning method like -learning and -learning are faster in learning than the conventional stochastic learning method. However, because many of the proposed reinforcement learning algorithms are given the reinforcement value only when the learning agent has reached its goal state, most of the reinforcement algorithms converge to the optimal solution too slowly. In this paper, we present GDRLS algorithm for finding the shortest path faster in a maze environment. GDRLS is select the candidate states that can guide the shortest path in maze environment , and learn only the candidate states to find the shortest path. Through experiments, we can see that GDRLS can search the shortest path faster than -learning and -learning in maze environment

한국어

강화학습(reinforcement learning)은 동적 환경과 시행-착오를 통해 상호 작용하면서 학습을 수행한다. 그러므로 동적 환경에서 -학습과 -학습과 같은 강화학습 방법들은 전통적인 통계적 학습 방법보다 더 빠르게 학습을 할 수 있다. 그러나 제안된 대부분의 강화학습 알고리즘들은 학습을 수행하는 에이전트(agent)가 목표 상태에 도달하였을 때만 강화 값(reinforcement value)이 주어지기 때무에 최적 해에 매우 늦게 수렴한다. 본 논문에서는 미로 환경(maze environment)에서 최단 경로를 빠르게 찾을 수 있는 강화학습 방법(GORLS : Goal-Directed Reinforcement Learning System)을 제안하였다. GDRLS 미로 환경에서 최단 경로가 될 수 있는 후보 상태들을 선택한다. 그리고 나서 최단 경로를 탐색하기 위해 후보 상태들을 학습한다. 실험을 통해, GDRLS는 미로 환경에서 -학습과 -학습보다 더 빠르게 최단 경로를 탐색할 수 있음을 알 수 있다.

키워드

저자정보

이창훈 Chang-Hoon Lee. 정회원， 한경대학교 컴퓨터공학과

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

※ 기관로그인 시 무료 이용이 가능합니다.

4,000원

0개의 논문이 장바구니에 담겼습니다.

earticle