Deep Deterministic Policy Gradient 알고리즘을 응용한 자전거의 자율 주행 제어

최승윤; Le Pham Tuyen; 정태충

Deep Deterministic Policy Gradient 알고리즘을 응용한 자전거의 자율 주행 제어

원문정보

Autonomous control of bicycle using Deep Deterministic Policy Gradient Algorithm

최승윤, Le Pham Tuyen, 정태충

한국융합보안학회 융합보안논문지 제18권 제3호 2018.09 pp.3-9 KCI 등재

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

The Deep Deterministic Policy Gradient (DDPG) algorithm is an algorithm that learns by using artificial neural network s and reinforcement learning. Among the studies related to reinforcement learning, which has been recently studied, the D DPG algorithm has an advantage of preventing the cases where the wrong actions are accumulated and affecting the learn ing because it is learned by the off-policy. In this study, we experimented to control the bicycle autonomously by applyin g the DDPG algorithm. Simulation was carried out by setting various environments and it was shown that the method us ed in the experiment works stably on the simulation.

한국어

DDPG(Deep Deterministic Policy Gradient)알고리즘은 인공신경망과 강화학습을 사용하여 학습하는 알고리즘이다. 최근많은 연구가 이루어지고 있는 강화학습과 관련된 연구 중에서도 DDPG 알고리즘은 오프폴리시로 학습하기 때문에 잘못된 행동이 누적되어 학습에 영향을 미치는 경우를 방지하는 장점이 있다. 본 연구에서는 DDPG 알고리즘을 응용하여 자전거를 자율주행 하도록 제어하는 실험을 진행하였다. 다양한 환경을 설정하여 시뮬레이션을 진행하였고 실험을 통해서 사용된 방법이시뮬레이션 상에서 안정적으로 동작함을 보였다.

요약
ABSTRACT
1. 서론
2. 관련 연구
2.1 강화학습
2.2 MDP(Markov Decision Problem)
2.3 액터 크리틱
3. DDPG를 사용한 자전거의 자율주행 제어
3.1 DDPG(Deep Deterministic Policy Gradient)
4. 실험 및 결과
4.1 목표지점을 고정하는 경우의 결과
4.2 속도와 목표지점을 무작위로 설정한 경우의 결과
5. 결론
참고문헌

키워드

저자정보

최승윤 Choi Seung Yoon. 경희대학교/컴퓨터공학과
Le Pham Tuyen 경희대학교/컴퓨터공학과
정태충 Chung Tae Choong. 경희대학교/컴퓨터공학과

참고문헌

자료제공 : 네이버학술정보

함께 이용한 논문

※ 기관로그인 시 무료 이용이 가능합니다.

4,000원

0개의 논문이 장바구니에 담겼습니다.

earticle