earticle

논문검색

Convergence of Internet, Broadcasting and Communication

Multi-Agent Deep Reinforcement Learning for Fighting Game: A Comparative Study of PPO and A2C

초록

영어

This paper investigates the application of multi-agent deep reinforcement learning in the fighting game Samurai Shodown using Proximal Policy Optimization (PPO) and Advantage Actor-Critic (A2C) algorithms. Initially, agents are trained separately for 200,000 timesteps using Convolutional Neural Network (CNN) and Multi-Layer Perceptron (MLP) with LSTM networks. PPO demonstrates superior performance early on with stable policy updates, while A2C shows better adaptation and higher rewards over extended training periods, culminating in A2C outperforming PPO after 1,000,000 timesteps. These findings highlight PPO's effectiveness for short-term training and A2C's advantages in long-term learning scenarios, emphasizing the importance of algorithm selection based on training duration and task complexity. The code can be found in this link https://github.com/Lexer04/Samurai-Shodown-with-Reinforcement-Learning-PPO.

목차

Abstract
1. Introduction
2. Related Work
2.1 Multi-Agent Reinforcement Learning
2.2 Proximal Policy Optimization
2.3 Advantage Actor-Critic
3. Experiment Setup and Methodology
3.1 Setup Environment
3.2 Independent Learning
4. Result and Discussion
5. Conclusion
Acknowledgement
References

저자정보

  • Yoshua Kaleb Purwanto Master Student, Department of Computer Engineering, Dongseo University, Busan, Korea
  • Dae-Ki Kang Professor, Department of Computer Engineering, Dongseo University, Busan, Korea

참고문헌

자료제공 : 네이버학술정보

    함께 이용한 논문

      ※ 기관로그인 시 무료 이용이 가능합니다.

      • 4,000원

      0개의 논문이 장바구니에 담겼습니다.