YOLO 성능 향상을 위한 데이터 증강기법

이준기; 장민호; 황영배

원문정보

Data Augmentation for the enhancement of YOLO Performance

이준기, 장민호, 황영배

한국차세대컴퓨팅학회 한국차세대컴퓨팅학회 논문지 Vol.20 No.3 2024.06 pp.22-35 KCI 등재

피인용수 : 0건 (자료제공 : 네이버학술정보)

초록

영어

Computer vision has shown excellent performance in various fields, thanks to the advancements in models like CNN and Transformers. However, training these models requires diverse and abundant data, which demands a significant amount of time and effort. The high cost associated with acquiring such training data often leads to issues like data scarcity and data imbalance. Data augmentation techniques provide effective solutions to address these challenges. In this paper, we focus on researching data augmentation techniques for object recognition models, specifically leveraging the Copy-Paste(Augmentation) technique. The previous researches involved attaching objects based on instance segmentation or visual context. However, we have discovered that using a straightforward approach, such as attaching bounding boxes of the same size to the existing object locations or randomly attaching objects, enhances the model's performance significantly. Furthermore, we propose a method of using the SAM(Segment Anything Model) to extract object instances from images and attaching them. We demonstrate additional experiments applying data augmentation techniques to the attached objects. To prevent existing objects in the image from occluded by the attached objects, we present a method of overlaying them into the image with attached objects. In this paper, we train the object recognition model using YOLO(You Only Look Once) v5 on the Pascal VOC12 dataset, and show better performance when utilizing the proposed data augmentation techniques.

한국어

컴퓨터 비전은 CNN, 트랜스포머 등과 같은 모델의 발전으로 여러 분야에서 좋은 성과를 이루었다. 하지만, 모델을 학습하기 위해서는 다양하고 많은 데이터가 필요하다. 이러한 학습데이터를 얻기 위해서는 많은 시간과 노력이 필요 로 한다. 이러한 높은 비용으로 인해 데이터 부족이나 데이터 불균형이 발생하게 된다. 데이터 증강기법은 이러한 문 제를 해결하기 위한 좋은 방법이다. 본 논문에서는 객체 인식 모델을 위한 데이터 증강기법 중에서(복사-붙여넣기) Copy-Paste를 활용한 데이터 증강기법을 연구한다. 이전 연구에서는 인스턴스 영상 분할 객체를 붙이거나 시각적 인 맥락을 바탕으로 객체를 붙인다. 하지만 인스턴스 영상 분할 객체를 사용하지 않고 단순한 방법인 바운딩 박스 (Bounding Box)를 그대로 기존의 객체 위치에 같은 크기로 붙이거나 무작위로 붙이는 것도 모델의 성능이 향상된 다는 것을 발견했다. 또한, 객체에서 SAM(Segment Anything Model) 모델을 활용하여 객체의 인스턴스를 추출 하여 붙이는 방법을 제안한다. 그리고 붙이는 객체에 데이터 증강기법을 적용하여 데이터를 증강하는 방법을 추가실 험으로 보여준다. 또한, 기존의 객체가 붙여지는 객체에 의해 가려지는 것을 막기 위해 객체를 붙이고 기존 이미지에 있는 객체를 덮어쓴 방법도 적용하였다. 본 논문에서 객체 인식 모델 Yolo v5를 Pascal VOC12 데이터셋으로만 학 습한 결과보다 제안한 데이터 증강기법을 활용해서 학습한 결과가 더 높은 성능을 보여주는 것을 확인하였다.

요약
Abstract
1. 서론
2. 관련 연구
3. 데이터 증강 기법
3.1 동일한 클래스 객체 데이터 증강
3.2 무작위 객체 데이터 증강
3.3 Mobile SAM을 활용한 데이터 증강
4. 실험
4.1 실험 환경 설정
4.2 실행 방법
4.3 동일한 클래스 객체 데이터 증강
4.4 무작위 객체 데이터 증강
4.5 Mobile SAM을 활용한 데이터 증강
5. 추가 실험
5.1 동일한 클래스 객체 데이터 증강 추가 실험
5.2 무작위 객체 데이터 증강 추가실험
5.3 Mobile SAM을 활용한 데이터 증강 추가 실험
6. 결론
Acknowledgment
참고문헌

earticle

YOLO 성능 향상을 위한 데이터 증강기법

원문정보

초록

목차

키워드

저자정보

참고문헌

함께 이용한 논문