원문정보
초록
영어
Object detection (OD) is a fundamental task in computer vision. However, progress is often hindered by limitations in existing datasets, including human annotation errors, reliance on manual annotation, missing annotations due to occlusion, and domain specificity. To address these challenges, this work proposes an automatically generated synthetic single-view dataset for OD. The dataset was generated in Unity by constructing a 3D virtual city with a single-camera surveillance system, providing diverse perspectives and calibrated viewpoints. Object metadata, including position and dimensions, was automatically extracted and projected into the 2D image plane to generate accurate bounding boxes. Annotations were normalized into YOLO format, with invalid boxes removed, resulting in a single-view dataset that is consistent, precise, and free from manual labeling errors, while still reflecting real-world challenges such as occlusion and object variation. Two versions of the dataset, original and refined, were created to evaluate the effect of bounding box quality on detection performance. An experimental evaluation using the YOLOv11 model demonstrated that the proposed dataset substantially improved detection performance, yielding notable gains in precision, recall, and mean average precision (mAP). These results underscore the importance of accurate dataset curation and highlight the potential of synthetic datasets to advance single-view OD in applications such as surveillance, autonomous systems, and robotics.
목차
I. INTRODUCTION
II. LITERATURE
III. METHODOLOGY
A. Dataset Construction
B. Preprocessing
C. Object Detection Module
IV. EXPERIMENTAL RESULTS
A. Experimental Setup
B. Performance Analysis
V. CONCLUSION & FUTURE WORK
ACKNOWLEDGMENT
REFERENCES
