원문정보
초록
영어
High-quality annotations are crucial for accurate object detection, but widely used datasets like MS-COCO face issues such as missing objects, duplicate labels, and inaccurate bounding boxes. To overcome these problems, MJ-COCO was created through model-driven refinement, increasing annotations from 860,001 to 1,221,970 instances. This paper presents a comparative analysis of MS-COCO and MJCOCO, with a focus on the accuracy of bounding box measurements. We designed a human-in-the-loop evaluation framework with custom software that enables side-by-side visualization of annotations, allowing evaluators to classify outcomes as improved, worse, or ambiguous. We collectively evaluated 41,754 annotations through a human-in-the-loop verification process involving fifteen human evaluators. The results demonstrate that a total of 25,754 annotations were improved, 2,398 were worsened, and 13,623 were ambiguous, for a total quality score of 89.49%. These findings show that MJ-COCO considerably enhances annotation quality and precision over MS-COCO, making it a more consistent and accurate standard for advancing object detection studies. The dataset and software codes are publicly available on Kaggle: https://www.kaggle.com/datasets/mjcoco2025/mj-coco-2025.
목차
I. INTRODUCTION
II. DATASET REFINEMENT AND EVALUATION METHOD
A. Data Preparation.
B. Annotation Quality Assessment.
III. DISCUSSION AND EVALUATION
A. Implementation Details
B. Datasets
C. Discussion
D. Evaluation Criteria
IV. Conclusion
ACKNOWLEDGMENT
REFERENCES
