원문정보
초록
영어
There have been active researches on the prediction of future events with moving (first person view) cameras by using deep-learning approaches. In many those approaches, the moving camera motion was considered as an important cue in future object localization and anticipation of traffic accidents. This paper argues that the moving cameramotion (i.e., ego-motion) is not necessary for these tasks because only the relative motion of the surrounding objects with respect to the moving camera is sufficient for those tasks. In this paper, we will present some empirical evidence from the recently published papers with codes and datasets. Comparison results with and without camera-motion shows that performance differences are rather minor. This may recommend not to use the camera-motion for reducing memories as well as inference time at edge devices because the number of parameters in the architecture can be significantly reduced.
목차
I. INTRODUCTION
II. COMPARISON OF WITH AND WITHOUT EGO-MOTION
A. FOL tasks
B. Object-level anticipation task
III. RESULTS AND DISCUSSION
A. FOL task for Unsupervised traffic accident detection in first-person videos [9]
B. FOL task for Future person localization in first-person videos [3]
C. Object-level anticipation task for Attention-guided Multistream Feature Fusion Network for early Localization of Risky Traffic Agent in Driving Videos [8]
D. Discussion on Role of Ego motion
IV. CONCLUSION AND FUTURE WORKS
ACKNOWLEDGMENT
REFERENCES