원문정보
초록
영어
Human action recognition has been a widely studied topic in the field of computer. However challenging problems exist for both local and global methods to classify human actions. Local methods usually ignore the structure information among local descriptors. Global methods generally have difficulties in occlusion and background clutter. To solve these problems, a novel combination representation called global Gist feature and local patch coding is proposed. Firstly, Gist feature captures spectrum information of actions in a global view, with spatial relationship among body parts. Secondly, Gist feature located in different grids of the action-centric region is divided into four patches according to the frequencies of action variance. Afterwards on the basis of traditional bag-of-words (BoW) model, a novel formation of local patch coding is adopted. Each patch is encoded independently and finally all the visual words are concatenated to represent high variability of human actions. By combining local patch coding, the proposed method not only solves the problem that global descriptors can not reliably identified actions in complex backgrounds, but also reduces the redundant features in a video. Experimental results performed on KTH and UCF sports dataset demonstrate that the proposed representation is effective for human action recognition.
목차
1. Introduction
2. The Framework of Proposed Method
3. Action Recognition with Global Gist Feature and Local Patch Coding
3.1. Action-centric Region Extraction and Normalization
3.2. Gist Feature Computation
3.3. Local Patch Coding
3.4. Action Recognition
4. Experiments and Results
4.1. Experimental Settings
4.2. Results and Analysis on UCF Sports Dataset
4.3. Results and Analysis on KTH Dataset
5. Conclusion
Acknowledgements
References
