Abstract
Synonyms-Cross-view action recognition-View-invariant action classification-View-invariant activity recognition Related Concepts-View-invariance-Action recognition-Activity classification Definition Recognizing human actions from previously seen viewpoints is relatively easy when compared with unseen viewpoints. View-invariant action recognition aims at recognizing human actions from unseen viewpoints. Background Human action recognition is an important problem in computer vision. It has a wide range of applications in surveillance, human-computer interaction, augmented reality, video indexing, and retrieval. The varying pattern of spatiotemporal appearance generated by human action is key for identifying the performed action. We have seen a lot of research exploring this dynamics of spatiotemporal appearance for learning a visual representation of human actions. However, most of the research in action recognition is focused on some common viewpoints [1], and these approaches do not perform well when there is a change in viewpoint. Human actions are performed in a 3-dimensional environment and are projected to a 2-dimensional space when captured as a video from a given viewpoint. Therefore, an action will have a different spatio-temporal appearance from different viewpoints. As shown in Figure 1, observation o1 is different from observation o2 and so on. The research in view-invariant action recognition addresses this problem and focuses on recognizing human actions from unseen viewpoints. There are different data modalities which can be used for view-invariant representation learning and perform action recognition. These include RGB videos,
References (20)
- Kevin Duarte, Yogesh Rawat, and Mubarak Shah. Videocapsulenet: A sim- plified network for action detection. In Advances in Neural Information Processing Systems, pages 7610-7619, 2018.
- Alexei Gritai, Yaser Sheikh, and Mubarak Shah. On the use of anthro- pometry in the invariant analysis of human actions. In Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004., volume 2, pages 923-926. IEEE, 2004.
- Junnan Li, Yongkang Wong, Qi Zhao, and Mohan Kankanhalli. Unsuper- vised learning of view-invariant action representations. In Advances in Neu- ral Information Processing Systems, pages 1254-1264, 2018.
- Jingen Liu, M Shah, B Kuipers, and S Savarese. Cross-view action recogni- tion via view knowledge transfer. In Proceedings of the 2011 IEEE Confer- ence on Computer Vision and Pattern Recognition, pages 3209-3216. IEEE Computer Society, 2011.
- Mengyuan Liu, Hong Liu, and Chen Chen. Enhanced skeleton visualization for view invariant human action recognition. Pattern Recognition, 68:346- 362, 2017.
- Vasu Parameswaran and Rama Chellappa. View invariance for human ac- tion recognition. International Journal of Computer Vision, 66(1):83-101, 2006.
- Hossein Rahmani, Arif Mahmood, Du Huynh, and Ajmal Mian. Histogram of oriented principal components for cross-view action recognition. IEEE transactions on pattern analysis and machine intelligence, 38(12):2430- 2443, 2016.
- Hossein Rahmani, Ajmal Mian, and Mubarak Shah. Learning a deep model for human action recognition from novel viewpoints. IEEE transactions on pattern analysis and machine intelligence, 40(3):667-681, 2018.
- C Rao, A Gritai, M Shah, and T Syeda-Mahmood. View-invariant alignment and matching of video sequences. In Proceedings Ninth IEEE International Conference on Computer Vision, 2003.
- Cen Rao, Mubarak Shah, and Tanveer Syeda-Mahmood. Invariance in mo- tion analysis of videos. In Proceedings of the eleventh ACM International Conference on Multimedia, pages 518-527. ACM, 2003.
- Cen Rao, Alper Yilmaz, and Mubarak Shah. View-invariant representa- tion and recognition of actions. International Journal of Computer Vision, 50(2):203-226, 2002.
- Amir Shahroudy, Jun Liu, Tian-Tsong Ng, and Gang Wang. Ntu rgb+ d: A large scale dataset for 3d human activity analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1010-1019, 2016.
- Amir Shahroudy, Tian-Tsong Ng, Yihong Gong, and Gang Wang. Deep multimodal feature analysis for action recognition in rgb+ d videos. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(5):1045- 1058, 2018.
- Yaser Sheikh, Mumtaz Sheikh, and Mubarak Shah. Exploring the space of a human action. In Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1, volume 1, pages 144-149. IEEE, 2005.
- Shruti Vyas, Yogesh S Rawat, and Mubarak Shah. Time-aware and view- aware video rendering for unsupervised representation learning. arXiv preprint arXiv:1811.10699, 2018.
- Jiang Wang, Xiaohan Nie, Yin Xia, Ying Wu, and Song-Chun Zhu. Cross- view action modeling, learning and recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2649-2656, 2014.
- Daniel Weinland, Remi Ronfard, and Edmond Boyer. Free viewpoint ac- tion recognition using motion history volumes. Computer vision and image understanding, 104(2-3):249-257, 2006.
- Pingkun Yan, Saad M Khan, and Mubarak Shah. Learning 4d action feature models for arbitrary view action recognition. In 2008 IEEE Conference on Computer Vision and Pattern Recognition, pages 1-7. IEEE, 2008.
- Alper Yilmaz and Mubarak Shah. Actions sketch: A novel action represen- tation. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05)-Volume 1-Volume 01, pages 984-989. IEEE Computer Society, 2005.
- Jingjing Zheng, Zhuolin Jiang, P Jonathon Phillips, and Rama Chellappa. Cross-view action recognition via a transferable dictionary pair. In bmvc, volume 1, page 7, 2012.