Two-person Activity Recognition using Skeleton Data
IET Computer Vision
https://doi.org/10.1049/IET-CVI.2017.0118Abstract
Human activity recognition is an important and active field of research having a wide range of applications in numerous fields including ambient-assisted living (AL). Although most of the researches are focused on the single user, the ability to recognise two-person interactions is perhaps more important for its social implications. This study presents a twoperson activity recognition system that uses skeleton data extracted from a depth camera. The human actions are encoded using a set of a few basic postures obtained with an unsupervised clustering approach. Multiclass support vector machines are used to build models on the training set, whereas the X-means algorithm is employed to dynamically find the optimal number of clusters for each sample during the classification phase. The system is evaluated on the Institute of Systems and Robotics (ISR)-University of Lincoln (UoL) and Stony Brook University (SBU) datasets, reaching overall accuracies of 0.87 and 0.88, respectively. Although the results show that the performances of the system are comparable with the state of the art, recognition improvements are obtained with the activities related to health-care environments, showing promise for applications in the AL realm.
References (57)
- Aquilano, M., Cavallo, F., Bonaccorsi, M., et al.: 'Ambient assisted living and ageing: preliminary results of RITA project'. 2012 Annual Int. Conf. IEEE Engineering in Medicine and Biology Society (EMBC), 2012, pp. 5823-5826
- Garcia, N.M., Rodrigues, J.J.P.C.: 'Ambient assisted living' (CRC Press, 2015)
- Laitinen, A., Niemela, M., Pirhonen, J.: 'Social robotics, elderly care, and human dignity: a recognition-theoretical approach', 2016
- Portugal, D., Trindade, P., Christodoulou, E., et al.: 'On the development of a service robot for social interaction with the elderly', 2015
- Cesta, A., Cortellessa, G., De Benedictis, R., et al.: 'Supporting active and healthy ageing by exploiting a telepresence robot and personalized delivery of information'. Int. Conf. Intelligent Software Methodologies, Tools, and Techniques, 2015, pp. 586-597
- Fiorini, L., Esposito, R., Bonaccorsi, M., et al.: 'Enabling personalised medical support for chronic disease management through a hybrid robot- cloud approach', Auton. Robots, 2017, 41, (5), pp. 1263-1276
- Yürür, Ö., Liu, C.H., Sheng, Z., et al.: 'Context-awareness for mobile sensing: a survey and future directions', IEEE Commun. Surv. Tutor., 2016, 18, (1), pp. 68-93
- Vrigkas, M., Nikou, C., Kakadiaris, I.A.: 'A review of human activity recognition methods', Front. Robot. AI, 2015, 2, p. 28
- Dong, Y., Scisco, J., Wilson, M., et al.: 'Detecting periods of eating during free-living by tracking wrist motion', IEEE J. Biomed. Health Inf., 2014, 18, (4), pp. 1253-1260
- Xiao, Y., Zhang, Z., Beck, A., et al.: 'Human-robot interaction by understanding upper body gestures', Presence, Teleoperators Virtual Environ., 2014, 23, (2), pp. 133-154
- Picard, R.W.: 'Affective computing' (MIT Press, Cambridge, MA, USA, 1997)
- Vinciarelli, A., Pentland, A.S.: 'New social signals in a new interaction world: the next frontier for social signal processing', IEEE Syst. Man Cybern. Mag., 2015, 1, (2), pp. 10-17
- Vázquez, M., Steinfeld, A., Hudson, S.E.: 'Parallel detection of conversational groups of free-standing people and tracking of their lower- body orientation'. 2015 IEEE/RSJ Int. Conf. Intelligent Robots and Systems (IROS), 2015, pp. 3010-3017
- Adelman, R.D., Tmanova, L.L., Delgado, D., et al.: 'Caregiver burden: a clinical review', Jama, 2014, 311, (10), pp. 1052-1060
- Kellokumpu, V., Pietikäinen, M., Heikkilä, J.: 'Human activity recognition using sequences of postures'. MVA, 2005, pp. 570-573
- Willems, G., Tuytelaars, T., Van Gool, L.: 'An efficient dense and scale- invariant spatio-temporal interest point detector'. European Conf. Computer Vision, 2008, pp. 650-663
- Aggarwal, J.K., Ryoo, M.S.: 'Human activity analysis: a review', ACM Comput. Surv. (CSUR), 2011, 43, (3), p. 16
- Weinland, D., Ronfard, R., Boyer, E.: 'A survey of vision-based methods for action representation, segmentation and recognition', Comput. Vis. Image Underst., 2011, 115, (2), pp. 224-241
- Shotton, J., Sharp, T., Kipman, A., et al.: 'Real-time human pose recognition in parts from single depth images', Commun. ACM, 2013, 56, (1), pp. 116- 124
- Turchetti, G., Micera, S., Cavallo, F., et al.: 'Technology and innovative services', IEEE Pulse, 2011, 2, (2), pp. 27-35
- Cavallo, F., Aquilano, M., Bonaccorsi, M., et al.: 'Multidisciplinary approach for developing a new robotic system for domiciliary assistance to elderly people'. 2011 Annual Int. Conf. IEEE Engineering in Medicine and Biology Society, 2011, pp. 5327-5330
- Manzi, A., Cavallo, F., Dario, P.: 'A 3D human posture approach for activity recognition based on depth camera' (Springer International Publishing, Cham, 2016), pp. 432-447
- ISR-UoL 3D Social Activity Dataset. Available at https:// lcas.lincoln.ac.uk/wp/isr-uol-3d-social-activity-dataset, accessed June 2017
- Yun, K., Honorio, J., Chattopadhyay, D., et al.: 'Two-person interaction detection using body-pose features and multiple instance learning'. 2012 IEEE Computer Society Conf. Computer Vision and Pattern Recognition Workshops (CVPRW), 2012, pp. 28-35
- Lara, O.D., Labrador, M.A.: 'A survey on human activity recognition using wearable sensors', IEEE Commun. Surv. Tutor., 2013, 15, (3), pp. 1192-1209
- Su, X., Tong, H., Ji, P.: 'Activity recognition with smartphone sensors', Tsinghua Sci. Technol., 2014, 19, (3), pp. 235-249
- Aggarwal, J.K., Xia, L.: 'Human activity recognition from 3D data: a review', Pattern Recognit. Lett., 2014, 48, pp. 70-80
- Sung, J., Ponce, C., Selman, B., et al.: 'Unstructured human activity detection from RGBD images'. 2012 IEEE Int. Conf. Robotics and Automation (ICRA), 2012, pp. 842-849
- Koppula, H.S., Gupta, R., Saxena, A.: 'Learning human activities and object affordances from RGB-D videos', Int. J. Robot. Res., 2013, 32, (8), pp. 951- 970
- Ni, B., Wang, G., Moulin, P.: 'RGBD-HuDaAct: a color-depth video database for human daily activity recognition'. Consumer Depth Cameras for Computer Vision, 2013, pp. 193-208
- Wang, J., Liu, Z., Wu, Y., et al.: 'Mining actionlet ensemble for action recognition with depth cameras'. 2012 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2012, pp. 1290-1297
- Ryoo, M.S., Matthies, L.: 'First-person activity recognition: what are they doing to me?'. Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2013, pp. 2730-2737
- Chen, L., Wei, H., Ferryman, J.: 'A survey of human motion analysis using depth imagery', Pattern Recognit. Lett., 2013, 34, (15), pp. 1995-2006
- Rehg, J., Abowd, G., Rozga, A., et al.: 'Decoding children's social behavior'. Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2013, pp. 3414- 3421
- Kong, Y., Jia, Y., Fu, Y.: 'Learning human interaction by interactive phrases'. Computer Vision -ECCV 2012, 2012, pp. 300-313
- Raptis, M., Sigal, L.: 'Poselet key-framing: a model for human activity recognition'. Proc. 2013 IEEE Conf. Computer Vision and Pattern Recognition, CVPR '13, Washington, DC, USA, 2013, pp. 2650-2657
- Huang, D.-A., Kitani, K.M.: 'Action-reaction: forecasting the dynamics of human interaction'. ECCV (7), 2014, pp. 489-504
- Blunsden, S., Fisher, R.B.: 'The behave video dataset: ground truthed video for multi-person behavior classification', Ann. BMVA, 2010, 4, (1-12), p. 4
- Lopez, L.D., Reschke, P.J., Knothe, J.M., et al.: 'Postural communication of emotion: perception of distinct poses of five discrete emotions', Front. Psychol., 2017, 8, p. 710. PMC. Web. 13 Oct. 2017
- Cheng, Z., Qin, L., Huang, Q., et al.: 'Recognizing human group action by layered model with multiple cues', Neurocomputing, 2014, 136, pp. 124-135 [41] Presti, L.L., Cascia, M.L.: '3D skeleton-based human action classification: a survey', Pattern Recognit., 2016, 53, pp. 130-147
- Ibrahim, M.S., Muralidharan, S., Deng, Z., et al.: 'A hierarchical deep temporal model for group activity recognition', 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2015, pp. 1971-1980. doi: 10.1109/CVPR.2016.217
- Choi, W., Shahid, K., Savarese, S.: 'What are they doing?: collective activity classification using spatio-temporal relationship among people'. 2009 IEEE 12th Int. Conf. Computer Vision Workshops (ICCV Workshops), 2009, pp. 1282-1289
- Tran, K.N., Gala, A., Kakadiaris, I.A., et al.: 'Activity analysis in crowded environments using social cues for group discovery and human interaction modeling', Pattern Recognit. Lett., 2014, 44, pp. 49-57
- Shahroudy, A., Liu, J., Ng, T.-T., et al.: 'NTU RGB + D: a large scale dataset for 3D human activity analysis'. Proc. IEEE Conf. Computer Vision and Pattern Recognition, 2016, pp. 1010-1019
- SBU Kinect interaction dataset v2.0. Available at http://
- Coppola, C., Faria, D.R., Nunes, U., et al.: 'Social activity recognition based on probabilistic merging of skeleton features with proximity priors from RGB-D data'. 2016 IEEE/RSJ Int. Conf. Intelligent Robots and Systems (IROS, 2016, pp. 5055-5061
- Faria, D.R., Premebida, C., Nunes, U.: 'A probabilistic approach for human everyday activities recognition using body motion from RGB-D images'. 23rd IEEE Int. Symp. Robot and Human Interactive Communication, 2014 RO- MAN, 2014, pp. 732-737
- Cippitelli, E., Gasparrini, S., Gambi, E., et al.: 'A human activity recognition system using skeleton data from RGBD sensors', Comput. Intell. Neurosci., 2016, 2016, p. 21
- Gaglio, S., Re, G.L., Morana, M.: 'Human activity recognition process using 3D posture data', IEEE Trans. Hum.-Mach. Syst., 2015, 45, (5), pp. 586-597
- Shan, J., Akella, S.: '3D human action segmentation and recognition using pose kinetic energy'. 2014 IEEE Int. Workshop on Advanced Robotics and its Social Impacts, 2014, pp. 69-75
- MacQueen, J.: 'Some methods for classification and analysis of multivariate observations'. Proc. Fifth Berkeley Symp. Mathematical Statistics and Probability, Volume 1: Statistics, Berkeley, CA, 1967, pp. 281-297
- Chang, C.-C., Lin, C.-J.: 'LIBSVM: a library for support vector machines', ACM Trans. Intell. Syst. Technol., 2011, 2, pp. 27 : 1-27 : 27. Accessed on June 2017. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm
- Pelleg, D., Moore, A.W.: 'X-means: extending k-means with efficient estimation of the number of clusters'. 17th Int. Conf. Machine Learning, 2000, pp. 727-734
- Witten, I.H., Frank, E., Hall, M.A.: 'Data mining: practical machine learning tools and techniques' (Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 2011, 3rd edn.)
- Hall, M., Frank, E., Holmes, G., et al.: 'The Weka data mining software: an update', SIGKDD Explor. Newsl., 2009, 11, (1), pp. 10-18
- Jiang, B., Martinez, B., Valstar, M.F., et al.: 'Decision level fusion of domain specific regions for facial action recognition'. 2014 22nd Int. Conf. Pattern Recognition (ICPR), 2014, pp. 1776-1781
- Aha, D., Kibler, D.: 'Instance-based learning algorithms', Mach. Learn., 1991, 6, pp. 37-66