Evidential event inference in transport video surveillance
2016, Computer Vision and Image Understanding
https://doi.org/10.1016/J.CVIU.2015.10.017Abstract
This paper presents a new framework for multi-subject event inference in surveillance video, where measurements produced by low-level vision analytics usually are noisy, incomplete or incorrect. Our goal is to infer the composite events undertaken by each subject from noise observations. To achieve this, we consider the temporal characteristics of event relations and propose a method to correctly associate the detected events with individual subjects. The Dempster-Shafer (DS) theory of belief functions is used to infer events of interest from the results of our vision analytics and to measure conflicts occurring during the event association. Our system is evaluated against a number of videos that present passenger behaviours on a public transport platform namely buses at different levels of complexity. The experimental results demonstrate that by reasoning with spatio-temporal correlations, the proposed method achieves a satisfying performance when associating atomic events and recognising composite events involving multiple subjects in dynamic environments.
References (59)
- T. Moeslund, A. Hilton, V. Kruger, A survey of advances in vision-based human motion capture and analysis, Computer Vision and Image Understanding 104 (2006) 90-126.
- D. Weinland, R. Ronfard, E. Boyer, A survey of vision-based methods for action representation, segmentation and recognition, Computer Vision and Image Understanding 115 (2011) 224-241.
- R. Poppe, A survey on vision-based human action recognition, Image and Vision Computing 28 (2010) 976-990.
- O. Popoola, K. Wang, Video-based abnormal human behavior recognition-a review, IEEE Transactions on Sys- tems, Man, and Cybernetics, Part C: Applications and Reviews 42 (2012) 865-878.
- R. Turaga, P.and Chellappa, V. Subrahmanian, O. Udrea, Machine recognition of human activities: A survey, IEEE Transactions on Circuits, Systems and Video Teachnology 18 (2008) 1473-1488.
- G. Lavee, E. Rivlin, M. Rudzsky, Understanding video events: A survey of methods for automatic interpretation of semantic occurrences in video, IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews 39 (5) (2009) 489-504.
- A. D. Newton, Crime on public transport, Encyclopedia of Criminology and Criminal Justice (2014) 709-720.
- T. deCampos, A survey on computer vision tools for action recognition, crowd surveillance and suspect retrieval, in: XXXIV Congresso da Sociedade Brasileira de Computacao (CSBC), 2014, pp. 1123-1132.
- S. Hongeng, R. Nevatia, Multi-agent event recognition, in: Procs. of ICCV, 2001, pp. 84-91.
- I. Atmosukarto, B. Ghanem, N. Ahuja, Trajectory-based fisher kernel representation for action recognition in videos., in: Procs. of ICPR, 2012, pp. 3333-3336.
- D. Ramanan, D. Forsyth, A. Zisserman, Tracking people by learning their appearance, IEEE Transactions on Pattern Analysis and Machine Intelligence 29 (1) (2007) 65-81.
- F. Bashir, A. Khokhar, D. Schonfeld, Object trajectory-based activity classification and recognition using hidden markov models, IEEE Transactions on Image Processing 16 (2007) 1912-1919.
- J. Shotton, A. Fitzgibbon, M. Cook, T. Sharp, M. Finocchio, R. Moore, A. Kipman, A. Blake, Real-time human pose recognition in parts from single depth images, in: Procs. of CVPR, 2011, pp. 1297-1304.
- H. Zhou, H. Hu, H. Liu, J. Tang, Classification of upper limb motion trajectory using shape features, IEEE Transactions on System, Man, and Cybern. -Part C 42 (6) (2012) 970-982.
- L. Bourdev, J. Malik, Poselets: body part detectors trained using 3d human pose annotations, in: Procs. of ICCV, 2009, pp. 1365-1372.
- A. Yao, J. Gall, G. Fanelli, L. Gool, Does human action recognition benefit from pose estimation?, in: Procs. of BMVC, 2011.
- A. Kläser, M. Marszalek, C. Schmid, A spatio-temporal descriptor based on 3d-gradients, in: Procs. of BMVC, 2008, pp. 995-1004.
- Y. Ke, R. Sukthankar, M. Hebert, Event detection in crowded videos, in: Procs. of ICCV, 2007, pp. 1-8.
- H. Wang, A. Kläser, C. Schmid, C.-L. Liu, Action recognition by dense trajectories, in: Procs. of CVPR, 2011, pp. 3169-3176.
- D. Oneata, J. Verbeek, C. Schmid, Efficient Action Localization with Approximately Normalized Fisher Vectors, in: Procs. of CVPR, 2014.
- S. Sadanand, J. J. Corso, Action bank: A high-level representation of activity in video., in: Procs. of CVPR, 2012, pp. 1234-1241.
- Y.-L. Tian, R. Feris, A. Hampapur, Real-Time Detection of Abandoned and Removed Objects in Complex Environments, in: Procs. of The Eighth International Workshop on Visual Surveillance, 2008.
- Q. Fan, P. Gabbur, S. Pankanti, Relative attributes for large-scale abandoned object detection, in: Procs. of ICCV, 2013, pp. 2736-2743.
- J. Jacques-Jr, S. Mussef, C. Jung, Crowd Analysis Using Computer Vision Techniques, IEEE Signal Processing Magazine 27 (2010) 66-77.
- H. Idrees, N. Warner, M. Shah, Tracking in dense crowds using prominence and neighborhood motion concur- rence, Image Vision Comput. 32 (1) (2014) 14-26.
- B. Zhou, X. Wang, X. Tang, Understanding collective crowd behaviors: Learning a mixture model of dynamic pedestrian-agents., in: Procs. of CVPR, 2012, pp. 2871-2878.
- S. Yi, X. Wang, C. Lu, J. Jia, L0 regularized stationary time estimation for crowd group analysis, in: Procs. of CVPR, 2014.
- M. Leach, E. Sparks, N. Robertson, Contextual anomaly detection in crowded surveillance scenes, Pattern Recognition Letters 44 (2014) 71-79.
- S. Cho, H. Kang, Abnormal behavior detection using hybrid agents in crowded scenes, Pattern Recognition Letters 44 (2014) 64-70.
- J. Kittler, W. Christmas, T. deCampos, D. Windridge, F. Yan, J. Illingworth, M. Osman, Domain anomaly detection in machine perception: A system architecture and taxonomy, IEEE Transactions on Pattern Analysis and Machine Intelligence 36 (5) (2014) 845-859.
- G. Lavee, M. Rudzsky, E. Rivlin, Propagating certainty in petri nets for activity recognition, IEEE Transactions on Circuits and Systems for Video Technology 23 (2) (2013) 326-337.
- J. Chen, Y. Cui, G. Ye, D. Liu, S.-F. Chang, Event-driven semantic concept discovery by exploiting weakly tagged internet images, in: Procs. of International Conference on Multimedia Retrieval, 2014, pp. 1:1-1:8.
- W. Li, Q. Yu, H. Sawhney, N. Vasconcelos, Recognizing activities via bag of words for attribute dynamics., in: Procs. of CVPR, IEEE, 2013, pp. 2587-2594.
- N. Chomsky, Syntactic Structures, Mouton, 1957.
- C. Petri, Communication with automata, Tech. Rep. AD0630125, Defense Tech. Inf. Cntr. (1966).
- M. Ryoo, J. Aggarwal, Recognition of composite human activities through context-free grammar based repre- sentation, in: Procs. of CVPR, 2006, pp. 1709-1718.
- G. Lavee, A. Borzin, E. Rivlin, M. Rudzsky, Building petri nets from video event ontologies, in: Procs. of ISVC, Springer-Verlag Berlin Heidelberg, 2007, pp. 442-451.
- S. Guler, J. Burns, A. Hakeem, Y. Sheikh, M. Shah, M. Thonnat, F. Bremond, N. Maillot, T. Vu, I. Haritaoglu, R. Chellappa, U. Akdemir, L. Davis, An ontology of video events in the physical security and surveillance domain, online, http://www.ai.sri.com/ burns/EventOntology (2003).
- R. Nevatia, T. Zhao, S. Hongeng, Hierarchical language-based representation of events in video streams, in: Procs. of the IEEE Workshop on Event Mining, 2003.
- R. Romdhane, B. Boulay, F. Bremond, M. Thonnat, Probabilistic recognition of complex event, in: Procs. of ICCVS, 2011, pp. 122-131.
- A. Hakeem, M. Shah, Learning, detection and representation of multi-agent events in videos, Artif. Intell. 171 (8- 9) (2007) 586-605.
- S. Khokhar, I. Saleemi, M. Shah, Multi-agent event recognition by preservation of spatiotemporal relationships between probabilistic models, Image Vision Comput. 31 (9) (2013) 603-615.
- S. D. Tran, L. S. Davis, Event modeling and recognition using markov logic networks, in: Procs. of ECCV, 2008, pp. 610-623.
- A. Stolcke, An efficient probabilistic context-free parsing algorithm that computes prefix probabilities, in: Com- putational Linguistics, MIT Press for the Association for Computational Linguistics, 1995.
- Y. Ivanov, A. Bobick, Recognition of visual activities and interactions by stochastic parsing, IEEE Transactions on Pattern Analysis and Machine Intelligence 22 (8) (2000) 852-872.
- A. Kanaujia, T. Choe, H. Deng, Complex events recognition under uncertainty in a sensor network (2014). doi:arXiv:1411.0085.
- W. Brendel, A. Fern, S. Todorovic, Probabilistic event logic for interval-based event recognition., in: Procs. of CVPR, 2011, pp. 3329-3336.
- A. Dempster, Upper and lower probabilities induced by a multivalued mapping, The Annals of Statistics 28 (1967) 325-339.
- G. Shafer, A Mathematical Theory of Evidence, Princeton University Press, 1976.
- J. Allen, Maintaining knowledge about temporal intervals, Communications of the ACM 26 (11) (1983) 832- 843.
- W. Liu, J. Hughes, M. McTear, Representing heuristic knowledge in the DS theory, in: Procs. of UAI, 1992, pp. 182-190.
- J. Lowrance, T. Garvey, T. Strat, A framework for evidential-reasoning systems, in: Procs. of AAAI, 1986, pp. 896-903.
- P. Smets, Constructing the pignistic probability function in a context of uncertainty, in: Procs. of UAI, 1990, pp. 29-40.
- H. Xu, Y. Hsia, P. Smets, Transferable belief model for decision making in the valuation-based systems, IEEE Transactions on Systems, Man, and Cybernetics-Part A: System and Humans 26 (6) (1996) 698-707.
- J. Allen, An interval-based representation of temporal knowledge, in: Proc. of IJCAI, 1981, pp. 221-225.
- N. McLaughlin, J. Martinez-del Rincon, P. Miller, Online multiperson tracking with occlusion reasoning and unsupervised track motion model, in: Procs. of AVSS, 2013, pp. 37-42.
- X. Hong, Y. Huang, W. Ma, P. Miller, W. Liu, H. Zhou, Video event recognition by Dempster-Shafer theory, in: Procs. of ECAI, 2014.
- X. Hong, W. Ma, Y. Huang, P. Miller, W. Liu, H. Zhou, Evidence reasoning for event inference in smart transport video surveillance, in: Procs. of ICDSC, 2014.
- J. Ma, W. Liu, P. Miller, W. Yan, Event composition with imperfect information for bus surveillance, in: Procs. of AVSS, 2009, pp. 382-387.