1 Intelligent Autonomous Systems
2015
Abstract
Pictorial structure models are the de facto standard for 2D human pose estimation. Numerous refinements and improvements have been proposed such as discriminatively trained body part detectors, flexible body models, and local and global mixtures. While these techniques allow to achieve state-of-the-art performance for 2D pose estimation, they have not yet been extended to enable pose estimation in 3D. This paper thus proposes a multi-view pictorial structures model that builds on recent advances in 2D pose esti-mation and incorporates evidence across multiple viewpoints to allow for robust 3D pose estimation. We evaluate our multi-view pictorial structures approach on the HumanEva-I and MPII Cooking dataset. In comparison to related work for 3D pose estimation our approach achieves similar or better results while operating on single-frames only and not relying on activity specific motion models or tracking. Notably, our approach outper-forms state-of-the-art for activities with more...
References (37)
- Mykhaylo Andriluka, Stefan Roth, and Bernt Schiele. Monocular 3d pose estimation and tracking by detection. In CVPR 2010.
- A. Balan, L. Sigal, M.J. Black, J. Davis, and H. Haussecker. Detailed human shape and pose from images. In CVPR 2007.
- S. Belongie, J. Malik, and J. Puzicha. Shape context: A new descriptor for shape matching and object recognition. In NIPS, 2000.
- C. Bregler, J. Malik, and K. Pullen. Twist based acquisition and tracking of animal and human kinematics. IJCV, 56(3):179-194, 2004.
- Magnus Burenius, Josephine Sullivan, and Stefan Carlsson. 3d pictorial structures for multiple view articulated pose estimation. In CVPR, 2013.
- D. Crandall, P. Felzenszwalb, and D. Huttenlocher. Spatial priors for part-based recog- nition using statistical models. In CVPR 2005.
- Jonathan Deutscher and Ian Reid. Articulated body motion capture by stochastic search. IJCV, 61:185-205, 2005.
- Marcin Eichner and Vittorio Ferrari. Better appearance models for pictorial structures. In BMVC 2009.
- Marcin Eichner and Vittorio Ferrari. Appearance sharing for collective human pose estimation. In ACCV, 2012.
- P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part-based models. PAMI, 32, 2010.
- Pedro F. Felzenszwalb and Daniel P. Huttenlocher. Pictorial structures for object recog- nition. IJCV, 2005.
- Vittorio Ferrari, Manuel Marin, and Andrew Zisserman. Progressive search space re- duction for human pose estimation. In CVPR 2008.
- M.A. Fischler and R.A. Elschlager. The representation and matching of pictorial struc- tures. IEEE Transactions on Computers, C-22(1):67-92, 1973.
- J. Gall, C. Stoll, E. de Aguiar, C. Theobalt, B. Rosenhahn, and H.-P. Seidel. Motion capture using joint skeleton tracking and surface estimation. In CVPR 2009.
- J. Gall, B. Rosenhahn, T. Brox, and H.-P. Seidel. Optimization and filtering for human motion capture: A multi-layer framework. IJCV, 87(1-2), 2010.
- Juergen Gall, Angela Yao, and Luc J. Van Gool. 2D action recognition serves 3D human pose estimation. In ECCV, 2010.
- Stephan Gammeter, Andreas Ess, Tobias Jaeggli, Konrad Schindler, Bastian Leibe, and L.J.V. Gool. Articulated multi-body tracking under egomotion. In ECCV 2008.
- R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cam- bridge University Press, ISBN: 0521540518, second edition, 2004.
- N. Jammalamadaka, A. Zisserman, M. Eichner, V. Ferrari, and C. V. Jawahar. Has my algorithm succeeded? an evaluator for human pose estimators. In ECCV 2012.
- Sam Johnson and Mark Everingham. Clustered pose and nonlinear appearance models for human pose estimation. In BMVC, 2010.
- Sam Johnson and Mark Everingham. Learning effective human pose estimation from inaccurate annotation. In CVPR, 2011.
- Andriluka Mykhaylo, Roth Stefan, and Schiele Bernt. Discriminative appearance mod- els for pictorial structures. IJCV, 2011.
- Hasler N., Rosenhahn B., Thormaehlen T., Wand M., Gall J., and Seidel H.-P. Marker- less motion capture with unsynchronized moving cameras. In CVPR 2009.
- B. Pepik, M. Stark, P. Gehler, and B. Schiele. Teaching 3d geometry to deformable part models. In CVPR 2012.
- Gerard Pons-Moll, Andreas Baak, Juergen Gall, Laura Leal-Taixe, Meinard Mueller, Hans-Peter Seidel, and Bodo Rosenhahn. Outdoor human motion capture using inverse kinematics and von mises-fisher sampling. In ICCV 2011.
- Michalis Raptis and Leonid Sigal. Poselet key-framing: A model for human activity recognition. In CVPR 2013.
- Marcus Rohrbach, Sikandar Amin, Mykhaylo Andriluka, and Bernt Schiele. A database for fine grained activity detection of cooking activities. In CVPR, 2012.
- Ben Sapp, David Weiss, and Ben Taskar. Parsing human motion with stretchable mod- els. In CVPR 2011.
- Benjamin Sapp, Alexander Toshev, and Ben Taskar. Cascaded models for articulated pose estimation. In ECCV, 2010.
- L. Sigal, A. Balan, and M. J. Black. Humaneva: Synchronized video and motion cap- ture dataset and baseline algorithm for evaluation of articulated human motion. IJCV, 87(1-2), 2010.
- Leonid Sigal and Michael J. Black. Predicting 3D people from 2D pictures. In AMDO 2006.
- Leonid Sigal, Sidharth Bhatia, Stefan Roth, Michael J. Black, and Michael Isard. Tracking loose-limbed people. In CVPR 2004.
- C. Stoll, N. Hasler, J. Gall, H.-P. Seidel, and C. Theobalt. Fast articulated motion tracking using a sums of gaussians body model. In ICCV 2011.
- G.W. Taylor, L. Sigal, D.J. Fleet, and G.E. Hinton. Dynamical binary latent variable models for 3d human pose tracking. In CVPR, 2010.
- R. Urtasun and T. Darrell. Local probabilistic regression for activity-independent hu- man pose inference. In ICCV 2009.
- Yi Yang and Deva Ramanan. Articulated pose estimation with flexible mixtures-of- parts. In CVPR, 2011.
- A. Yao, J. Gall, L. Van Gool, and R. Urtasun. Learning probabilistic non-linear latent variable models for tracking complex activities. In NIPS, 2011.