The Stixel World: A medium-level representation of traffic scenes
Image and Vision Computing
https://doi.org/10.1016/J.IMAVIS.2017.01.009Abstract
Recent progress in advanced driver assistance systems and the race towards autonomous vehicles is mainly driven by two factors: (1) increasingly sophisticated algorithms that interpret the environment around the vehicle and react accordingly, and (2) the continuous improvements of sensor technology itself. In terms of cameras, these improvements typically include higher spatial resolution, which as a consequence requires more data to be processed. The trend to add multiple cameras to cover the entire surrounding of the vehicle is not conducive in that matter. At the same time, an increasing number of special purpose algorithms need access to the sensor input data to correctly interpret the various complex situations that can occur, particularly in urban traffic. By observing those trends, it becomes clear that a key challenge for vision architectures in intelligent vehicles is to share computational resources. We believe this challenge should be faced by introducing a representation of the sensory data that provides compressed and structured access to all relevant visual content of the scene. The Stixel World discussed in this paper is such a representation. It is a medium-level model of the environment that is specifically designed to compress information about obstacles by leveraging the typical layout of outdoor traffic scenes. It has proven useful for a multi
References (56)
- H. Badino, U. Franke, D. Pfeiffer, The Stixel world -a compact medium level representation of the 3D world, in: DAGM Symposium, 2009.
- D. Pfeiffer, U. Franke, Towards a global optimal multi-layer Stixel rep- resentation of dense 3D data, in: British Machine Vision Conference, 2011.
- D. Pfeiffer, The Stixel world -a compact Medium-level representation for efficiently modeling dynamic three-dimensional environments, Ph.D. thesis, Humboldt-Universität zu Berlin (2011).
- D. Pfeiffer, S. K. Gehrig, N. Schneider, Exploiting the power of stereo confidences, in: IEEE Conference on Computer Vision and Pattern Recognition, 2013.
- M. Cordts, L. Schneider, M. Enzweiler, U. Franke, S. Roth, Object- level priors for Stixel generation, in: German Conference on Pattern Recognition, 2014.
- T. Scharwächter, U. Franke, Low-level fusion of color, texture and depth for robust road scene understanding, in: IEEE Intelligent Vehicles Sym- posium, 2015.
- L. Schneider, M. Cordts, T. Rehfeld, D. Pfeiffer, M. Enzweiler, U. Franke, M. Pollefeys, S. Roth, Semantic Stixels: Depth is not enough, in: IEEE Intelligent Vehicles Symposium, 2016.
- R. Benenson, M. Mathias, R. Timofte, L. V. Gool, Fast stixel com- putation for fast pedestrian detection, in: CVVT Workshop (ECCV), 2012.
- M. Enzweiler, M. Hummel, D. Pfeiffer, U. Franke, Efficient Stixel-based object recognition, in: IEEE Intelligent Vehicles Symposium, 2012.
- F. Erbs, B. Schwarz, U. Franke, Stixmentation -probabilistic Stixel based traffic scene labeling, in: British Machine Vision Conference, 2012.
- M. Muffert, N. Schneider, U. Franke, Stix-Fusion: A probabilistic Stixel integration technique, in: Canadian Conference on Computer and Robot Vision, 2014.
- D. Pfeiffer, U. Franke, Modeling dynamic 3D environments by means of the Stixel world, IEEE Intelligent Transportation Systems Magazine 3 (3).
- U. Franke, D. Pfeiffer, C. Rabe, C. Knöppel, M. Enzweiler, F. Stein, R. G. Herrtwich, Making Bertha see, in: International Conference on Computer Vision Workshops, 2013.
- T. Scharwächter, M. Enzweiler, U. Franke, S. Roth, Stixmantics: A medium-level model for real-time semantic scene understanding, in: Eu- ropean Conference on Computer Vision, 2014.
- X. Li, F. Flohr, Y. Yang, H. Xiong, S. Pan, L. Keqiang, D. M. Gavrila, A new benchmark for vision-based cyclist detection, in: IEEE Intelligent Vehicles Symposium, 2016.
- D. Levi, N. Garnett, E. Fetaya, Stixelnet: A deep convolutional network for obstacle detection and road segmentation, in: British Machine Vision Conference, 2015.
- M.-Y. Liu, S. Lin, S. Ramalingam, O. Tuzel, Layered interpretation of street view images, in: Robotics Science and System (RSS), 2015.
- J. Rieken, R. Matthaei, M. Maurer, Benefits of using explicit ground- plane information for grid-based urban environment modeling, in: IEEE International Conference on Information Fusion, 2015.
- W. P. Sanberg, G. Dubbelman, P. H. de With, Extending the Stixel world with online self-supervised color modeling for road-versus-obstacle segmentation, in: IEEE Conference on Intelligent Transportation Sys- tems, 2014.
- M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benen- son, U. Franke, S. Roth, B. Schiele, The Cityscapes Dataset for semantic urban scene understanding, in: IEEE Conference on Computer Vision and Pattern Recognition, 2016.
- V. Dhiman, A. Kundu, F. Dellaert, J. J. Corso, Modern MAP inference methods for accurate and fast occupancy grid mapping on higher or- der factor graphs, in: IEEE International Conference on Robotics and Automation, 2014.
- D. Nuss, T. Yuan, G. Krehl, M. Stuebler, S. Reuter, K. Dietmayer, Fusion of laser and radar sensor data with a sequential monte carlo bayesian occupancy filter, in: IEEE Intelligent Vehicles Symposium, 2015.
- S. Thrun, Robotic mapping: A survey, in: Exploring artificial intelli- gence in the new millennium, 2002.
- R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, S. Süsstrunk, SLIC superpixels compared to state-of-the-art superpixel methods, IEEE Transactions on Pattern Analysis and Machine Intelligence 34 (11).
- D. Comaniciu, P. Meer, Mean shift: A robust approach toward feature space analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (5).
- P. F. Felzenszwalb, D. P. Huttenlocher, Efficient graph-based image seg- mentation, International Journal of Computer Vision 59 (2).
- A. Levinshtein, A. Stere, K. N. Kutulakos, D. J. Fleet, S. J. Dickinson, K. Siddiqi, Turbopixels: fast superpixels using geometric flows, IEEE Transactions on Pattern Analysis and Machine Intelligence 31 (12).
- P. Arbeláez, M. Maire, C. Fowlkes, J. Malik, Contour detection and hi- erarchical image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence 33 (5).
- J. Carreira, C. Sminchisescu, CPMC: Automatic object segmentation using constrained parametric min-cuts, IEEE Transactions on Pattern Analysis and Machine Intelligence 34 (7).
- R. B. Girshick, J. Donahue, T. Darrell, J. Malik, Rich feature hierar- chies for accurate object detection and semantic segmentation, in: IEEE Conference on Computer Vision and Pattern Recognition, 2014.
- J. R. R. Uijlings, K. E. A. Van De Sande, T. Gevers, A. W. M. Smeul- ders, Selective search for object recognition, International Journal of Computer Vision 104 (2).
- P. Arbeláez, B. Hariharan, C. Gu, S. Gupta, L. Bourdev, J. Malik, Semantic Segmentation using Regions and Parts, in: IEEE Conference on Computer Vision and Pattern Recognition, 2012.
- J. Carreira, R. Caseiro, J. Batista, C. Sminchisescu, Semantic segmenta- tion with second-order pooling, in: European Conference on Computer Vision, 2012.
- C. Häne, C. Zach, A. Cohen, R. Angst, M. Pollefeys, Joint 3D scene re- construction and class segmentation, in: IEEE Conference on Computer Vision and Pattern Recognition, 2013.
- A. Kundu, Y. Li, F. Dellaert, F. Li, J. Rehg, Joint semantic segmenta- tion and 3D reconstruction from monocular video, in: European Con- ference on Computer Vision, 2014.
- L. Ladický, P. Sturgess, C. Russell, S. Sengupta, Y. Bastanlar, W. Clocksin, P. H. S. Torr, Joint optimisation for object class seg- mentation and dense stereo reconstruction, in: British Machine Vision Conference, 2010.
- S. Sengupta, E. Greveson, A. Shahrokni, P. H. S. Torr, Urban 3D se- mantic modelling using stereo vision, in: IEEE International Conference on Robotics and Automation, 2013.
- C. Zhang, L. Wang, R. Yang, Semantic segmentation of urban scenes using dense depth maps, in: European Conference on Computer Vision, 2010.
- M. Bleyer, C. Rother, P. Kohli, D. Scharstein, S. Sinha, Object stereo - joint stereo matching and object segmentation, in: IEEE Conference on Computer Vision and Pattern Recognition, 2011.
- J. Long, E. Shelhamer, Darrell, Fully convolutional networks for se- mantic segmentation, in: IEEE Conference on Computer Vision and Pattern Recognition, 2015.
- S. Zheng, S. Jayasumana, B. Romera-Paredes, V. Vineet, Z. Su, D. Du, C. Huang, P. H. S. Torr, Conditional random fields as recurrent neural networks, in: International Conference on Computer Vision, 2015.
- Z. Liu, X. Li, P. Luo, C. C. Loy, X. Tang, Semantic image segmentation via deep parsing network, in: International Conference on Computer Vision, 2015.
- L.-C. Chen, G. Papandreou, I. Kokkinos, K. P. Murphy, A. L. Yuille, Semantic image segmentation with deep convolutional nets and fully connected CRFs, in: International Conference on Learning Representa- tions, 2015.
- G. Lin, C. Shen, I. Reid, A. van den Hengel, Efficient piecewise training of deep structured models for semantic segmentation, in: arXiv:1504.01013v2 [cs.CV], 2015.
- A. Kendall, V. Badrinarayanan, R. Cipolla, Bayesian SegNet: Model un- certainty in deep convolutional encoder-decoder architectures for scene understanding, arXiv:1511.02680v1 [cs.CV].
- M. Mostajabi, P. Yadollahpour, G. Shakhnarovich, Feedforward seman- tic segmentation with zoom-out features, in: IEEE Conference on Com- puter Vision and Pattern Recognition, 2015.
- A. Sharma, O. Tuzel, J. W. David, Deep hierarchical parsing for seman- tic segmentation, in: IEEE Conference on Computer Vision and Pattern Recognition, 2015.
- D. Hoiem, A. A. Efros, M. Hebert, Recovering surface layout from an image, International Journal of Computer Vision 75 (1).
- P. F. Felzenszwalb, O. Veksler, Tiered Scene Labeling with Dynamic Programming, in: IEEE Conference on Computer Vision and Pattern Recognition, 2010.
- H. Hirschmüller, Stereo processing by semiglobal matching and mutual information, IEEE Transactions on Pattern Analysis and Machine In- telligence 30 (2).
- S. K. Gehrig, F. Eberli, T. Meyer, A Real-Time Low-Power Stereo Vision Engine Using Semi-Global Matching, in: International Conference on Computer Vision Systems, 2009.
- S. Nowozin, C. H. Lampert, Structured prediction and learning in com- puter vision, Foundations and Trends in Computer Graphics and Vision 6 (3-4).
- A. Geiger, P. Lenz, R. Urtasun, Are we ready for autonomous driving? The KITTI Vision Benchmark Suite, in: IEEE Conference on Computer Vision and Pattern Recognition, 2012.
- L. Ladický, J. Shi, M. Pollefeys, Pulling things out of perspective, in: IEEE Conference on Computer Vision and Pattern Recognition, 2014.
- M. Menze, A. Geiger, Object scene flow for autonomous vehicles, in: IEEE Conference on Computer Vision and Pattern Recognition, 2015.
- M. Everingham, S. M. A. Eslami, L. Van Gool, C. K. I. Williams, J. Winn, A. Zisserman, The Pascal Visual Object Classes challenge: A retrospective, International Journal of Computer Vision 111 (1).