Lukas Schneider

Followers

Following

Public Views

Interests

Uploads

Papers by Lukas Schneider

Scene Understanding for Autonomous Driving based on Visual and Depth Cues

Exotic genetic libraries in different crops are valuable genetic resources for genetic dissection... more

Download

Slanted Stixels: Representing San Francisco's Steepest Streets

Procedings of the British Machine Vision Conference 2017, 2017

In this work we present a novel compact scene representation based on Stixels that infers geometr... more In this work we present a novel compact scene representation based on Stixels that infers geometric and semantic information. Our approach overcomes the previous rather restrictive geometric assumptions for Stixels by introducing a novel depth model to account for non-flat roads and slanted objects. Both semantic and depth cues are used jointly to infer the scene representation in a sound global energy minimization formulation. Furthermore, a novel approximation scheme is introduced that uses an extremely efficient over-segmentation. In doing so, the computational complexity of the Stixel inference algorithm is reduced significantly, achieving real-time computation capabilities with only a slight drop in accuracy. We evaluate the proposed approach in terms of semantic and geometric accuracy as well as run-time on four publicly available benchmark datasets. Our approach maintains accuracy on flat road scene datasets while improving substantially on a novel non-flat road dataset. * Both authors contributed equally. † Work performed during an internship at Daimler AG.

Download

Learning Stixel-based Instance Segmentation

2021 IEEE Intelligent Vehicles Symposium (IV), 2021

Stixels have been successfully applied to a wide range of vision tasks in autonomous driving, rec... more Stixels have been successfully applied to a wide range of vision tasks in autonomous driving, recently including instance segmentation. However, due to their sparse occurrence in the image, until now Stixels seldomly served as input for Deep Learning algorithms, restricting their utility for such approaches. In this work we present StixelPointNet, a novel method to perform fast instance segmentation directly on Stixels. By regarding the Stixel representation as unstructured data similar to point clouds, architectures like PointNet are able to learn features from Stixels. We use a bounding box detector to propose candidate instances, for which the relevant Stixels are extracted from the input image. On these Stixels, a PointNet models learns binary segmentations, which we then unify throughout the whole image in a final selection step. StixelPointNet achieves state-of-the-art performance on Stixel-level, is considerably faster than pixel-based segmentation methods, and shows that with our approach the Stixel domain can be introduced to many new 3D Deep Learning tasks.

Download

Semantic Stixels: Depth is not enough

2016 IEEE Intelligent Vehicles Symposium (IV), 2016

In this paper we present Semantic Stixels, a novel vision-based scene model geared towards automa... more In this paper we present Semantic Stixels, a novel vision-based scene model geared towards automated driving. Our model jointly infers the geometric and semantic layout of a scene and provides a compact yet rich abstraction of both cues using Stixels as primitive elements. Geometric information is incorporated into our model in terms of pixel-level disparity maps derived from stereo vision. For semantics, we leverage a modern deep learning-based scene labeling approach that provides an object class label for each pixel. Our experiments involve an in-depth analysis and a comprehensive assessment of the constituent parts of our approach using three public benchmark datasets. We evaluate the geometric and semantic accuracy of our model and analyze the underlying run-times and the complexity of the obtained representation. Our results indicate that the joint treatment of both cues on the Semantic Stixel level yields a highly compact environment representation while maintaining an accuracy comparable to the two individual pixel-level input data sources. Moreover, our framework compares favorably to related approaches in terms of computational costs and operates in real-time.

Download

Multimodal Neural Networks: RGB-D for Semantic Segmentation and Object Detection

Image Analysis

This paper presents a novel multi-modal CNN architecture that exploits complementary input cues i... more This paper presents a novel multi-modal CNN architecture that exploits complementary input cues in addition to sole color information. The joint model implements a mid-level fusion that allows the network to exploit cross-modal interdependencies already on a medium feature-level. The benefit of the presented architecture is shown for the RGB-D image understanding task. So far, state-of-the-art RGB-D CNNs have used network weights trained on color data. In contrast, a superior initialization scheme is proposed to pre-train the depth branch of the multi-modal CNN independently. In an end-to-end training the network parameters are optimized jointly using the challenging Cityscapes dataset. In thorough experiments, the effectiveness of the proposed model is shown. Both, the RGB GoogLeNet and further RGB-D baselines are outperformed with a significant margin on two different task: semantic segmentation and object detection. For the latter, this paper shows how to extract object-level groundtruth from the instance level annotations in Cityscapes in order to train a powerful object detector.

Download

Sparsity Invariant CNNs

2017 International Conference on 3D Vision (3DV)

The Stixel world – A comprehensive representation of traffic scenes for autonomous driving

at - Automatisierungstechnik

Autonomous vehicles as well as sophisticated driver assistance systems use stereo vision to perce... more Autonomous vehicles as well as sophisticated driver assistance systems use stereo vision to perceive their environment in 3D. At least two Million 3D points will be delivered by next generation automotive stereo vision systems. In order to cope with this huge amount of data in real-time, we developed a medium level representation, named Stixel world. This representation condenses the relevant scene information by three orders of magnitude. Since traffic scenes are dominated by planar horizontal and vertical surfaces our representation approximates the three-dimensional scene by means of thin planar rectangles called Stixel. This survey paper summarizes the progress of the Stixel world. The evolution started with a rather simple representation based on a flat world assumption. A major break-through was achieved by introducing deep-learning that allows to incorporate rich semantic information. In its most recent form, the Stixel world encodes geometric, semantic and motion cues and is...

Slanted Stixels: A Way to Represent Steep Streets

International Journal of Computer Vision

This work presents and evaluates a novel compact scene representation based on Stixels that infer... more This work presents and evaluates a novel compact scene representation based on Stixels that infers geometric and semantic information. Our approach overcomes the previous rather restrictive geometric assumptions for Stixels by introducing a novel depth model to account for non-flat roads and slanted objects. Both semantic and depth cues are used jointly to infer the scene representation in a sound global energy minimization formulation. Furthermore, a novel approximation scheme is introduced in order to significantly reduce the computational complexity of the Stixel algorithm, and then achieve real-time computation capabilities. The idea is to first perform an over-segmentation of the image, discarding the unlikely Stixel cuts, and apply the algorithm only on the remaining Stixel cuts. This work presents a novel over-segmentation strategy based on a fully convolutional network, which outperforms an approach based on using local extrema of the disparity map. We evaluate the proposed ...

Download

The Stixel World: A medium-level representation of traffic scenes

Image and Vision Computing

Recent progress in advanced driver assistance systems and the race towards autonomous vehicles is... more Recent progress in advanced driver assistance systems and the race towards autonomous vehicles is mainly driven by two factors: (1) increasingly sophisticated algorithms that interpret the environment around the vehicle and react accordingly, and (2) the continuous improvements of sensor technology itself. In terms of cameras, these improvements typically include higher spatial resolution, which as a consequence requires more data to be processed. The trend to add multiple cameras to cover the entire surrounding of the vehicle is not conducive in that matter. At the same time, an increasing number of special purpose algorithms need access to the sensor input data to correctly interpret the various complex situations that can occur, particularly in urban traffic. By observing those trends, it becomes clear that a key challenge for vision architectures in intelligent vehicles is to share computational resources. We believe this challenge should be faced by introducing a representation of the sensory data that provides compressed and structured access to all relevant visual content of the scene. The Stixel World discussed in this paper is such a representation. It is a medium-level model of the environment that is specifically designed to compress information about obstacles by leveraging the typical layout of outdoor traffic scenes. It has proven useful for a multi

Download

Semantically Guided Depth Upsampling

Lecture Notes in Computer Science, 2016

We present a novel method for accurate and efficient upsampling of sparse depth data, guided by h... more We present a novel method for accurate and efficient upsampling of sparse depth data, guided by high-resolution imagery. Our approach goes beyond the use of intensity cues only and additionally exploits object boundary cues through structured edge detection and semantic scene labeling for guidance. Both cues are combined within a geodesic distance measure that allows for boundary-preserving depth interpolation while utilizing local context. We model the observed scene structure by locally planar elements and formulate the upsampling task as a global energy minimization problem. Our method determines globally consistent solutions and preserves fine details and sharp depth boundaries. In our experiments on several public datasets at different levels of application, we demonstrate superior performance of our approach over the state-of-the-art, even for very sparse measurements.

Download

Object-Level Priors for Stixel Generation

Lecture Notes in Computer Science, 2014

Lukas Schneider

Uploads

Papers by Lukas Schneider

Log In