Papers by André Stuhlsatz
Classification of speech recognition hypotheses
A Dual Formulation to the Lipschitz Classifier

Transactions of the Institute of Measurement and Control, 2011
The concentration of organic acids in anaerobic digesters is one of the most critical parameters ... more The concentration of organic acids in anaerobic digesters is one of the most critical parameters for monitoring and advanced control of anaerobic digestion processes. Thus, a reliable online-measurement system is absolutely necessary. A novel approach to obtaining these measurements indirectly and online using UV/vis spectroscopic probes, in conjunction with powerful pattern recognition methods, is presented in this paper. An UV/vis spectroscopic probe from S::CAN is used in combination with a custom-built dilution system to monitor the absorption of fully fermented sludge at a spectrum from 200 to 750 nm. Advanced pattern recognition methods are then used to map the non-linear relationship between measured absorption spectra to laboratory measurements of organic acid concentrations. Linear discriminant analysis, generalized discriminant analysis (GerDA), support vector machines (SVM), relevance vector machines, random forest and neural networks are investigated for this purpose and...

Sixth International Conference on Machine Learning and Applications (ICMLA 2007), 2007
This paper presents an autonomous symbolic indoor tracking system for ubiquitous computing applic... more This paper presents an autonomous symbolic indoor tracking system for ubiquitous computing applications. The proposed approach is based upon the assumption that topologically discriminable information can be assigned explicitly to different spaces of a given indoor environment. On that assumption, continuous Time-of-Flight (ToF) measurements of echo-bursts obtained from four orthogonally and coplanarly mounted ultrasonic transducer are used to learn a stochastic room model. While the individual acoustic representation of space is captured using Gaussian mixture densities, the stochastic variabilities in the moving direction of a person are modeled by Hidden-Markov-Models (HMMs). Experiments within a six room environment resulted in a room recognition rate of 92.21% and a room sequence recogntion rate of 66.00% without any prefixed devices.
Organic Acid Prediction in Biogas Plants Using UV/vis Spectroscopic Online-Measurements
Communications in Computer and Information Science, 2010
The concentration of organic acids in anaerobic digesters is one of the most critical parameters ... more The concentration of organic acids in anaerobic digesters is one of the most critical parameters for monitoring and advanced control of anaerobic digestion processes, making a reliable online-measurement system absolutely necessary. This paper introduces a novel ...

Advances in Soft Computing, 2007
In this paper, we present a new implementable learning algorithm for the general nonlinear binary... more In this paper, we present a new implementable learning algorithm for the general nonlinear binary classification problem. The suggested algorithm abides the maximum margin philosophy, and learns a decision function from the set of all finite linear combinations of continuous differentiable basis functions. This enables the use of a much more flexible function class than the one usually employed by Mercer-restricted kernel machines. Experiments on 2-dimensional randomly generated data are given to compare the algorithm to a Support Vector Machine. While the performances are comparable in case of Gaussian basis functions and static feature vectors the algorithm opens a novel way to hitherto intractable problems. This includes especially classification of feature vector streams, or features with dynamically varying dimensions as such in DNA analysis, natural speech or motion image recognition.
HSVM - A SVM Toolkit for Segmented Speech Data
TUDpress, Dresden, 2007

The classification of complex patterns is one of the most impressive cognitive achievements of th... more The classification of complex patterns is one of the most impressive cognitive achievements of the human brain. Humans have the ability to recognize a complex image, like for example that of a known person, and to distinguish it from other objects within half a second. While for a solution of this task the brain has access to a massive parallelism and a vast, hierarchically organized, and auto-associative memory, common computer architectures are just able to a sequential processing of information stored in a non auto-associative memory. Even modern, parallelly operating, multi-processor systems are far away from the performance of our brain. However, nowadays, it is possible to solve complex and memory extensive pattern recognition problems, like the recognition of handwritten digits or the transcription of speech, satisfactorily with a common computer by the use of modern statistical and algorithmic learning approaches. One of the most successful pattern recognition methods is the...

Proceedings of the 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, 2017
We present a novel approach to dimensionality reduction for data visualization that is a combinat... more We present a novel approach to dimensionality reduction for data visualization that is a combination of two deep neural networks (DNNs) with different objectives. One is a nonlinear generalization of Fisher's linear discriminant analysis (LDA). It seeks to improve the class separability in the desired feature space, which is a natural strategy to obtain well-clustered visualizations. The other DNN is a deep autoencoder. Here, an encoding and a decoding DNN are optimized simultaneously with respect to the decodability of the features obtained by encoding the data. The idea behind the combined DNN is to use the generalized discriminant analysis as an encoding DNN and to equip it with a regularizing decoding DNN. Regarding data visualization, a well-regularized DNN guarantees to learn sufficiently similar data visualizations for different sets of samples that represent the data approximately equally good. Clearly, such a robustness against fluctuations in the data is essential for real-world applications. We therefore designed two extensive experiments that involve simulated fluctuations in the data. Our results show that the combined DNN is considerably more robust than the generalized discriminant analysis alone. Moreover, we present reconstructions that reveal how the visualizable features look like back in the original data space.
Graphical Models, 2020
This is a PDF file of an article that has undergone enhancements after acceptance, such as the ad... more This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Fusion of Visual and Inertial Measurements for Pose Estimation
ABSTRACT
Making the Lipschitz Classifier Practical via Semi-infinite Programming
2008 Seventh International Conference on Machine Learning and Applications, 2008
ABSTRACT

2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2011
Deep Neural Networks (DNNs) denote multilayer artificial neural networks with more than one hidde... more Deep Neural Networks (DNNs) denote multilayer artificial neural networks with more than one hidden layer and millions of free parameters. We propose a Generalized Discriminant Analysis (GerDA) based on DNNs to learn discriminative features of low dimension optimized with respect to a fast classification from a large set of acoustic features for emotion recognition. On nine frequently used emotional speech corpora, we compare the performance of GerDA features and their subsequent linear classification with previously reported benchmarks obtained using the same set of acoustic features classified by Support Vector Machines (SVMs). Our results impressively show that low-dimensional GerDA features capture hidden information from the acoustic features leading to a significantly raised unweighted average recall and considerably raised weighted average recall.

Feature Extraction With Deep Neural Networks by a Generalized Discriminant Analysis
IEEE Transactions on Neural Networks and Learning Systems, 2012
We present an approach to feature extraction that is a generalization of the classical linear dis... more We present an approach to feature extraction that is a generalization of the classical linear discriminant analysis (LDA) on the basis of deep neural networks (DNNs). As for LDA, discriminative features generated from independent Gaussian class conditionals are assumed. This modeling has the advantages that the intrinsic dimensionality of the feature space is bounded by the number of classes and that the optimal discriminant function is linear. Unfortunately, linear transformations are insufficient to extract optimal discriminative features from arbitrarily distributed raw measurements. The generalized discriminant analysis (GerDA) proposed in this paper uses nonlinear transformations that are learnt by DNNs in a semisupervised fashion. We show that the feature extraction based on our approach displays excellent performance on real-world recognition and detection tasks, such as handwritten digit recognition and face detection. In a series of experiments, we evaluate GerDA features with respect to dimensionality reduction, visualization, classification, and detection. Moreover, we show that GerDA DNNs can preprocess truly high-dimensional input data to low-dimensional representations that facilitate accurate predictions even if simple linear predictors or measures of similarity are used.
Feature Extraction for Simple Classification
2010 20th International Conference on Pattern Recognition, 2010
ABSTRACT

IEEE Transactions on Affective Computing, 2010
As the recognition of emotion from speech has matured to a degree where it becomes applicable in ... more As the recognition of emotion from speech has matured to a degree where it becomes applicable in real-life settings, it is time for a realistic view on obtainable performances. Most studies tend to overestimation in this respect: Acted data is often used rather than spontaneous data, results are reported on preselected prototypical data, and true speaker disjunctive partitioning is still less common than simple cross-validation. Even speaker disjunctive evaluation can give only a little insight into the generalization ability of today's emotion recognition engines since training and test data used for system development usually tend to be similar as far as recording conditions, noise overlay, language, and types of emotions are concerned. A considerably more realistic impression can be gathered by interset evaluation: We therefore show results employing six standard databases in a cross-corpora evaluation experiment which could also be helpful for learning about chances to add resources for training and overcoming the typical sparseness in the field. To better cope with the observed high variances, different types of normalization are investigated. 1.8 k individual evaluations in total indicate the crucial performance inferiority of inter to intracorpus testing.

Water Science and Technology, 2012
The optimization of full-scale biogas plant operation is of great importance to make biomass a co... more The optimization of full-scale biogas plant operation is of great importance to make biomass a competitive source of renewable energy. The implementation of innovative control and optimization algorithms, such as Nonlinear Model Predictive Control, requires an online estimation of operating states of biogas plants. This state estimation allows for optimal control and operating decisions according to the actual state of a plant. In this paper such a state estimator is developed using a calibrated simulation model of a full-scale biogas plant, which is based on the Anaerobic Digestion Model No.1. The use of advanced pattern recognition methods shows that model states can be predicted from basic online measurements such as biogas production, CH4 and CO2 content in the biogas, pH value and substrate feed volume of known substrates. The machine learning methods used are trained and evaluated using synthetic data created with the biogas plant model simulating over a wide range of possible...
In this paper, we introduce an approach to improve the recognition performance of a Hidden Markov... more In this paper, we introduce an approach to improve the recognition performance of a Hidden Markov Model (HMM) based monophone recognizer using Support Vector Machines (SVMs). We developed and examined a method for re-scoring the HMM recognizer hypotheses by SVMs in a phoneme recognition framework. Compared to a stand-alone HMM system, an improvement of 9.2% was reached on the TIMIT database and 12.8% on the Wallstreet Journal Cambridge database using the hybrid framework.
Discriminative feature extraction with Deep Neural Networks
The 2010 International Joint Conference on Neural Networks (IJCNN), 2010
We propose a framework for optimizing Deep Neural Networks (DNN) with the objective of learning l... more We propose a framework for optimizing Deep Neural Networks (DNN) with the objective of learning low-dimensional discriminative features from high-dimensional complex patterns. In a two-stage process that effectively implements a Nonlinear Discriminant Analysis (NDA), we first pretrain a DNN using stochastic optimization, partly supervised and unsupervised. This stage involves layer-wise training and stacking of single Restricted Boltzmann Machines (RBM). The
Uploads
Papers by André Stuhlsatz