Academia.eduAcademia.edu

Multiple Classifier Systems

description527 papers
group62 followers
lightbulbAbout this topic
Multiple Classifier Systems (MCS) refer to a framework in machine learning that combines the predictions of multiple classifiers to improve overall classification performance. By leveraging the diversity of individual classifiers, MCS aims to enhance accuracy, robustness, and generalization in various classification tasks.
lightbulbAbout this topic
Multiple Classifier Systems (MCS) refer to a framework in machine learning that combines the predictions of multiple classifiers to improve overall classification performance. By leveraging the diversity of individual classifiers, MCS aims to enhance accuracy, robustness, and generalization in various classification tasks.

Key research themes

1. How can optimal classifier subset selection improve majority voting ensembles given the combinatorial complexity of classifier combinations?

This research theme focuses on methods for selecting subsets of classifiers to enhance majority voting ensembles while managing the exponential complexity of evaluating all possible subsets. It entails exploring criteria for classifier selection (e.g., diversity measures, direct fusion error estimates), heuristic search algorithms, and novel ensemble designs that iteratively refine both selection and fusion to improve generalization. Understanding and addressing the limitations of diversity as a proxy for ensemble performance and leveraging combiner-specific measures are central to this area.

Key finding: This work experimentally establishes that diversity measures traditionally used as selection criteria are less effective than direct combiner error-based search in selecting classifier subsets for majority voting; the authors... Read more
Key finding: This study introduces a novel approach in which classifier importance within an ensemble is assessed using centrality measures derived from complex network theory, leading to two methods: Centrality Based Selection (CBS),... Read more
Key finding: The authors propose an unsupervised, instance-specific weight learning method for classifier fusion that optimizes a clarity index based on ranks of weighted scores without requiring labeled training data; this method allows... Read more

2. Which ensemble learning strategies and classifier types yield superior performance in diverse real-world and medical domain classification tasks?

This theme investigates comparative evaluations and hybrid ensemble designs that combine classifiers from different families to improve predictive accuracy on real-world classification problems, including complex domains like medical diagnosis and remote sensing. It includes studies benchmarking hundreds of classifiers against extensive datasets, strategies for stacking strong ensembles, and the practical application of ensemble methods combining support vector machines with bagging or boosting approaches. The goal is identifying classifier families or ensemble configurations that consistently achieve high classification accuracy and robustness.

Key finding: This empirical study evaluates 179 classifiers from 17 families on 121 diverse datasets and finds that random forest classifiers tend to yield the best overall accuracy, followed closely by SVMs with Gaussian kernels; neural... Read more
Key finding: Proposes ensemble frameworks using bagging and gradient boosting with SVMs (equipped with RBF kernels) tailored for medical data classification, demonstrating statistically significant improvements over mono-modal models... Read more
Key finding: Presents a hybrid stacking method combining strong ensemble algorithms like Random Forest, ExtraTrees, and Gradient Boosting to create a meta-classifier, achieving superior accuracy than individual ensembles across multiple... Read more
Key finding: Demonstrates that multiple classifier combinations using voting weights based on producer’s accuracy outperform those using weights based on overall accuracy when classifying Landsat-8 remote sensing images, confirming... Read more
Key finding: Compares combination schemes of a multilayer perceptron ensemble against support vector machine classifiers on remote sensing data, finding nonlinear MLP-based combination yields improved performance among combination... Read more

3. How can feature selection and structured ensemble architectures enhance classifier performance in high-dimensional and dispersed data contexts?

This research theme addresses integrating feature selection with ensemble methods and handling distributed or dispersed data for improved classification performance. It encompasses approaches such as learning classifier systems ensembles based on feature subspaces selected via rough sets or random sampling, hierarchical classifier ensembles constructed using clustering and classifier selection, and coalition formation of distributed local classifiers based on conflict analysis. These methods focus on scalability, dimensionality reduction, and leveraging local data coalitions or classifier diversity to boost robustness and accuracy.

by Essam Debie and 
1 more
Key finding: Proposes a conceptual framework categorizing Learning Classifier System (LCS) ensembles by stages including data preparation (pre-gate), base learner types (member), and output combination (post-gate); experimentally compares... Read more
Key finding: Introduces the Classifier Selection Based on Clustering (CSBC) method that generates base classifiers using bagging decision trees and partitions them via clustering to enhance diversity; selects classifiers from clusters to... Read more
Key finding: Presents a novel approach to combine dispersed independent data sets by creating coalitions of local decision tables analyzed for conflicts using Pawlak's conflict analysis model; aggregation of similar tables into coalitions... Read more

All papers in Multiple Classifier Systems

Cluster analysis is an un-supervised learning technique that is widely used in the process of topic discovery from text. The research presented here proposes a novel un-supervised learning approach based on aggregation of clusterings... more
We recently introduced the idea of solving cluster ensembles using a Weighted Shared nearest neighbors Graph (WSnnG). Preliminary experiments have shown promising results in terms of integrating different clusterings into a combined one,... more
In this paper, we propose a cluster-based cumulative representation for cluster ensembles. Cluster labels are mapped to incrementally accumulated clusters, and a matching criterion based on maximum similarity is used. The ensemble method... more
This paper presents a probabilistic model for combining cluster ensembles utilizing information theoretic measures. Starting from a co-association matrix which summarizes the ensemble, we extract a set of association distributions, which... more
In this paper we propose a Multiple Classifier System (MCS) for classifying breast lesions in Dynamic Contrast Enhanced-Magnetic Resonance Imaging (DCE-MRI). The proposed MCS combines the results of two classifiers trained with dynamic... more
Machine learning algorithms are spreading into many essential decision-making procedures of intelligent systems. These algorithms rely on big data which contain several sensitive features regarding the training instances such as gender,... more
In this paper, we present biometric person recognition experiments in a real-world car environment using speech, face, and driving signals. We have performed experiments on a subset of the in-car CIAIR corpus collected at the Nagoya... more
Sensitivity indices are used to rank the importance of input design variables or components by estimating the degree of uncertainty of output variable influenced by the uncertainty generated from input variables or components. With the... more
Combining classifiers by majority voting (MV) has recently emerged as an effective way of improving performance of individual classifiers. However, the usefulness of applying MV is not always observed and is subject to distribution of... more
The social media has made the world a global world and we, in addition to, as part of physical society, are now part of the virtual society as well. There has been the generation of a large amount of information over the social web. By... more
The goal of this work is to introduce an architecture to automatically detect the amount of stress in the speech signal close to real time. For this an experimental setup to record speech rich in vocabulary and containing different stress... more
ABSTRACT: The specific objective of this paper was to provide a comparative analysis of three automatic classification algorithms: Quinlan's C4. 5 and two robust probabilistic classifiers like Support Vector Machine (SVM) and... more
We detail an exploratory experiment aimed at determining the performance of stochastic vector quantisation as a purely fusion methodology, in contrast to its performance as a composite classification/fusion mechanism. To achieve this we... more
A novel technique is being proposed to prune the weights of artificial neural networks (ANNs) while training with backpropagation algorithm. Iterative update of weights through gradient descent mechanism does not guarantee convergence in... more
The "one against one" and the "one against all" are the two most popular strategies for multi-class SVM; however, according to the literature review, it seems impossible to conclude which one is better for handwriting recognition. Thus,... more
The "one against one" and the "one against all" are the two most popular strategies for multi-class SVM; however, according to the literature review, it seems impossible to conclude which one is better for handwriting recognition. Thus,... more
Recently, ensemble techniques have also attracted the attention of Genetic Programing (GP) researchers. The goal is to further improve GP classification performances. Among the ensemble techniques, also bagging and boosting have been... more
In many real world applications, the cost of a wrong decision may be much higher than the benefit of having a high classification rate. This kind of classification problems represent very challenging tasks, because they require highly... more
Most of the methods for combining classifiers rely on the assumption that the experts to be combined make uncorrelated errors. Unfortunately, this theoretical assumption is not easy to satisfy in practical cases, thus effecting the... more
Vehicle classification has been a highly focused topic amongst the scientific community due to its application in automatic tolling systems, road surveillance, traffic monitoring, etc. Image-based classification using computer vision... more
One of the main factors affecting the effectiveness of ECOC methods for classification is the dependence among the errors of the computed codeword bits. We present an extensive experimental work for evaluating the dependence among output... more
Statistical classification of hyperspectral data is challenging because the input space is high in dimension and correlated, but labeled information to characterize the class distributions is typically sparse. The resulting classifiers... more
Training a parametric classi er involves the use of a training set of data with known la-beling to estimate or \learn" the parameters of the chosen model. A test set, consisting of patterns not previously seen by the classi er, is... more
Recently, ensemble techniques have also attracted the attention of Genetic Programing (GP) researchers. The goal is to further improve GP classification performances. Among the ensemble techniques, also bagging and boosting have been... more
This paper proposes a new ensemble method that constructs an ensemble of tree-structured classifiers using multi-view learning. We are motivated by the fact that an ensemble can outperform its members providing that these classifiers are... more
For the last few decades, learning based on multiple kernels, such as the ensemble kernel regressor and the multiple kernel regressor, has attracted much attention in the field of machine learning. Although its efficacy was revealed... more
For the last few decades, learning based on multiple kernels, such as the ensemble kernel regressor and the multiple kernel regressor, has attracted much attention in the field of machine learning. Although its efficacy was revealed... more
AdaBoost and other ensemble methods have successfully been applied to a number of classi cation tasks, seemingly defying problems of over tting. AdaBoost performs gradient descent i n a n e r r o r function with respect to the margin,... more
machine learning, multiple classifier systems, compound pattern recognition, medical informatics, liver fibrosis, hepatitis C
We present an approach for the automatic classification of Nuclear Magnetic Resonance Spectroscopy data of biofluids with respect to drug induced organ toxicities. Classification is realized by an Ensemble of Support Vector Machines,... more
A communication model for the Hypothesis Boosting (HB) problem is proposed. Under this model, AdaBoost algorithm can be viewed as a threshold decoding approach for a repetition code. Generalization of such decoding view under theory of... more
In this paper, a novel hybrid kernel machine ensemble is proposed for abnormal ECG beat detection to facilitate long-term monitoring of heart patients. A binary SV M is trained using ECG beats from different patients to adapt to the... more
In this paper, a novel hybrid kernel machine ensemble is proposed for abnormal ECG beat detection to facilitate long-term monitoring of heart patients. A binary SV M is trained using ECG beats from different patients to adapt to the... more
Recently, the problem of imbalanced data is the focus of intense research of machine learning community. Following work tries to utilize an approach of transforming the data space into another where classification task may become easier.... more
This paper provides a three-step framework to predict user assessment of the suitability of movies for an inflight viewing context. For this, we employed classifier stacking strategies. First of all, using the different modalities of... more
This paper provides a three-step framework to predict user assessment of the suitability of movies for an inflight viewing context. For this, we employed classifier stacking strategies. First of all, using the different modalities of... more
Multiple modalities present potential difficulties for kernel-based pattern recognition in consequence of the lack of intermodal kernel measures. This is particularly apparent when training sets for the differing modalities are disjoint.... more
In the field of machine learning, the ensemble has been employed as a common methodology to improve the performance upon multiple base classifiers. However, the true predictions are often canceled out by the false ones during consensus... more
There are two different stages to consider when constructing multiple classifier systems: The Meta-Classifier Stage that is responsible for the combination logic and basically treats the ensemble members as black boxes, and the Classifier... more
One of the challenges in machine learning is the classification of datasets with ambiguous instances. In this paper we study specifically datasets with examples that have overlapping feature values for different classes. In these... more
Combining the predictions of a set of classifiers has been shown to be an effective way to create composite classifiers that are more accurate than any of the component classifiers. Increased accuracy has been shown in a variety of... more
Documentation of software architecture is a good approach to understand the architecture of a software system and to match it with the changes needed during the software maintenance phase. In several systems like legacy or older systems... more
In recent times, the manufacturing processes are faced with many external or internal (the increase of customized product rescheduling , process reliability,..) changes. Therefore, monitoring and quality management activities for these... more
Hyperspectral images are one of the most important data source for land cover analysis. These images encode information about the earth surface expressed in terms of spectral bands, allowing us to precisely classify and identify materials... more
In this paper, a novel hybrid kernel machine ensemble is proposed for abnormal ECG beat detection to facilitate long-term monitoring of heart patients. A binary SV M is trained using ECG beats from different patients to adapt to the... more
We present a methodology to analyze Multiple Classifiers Systems (MCS) performance, using the disagreement concept. The goal is to define an alternative approach to the conventional recognition rate criterion, which usually requires an... more
The accuracy of predictions is better if the combinations of the different approaches are used. Currently in collaborative filtering research, the linear blending of various methods is used. More accurate classifiers can be obtained by... more
Membrane protein prediction is a significant classification problem, requiring the integration of data derived from different sources such as protein sequences, gene expression, protein interactions etc. A generalized probabilistic... more
We have previously introduced, in purely theoretical terms, the notion of neutral point substitution for missing kernel data in multimodal problems. In particular, it was demonstrated that when modalities are maximally disjoint, the... more
Download research papers for free!