Academia.eduAcademia.edu

C Means Clustering

description7 papers
group2 followers
lightbulbAbout this topic
C Means Clustering, also known as C-Means or Fuzzy C-Means, is an unsupervised machine learning algorithm that partitions a dataset into C clusters, allowing each data point to belong to multiple clusters with varying degrees of membership, based on the distance from cluster centroids.
lightbulbAbout this topic
C Means Clustering, also known as C-Means or Fuzzy C-Means, is an unsupervised machine learning algorithm that partitions a dataset into C clusters, allowing each data point to belong to multiple clusters with varying degrees of membership, based on the distance from cluster centroids.

Key research themes

1. How can fuzzy and possibilistic extensions to C-Means improve clustering robustness and membership representation?

This research theme focuses on enhancing classical C-means clustering by incorporating fuzzy and possibilistic membership frameworks, aiming to handle overlapping clusters, noisy data, and uncertainty in membership assignment. These extensions improve clustering robustness, address noise sensitivity, and provide a richer membership representation that reflects partial belongingness of data points to multiple clusters. This is especially critical in real-world applications where crisp boundaries do not suffice and noise is prevalent.

Key finding: Comprehensively surveys Fuzzy C-Means (FCM) algorithm and its variants developed over more than a decade, establishing FCM's core methodological foundation where membership degrees represent each data point's fuzzy belonging... Read more
Key finding: Proposes a novel enhancement to standard FCM by utilizing multiple fuzzification coefficients tailored per data sample based on local density metrics, instead of a single universal parameter. This methodological innovation... Read more
Key finding: Introduces a suppression-based modification to Possibilistic Fuzzy c-means (PFCM) that integrates winner and non-winner suppression mechanisms in membership and typicality computations. This improves classification accuracy... Read more
Key finding: Extends classical c-means clustering by introducing two new regulation principles (R1-OKM and R2-OKM) to explicitly control overlap size and shape between clusters, moving beyond fuzzy membership to crisp overlapping... Read more

2. What methodological advances have been proposed to automatically estimate the optimal number of clusters in C-means-based clustering?

Determining the correct number of clusters remains a fundamental challenge in C-means clustering, given that erroneous cluster count selection leads to poor partitioning. This theme encompasses approaches that integrate statistical modeling, hierarchical partitioning, and evidence accumulation to estimate cluster number reliably. Advances include model-based hierarchical methods, Gamma Mixture Models, and evidence accumulation techniques that aggregate multiple clustering runs to improve cluster count estimation, critical for unsupervised settings and high-dimensional data.

Key finding: Develops a hierarchical Gamma Mixture Model (GMM) tree algorithm for recursively partitioning data via distance distributions from observation points, enabling accurate estimation of the number of clusters even in... Read more
Key finding: Proposes an evidence accumulation clustering method leveraging multiple K-means runs with random initializations to create a co-association matrix reflecting pairwise object clustering frequencies. Subsequent clustering of... Read more
Key finding: Presents Hierarchical Means Clustering (HMC) which optimizes a least-squares loss over a system of nested centroid-based statistical models to produce a hierarchy of cluster partitions. HMC integrates iterative assignment,... Read more

3. How have C-means clustering adaptations been applied and improved for specialized data types such as sets, high-dimensional data, and image segmentation?

This theme investigates adaptations and methodological innovations of C-means and its variants tailored to complex data types prevalent in practical applications. Key advances include medoid-based algorithms for clustering sets and categorical data, improved distance metrics for clustering in high-dimensional spaces, and kernelized fuzzy c-means approaches incorporating neighborhood information for robust image segmentation under noise. Such adaptations critically enhance the applicability of clustering algorithms to domain-specific data structures and noisy environments.

Key finding: Introduces medoid-centric clustering algorithms adapting K-medoids and random swap strategies to cluster variable-sized sets using customized set distance measures like Jaccard and Otsuka-Ochiai cosine distances. The approach... Read more
Key finding: Proposes a novel distance metric based on the logarithm of ratios of vector components that remains bounded in high-dimensional spaces, alleviating issues of Euclidean distance blowup. When embedded in the k-means algorithm,... Read more
Key finding: Presents a novel Kernel-based Modified Fuzzy C-means (NMKFCM) algorithm that integrates neighborhood pixel information into the kernelized fuzzy objective function to improve image segmentation performance, especially in... Read more
Key finding: Employs an improved Sine Cosine Algorithm (ISCA), a metaheuristic optimizer, to enhance clustering-based image segmentation by optimizing cluster centers effectively and preventing local optima entrapments. Experimental... Read more

All papers in C Means Clustering

There aren't any papers tagged with C Means Clustering yet

Download research papers for free!