Data Clustering

description2,735 papers

group207 followers

lightbulbAbout this topic

Data clustering is a machine learning technique that involves grouping a set of objects in such a way that objects in the same group (or cluster) are more similar to each other than to those in other groups. It is used for exploratory data analysis and pattern recognition.

lightbulbAbout this topic

Key research themes

1. How do different fundamental clustering algorithm paradigms address data complexity and application needs in data mining?

This theme explores the comparative roles, methodological foundations, and practical implementations of the main clustering paradigms—partitioning, hierarchical, density-based, grid-based, and model-based clustering—in data mining. It highlights how these paradigms adapt to handle large, high-dimensional, or complex datasets and accommodate different data types and clustering objectives across various applications.

A Survey of Clustering Data Mining Techniques

by Tasos Neikos

2022, Grouping Multidimensional Data

Key finding: Provides a comprehensive taxonomy of clustering algorithms used in big data mining, dividing them into hierarchical and partitioning methods, further delineating subtypes such as agglomerative/divisive, k-means, k-medoids,... Read more

articleView Paper downloadDownload

A Survey on Clustering Techniques in Data Mining

by IJCSMC Journal

2018, IJCSMC

Key finding: Offers an extensive survey highlighting strengths and weaknesses of partitioning (e.g., k-means, k-medoids), hierarchical (agglomerative and divisive), density-based (DBSCAN, OPTICS), and grid-based algorithms. It emphasizes... Read more

articleView Paper downloadDownload

Study on Various Clustering Techniques

by Saroj Chaudhary

2022

Key finding: Analyzes the performance of classical clustering methods on high-dimensional datasets, emphasizing challenges of scalability and meaningful pattern extraction. It discusses issues like the curse of dimensionality causing... Read more

articleView Paper downloadDownload

Clustering Methods

by Oded Maimon

2024, Data Mining and Knowledge Discovery Handbook

Key finding: Presents a tutorial overview and mathematical underpinnings of diverse clustering approaches including hierarchical, partitioning, density-based, model-based, grid-based, and soft computing. It rigorously discusses the... Read more

articleView Paper downloadDownload

Data Clustering

by Hana Rezanková

2023, Emerging Techniques and Technologies

Key finding: Focuses on clustering methods specifically tailored for Web data, emphasizing adaptation of classical clustering, graph clustering, and neural network approaches to domain-specific data representations (text, hyperlinks,... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

2. What optimization and ensemble strategies improve clustering robustness, accuracy, and shape flexibility beyond traditional centroid-based methods?

This theme investigates advanced methodologies that enhance clustering performance via mathematical programming, ensemble evidence accumulation, non-centroid discrete optimization, and hybrid parallel approaches. It elucidates how these methods address issues such as local minima trapping, robustness over multiple runs, identification of arbitrary-shaped clusters, and computational scalability.

Optimal clustering: a model and method

by Janine Aronson

2021, Naval Research Logistics (NRL)

Key finding: Introduces a mixed-integer linear programming (MILP) model to obtain provably optimal cluster assignments minimizing total within-cluster dissimilarities, incorporating constraints such as group precedence and size limits.... Read more

articleView Paper downloadDownload

Data clustering using evidence accumulation

by Ana L N Fred

2024, Object recognition supported by user interaction for service robots

Key finding: Proposes an ensemble clustering approach that aggregates multiple clusterings generated by repeated random initializations of k-means, forming a co-association matrix representing pairwise pattern similarity. By clustering... Read more

articleView Paper downloadDownload

Robust data clustering

by Ana L N Fred

2025, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings.

Key finding: Develops an information-theoretical framework using normalized mutual information and bootstrap variance to quantify clustering ensemble consistency and robustness. It formulates evidence accumulation as an optimization of... Read more

articleView Paper downloadDownload

Two Medoid-Based Algorithms for Clustering Sets

by Libero Nigro

2024, Algorithms

Key finding: Presents novel K-medoids based clustering algorithms tailored for set-valued data, bypassing centroid computation limitations in categorical and set data by leveraging classical set-distance measures (Jaccard, Otsuka-Ochiai)... Read more

articleView Paper downloadDownload

A Multi-Agent K-Means Algorithm for Improved Parallel Data Clustering

by aldo erianda

2023, JOIV : International Journal on Informatics Visualization

Key finding: Combines multi-agent system (MAS) concepts with K-means algorithm to introduce Multi-K-means (MK-means), a parallel clustering technique improving global optimization convergence and accuracy. Agents collaborate by monitoring... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

3. How can dimensionality reduction and integration of clustering algorithms enhance clustering effectiveness in high-dimensional and domain-specific datasets?

This theme surveys approaches combining dimensionality reduction techniques like Principal Component Analysis (PCA) and integrated or hybrid clustering frameworks to address the curse of dimensionality, improve clustering interpretability, and optimize domain-specific applications such as telecom customer segmentation.

Principal Component Analysis for Database Scan Using Hierarchical KNN Model for Telecom Customer Segmentation

by oluwasegun william ijibadejo

2024, International Conference on Engineering, Natural Sciences, and Technological Development

Key finding: Demonstrates a hybrid approach where PCA reduces high-dimensional telecom customer data to critical principal components, enabling effective hierarchical K-nearest neighbors clustering for client segmentation. This method... Read more

articleView Paper downloadDownload

Non-centroid-based discrete differential evolution for data clustering

by beei iaes

2025, Bulletin of Electrical Engineering and Informatics

Key finding: Introduces a discrete differential evolution algorithm that eschews traditional centroid reliance, searching instead for label assignments directly in discrete space. This enables discovery of non-spherical clusters and... Read more

articleView Paper downloadDownload

All papers in Data Clustering

Evaluation of texture analysis techniques to characterize vegetation

by Jorge Recio

2025

The extraction of numeric features to characterize textures on images takes special relevance in certain satellite and aerial images classification processes. The wide range of the methodological approaches used and their applications in... more

descriptionView Paper arrow_downwardDownload

Efficient data clustering over peer-to-peer networks

by Mohamed Ismail

2025, 2011 11th International Conference on Intelligent Systems Design and Applications

Due to the dramatic increase of data volumes in different applications, it is becoming infeasible to keep these data in one centralized machine. It is becoming more and more natural to deal with distributed databases and networks. That is... more

descriptionView Paper arrow_downwardDownload

Rb-Sr and K-Ar ages on minerals at temperatures of 300°–400° C from deep wells in the Larderello geothermal field (Italy)

by Filippo Radicati di Brozolo

2025, Contributions to Mineralogy and Petrology

Rb-Sr and K-Ar ages have been obtained on six biotites, two muscovites and one hornblende from samples of micaschist, gneiss and amphibolite of Lower Paleozoic to Precambrian age at a depth exceeding 2,000 m in basement rocks of the... more

descriptionView Paper arrow_downwardDownload

Topic Discovery from Text Using Aggregation of Different Clustering Methods

by hanan ayad

2025, Lecture Notes in Computer Science

Cluster analysis is an un-supervised learning technique that is widely used in the process of topic discovery from text. The research presented here proposes a novel un-supervised learning approach based on aggregation of clusterings... more

descriptionView Paper arrow_downwardDownload

Refined Shared Nearest Neighbors Graph for Combining Multiple Data Clusterings

by hanan ayad

2025, Lecture Notes in Computer Science

We recently introduced the idea of solving cluster ensembles using a Weighted Shared nearest neighbors Graph (WSnnG). Preliminary experiments have shown promising results in terms of integrating different clusterings into a combined one,... more

descriptionView Paper arrow_downwardDownload

NEWcv

by Tony Scott

2025

descriptionView Paper arrow_downwardDownload

Transformation of an Uncertain Video Search Pipeline to a Sketch-Based Visual Analytics Loop

by Iwan Griffiths

2025, IEEE Transactions on Visualization and Computer Graphics

Traditional sketch-based image or video search systems rely on machine learning concepts as their core technology. However, in many applications, machine learning alone is impractical since videos may not be semantically annotated... more

descriptionView Paper arrow_downwardDownload

Evaluation of a New Approach for Edit and Imputation of Social and Demographical Data with Hierarchical Structure

by Anna Pezone

2025

Riassunto: In questo lavoro viene proposto un nuovo approccio, suggerito dalla Teoria dei Grafi, per ridurre, nel processo di correzione, la perdita di informazioni derivante da cancellazioni improprie dei dati, e per migliorare la... more

descriptionView Paper arrow_downwardDownload

WP. 30 ENGLISH ONLY UNITED NATIONS STATISTICAL COMMISSION and ECONOMIC COMMISSION FOR EUROPE CONFERENCE OF EUROPEAN STATISTICIANS Work Session on Statistical Data Editing (Ottawa, Canada, 16-18 May 2005) Topic (iv): New and emerging methods, including automation through machine learning, imputation

by Anna Pezone

2025

In the paper the most recent methodological and technological advancements at ISTAT in the area of editing and imputation are described. A recently developed model-based method for localizing systematic unity measure errors and some... more

descriptionView Paper arrow_downwardDownload

Fuzzy Models Synthesis with Kernel-Density-Based Clustering Algorithm

by Małgorzata Charytanowicz

2025, 2008 Fifth International Conference on Fuzzy Systems and Knowledge Discovery

Data clustering constitutes at present a commonly used technique for extracting fuzzy system rules from experimental data. Detailed studies in the field have shown that using above-mentioned method results in significantly reduced... more

descriptionView Paper arrow_downwardDownload

Enhanced Features extraction method based on Fuzzy C-means algorithm

by Nidhal Hasan Hasaan

2025, Master's Thesis

Feature extraction is an essential process in machine learning (ML). It is the process of deriving new features from the original features in order to enhance the quality or representation of the data for different reasons, such as increasing classifier efficiency and allowing higher classification accuracy. Clustering is one of the methods used in feature extraction. There are different clustering methods: soft clustering and hard clustering. Hard clustering, where each data point must be assigned to one cluster. while soft clustering, that may be a data point assigned to more than one cluster with a specific parameter "membership" value. Therefore, the type of clustering method may affect on quality of features generated using clustering. As related to using soft clustering algorithms such as Fuzzy c-means (FCM), several issues can be examined about the impact of using it on feature extraction, i. e. enhancing the quality of these features or not. Also, the effect of data circumstances that are introduced to FCM algorithm on the results. This is the problem that was adopted in this thesis. The aim is to develop a new method for feature extraction using FCM clustering to enhance classification performance, which allows for a more accurate representation of the nature of the data by using the "partial membership" features. The proposed work consists of several steps. First, an algorithm based on FCM to produce the new features would be proposed, this is because allowing points to partially belong to more than one cluster can generate useful and rich representations of features. Then, for testing the quality of these features, a classification algorithm such as K-Nearest Neighbors(KNN) algorithm would be used. At last, for comparing the results, K-means algorithm, which belongs to the hard clustering method, was used in parallel with the proposed algorithm to IX extract features independently and evaluate these features through KNN algorithm. The practical work was applied to six datasets: CICIDS2017, Breast Cancer, Sonar, Spam base, Nomao, and Diabetes. After implementing the algorithms and getting the results, we notice the superiority of the proposed method in all types of data, with some variation between data, in terms of classification performance metrics: accuracy, precision, recall, and F1 score, demonstrating its effectiveness in generating high-quality and robust features for ML tasks. CICIDS2017 dataset has achieved an accuracy of 99.10%, Breast Cancer 96.71%, Sonar 89.66%, Spam base 88.23%, Nomao 95.03%, and Diabetes 65.51%. FCM algorithm is considered better than K-Means for reasons related to the nature of the data and the characteristics of each algorithm. It allows a single point to belong to more than one cluster with different degrees and produces more expressive features. It is also suitable for cases where the class boundaries are unclear or overlap with each other.

descriptionView Paper arrow_downwardDownload

Data Mining of Virtual Campus Data

by Francisco Mugica

2025, Studies in Computational Intelligence

descriptionView Paper arrow_downwardDownload

Main Methods of Data Mining and Their Applications

by Daoud Jerab and

2025

This study explores the principal methods of data mining and their diverse applications across industries. Its purpose is to provide a comprehensive overview of key techniques-classification, clustering, association rule learning,... more

descriptionView Paper arrow_downwardDownload

A Survey on Unsupervised Clustering Algorithm based on K-Means Clustering

by ASHISH MOHAN

2025, International Journal of Computer Applications

Data mining are data analysis supported unsupervised clustering algorithm is one of the quickest growing research areas because of availability of huge quantity of data analysis and extract usefully information based on new improve... more

descriptionView Paper arrow_downwardDownload

CUCKOO-ANN Based Novel Energy-Efficient Optimization Technique for IoT Sensor Node Modelling

by deepshikha bhargava

2025, Wireless Communications and Mobile Computing

Wireless sensor networks (WSNs) based on the Internet of Things (IoT) are now one of the most prominent wireless sensor communication technologies. WSNs are often developed for particular applications such as monitoring or tracking in... more

descriptionView Paper arrow_downwardDownload

SHILL BIDDING FRAUD DETECTION IN ONLINE AUCTIONS USING MACHINE LEARNING ALGORITHMS

by IJETRM Journal

2025, IJETRM

As network traffic gets more complex, conventional manual techniques of identifying network traffic are becoming less successful. Fraudulent activities are no longer allowed by internet auction sites like those that allow shill bidding;... more

descriptionView Paper arrow_downwardDownload

A Survey on Unsupervised Clustering Algorithm based on K-Means Clustering

by yogiraj singh

2025, International Journal of Computer Applications

descriptionView Paper arrow_downwardDownload

A clustering fuzzy approach for image segmentation

by Giovanni Foresti

2025, Pattern Recognition

Segmentation is a fundamental step in image description or classiÿcation. In recent years, several computational models have been used to implement segmentation methods but without establishing a single analytic solution. However, the... more

descriptionView Paper arrow_downwardDownload

Identifying Conversational Message Threads by Integrating Classification and Data Clustering

by Giacomo Domeniconi

2025, Communications in computer and information science

Conversational message thread identification regards a wide spectrum of applications, ranging from social network marketing to virus propagation, digital forensics, etc. Many different approaches have been proposed in literature for the... more

descriptionView Paper arrow_downwardDownload

Identifying Conversational Message Threads by Integrating Classification and Data Clustering

by Giacomo Domeniconi

2025, Communications in Computer and Information Science

descriptionView Paper arrow_downwardDownload

Centralized Scheduling Approach to Manage Smart Charging of Electric Vehicles in Smart Cities

by Vito CALDERARO

2025

Electric vehicles (EVs) are emerging as the future of individual mobility systems in smart cities since they reduce greenhouse gas emissions and fossil fuel dependence. However, the deepening penetration of battery EVs forecasted for the... more

descriptionView Paper arrow_downwardDownload

Using a Two-Step Clustering Approach to Examine Judiciary Efficiency in European Countries

by Jan Hunady and

2025, The Economic Research Guardian

Panel data, also known as longitudinal data, is collected and analysed across various research areas. This type of data consists of statistical objects that are periodically observed over time. In comparison to cross-sectional data, there... more

descriptionView Paper arrow_downwardDownload

Automatic Clustering of Macroseismic Intensity Data Points from Internet Questionnaires: Efficiency of the Partitioning around Medoids (PAM)

by Daniel AMORESE

2025, Seismological Research Letters

Tables of earthquakes and clustering results, maps of questionnaire results, ananimation showing the evolution of the Barcelonnette event questionnaire clustering with time, and figures showing clustering comparisons (zipped archive).

descriptionView Paper arrow_downwardDownload

Performance evaluation of compromise conditional Gaussian networks for data clustering

by Pedro Larrañaga

2025, International Journal of Approximate Reasoning

This paper is devoted to the proposal of two classes of compromise conditional Gaussian networks for data clustering as well as to their experimental evaluation and comparison on synthetic and real-world databases. According to the... more

descriptionView Paper arrow_downwardDownload

Dimensionality reduction in unsupervised learning of conditional Gaussian networks

by Pedro Larrañaga

2025, IEEE Transactions on Pattern Analysis and Machine Intelligence

AbstractÐThis paper introduces a novel enhancement for unsupervised learning of conditional Gaussian networks that benefits from feature selection. Our proposal is based on the assumption that, in the absence of labels reflecting the... more

descriptionView Paper arrow_downwardDownload

A comparison of various approaches for using probabilistic dependencies in language modeling

by Peter Bruza

2025

This version may not include final proof corrections and does not include published layout or pagination. Citation for the version of the work held in 'OpenAIR@RGU': BRUZA, P. D. and SONG, D., 2003. A comparison of various approaches for... more

descriptionView Paper arrow_downwardDownload

Unsupervised feature extraction based on uncorrelated approach

by Jayashree Jayashree

2025

Uncorrelated Neighborhood Preserving Embedding (UNPE)

descriptionView Paper arrow_downwardDownload

An Analysis of Particle Swarm Optimization with Data Clustering Technique for Optimization in Data Mining

by amreen khan

2025

Data clustering is an approach for automatically finding classes, concepts, or groups of patterns. It also aims at representing large datasets by a few number of prototypes or clusters. It brings simplicity in modelling data and plays an... more

descriptionView Paper arrow_downwardDownload

Spectral feature selection for supervised and unsupervised learning

by Huan Liu

2025, Proceedings of the 24th international conference on Machine learning

Feature selection aims to reduce dimensionality for building comprehensible learning models with good generalization performance. Feature selection algorithms are largely studied separately according to the type of learning: supervised or... more

descriptionView Paper arrow_downwardDownload

Quantifying Features Using False Nearest Neighbors: An Unsupervised Approach

by Huan Liu

2025, 2011 IEEE 23rd International Conference on Tools with Artificial Intelligence

Real-world datasets commonly present high dimensional data, which means an increased amount of information. However, this does not always imply an improvement in learning technique performance. Furthermore, some features may be correlated... more

descriptionView Paper arrow_downwardDownload

Grid Clustering With Genetic Algorithm and Tabu Search Process

by gautam garai

2025, Journal of pattern recognition research

In this paper we have presented an effective hybrid genetic algorithm for solving clustering problems with multi-dimensional grid structure. The algorithm is basically a combination of Genetic Algorithm (GA) and Tabu Search (TS) so that... more

descriptionView Paper arrow_downwardDownload

Parallel implementation of vision algorithms on workstation clusters

by Philip McKinley

2025, Proceedings of the 12th IAPR International Conference on Pattern Recognition (Cat. No.94CH3440-5)

Parallel implementations of two computer vision algorithms on distributed cluster platforms are described. The rst algorithm is a square-error data clustering method whose parallel implementation is based on the well-known sequential... more

descriptionView Paper arrow_downwardDownload

Large-scale parallel data clustering

by Philip McKinley

2025, IEEE Transactions on Pattern Analysis and Machine Intelligence

Algorithmic enhancements are described that enable large computational reduction in mean square-error data clustering. These improvements are incorporated into a parallel data-clustering tool, P-CLUSTER, designed to execute on a network... more

descriptionView Paper arrow_downwardDownload

On the learnability of discrete distributions

by Linda Sellie

2025, Proceedings of the twenty-sixth annual ACM symposium on Theory of computing - STOC '94

descriptionView Paper arrow_downwardDownload

A Novel Benchmark K-Means Clustering on Continuous Data

by Sankara Prasanna Kumar Mahankali

2025, International Journal on Computer Science and Engineering

Cluster analysis is one of the prominent techniques in the field of data mining and k-means is one of the most well known popular and partitioned based clustering algorithms. K-means clustering algorithm is widely used in clustering. The... more

descriptionView Paper arrow_downwardDownload

A Framework for Clustering Categorical Time-Evolving Data

by fuyuan cao

2025, IEEE Transactions on Fuzzy Systems

A fundamental assumption often made in unsupervised learning is that the problem is static, i.e., the description of the classes does not change with time. However, many practical clustering tasks involve changing environments. It is... more

descriptionView Paper arrow_downwardDownload

Mesoscopic analysis of networks: Applications to exploratory analysis and data clustering

by Sergio Gomez

2025, Chaos: An Interdisciplinary Journal of Nonlinear Science

We investigate the adaptation and performance of modularity-based algorithms, designed in the scope of complex networks, to analyze the mesoscopic structure of correlation matrices. Using a multiresolution analysis, we are able to... more

descriptionView Paper arrow_downwardDownload

A modified fuzzy possibilistic C-means for context data clustering toward efficient context prediction

by Mohamed Salah

2025

Context prediction is useful for energy saving and hence eco-efficient context-aware service by increasing the interval of context sensing. One way of predicting context is to recognize context patterns in an accurate manner.... more

descriptionView Paper arrow_downwardDownload

Logo identification algorithm for TV Internet

by V. Mosorov

2025

Content inappropriate for children on Internet television is a serious problem in today's multimedia world. There are numerous methods which are used to control the content of the transmitted television programmes. However, these... more

descriptionView Paper arrow_downwardDownload

Visualising high-dimensional state spaces with "Tuple Plots

by Susan Stepney

2025, arXiv (Cornell University)

Complex systems are described with high-dimensional data that is hard to visualise. Inselberg's parallel coordinates are one representation technique for visualising high-dimensional data. Here we generalise Inselberg's approach, and use... more

descriptionView Paper arrow_downwardDownload

Flying Ad Hoc Networks (FANETs): A Review of Mobility Models

by Sandeep Raj

2025, Journal of emerging technologies and innovative research

The organizations of Flying ad hoc networks (FANETs) are turning into a favorable answer for various purposes situation including unmanned aerial vehicles, as metropolitan reconnaissance or search and salvage missions. Be that as it may,... more

descriptionView Paper arrow_downwardDownload

Scalable clustering of categorical data and applications

by Periklis Andritsos

2025

Clustering is widely used to explore and understand large collections of data. In this thesis, we introduce LIMBO, a scalable hierarchical categorical clustering algorithm based on the Information Bottleneck (IB) framework for quantifying the relevant information preserved when clustering. As a hierarchical algorithm, LIMBO can produce clusterings of different sizes in a single execution. We also define a distance measure for categorical tuples and values of a specific attribute. Within this framework, we define a heuristic for discovering candidate values for the number of meaningful clusters. Next, we consider the problem of database design, which has been characterized as a process of arriving at a design that minimizes redundancy. Redundancy is measured with respect to a prescribed model for the data (a set of constraints). We consider the problem of doing database redesign when the prescribed model is unknown or incomplete. Specifically, we consider the problem of finding structural clues in a data instance, which may contain errors, missing values, and duplicate records. We propose a set of tools based on LIMBO for finding structural summaries that are useful in characterizing the information content of the data. We study the use of these summaries in ranking functional dependencies based on their data redundancy. We also consider a different application of LIMBO, that of clustering software artifacts. The majority of previous algorithms for this problem utilize structural information in order to decompose large software systems. Other approaches using non-structural iniii formation, such as file names or ownership information, have also demonstrated merit. We present an approach that combines structural and non-structural information in an integrated fashion. We apply LIMBO to two large software systems, and the results indicate that this approach produces valid and useful clusterings. Finally, we present a set of weighting schemes that specify objective assignments of importance to the values of a data set. We use well established weighting schemes from information retrieval, web search and data clustering to assess the importance of whole attributes and individual values.

descriptionView Paper arrow_downwardDownload

A new method of fuzzy clustering by using the combination of the firefly algorithm and the particle swarm optimization algorithm

by Seyed javad

2025

Fuzzy clustering algorithm is one of the data mining methods that is applied in different fields. According to the fuzzy clustering algorithm, each object is allocated to the clusters regarding its percentage of belonging to each of the... more

descriptionView Paper arrow_downwardDownload

Rough Set Theory Approach for Classifying Multimedia Data

by SUHAILAN SAFEI

2025, Communications in Computer and Information Science

The huge size of multimedia data requires for efficient data classification and organization in providing effective multimedia data manipulation. Those valuable data must be captured and stored for potential purposes. One of the main... more

descriptionView Paper arrow_downwardDownload

Level of Community Satisfaction on Operational Performance of the 1 st Provincial Mobile Force Company on Anti-Criminality Campaign in Basilan

by IJMRAP Editor

2025, IJMRAP

This study examined the level of community satisfaction with the operational performance of the 1st Provincial Mobile Force Company (PMFC) in Basilan concerning its anti-criminality campaign. Using a descriptive research design, data were... more

descriptionView Paper arrow_downwardDownload

Improved Crisp and Fuzzy Clustering Techniques for Categorical Data

by Anirban Mukhopadhyay

2025

Clustering is a widely used technique in data mining application for discovering patterns in underlying data. Most traditional clustering algorithms are limited in handling datasets that contain categorical attributes. However, datasets... more

descriptionView Paper arrow_downwardDownload

A concurrent architecture for serializable production systems

by José Amaral

2025, IEEE Transactions on Parallel and Distributed Systems

descriptionView Paper arrow_downwardDownload

Supporting knowledge exploration and discovery in multi-dimensional data with interactive multiscale visualisation

by Tim Simpson

2025, Journal of Engineering Design

Knowledge discovery in multi-dimensional data is a challenging problem in engineering design. For example, in trade space exploration of large design data sets, designers need to select a subset of data of interest and examine data from... more

descriptionView Paper arrow_downwardDownload

Can we constrain the neutron-star equation of state from QPO observations?

by Valeria Ferrari

2025, arXiv (Cornell University)

We develop a new method to measure neutron star parameters and derive constraints on the equation of state of dense matter by fitting the frequencies of simultaneous Quasi Periodic Oscillation modes observed in the X-ray flux of accreting... more

descriptionView Paper arrow_downwardDownload

Natural-Inspired Data Clustering: A Hybridization Between Ant Clustering and Particle Swarm Optimization

by Mohammed ElSaid Ibrahim El-Telbany

2025

The clustering algorithms have evolved over the last decade. With the continuous success of natural inspired algorithms in solving many engineering problems, it is imperative to scrutinize the success of these methods applied to data... more

descriptionView Paper arrow_downwardDownload

Data Clustering

Key research themes

1. How do different fundamental clustering algorithm paradigms address data complexity and application needs in data mining?

2. What optimization and ensemble strategies improve clustering robustness, accuracy, and shape flexibility beyond traditional centroid-based methods?

3. How can dimensionality reduction and integration of clustering algorithms enhance clustering effectiveness in high-dimensional and domain-specific datasets?

Related Topics

All papers in Data Clustering