Information granulation: A search for data structures
2001
Abstract
Abstract Revealing a structure in data is of paramount importance in a broad range of problems of information processing. In spite of the specificity of the problem in which such analysis is realized, there is an evident commonality of all these pursuits worth emphasizing. One can distinguish between a core and a residual of the data structure. In this study, we propose a formal environment supporting these concepts and develop its algorithmic fabric.
FAQs
AI
What distinguishes core data structures from residual ones in classification tasks?
The study identifies core data as dense and geometrically defined clusters, while residual data is more dispersed and complex. This distinction aligns with classification difficulties, as core patterns are easier to classify compared to residual mixtures.
How does Tchebyschev distance improve clustering transparency over Euclidean distance?
The use of Tchebyschev distance creates equidistant contours that resemble box-like structures, facilitating clearer decomposable relations in data. This approach helps define sharp boundaries, a necessity for accurate data structures, unlike the spherical contours from Euclidean distance.
What process enhances prototype optimization in FCM using Tchebyschev distance?
The paper details a gradient-based optimization approach augmented by a multi-valued predicate to determine prototypes under Tchebyschev distance constraints. This method aims to avoid gradient zeroing issues that hinder traditional optimization techniques.
What classification rules emerge from granular data modeling using ∞l clustering?
The paper establishes a classification rule based on membership grades derived from ∞l clustering, resulting in more complex boundaries outside core clusters. The complexity of the classifier required increases significantly when addressing residual patterns.
What is the practical implication of hyperboxes in data classification?
The analysis demonstrates that hyperboxes serve as effective geometric representations for core data, leading to simpler classifier requirements. Conversely, residual data presents challenges that necessitate more sophisticated, higher complexity classifiers.
References (8)
- J.C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, 1981.
- L. Bobrowski, J.C. Bezdek , C-Means clustering with the l 1 and ∞ l norms, IEEE Trans. on Systems Man and Cybernetics, 21, 1991, 545-554.
- D. Dubois and H. Prade, Fuzzy relation equations and causal reasoning, FSS, 75, 1995, pp. 119-134
- P. J.F. Groenen, K. Jajuga, Fuzzy clustering with squared Minkowski distances, FSS, 120, 2001, 227-237.
- W. Pedrycz, F. Gomide, An Introduction to Fuzzy Sets, Cambridge, MIT Press, Cambridge, MA, 1998.
- W. Pedrycz, Fuzzy Control and Fuzzy Systems, RSP/Wiley, 1989.
- L. A. Zadeh, Toward a theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic, Fuzzy Sets and Systems (FSS), 90, 1997, 111-117.
- L.A. Zadeh, From computing with numbers to computing with words-from manipulation of measurements to manipulation of perceptions, IEEE Trans. on Circuits and Systems, 45, 1999, 105-119.