Books by Sergei Kuznetsov
Artificial Intelligence: Methodology, Systems, and Applications - 16th International Conference, AIMSA 2014, Varna, Bulgaria, September 11-13, 2014. Proceedings
Papers by Sergei Kuznetsov

arXiv (Cornell University), Nov 14, 2011
Biclustering numerical data became a popular data-mining task in the beginning of 2000's, especia... more Biclustering numerical data became a popular data-mining task in the beginning of 2000's, especially for analysing gene expression data. A bicluster reflects a strong association between a subset of objects and a subset of attributes in a numerical object/attribute data-table. So called biclusters of similar values can be thought as maximal sub-tables with close values. Only few methods address a complete, correct and non redundant enumeration of such patterns, which is a well-known intractable problem, while no formal framework exists. In this paper, we introduce important links between biclustering and formal concept analysis. More specifically, we originally show that Triadic Concept Analysis (TCA), provides a nice mathematical framework for biclustering. Interestingly, existing algorithms of TCA, that usually apply on binary data, can be used (directly or with slight modifications) after a preprocessing step for extracting maximal biclusters of similar values.
The five preceding editions of the FCA4AI Workshop showed that many researchers working in Artifi... more The five preceding editions of the FCA4AI Workshop showed that many researchers working in Artificial Intelligence are deeply interested by a well-founded method for classification and mining such as Formal Concept Analysis (see http://www.fca4ai.hse.ru/). The first edition of FCA4AI was co-located with ECAI 2012 in Montpellier, the second one with IJCAI 2013 in Beijing, the third one with ECAI 2014 in Prague, the fourth on with IJCAI 2015 in Buenos Aires, and finally the fifth one with ECAI 2016 in The Hague. All the proceedings of the preceding editions are published as CEUR Proceedings (
HAL (Le Centre pour la Communication Scientifique Directe), Nov 6, 2020
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific re... more HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires

G, M, I) is called a formal context where G (Gegenstände) and M (Merkmale) are sets, and I ⊆ G × ... more G, M, I) is called a formal context where G (Gegenstände) and M (Merkmale) are sets, and I ⊆ G × M is a binary relation between G and M. The elements of G are the objects, while the elements of M are the attributes, I is the incidence relation of the context (G, M, I). The derivation operators and the Galois connection The derivation operators establish a Galois connection between the power sets ℘(G) and ℘(M) (and thereby a dual isomorphism between two closure systems). A Galois connection is defined as follows: Let P and Q be ordered sets. A pair of maps φ : P −→ Q and ψ : Q −→ P is called a Galois connection between P and Q if: (i) p 1 ≤ p 2 =⇒ φ(p 1) ≥ φ(p 2) (ii) q 1 ≤ q 2 =⇒ ψ(q 1) ≥ ψ(q 2) (iii) p ≤ ψ • φ(p) and q ≤ φ • ψ(q) Reduced labeling: The attributes "at the highest" and the objects "at the lowest". Types of attributes Introducing and attribute: an attribute α is introduced in a concept C when it is not present in any ascendant (super-concept) of C, i.e. the concept C corresponds to the attribute concept of α (sometimes called the introducer of α). Inheriting an attribute: an attribute α is inherited by a concept C when it is already present in an ascendant of C, i.e. C is lower for the lattice order than the attribute-concept or introducer of α.
These are the proceedings of the seventh edition of the FCA4AI workshop (http://www.fca4ai.hse.ru... more These are the proceedings of the seventh edition of the FCA4AI workshop (http://www.fca4ai.hse.ru/) co-located with the IJCAI 2019 Conference in Macao (China). Formal Concept Analysis (FCA) is a mathematically well-founded theory aimed at classification and knowledge discovery that can be used for many purposes in Artificial Intelligence (AI). The objective of the FCA4AI workshop is to investigate two main issues: how can FCA supports various AI activities (knowledge discovery, knowledge engineering, machine learning, data mining, information retrieval, recommendation. . . ), and how can FCA be extended in order to help AI researchers to solve new and complex problems in their domain.
Lecture Notes in Computer Science, 2011
Concept lattices are mathematical structures useful for many tasks in knowledge discovery and man... more Concept lattices are mathematical structures useful for many tasks in knowledge discovery and management. A concept lattice is basically obtained from binary data encoding the membership of some attributes to some objects. Dealing with complex data brings the important problem of discretization and the associated loss of information. To avoid discretization, (i) pattern structures and (ii) symbolic data analysis provide means to analyze such complex data directly. We compare both these approaches and show how they are mutually beneficial.
Lecture Notes in Computer Science, 2015
This article aims at presenting recent advances in Formal Concept Analysis (2010-2015), especiall... more This article aims at presenting recent advances in Formal Concept Analysis (2010-2015), especially when the question is dealing with complex data (numbers, graphs, sequences, etc.) in domains such as databases (functional dependencies), data-mining (local pattern discovery), information retrieval and information fusion. As these advances are mainly published in artificial intelligence and FCA dedicated venues, a dissemination towards data mining and machine learning is worthwhile.
In this paper, we investigate the problem of mining numerical data in the framework of Formal Con... more In this paper, we investigate the problem of mining numerical data in the framework of Formal Concept Analysis. The usual way is to use a scaling procedure --transforming numerical attributes into binary ones-- leading either to a loss of information or of efficiency, in particular w.r.t. the volume of extracted patterns. By contrast, we propose to directly work on numerical data in a more precise and efficient way, and we prove it. For that, the notions of closed patterns, generators and equivalent classes are revisited in the numerical context. Moreover, two original algorithms are proposed and used in an evaluation involving real-world data, showing the predominance of the present approach.
Lecture Notes in Computer Science, 2015
Formal concept analysis (FCA) is a well-founded method for data analysis and has many application... more Formal concept analysis (FCA) is a well-founded method for data analysis and has many applications in data mining. Pattern structures is an extension of FCA for dealing with complex data such as sequences or graphs. However the computational complexity of computing with pattern structures is high and projections of pattern structures were introduced for simplifying computation. In this paper we introduce o-projections of pattern structures, a generalization of projections which defines a wider class of projections preserving the properties of the original approach. Moreover, we show that o-projections form a semilattice and we discuss the correspondence between o-projections and the representation contexts of o-projected pattern structures.
Lecture Notes in Computer Science, 2014
Data mining aims at finding interesting patterns from datasets, where "interesting" means reflect... more Data mining aims at finding interesting patterns from datasets, where "interesting" means reflecting intrinsic dependencies in the domain of interest rather than just in the dataset. Concept stability is a popular relevancy measure in FCA. Experimental results of this paper show that high stability of a concept for a context derived from the general population suggests that concepts with the same intent in other samples drawn from the population have also high stability. A new estimate of stability is introduced and studied. It is experimentally shown that the introduced estimate gives a better approximation than the Monte Carlo approach introduced earlier.
Procedia Computer Science, 2014
There is a lot of usefulness measures of patterns in data mining. This paper is focused on the me... more There is a lot of usefulness measures of patterns in data mining. This paper is focused on the measures used in Formal Concept Analysis (FCA). In particular, concept stability is a popular relevancy measure in FCA. Experimental results of this paper show that high stability of a pattern in a given dataset derived from the general population suggests that the stability of that pattern is high in another dataset derived from the same population. At the second part of the paper, a new estimate of stability is introduced and studied. It es performance is evaluated experimentally. And it is shown that it is more efficient.

The first and the second edition of the FCA4AI Workshop showed that many researchers working in A... more The first and the second edition of the FCA4AI Workshop showed that many researchers working in Artificial Intelligence are indeed interested by a well-founded method for classification and mining such as Formal Concept Analysis (see http://www.fca4ai.hse.ru/). The first edition of FCA4AI was co-located with ECAI 2012 in Montpellier and published as http://ceur-ws.org/Vol-939/ while the second edition was co-located with IJCAI 2013 in Beijing and published as http://ceur-ws.org/Vol-1058/. Based on that, we decided to continue the series and we took the chance to organize a new edition of the workshop in Prague at the ECAI 2014 Conference. This year, the workshop has again attracted many different researchers working on actual and important topics, e.g. recommendation, linked data, classification, biclustering, parallelization, and various applications. This shows the diversity and the richness of the relations between FCA and AI. Moreover, this is a good sign for the future and especially for young researchers that are at the moment working in this area or who will do. Formal Concept Analysis (FCA) is a mathematically well-founded theory aimed at data analysis and classification. FCA allows one to build a concept lattice and a system of dependencies (implications) which can be used for many AI needs, e.g. knowledge discovery, learning, knowledge representation, reasoning, ontology engineering, as well as information retrieval and text processing. As we can see, there are many "natural links" between FCA and AI. Recent years have been witnessing increased scientific activity around FCA, in particular a strand of work emerged that is aimed at extending the possibilities of FCA w.r.t. knowledge processing, such as work on pattern structures and relational context analysis. These extensions are aimed at allowing FCA to deal with more complex than just binary data, both from the data analysis and knowledge discovery points of view and as well from the knowledge representation point of view, including, e.g., ontology engineering. All these investigations provide new possibilities for AI activities in the framework of FCA. Accordingly, in this workshop, we are interested in two main issues: • How can FCA support AI activities such as knowledge processing (knowledge discovery, knowledge representation and reasoning), learning (clustering, pattern and data mining), natural language processing, and information retrieval. • How can FCA be extended in order to help AI researchers to solve new and complex problems in their domains. The workshop is dedicated to discuss such issues. This year, the papers submitted to the workshop were carefully peer-reviewed by three members of the program committee and 11 papers with the highest scores were selected. We thank all the PC members for their reviews and all the authors for their contributions.

Nowadays data sets are available in very complex and heterogeneous ways. The mining of such data ... more Nowadays data sets are available in very complex and heterogeneous ways. The mining of such data collections is essential to support many real-world applications ranging from healthcare to marketing. In this work, we focus on the analysis of "complex" sequential data by means of interesting sequential patterns. We approach the problem using an elegant mathematical framework: Formal Concept Analysis (FCA) and its extension based on "pattern structures". Pattern structures are used for mining complex data (such as sequences or graphs) and are based on a subsumption operation, which in our case is defined with respect to the partial order on sequences. We show how pattern structures along with projections (i.e., a data reduction of sequential structures), are able to enumerate more meaningful patterns and increase the computing efficiency of the approach. Finally, we show the applicability of the presented method for discovering and analyzing interesting patients' patterns from a French healthcare data set of cancer patients. The quantitative and qualitative results are reported in this use case which is the main motivation for this work.

Information Sciences, 2011
This paper addresses the important problem of efficiently mining numerical data with formal conce... more This paper addresses the important problem of efficiently mining numerical data with formal concept analysis (FCA). Classically, the only way to apply FCA is to binarize the data, thanks to a so-called scaling procedure. This may either involve loss of information, or produce large and dense binary data known as hard to process. In the context of gene expression data analysis, we propose and compare two FCA-based methods for mining numerical data and we show that they are equivalent. The first one relies on a particular scaling, encoding all possible intervals of attribute values, and uses standard FCA techniques. The second one relies on pattern structures without a priori transformation, and is shown to be more computationally efficient and to provide more readable results. Experiments with real-world gene expression data are discussed and give a practical basis for the comparison and evaluation of the methods.

Formal Concept Analysis (FCA) is a mathematically well-founded theory aimed at data analysis and ... more Formal Concept Analysis (FCA) is a mathematically well-founded theory aimed at data analysis and classification. FCA allows one to build a concept lattice and a system of dependencies (implications) which can be used for many AI needs, e.g. knowledge processing involving learning, knowledge discovery, knowledge representation and reasoning, ontology engineering, as well as information retrieval and text processing. Thus, there exist many "natural links" between FCA and AI. Recent years have been witnessing increased scientific activity around FCA, in particular a strand of work emerged that is aimed at extending the possibilities of FCA w.r.t. knowledge processing, such as work on pattern structures and relational context analysis. These extensions are aimed at allowing FCA to deal with more complex than just binary data, both from the data analysis and knowledge discovery points of view and from the knowledge representation point of view, including, e.g., ontology engineering. All these works extend the capabilities of FCA and offer new possibilities for AI activities in the framework of FCA. Accordingly, in this workshop, we are interested in two main issues: • How can FCA support AI activities such as knowledge processing (knowledge discovery, knowledge representation and reasoning), learning (clustering, pattern and data mining), natural language processing, information retrieval. • How can FCA be extended in order to help AI researchers to solve new and complex problems in their domains. The workshop is dedicated to discuss such issues. The papers submitted to the workshop were carefully peer-reviewed by two members of the program committee and 11 papers with the highest scores were selected. We thank all the PC members for their reviews and all the authors for their contributions. We also thank the organizing committee of ECAI-2012 and especially workshop chairs Jérôme Lang and Michèle Sebag for the support of the workshop.
Résumé: Formal Concept Analysis (FCA) is a well founded mathematical framework used for conceptua... more Résumé: Formal Concept Analysis (FCA) is a well founded mathematical framework used for conceptual classification and knowledge management. Given a binary table describing a relation between objects and attributes, FCA consists in building a set of concepts organized by a subsumption relation within a concept lattice. Accordingly, FCA requires to transform complex data, eg numbers, intervals, graphs, into binary data leading to loss of information and poor interpretability of object classes. In this paper, we propose a pre-processing ...
Information Sciences, 2018
Formal concepts and closed itemsets proved to be of big importance for knowledge discovery, both ... more Formal concepts and closed itemsets proved to be of big importance for knowledge discovery, both as a tool for concise representation of association rules and a tool for clustering and constructing domain taxonomies and ontologies. Exponential explosion makes it difficult to consider the whole concept lattice arising from data, one needs to select most useful and interesting concepts. In this paper interestingness measures of concepts are considered and compared with respect to various aspects, such as efficiency of computation and applicability to noisy data and performing ranking correlation.

Asian Journal of Economics and Banking, 2020
Purpose The purpose of this study is to show that closure-based classification and regression mod... more Purpose The purpose of this study is to show that closure-based classification and regression models provide both high accuracy and interpretability. Design/methodology/approach Pattern structures allow one to approach the knowledge extraction problem in case of partially ordered descriptions. They provide a way to apply techniques based on closed descriptions to non-binary data. To provide scalability of the approach, the author introduced a lazy (query-based) classification algorithm. Findings The experiments support the hypothesis that closure-based classification and regression allow one to both achieve higher accuracy in scoring models as compared to results obtained with classical banking models and retain interpretability of model results, whereas black-box methods grant better accuracy for the cost of losing interpretability. Originality/value This is an original research showing the advantage of closure-based classification and regression models in the banking sphere.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017
Pattern mining is an important task in AI for eliciting hypotheses from the data. When it comes t... more Pattern mining is an important task in AI for eliciting hypotheses from the data. When it comes to spatial data, the geo-coordinates are often considered independently as two different attributes. Consequently, rectangular patterns are searched for. Such an arbitrary form is not able to capture interesting regions in general. We thus introduce convex polygons, a good trade-off for capturing high density areas in any pattern mining task. Our contribution is threefold: (i) We formally introduce such patterns in Formal Concept Analysis (FCA), (ii) we give all the basic bricks for mining polygons with exhaustive search and pattern sampling, and (iii) we design several algorithms that we compare experimentally.
Uploads
Books by Sergei Kuznetsov
Papers by Sergei Kuznetsov