Academia.eduAcademia.edu

Interestingness Measures

description52 papers
group93 followers
lightbulbAbout this topic
Interestingness measures are quantitative metrics used in data mining and knowledge discovery to evaluate the novelty, relevance, or significance of patterns, relationships, or information within datasets. These measures help identify findings that are not only statistically valid but also meaningful and engaging to users or stakeholders.
lightbulbAbout this topic
Interestingness measures are quantitative metrics used in data mining and knowledge discovery to evaluate the novelty, relevance, or significance of patterns, relationships, or information within datasets. These measures help identify findings that are not only statistically valid but also meaningful and engaging to users or stakeholders.

Key research themes

1. How can a-posteriori novelty and uncommonness metrics be refined to better assess design creativity across heterogeneous idea sets?

This theme investigates the measurement of novelty or interestingness in design ideas generated during creative processes, focusing on a-posteriori methods where novelty is assessed relative to the current idea set. Key issues include limitations of existing metrics like the Shah Novelty Metric (SNM) in handling heterogeneous attribute sets, and the integration of multiple uncommonness perspectives to yield comprehensive novelty assessments. Refinements address empirical applicability to real, heterogeneous design outcomes and aim to provide more reliable, nuanced metrics to quantify creativity in engineering and design studies.

Key finding: This paper identifies critical limitations of the widely used Shah Novelty Metric (SNM) when applied to idea sets with heterogeneous numbers of attributes—a realistic condition in creative design scenarios. It proposes a... Read more
Key finding: This study synthesizes three primary families of a-posteriori novelty metrics—weighted uncommonness, overall uncommonness, and uncommonness across groups—and proposes an integrated assessment framework that concurrently... Read more

2. What objective interestingness measures improve pattern selection and summarization in dynamic and streaming data mining scenarios?

This theme centers on developing and applying objective interestingness metrics for pattern mining, especially in contexts where massive, dynamic, or streaming data generate a continuously growing set of patterns or association rules. The focus is on post-processing and online summarization techniques that can efficiently identify patterns of high value or surprisingness for users without heavy domain-dependent input. Approaches integrate information theory, data compression principles, and classification models to manage scalability while enhancing pattern relevance, including in knowledge graphs and evolving networks.

Key finding: This work formulates the interestingness selection of streaming association rules as a benefit-maximizing classification problem, allowing for personalized, user-specific models of what constitutes an interesting association... Read more
Key finding: This paper presents a novel framework for subjective interestingness in online summarization of dynamic graphs, where the summary is incrementally provided to an analyst relative to their current knowledge. Using a maximum... Read more
Key finding: The paper introduces 'Surprisingness', an objective multivariate interestingness measure designed for pattern mining in complex data such as knowledge graphs represented as directed labeled hypergraphs. It identifies... Read more

3. How do alternative interestingness, correlation, and similarity measures enhance association rule mining and recommender system performance?

This theme explores the development and evaluation of novel interestingness and similarity measures to improve the quality and relevance of association rules in data mining and enhance recommendation accuracy in recommender systems. It emphasizes quantifying meaningful positive or negative item correlations beyond traditional support-confidence frameworks, addressing cold-start problems, and incorporating user preferences for classification and filtering. The studies rigorously test new metrics against established baselines using real-world data sets, highlighting improved interpretability, statistical significance, and predictive power.

Key finding: This paper critiques the confidence measure's inability to capture correlation directionality in association rules and proposes integrating lift directly into the mining algorithm as a criterion (Lift-Based Algorithm). The... Read more
Key finding: This comprehensive study analyzes 20 interestingness measures for association rules through formal properties and extensive experiments on ten data sets. It classifies metrics according to user preferences and data... Read more
Key finding: The paper proposes a novel method to incorporate user subjectivity into Content-Based Image Retrieval (CBIR) systems by integrating relevance feedback mechanisms that reconcile low-level visual features with high-level... Read more
Key finding: This study develops five novel item-based similarity measures designed to improve recommendation accuracy and address cold-start problems in collaborative filtering systems. The proposed metrics outperform established... Read more
Key finding: This paper introduces a novel, preference-driven quality measure for classification that incorporates user-defined relative importance weights for multiple classes and is applicable to multi-class problems. Unlike traditional... Read more
Key finding: The authors identify significant limitations in existing association rule objectives that inadequately capture positive and negative correlations, especially in multi-variable contexts and scenarios affected by Simpson's... Read more

All papers in Interestingness Measures

Je remercie en premier lieu Monsieur André TOTOHASINA, Professeur titulaire de l'Université, pour m'avoir proposé ce sujet, et m'avoir supporté dans toutes les étapes de cette thèse. Ses avis, ses nombreux conseils et son soutien constant... more
Social media platforms have become an important part of our daily lives due to the widespread use of the Internet. They contains a great wealth of valuable information which provide opportunities for us to explore hidden patterns or... more
In the literature, the properties of several interestingness measures have been analyzed and a framework has been proposed for selecting a right interestingness measure for extracting association rules. As rare association rules contain... more
Stabilité en A.S.I. de l'intensité d'implication et comparaisons avec d'autres indices de qualité de règles d'association
L'extraction de connaissances à partir de données a pour objet la découverte de connaissances à partir de grandes quantités de données, par des méthodes d'apprentissage automatiques ou semi-automatiques, et l'utilisation industrielle ou... more
La mesure de la qualité des connaissances est uneétape clef d'un processus de découverte de règles d'association. Dans cet article, nous présentons IPEE, un indice de qualité de règle qui a la particularité unique d'associer les deux... more
Scientific datasets often consist of complex data types such as images. Mining such data presents interesting issues related to semantics. In this paper, we explore the research issues in mining data from the field of nanotechnology. More... more
Pattern mining usually results in huge amounts of patterns, among which only small percentages are interesting. In this paper, Surprisingness (including Surpringness_I and Surpringness_II) is proposed as an innovative objective... more
Due to the increase in data mining research and applications, selection of interesting rules among a huge number of learned rules is an important task in data mining applications. In this paper, the metrics for the interestingness of a... more
In most of the real-world domains, benefit and costs of classifications can be dependent on the characteristics of individual examples. In such cases, there is no static benefit matrix available in the domain and each classification... more
Due to the increase in data mining research and applications, selection of interesting rules among a huge number of learned rules is an important task in data mining applications. In this paper, the metrics for the interestingness of a... more
In a typical application of association rule learning from market basket data, a set of transactions for a fixed period of time is used as input to rule learning algorithms. For example, the well-known Apriori algorithm can be applied to... more
Latar Belakang : Asma bronkhial secara umum terjadi ketika bronkhi mengalami inflamasi atau peradangan dan respon berlebih akibat suatu rangsangan yang menyebabkan penyempitan pada saluran pernapasan. Asma bronkhial ditandai dengan... more
Generalized rule discovery is a rule discovery framework that subsumes association rule discovery and the type of search employed to find individual rules in classification rule discovery. This new rule discovery framework escapes the... more
In data mining it is usually desirable that discovered knowledge have some characteristics such as being as accurate as possible, comprehensible and surprising to the user. The vast majority of data mining algorithms produce, as part of... more
In the last few years, the data mining community has proposed a number of objective rule interestingness measures to select the most interesting rules, out of a large set of discovered rules. However, it should be recalled that objective... more
The immense volume of web usage data that exists on web servers can be mined to generate association rules that contain the information about website visitor interests, which can then be utilized for enhancing the website effectiveness,... more
The work aims to discover frequent patterns by generating the candidates and frame the association rules after which filter out only the efficient rules based on various Rule Interestingness measures. As all these require heavy... more
Association rules is a popular and well researched method for discovering interesting relation between variables in large databases and association rules, is one of the most important tasks in data mining. The generated strong association... more
Measures of interestingness play a crucial role in association rule mining. An important methodological problem, on which several papers appeared in the literature, is to provide a reasonable classication of the measures. In this paper,... more
Association rule mining of the web usage log files can be used to extract patterns of a website visito rs’ behavior. This knowledge can then be utilized to enhance web marketing strategies or improve the web browsing ex perience. In this... more
The immense volume of web usage data that exists on web servers contains potentially valuable information about the behavior of website visitors. This information can be exploited in various ways, such as enhancing the effectiveness of... more
Web usage log files generated on web servers contain huge amount of information that can be used for discovering web usage association rules, which can potentially give useful knowledge to the web usage data analysts. Association rule... more
Automatic discovery of web usage association rules is commonly used to extract the knowledge about web site visitors' interests. Its drawback is the generation of too many not truly interesting rules that have high statistical... more
Recently, a number of learning algorithms have been adapted for label ranking, including instance-based and tree-based methods. In this paper, we continue this line of work by proposing an adaptation of association rules for label ranking... more
User preference is very important in orienting data miner, and this is the reason why these user preferences are integrated in the mining process, where they are coupled with Association Rules Mining "ARM" Algorithms to select only... more
Using the association rules in datamining is one of the most relevant techniques in modern society, aiming to extract the interesting correlation and relation among sets of items or products in large transactional databases. The huge... more
La recherche de regles d’association interessantes est un domaine important et actif en fouille de donnees. Puisque les algorithmes utilises en extraction de connaissances a partir de donnees (ECD), ont tendance a generer un nombre... more
With the rapid development of computer networks and information technology, an attacker has taken advantage to manipulate the situation to launch a complicated cyberattack. This complicated cyberattack causes a lot of problems among the... more
Fig. 1. Overview of a SOMFlow clustering graph that was created during our expert study to analyze speech intonation: First, a gender effect is identified (A) and removed using a domain-specific semitone normalization (B). The analyst... more
The assessment of the interestingness of sequential rules (generally temporal rules) is a crucial problem in sequence analysis. Due to their unsupervised nature, frequent pattern mining algorithms commonly generate a huge number of rules.... more
The assessment of the interestingness of sequential rules (generally temporal rules) is a crucial problem in sequence analysis. Due to their unsupervised nature, frequent pattern mining algorithms commonly generate a huge number of rules.... more
Web Usage Mining make use of Association Rule Mining to discover the interesting pattern, identify web user behavior, predict web user expectation and improve the business strategy. Association Rule Mining is a technique of Data Mining... more
Fig. 1. Overview of a SOMFlow clustering graph that was created during our expert study to analyze speech intonation: First, a gender effect is identified (A) and removed using a domain-specific semitone normalization (B). The analyst... more
In this digital age, organizations have to deal with huge amounts of data, sometimes called Big Data. In recent years, the volume of data has increased substantially. Consequently, finding efficient and automated techniques for... more
In this digital age, organizations have to deal with huge amounts of data, sometimes called Big Data. In recent years, the volume of data has increased substantially. Consequently, finding efficient and automated techniques for... more
Algorithmic probability is traditionally defined by considering the output of a universal machine fed with random programs. This definition proves inappropriate for many practical applications where probabilistic assessments are... more
Mobile phones are pervasive, moderately specialized gadgets that have an effective and capable handling power enveloped with smaller segments that can do efficient and powerful calculations. One of the components that is built into the... more
Assessing rules with interestingness measures is the cornerstone of successful applications of association rule discovery. However, there exists no information-theoretic measure which is adapted to the semantics of association rules. In... more
A study of sculptures and puzzles resulting from splitting lengthwise, tori, Moebius bands, various knots and graphs, illustrated with many models made on rapid prototyping machines.
A study of sculptures and puzzles resulting from splitting lengthwise, tori, Moebius bands, various knots and graphs, illustrated with many models made on rapid prototyping machines.
The Association Rules Discovery is a technique widely used for various objectives. One is for Classification Based on Associations (CBA) with Class Association Rules (CARs). The number of rules discovered from data is extremely high with... more
This paper presents a preprocessing step in mining association rules which uses tables to summarize synthetically the way variables interact by highlighting any zones which are attractive. Attractive zones are those which guarantee that... more
This paper presents a preprocessing step in mining association rules which uses tables to summarize synthetically the way variables interact by highlighting any zones which are attractive. Attractive zones are those which guarantee that... more
The development of good measures of interestingness of the discovered rules is one of the important problems in data mining. Such measures of interestingness are divided into objective measures :-those that depend only on the structure of... more
Web usage log files generated on web servers contain huge amount of information that can be used for discovering web usage association rules, which can potentially give useful knowledge to the web usage data analysts. Association rule... more
Data mining algorithms, especially those used for unsupervised learning, generate a large quantity of rules. In particular this applies to the Apriori family of algorithms for the determination of association rules. It is hence impossible... more
Data mining is the efficient discovery of patterns in large databases, and classification rules are perhaps the most important type of patterns in data mining applications. However, the number of such classification rules is generally too... more
Download research papers for free!