Academia.eduAcademia.edu

Association Mining

description354 papers
group6 followers
lightbulbAbout this topic
Association mining is a data mining technique used to discover interesting relationships, patterns, or correlations among a set of items in large datasets. It identifies frequent itemsets and generates association rules, which help in understanding the co-occurrence of items and can inform decision-making in various domains.
lightbulbAbout this topic
Association mining is a data mining technique used to discover interesting relationships, patterns, or correlations among a set of items in large datasets. It identifies frequent itemsets and generates association rules, which help in understanding the co-occurrence of items and can inform decision-making in various domains.

Key research themes

1. How can algorithmic scalability and efficiency be improved in frequent itemset discovery for association mining?

This research theme addresses computational challenges in discovering frequent itemsets efficiently from large-scale transactional databases. It explores algorithmic strategies that reduce I/O overhead, manage complex search spaces using structural decompositions, and adapt processing routines dynamically to dataset characteristics. Efficient frequent itemset mining is critical because the exponential search space and repeated data scans severely impact scalability in practical applications.

Key finding: Introduced algorithms (e.g., Eclat, MaxEclat) using a vertical tid-list database format and lattice-theoretic decomposition to partition the search space into manageable sublattices processed in-memory. This method minimizes... Read more
by lan vu
Key finding: Proposed DFEM, an algorithm combining FP-growth and Eclat techniques with a dynamic runtime threshold that adapts its mining strategy to database sparsity and density. DFEM automatically chooses the most efficient mining... Read more
Key finding: Presented DCI, an algorithm that adaptively switches from horizontal counting-based mining to vertical tidlist intersection-based mining as the pruned database shrinks to fit in memory. It includes heuristics adjusting to... Read more
Key finding: Proposed the reverse Apriori algorithm, which enhances the classical Apriori's efficiency by scanning transactions in reverse order and leveraging existing frequent patterns more effectively. This approach demonstrated... Read more

2. How can association rule mining be integrated effectively with classification to improve predictive accuracy and reduce rule redundancy?

This theme investigates combining association rule mining with classification tasks to develop classifiers based on association rules that maintain high predictive accuracy while generating fewer, less redundant rules. The research focuses on integrating itemset generation with rule generation, applying measures like information gain, and filtering rule conflicts within the mining process. These techniques aim to yield compact and interpretable classifiers improving over traditional classification or separate mining-classification pipelines.

Key finding: Presented GARC, a classification algorithm that integrates information gain measures into candidate itemset generation, merges frequent itemset mining with rule generation, and embeds redundancy and conflict avoidance... Read more
Key finding: Reviewed Apriori and hybrid Apriori-TID algorithms, highlighting that hybrid methods combining transaction and itemset information classification can better handle large itemsets and improve classification accuracy. The study... Read more

3. What are the methodological advancements and limitations in interpretability and evaluation of association rule interestingness and common-sense knowledge integration?

This research area focuses on evaluating and improving the measures used to identify meaningful and actionable association rules, including confidence, support, lift, and novel probabilistic or statistical models. It also explores approaches for semantic interpretation of association rules via frameworks like semantic frames and their application in building common-sense knowledge bases, thus enhancing the semantic richness and usability of mined association rules.

Key finding: Developed a simple probabilistic framework modeling transaction data as independent Bernoulli trials to simulate random, no-association data. Using real and simulated datasets, the study showed that confidence is influenced... Read more
Key finding: Proposed a Bayesian statistical framework that replaces the traditional support measure in association rule mining with probabilistic criteria based on posterior probability estimations. This approach addresses limitations... Read more
Key finding: Refactored the RelEx2Frame component of the OpenCog AGI framework by integrating the Drools rule engine and supervised/statistical methods aided by WordNet to expand concept variables. Association mining on semantic frames... Read more
Key finding: Proposed a visualization and clustering method based on conditional probabilities of association rules to help non-technical users interpret large sets of categorical association rules. This approach addresses the rare item... Read more

All papers in Association Mining

In the present paper a model of a multi agent based system is presented, which helps marketers on the one hand to address its products to the best targets and in the another hand to generate relevant product recommendations for customers... more
In the last years, the problem of Frequent Itemset Mining (FIM) from imperfect databases has been sufficiently tackled to handle many kinds of data imperfection. However, frequent itemsets discovered from databases describe only the... more
leverage, two traditional frequency-based measures. Among the top 50 signal pairs (i.e., enalapril versus symptoms) ranked by the potential causal-leverage measure, the physicians on the project determined that eight of them probably... more
IX7 uns ~and 4 2768 0.2243 0.1968 5 1.923 70.8443 6P. We wo uld like to nments on the des ign 1 at Bloomsburg Uni-1 and design of exper--2 Tec hnical Summary. A.R7-4. Thinking Mach-I 1987) 'crformance analysis of • • the Connc<:ti on M<... more
Different data mining algorithms applied to the same data can result in similar findings, typically in the form of rules. These similarities can be exploited to identify especially powerful rules, in particular those that are common to... more
We propose DepMiner, a method implementing a simple but effective model for the evaluation of itemsets, and in general for the evaluation of the dependencies between the values assumed by a set of variables on a domain of finite values.... more
Document management has to be rethinked and clarified in organizations, especially for the coordinated adoption of organization-wide electronic document management systems (EDMSs). This paper reports the identification and evaluation of... more
Advances in the media and entertainment industries, including streaming audio and digital TV, present new challenges for managing and accessing large audio-visual collections. Current content management systems support retrieval using... more
The paper presents a survey of the field of recommender systems and describes current recommendation methods that are usually classified into the following three main categories: content-based, collaborative, and hybrid recommendation... more
With the ever-growing database sizes, we have enormous quantities of data, but unfortunately we cannot use raw data in our day-today reasoning/decisions. We desperately need knowledge. This knowledge is in most cases in the gathered data,... more
This paper is concerned with current applications and researches of GUHA, a method for hypothesis generation. The GUHA method is very promising in the field of association rules data mining. Some of the current software implementations of... more
Association rule mining is a crucial data mining technique used to uncover relationships between variables in large datasets. This paper provides a comprehensive review of various association rule algorithms, including Apriori, FP-Growth,... more
Generating captions or annotations automatically for still images is a challenging task. Traditionally, techniques involving higher-level (semantic) object detection and complex feature extraction have been employed for scene... more
In this paper, we propose an associative watermarking scheme which is conducted by the concept of Association Mining Rules (AMRs) and the ideas of Vector Quantization (VQ) and Soble operator. Performing associative watermarking rules to... more
This paper proposes a systemic framework that attempts to define the domain and major areas of Data Mining and Knowledge Discovery (DMKD). Grounded theory approach, a qualitative method that inductively develops an understanding of... more
Generating captions or annotations automatically for still images is a challenging task. Traditionally, techniques involving higher-level (semantic) object detection and complex feature extraction have been employed for scene... more
In the recent past, wide ranges of video retrieval processes were presented by different researchers. In order to boost the ease of access of video clip, keen applications, which have item removal, video purchasing, video clip healing and... more
In recent years interest has grown in "mining" large databases to extract novel and interesting information. Knowledge Discovery in Databases (KDD) has been recognised as an emerging research area. Association rules discovery is an... more
This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
Association rules consist of the discovery of association between mining transaction items. This is one of the most important information mining jobs. It has been integrated into many commercial data mining software and has a wide variety... more
Pattern mining has emerged as a compelling field of data mining over the years. Literature has bestowed ample endeavors in this field of research ranging from frequent pattern mining to rare pattern mining. A precise and impartial... more
Data mining has been defined as the non-trivial extraction of implicit, previously unknown and potentially useful information from data. Association mining and sequential mining analysis are considered as crucial components of strategic... more
Compared with traditional association rule mining in the structured world (e.g. Relational Databases), mining from XML data is confronted with more challenges due to the inherent flexibilities of XML in both structure and semantics. The... more
One of the popular and compact trie data structure to represent frequent patterns is via frequent pattern tree (FP-Tree). There are two scanning processes involved in the original database before the FP-Tree can be constructed. One of... more
Data-mining is the extraction of meaningful patterns from the large source of data. Association Rule Mining (ARM) is an important data mining technique. Mining of frequent patterns is a very important association rule mining problem. The... more
One of the main issues in the process of Knowledge Discovery in Databases is the Mining of Association Rules. Although a great variety of pattern mining algorithms have been designed to this purpose, their main problems rely on in the... more
A 2D-3D visualization support for human-centered rule-mining
One of the main challenges in data-intensive sectors like scientific research, data mining, and machine learning is efficiently analyzing enormous datasets. A popular data structure in similarity search algorithms to speed up the... more
Mining generalized association rules among items in the presence of taxonomy has been recognized as an important model in data mining. Earlier work on generalized association rules confined the minimum supports to be uniformly specified... more
The development of novel platforms and techniques for emerging "Big Data" applications requires the availability of real-life datasets for data-driven experiments, which are however not accessible in most cases for various reasons, e.g.,... more
Different data mining algorithms applied to the same data can result in similar findings, typically in the form of rules. These similarities can be exploited to identify especially powerful rules, in particular those that are common to... more
Data mining is a collection of algorithms for finding interesting and unknown patterns or rules in data. However, different algorithms can result in different rules from the same data. The process presented here exploits these differences... more
Abstract: Stream analysis is considered as a crucial component of strategic control over a broad variety of disciplines in business, science and engineering. Stream data is a sequence of observations collected over intervals of time. Each... more
Mining frequent subtree patterns has many useful applications in XML mining, bioinformatics, network routing, etc. Most of the frequent subtree mining algorithms (such as FREQT, TreeMiner and CMTreeMiner) use anti-monotone property in the... more
Indirect association is a new kind of infrequent pattern, which provides a new way for interpreting the value of infrequent patterns and can effectively reduce the number of uninteresting infrequent patterns. The concept of indirect... more
We present a framework to analyse text streams with minute details of a game and generate summaries for multiple paradigms of desired lengths (as determined by the user). Multiple paradigms refer to a summary that is player-specific,... more
A core issue of the association rule extracting process in the data mining field is to find the frequent patterns in the database of operational transactions. If these patterns discovered, the decision making process and determining... more
Association rule mining perhaps the most widely described technique among the minding paradigms. The temporal association rule mining in the association rule mining tries to find relations among items in datasets. The temporal association... more
Organizations are more interested in the interesting data rather than the bulk of data. So they need a systematic and scientific approach to extract meaningful data out of heaps of the data and to find out the relations among these... more
Eclat is an algorithm that finds frequent itemsets. It uses a vertical database and calculates item&#39;s support by intersecting transactions. However, Eclat suffers from the exponential time complexity of calculating the intersection of... more
Frequent and infrequent itemset mining are trending in data mining techniques. The pattern of Association Rule (AR) generated will help decision maker or business policy maker to project for the next intended items across a wide variety... more
In this paper, a robust optimization approach is used to solve the redundancy allocation problem (RAP) in series-parallel systems with component mixing where uncertainty exists in components' reliabilities. In real world, the... more
Data mining has been defined as the non-trivial extraction of implicit, previously unknown and potentially useful information from data. Association mining is one of the important sub-fields in data mining, where rules that imply certain... more
Association Mining, a class of data mining techniques, is one of the most researched field in data mining, where algorithms are designed to discover rules that reflect dependencies among values of an attribute. Because of the vast amounts... more
Abstract: Stream analysis is considered as a crucial component of strategic control over a broad variety of disciplines in business, science and engineering. Stream data is a sequence of observations collected over intervals of time. Each... more
Data Mining is the process of discovering potentially valuable patterns, associations, trends, sequences and dependencies in data. Data mining techniques can discover information that many traditional business analysis and statistical... more
Generating captions or annotations automatically for still images is a challenging task. Traditionally, techniques involving higher-level (semantic) object detection and complex feature extraction have been employed for scene... more
Version control systems are among the type of repositories that are frequently explored as sources of software change history. They can be mined to identify associations between software module modifications. This information is useful to... more
In this work we made a study of several other works were the association and sequence mining techniques were applied to the field of web usage mining. This report is to be submitted to classification to the Data Mining course at the phd... more
Download research papers for free!