Rule Induction

description954 papers

group41 followers

lightbulbAbout this topic

Rule induction is a machine learning technique that involves the extraction of useful if-then rules from data. It aims to create a model that can predict outcomes based on input features by identifying patterns and relationships within the dataset.

lightbulbAbout this topic

Key research themes

1. How do search strategies and heuristics influence rule learning performance and over-searching in inductive rule induction?

This research theme investigates the impact of different search strategies—hill-climbing, beam search, and exhaustive search—and rule evaluation heuristics on the performance and characteristics of rule induction algorithms. It addresses the over-searching phenomenon, where increasing search effort may deteriorate learning performance, by examining the interplay between search mechanisms and heuristics. Understanding this interaction is critical for optimizing rule learning algorithms to balance theory size, predictive accuracy, and rule generality.

A Re-evaluation of the Over-Searching Phenomenon in Inductive Rule Learning

by Frederik Janssen

2016

Key finding: This study demonstrated that the traditionally observed over-searching phenomenon in inductive rule learning depends significantly on the choice of heuristic evaluation function. Exhaustive search tends to find longer but... Read more

articleView Paper downloadDownload

On Trading Off Consistency and Coverage in Inductive Rule Learning

by Frederik Janssen

2016

Key finding: This paper analyzed key rule learning heuristics—m-estimate, F-measure, and Klösgen measures—characterizing how each parametrically manages the trade-off between rule consistency (accuracy on covered examples) and coverage... Read more

articleView Paper downloadDownload

RuleKit: A comprehensive suite for rule-based learning

by Marek Sikora

2023, Knowledge-Based Systems

Key finding: RuleKit exemplifies a flexible, scalable sequential covering rule induction system that supports extensive customization of rule quality measures (over 40), including user-guided induction and multi-threaded execution.... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

2. What methodologies enable effective rule extraction from complex black-box models, particularly support vector machines, enhancing interpretability without compromising performance?

A key challenge in machine learning is extracting comprehensible symbolic rules from high-performance but opaque models like support vector machines (SVMs). This theme explores learning-based and decompositional approaches for rule extraction that convert SVM decision boundaries into human-readable rules, facilitating trust, explanation, and validation especially in high-stakes domains such as medicine. The theme includes evaluation of techniques that treat SVMs as black boxes and generate rule sets approximating SVM predictions while maintaining accuracy.

Learning-based Rule-Extraction from Support Vector Machines

by Nahla Barakat

2021

Key finding: This work presented a novel learning-based method for extracting symbolic classification rules from SVMs by treating the SVM as a black box to generate labeled examples, which are then used to train rule-based learners like... Read more

articleView Paper downloadDownload

A Rule-Learning Approach for Detecting Faults in Highly Configurable Software Systems from Uniform Random Samples

by Victoria Ruiz

2022, Proceedings of the Annual Hawaii International Conference on System Sciences

Key finding: This study applied rule induction algorithms (e.g., AQ, CN2, RIPPER) to detect faults from test results on uniform random samples of software configurations. Evaluations on large-scale datasets demonstrate that rule learning... Read more

articleView Paper downloadDownload

Using Prior Knowledge in Rule Induction

by DungDuc Nguyen

2022

Key finding: By integrating prior knowledge as existing rule sets and user constraints into the rule induction process, this work proposed a two-step approach of generating rule seeds and specializing them to obtain more accurate rules.... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

3. How can constructive induction and complex condition formulation extend the expressivity and predictive capacity of rule induction algorithms?

Traditional rule induction algorithms typically generate rules with simple logical conditions, which may limit their ability to capture complex relationships in data. This theme investigates methodologies for constructive induction—creating new features or complex rule conditions such as M-of-N combinations—and how these enhance the descriptive and predictive capabilities of rule learning. The research also addresses practical aspects such as heuristic control and knowledge-driven user guidance to manage combinatorial explosion and improve model interpretability.

Multistrategy Constructive Induction

by Eric Bloedorn

2021

Key finding: This paper proposed a multistrategy constructive induction framework combining data-driven and hypothesis-driven inference alongside expanding and contracting operations in representation space. The approach simultaneously... Read more

articleView Paper downloadDownload

A Practical Approach for Knowledge-Driven Constructive Induction

by Jose Augusto

2023, Citeseer

Key finding: The proposed methodology incorporates expert knowledge to guide constructive induction by suggesting new composite features that augment original datasets. By iteratively augmenting data with user-defined features and... Read more

articleView Paper downloadDownload

Classification, Regression, and Survival Rule Induction with Complex and M-of-N Elementary Conditions

by Cezary Maszczyk

2025, Machine learning and knowledge extraction

Key finding: This study introduced an extension to sequential covering rule induction algorithms allowing complex and M-of-N conditions in rule premises by analyzing frequent sets of elementary conditions. The approach effectively induced... Read more

articleView Paper downloadDownload

keyboard_arrow_downShow more

All papers in Rule Induction

Detecting intrusions using system calls: Alternative data models

by Barak Pearlmutter

1999

Intrusion detection systems rely on a wide variety of observable data to distinguish between legitimate and illegitimate activities. In this paper we study one such observablesequences of system calls into the kernel of an operating... more

Figure 2. Alternate database growth curves for the same data used in Figure 1. Light lines show standard growth curves for differ- ent starting points in the training data; the dark line shows the mean. nearby sequences as well. We considered several meth- ods for smoothing this curve so as to get a better estimate of the slope, and eventually settled on the following ap- proach. Figure 2 shows several different versions of the growth curve for the same data. The pool of normal data traces is treated as a loop, where the first trace follows the ast trace. For each curve shown in figure 2, a different start- ing point on this loop was chosen randomly, and then the traces were read in order from that point. This allows us to examine variations in the growth curve without reduc- ing the amount of data used or disrupting the chronological ordering of traces.

Figure 1. Typical database growth curve. The graph shows how the size of the normal database grows as traces are added chrono- logically.

Figure 3. Composite results for each method on all data sets, sequence length 6. Each point rep- resents performance at a particular threshold. True-positive values are the fraction of intrusions identified. For the sequence-based methods, false positives are the fraction of sequences giving mismatches at or above the specified locality frame count threshold. For HMMs, false positives are the fraction of system calls corresponding to state transitions or outputs below the specified probability threshold. Points labeled “HMM” are for only randomly-initialized HMMs, while those for “HMM+” use the specially-initialized HMMs designed to handle Ipr data. No t-stide points appear in the median plot because the false positives are off the scale. Results for four HMM thresholds all map to the single median point shown.

Figure 4. Average true and false positives versus threshold for each method, sequence length 6. HMM results are for randomly-initialized HMMs only.

Figure 5. False-positive rates for each of six data sets, sequence length six. stide threshold: 6, t-stide threshold: 4, RIPPER threshold 2, HMM threshold = 0.001. Note that the RIPPER true-positive rate at this threshold is is slightly lower than those of the other methods. False-positive rates are shown on a logarithmic scale.

Table 1. Amount of data available for each program. “Normal data used for training” refers to models built with sequence length six; sequence length ten models used more training data. The same test data were used for both sequence lengths; this includes all normal data not used for training either set of models.

descriptionView Paper arrow_downwardDownload

Rule induction with CN2: Some recent improvements

by Yves Kodratoff

1991, Lecture Notes in Computer Science

The CN2 algorithm induces an ordered list of classi cation rules from examples using entropy as its search heuristic. In this short paper, we describe two improvements to this algorithm. Firstly, we present the use of the Laplacian error... more

descriptionView Paper arrow_downwardDownload

Protein secondary structure prediction using nearest neighbor and neural network approach

by Gajendra P S Raghava

2000, CASP

Fig. 2. Graphical representation of Q3 values and some descriptive statistics.

Summary of descriptive statistics for prediction accuracy. Table 2

The results of five set cross validation test. Table 1

The prediction accuracy on the five test sets of proteins using segment overlap measure (SOV). Table 4

Matthew’s correlation coefficient (MCC) scores for different structural classes which retains the flavor of both sensitivity and specificity. Table 3

descriptionView Paper arrow_downwardDownload

Genetic algorithms as a tool for feature selection in machine learning

by Kenneth De Jong

1992, Proceedings Fourth International Conference on Tools with Artificial Intelligence TAI '92

This paper describes an approach being explored to improve the usefulness of machine learning techniques for generating classification rules for complex, real world data. The approach involves the use of genetic algorithms as a "front... more

descriptionView Paper arrow_downwardDownload

Intrusion Detection Using Fuzzy Association Rules

by mohammadhossein rahmati

2009, Applied Soft Computing

Vulnerabilities in common security components such as firewalls are inevitable. Intrusion Detection Systems (IDS) are used as another wall to protect computer systems and to identify corresponding vulnerabilities. In this paper, a novel... more

Fig. 1 Block diagram of both train and test phases in proposed intrusion detection framework Fuzzy association rule induction algorithm has two steps. As in Aprioni algorithm, the first step is to find the so-called large itemsets which are itemsets with significance higher than a user specified threshold.

defining each function are three questions that are to be answered. In this work, a fixed number of fuzzy sets are defined for every attribute (i.e. three), and all membership functions are considered to have a trapezoidal shape. can be carried out, ignoring one of these two items, say 'b'. And when the ruleset is obtained each extracted rule that contains item ‘a' must be duplicated by substituting 'b' for each occurrence of 'a'. Of course the mles a —b and b — a should also be added to obtained ruleset. In the same way a shared item between multiple hyper-edges can represent multiple items. Going one step further, by extracting all association hyper-edges having two members, the item reduction problem can be mapped to a graph problem. To exactly describe such mapping, graph nodes and edges are to be defined; consider each item as a graph node and suppose that an edge exists between two nodes if the items corresponding to those nodes form an association hyper-edge. Then the problem is to find the smallest subset of graph nodes such that every non-member node of the subset is reachable from the nodes in the subset by passing at most one edge. In this work, this problem is handled through a greedy aAwRnmanh hee apacuinn tha hinhar calanhan ArAmE? ka tha nwanao

Fig. 4 ROC curve of the proposed anomaly detection system, on training data.

j" attribute. Table 1 is a sample database with quantitative

descriptionView Paper arrow_downwardDownload

Evolving accurate and compact classification rules with gene expression programming

by Weimin Xiao

2003, IEEE Transactions on Evolutionary Computation

Classification is one of the fundamental tasks of data mining. Most rule induction and decision tree algorithms perform local, greedy search to generate classification rules that are often more complex than necessary. Evolutionary algorithms for pattern classification have recently received increased attention because they can perform global searches. In this paper, we propose a new approach for discovering classification rules by using gene expression programming (GEP), a new technique of genetic programming (GP) with linear representation. The antecedent of discovered rules may involve many different combinations of attributes. To guide the search process, we suggest a fitness function considering both the rule consistency gain and completeness. A multiclass classification problem is formulated as multiple two-class problems by using the one-against-all learning method. The covering strategy is applied to learn multiple rules if applicable for each class. Compact rule sets are subsequently evolved using a two-phase pruning method based on the minimum description length (MDL) principle and the integration theory. Our approach is also noise tolerant and able to deal with both numeric and nominal attributes. Experiments with several benchmark data sets have shown up to 20% improvement in validation accuracy, compared with C4.5 algorithms. Furthermore, the proposed GEP approach is more efficient and tends to generate shorter solutions compared with canonical tree-based GP classifiers. He is currently a Senior Software Engineer with the Motorola Advanced Technology Center, Schaumburg, IL. His applied research has focused on developing advanced knowledge discovery and management techniques for manufacturing. His interests include machine learning, neural networks, evolutionary algorithms, and data mining. Weimin Xiao received the B.S. degree in structural engineering from Zhejiang University, Zhejiang, China, the M.S. degree in computational structural mechanics from Chongqing University, Chongqing, China, and Ph.D. degree in computational structural mechanics from University of Kentucky, Lexington. He was a Lecturer at Zhejiang Industrial University, Hangzhou, China. Currently, he is a Principal Software Engineer with the Motorola Advanced Technology Center, Schaumburg, IL. His research interests include distributed computing, automated FEA system, mixed integer linear and nonlinear optimization, machine learning, automated mathematical model discovery, and knowledge discovery from database. Thomas M. Tirpak (M'91) received the B.S. and M.S. degrees in general engineering (robotics), the Ph.D. degree in electrical and computer engineering from the University of Illinois at Urbana-Champaign, and the Master of Engineering Management degree from Northwestern University, Evanston, IL.

descriptionView Paper arrow_downwardDownload

Selecting informative features with fuzzy-rough sets and its application for complex systems monitoring

by Richard Jensen

2004, Pattern recognition

One of the main obstacles facing current intelligent pattern recognition applications is that of dataset dimensionality. To enable these systems to be e ective, a redundancy-removing step is usually carried out beforehand. Rough set... more

descriptionView Paper arrow_downwardDownload

An investigation of machine learning based prediction systems

by Carolyn Mair and

2000, Journal of Systems and …

outputs are generated. A NNs are massively parallel systems inspired by the architecture of biological neura inputs and outputs that are known as the training set. There are several training

Table 1: Summary of the Desharnais Dataset training set of 67 projects and validation sets of 10 projects. This was performed three times yielding validation sets 1, 2 and 3 so as to help assess the stability of any

carried out a preliminary investigation of feature set pruning but without automated that we only had nine features to contend with an exhaustive search was possible

Table 3: Summary Statistics of Prediction Techniques significantly to this problem. with heterogeneity. The dataset contained a number of outlier values that contributed

topic we found the process largely to be one of trial and error. For this reason, it is

descriptionView Paper arrow_downwardDownload

Sequential covering rule induction algorithm for variable consistency rough set approaches

by Roman Słowiński

2011, Information Sciences

We present a general rule induction algorithm based on sequential covering, suitable for variable consistency rough set approaches. This algorithm, called VC-DomLEM, can be used for both ordered and non-ordered data. In the case of... more

Figure 1: Illustration of VC-DomLEM problems with induction of rules satisfying pu-consistency condition, caused by the lack of property (m4).

The aim of the experiment presented in this paper is to evaluate the usefulness of VC-DomLEM algorithm in terms of its predictive accuracy. To this end, we compared our algorithm to other methods on 12 ordinal data sets listed in Table 1. Data sets: employee rejection/acceptance (ERA), employee selection (ESL), lectures evaluation (LEV) and social workers decisions (SWD) were taken from [2]. Other data sets come from the UCI repository! and other public repositories (as in case of data sets: bank of Greece (bank-g) and financial analysis made easy (fame)). In general, it is not always the case that ordinal classifiers that preserve monotonicity constraints perform better than non-ordinal classifiers [4] in predictive accuracy. This is mainly attributed to the fact that monotonicity constraints that need to be satisfied bias the classifier. The unbiased classifier may generalize the data more effectively. Taking this into account, we compared VC-DomL| EM to other ordinal classifiers that preserve monotonicity constraints as well as to non-ordinal classifiers.

Table 4: Comparison of mean strength and length of rules induced by e-VC-DomLEM and u-VC-DomLEM «VC_-DomLEM | u-VC-DomLEM other classifier except OSDL. The same was true for the following pairs: u-VC-DomLEM and OLM, Naive Bayes and OLM, C4.5 and RIPPER, C4.5 and OLM, OSDL and OLM. Then, we applied Wilcoxon test to percentage of correctly classified objects from Table 3. We observed significant difference between «-VC-DomLEM and any other classifier except C4.5 and OSDL. The same was true for the following pairs: u-VC-DomLEM and OLM, Naive Bayes and OLM, RIPPER and OLM, C4.5 and RIPPER, C4.5 and OLM, OSDL and OLM.

descriptionView Paper arrow_downwardDownload

Decision-tree and rule-induction approach to integration of remotely sensed and GIS data in mapping vegetation in disturbed or hilly environments

by Brian Lees

1991, Environmental Management

The integration of Landsat TM and environmental GIS data sets using artificial intelligence rule-induction and decision-tree analysis is shown to facilitate the production of vegetation maps with both floristic and structural infermation.... more

descriptionView Paper arrow_downwardDownload

Expert systems-rule induction with statistical data

by John Mingers

1987, Journal of the operational research society

The user has requested enhancement of the downloaded file.

descriptionView Paper arrow_downwardDownload

Understanding the Crucial Role of Attribute Interaction in Data Mining

by Mina Bhandari

2001, Artificial Intelligence Review

This is a review paper, whose goal is to significantly improve our understanding of the crucial role of attribute interaction in data mining. The main contributions of this paper are as follows. Firstly, we show that the concept of... more

descriptionView Paper arrow_downwardDownload

Mining Needle in a Haystack: Classifying Rare Classes via Two-phase Rule Induction

by Mahesh Joshi

2001

Learning models to classify rarely occurring target classes is an important problem with applications in network intrusion detection, fraud detection, or deviation detection in general. In this paper, we analyze our previously proposed... more

Table 2: Comparative Results on dataset nsyn5. Cte: C4.5-we, Re: RIPPER-we Celne! pate Mae eeeee, EER Peano: Seo EM oe nspa: Each subclass is distinguished by nspa number of disjoint signatures over a distinct pair of attributes. nwps: Each distinct pair of attributes has total nwps combinations of words that identify the corresponding signature. Each subclass of the target-class has same values of nspa and nwps, but different set of distinguishing words. Each subclass of the non-target-class has same values of nspa and nwps, but different set of distinguishing words.

Figure 3: Description of dataset syngen. There are 8 total attributes: four categorical attributes distinguish subclasses C3 and NC3 or target and non-target class, respectively. Four numerical attributes, over which the class distributions are shown, distinguish other subclasses. The distribution of a subclass over non- distinguishing attributes is random and uniform.

Table 4: Comparative Results on dataset syngen

Table 5: Comparing effect of variation in target class size on syngen dataset. ntc-frac is the fraction of non-target-class examples randomly sampled from the dataset with 0.3% target class proportion.

descriptionView Paper arrow_downwardDownload

Knowledge Management, Data Mining, and Text Mining in Medical Informatics

by Sherrilynne Fuller and

2005, Integrated Series in Information Systems

In this chapter we provide a broad overview of selected knowledge management, data mining, and text mining techniques and their use in various emerging biomedical applications. It aims to set the context for subsequent chapters. We first... more

descriptionView Paper arrow_downwardDownload

Organic food consumption: A multi-theoretical framework of consumer decision making

by Wander Jager

2002, British Food Journal

An approach is introduced to combine survey data with multi-agent simulation models of consumer behaviour to study the diffusion process of organic food consumption. This methodology is based on rough set theory, which is able to... more

descriptionView Paper arrow_downwardDownload

Fuzzy rough sets and multiple-premise gradual decision rules

by Masahiro Inuiguchi

2006, International Journal of Approximate Reasoning

We propose a new fuzzy rough set approach which, differently from most known fuzzy set extensions of rough set theory, does not use any fuzzy logical connectives (t-norm, t-conorm, fuzzy implication). As there is no rationale for a... more

descriptionView Paper arrow_downwardDownload

Fuzzy sets of rules for system identification

by Riccardo Rovatti

1996, IEEE Transactions on Fuzzy Systems

Fig. 1. Truth values of preconditions. M (medium),” and p3(z) = “a is H (high).” The truth values of these predicates are the piecewise linear functions reported in Fig. 1

may consider, we resort to a different estimation for the opportunity of introducing rules in the rule set. First, let us discuss the effect of introducing a single rule. Recall Definition 2 and positions in (2) and assume that we have to decide whether or not to include a rule of the type pi => cj, ie. to increase fi;. If f; > 0, then an extra parameter é; € [max {c;}, min{c;}] is available to improve the approximation in zones where the quantity f;T[p;(x)] is significant. Thus, we may consider the compromise between adding a new free parameter €; (whose weight in the global /O relationship is controlled by /2;) and the effective improvement that this parameter brings to the approximation.

Once that this is assumed, the performance index can be expressed in terms of the elements of P and C to obtain To help formula manipulation, we will flatten two-dimensional quantities in E setting . = {yi }i3 and a, = {T[pi(z,)] (cj — ys) }ij;. With these positions the error expression becomes

Fig. 3. Graphic of FSR and PSC approximation versus the target.

Fig. 4. Graphic of FSR, PSC, and RMS approximations versus the target function.

Table VIII shows the performance of the identification re- lying on the proposed technique along with the results of other 14 techniques that were employed in the “great energy- predictor shootout” competition. the 900 testing values so that the two expressions are

ROVATTI AND GUERRIERI: FUZZY SETS OF RULES FOR SYSTEM IDENTIFICATION The application of such a methodology has been detailed in few significant cases to show its advantages over other commonly adopted strategies. Solutions to some problems of noisy approximation have been reported demonstrating that when the proposed approach is coupled with parameter optimization, it may behave three times better (in the RMSE sense) than completely parametric techniques. be either included or excluded from the rule set is relaxed, and degrees of membership are exploited, to achieve ap- proximation results comparable with all-parametric techniques which also address a limited form of the accuracy-complexity compromise. Then, proper defuzzification methodologies are defined to extract well-behaving high-level crisp rule sets from their fuzzy counterparts.

In this section, we report the proof of some of the proposi- tions stated and exploited in the previous discussion.

where the -’ operator transposes its argument, vy = (1, 1,---, 1)*, and A = E,[aa*]. It can be directly verified that E-is constant for each member of an equivalence class [R|. The aim of the approximation procedure is to find the representative fuzzy set of rules for which E([R]) is minimum. In Section IV, a learning algorithm will be devised which teaches the best fuzzy set of rules minimizing

This kind of solution is preferred even to the root-mean- square error minimization (RMS) which yields RMSE(RMS) Again, we may suppose the samples to be equally distributed in [—2, 2]. and compute the expectation matrix directly to obtain the matrix, shown at the bottom of this page. The corre- sponding quadratic form vanishes in y* = (4, 0, 4, 4, 0, 4)*

Though the technique proposed in this paper (amed FSR in Table VID) is completely general purpose and did not rely on any dedicated pre or post-processing, it seems to successfully extract compact symbolic significance from the raw training data. In fact, the NRMSE% performance puts the methodology in the upper part of the global ranking, while the average bias of the symbolic description seems to be almost negligible.

ROVATTI AND GUERRIERI: FUZZY SETS OF RULES FOR SYSTEM IDENTIFICATION The authors would like to thank Prof. G. Baccarani for his help and encouragement on this work. REFERENCES where the Tr [-] operator gives the trace of its argument. Thus, it is sufficient that

“Dr. Guerrier received | a ‘NATO fellowship in n 1986, and it in 1989, a fellowship for young researchers, provided by Consiglio Nazionale delle Ricerche, Italy. In 1992, he won the Best Paper Award from the IEEE TRANSACTIONS ON SEMICONDUCTOR MANUFACTURING. 1980 and 1986, respectively. From 1980-1986, he was with the Department of Electrical Engineering, University of Bologna, Italy, where he did research on the numerical simulation of semiconductor devices. From 1986-1988, he was with the Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, as a Visiting Researcher. In 1987, he spent the winter semester at the Massachusetts Institute of Technology (MIT), Cambridge, as Visiting Scientist. In 1989, he joined the University of Bologna where he is currently Associate Professor in charge of the Laboratory for VLSI design. His research interests are in various aspects of applied pattern recognition, integrated circuit design, and parallel processing.

descriptionView Paper arrow_downwardDownload

Short Proofs of Normalization for the simply-typed -calculus, permutative conversions and Godel's T

by Felix Joachimski

2002, Archive for Mathematical Logic

Inductive characterizations of the sets of terms, the subset of strongly normalizing terms and normal forms are studied in order to reprove weak and strong normalization for the simplytyped λ-calculus and for an extension by sum types... more

descriptionView Paper arrow_downwardDownload

Discovering Interesting Patterns for Investment Decision Making with GLOWER☹—A Genetic Learner Overlaid with Entropy Reduction

by Vasant Dhar

2000

Prediction in financial domains is notoriously difficult for a number of reasons. First, theories tend to be weak or non-existent, which makes problem formulation open ended by forcing us to consider a large number of independent... more

descriptionView Paper arrow_downwardDownload

Constraints and Preferences in Inductive Learning: An Experimental Study of Human and Machine Performance

by Douglas Medin

1987, Cognitive Science

descriptionView Paper arrow_downwardDownload

Search-based model transformation by example

by Omar Omar

Software and System Modeling

Model transformation (MT) has become an important concern in software engineering. In addition to its role in model-driven development, it is useful in many other situations such as measurement, refactoring, and test-case generation.... more

descriptionView Paper arrow_downwardDownload

A dynamic model of group performance: Considering the group members' capacity to learn

by Felix Brodbeck

2000, Group Processes & Intergroup …

A dynamic model of group performance is suggested that combines the group learning approach and the combination of contributions approach. Three hypotheses are tested in two experiments, comparing individual training conditions with mixed... more

descriptionView Paper arrow_downwardDownload

A Rough-Set-Based Approach for Classification and Rule Induction

by Shu Beng Tor

1999, The International Journal of Advanced Manufacturing Technology

The inconsistency of information about objects may be the greatest obstacle to performing inductive learning from examples. Rough sets theory provides a new mathematical tool to deal with uncertainty and vagueness. Based on rough sets... more

descriptionView Paper arrow_downwardDownload

Highly scalable and robust rule learner: performance evaluation and comparison

by Lukasz Kurgan

2000, IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics)

Business intelligence and bioinformatics applications increasingly require the mining of datasets consisting of millions of data points, or crafting real-time enterprise-level decision support systems for large corporations and drug... more

descriptionView Paper arrow_downwardDownload

Rule Quality Measures for Rule Induction Systems: Description and Evaluation

by Aijun An

2001, Computational Intelligence

A rule quality measure is important to a rule induction system for determining when to stop generalization or specialization. Such measures are also important to a rule-based classification procedure for resolving conflicts among rules.... more

descriptionView Paper arrow_downwardDownload

Predicting Business/ICT Alignment with AntMiner+

by Bart Baesens and

2000, SSRN Electronic Journal

In this paper we report on the results of a European survey on business/ICT alignment practices. The goal of this study is to come up with some practical guidelines for managers on how to strive for better alignment of ICT investments... more

descriptionView Paper arrow_downwardDownload

Improving the scalability of ILP-based multi-relational concept discovery system through parallelization

by Pinar Karagoz

2012, Knowledge-Based Systems

Due to the increase in the amount of relational data that is being collected and the limitations of propositional problem definition in relational domains, multi-relational data mining has arisen to be able to extract patterns from... more

descriptionView Paper arrow_downwardDownload

A Combined CXCL10, CXCL8 and H-FABP Panel for the Staging of Human African Trypanosomiasis Patients

by Markus Müller

2009, Plos Neglected Tropical Diseases

Background: Human African trypanosomiasis (HAT), also known as sleeping sickness, is a parasitic tropical disease. It progresses from the first, haemolymphatic stage to a neurological second stage due to invasion of parasites into the... more

descriptionView Paper arrow_downwardDownload

Use of Artificial Neural Networks in the Prediction of the Kidney Transplant Outcomes

by Dharmendra Sharma

2004

Traditionally researchers have used statistical methods to predict medical outcomes. However, statistical techniques do not provide sufficient in-formation for solving problems of high complexity. Recently more attention has turned to a... more

descriptionView Paper arrow_downwardDownload

Barendregt’s Variable Convention in Rule Inductions

by Michael Norrish

2007, Lecture Notes in Computer Science

Inductive definitions and rule inductions are two fundamental reasoning tools in logic and computer science. When inductive definitions involve binders, then Barendregt's variable convention is nearly always employed (explicitly or... more

descriptionView Paper arrow_downwardDownload

Rule-Induction and Case-Based Reasoning: Hybrid Architectures Appear Advantageous

by Aijun An

1999, IEEE Transactions on Knowledge and Data Engineering

Researchers have embraced a variety of machine learning (ML) techniques in their efforts to improve the quality of learning programs. The recent evolution of hybrid architectures for machine learning systems has resulted in several... more

descriptionView Paper arrow_downwardDownload

Concept discovery on relational databases: New techniques for search space pruning and rule quality improvement

by Pinar Karagoz

2010

Multi-relational data mining has become popular due to the limitations of propositional problem definition in structured domains and the tendency of storing data in relational databases. Several relational knowledge discovery systems have... more

descriptionView Paper arrow_downwardDownload

Decision-Tree-based data mining and rule induction for predicting and mapping soil bacterial diversity

by Dongwon Ki

2011, Environmental Monitoring and Assessment

Soil microbial ecology plays a significant role in global ecosystems. Nevertheless, methods of model prediction and mapping have yet to be established for soil microbial ecology. The present study was undertaken to develop an... more

descriptionView Paper arrow_downwardDownload

An approach to imbalanced data sets based on changing rule strength

by Xinqun Zheng

2000

This paper describes ,experiments ,with ,a challenging data set describing preterm births. The data set, collected at the Duke University Medical Center, was large and, at the same time, many attribute values were missing. However, the... more

descriptionView Paper arrow_downwardDownload

Statistical Methods for Analyzing Tissue Microarray Data

by Steve Horvath

2004, Journal of Biopharmaceutical Statistics

Tissue microarrays (TMAs) are a new high-throughput tool for the study of protein expression patterns in tissues and are increasingly used to evaluate the diagnostic and prognostic importance of biomarkers. TMA data are rather challenging... more

descriptionView Paper arrow_downwardDownload

Gene expression meta-analysis supports existence of molecular apocrine breast cancer with a role for androgen receptor and implies interactions with ErbB family

by Vittorio Cristini

2009, BMC Medical Genomics

Background: Pathway discovery from gene expression data can provide important insight into the relationship between signaling networks and cancer biology. Oncogenic signaling pathways are commonly inferred by comparison with signatures... more

descriptionView Paper arrow_downwardDownload

Rule Induction for Concept Hierarchy Alignment

by Hideaki Takeda

2001

To manage information like ontology, we usually use categorization with concept hierarchy. Such concept hierarchies are managed individual for each system due to the many differences in concept hierarchies. Consequently, it is difficult... more

descriptionView Paper arrow_downwardDownload

Data mining for gene networks relevant to poor prognosis in lung cancer via backward-chaining rule induction

by Lewis Frey

2007, Cancer informatics

We use Backward Chaining Rule Induction (BCRI), a novel data mining method for hypothesizing causative mechanisms, to mine lung cancer gene expression array data for mechanisms that could impact survival. Initially, a supervised learning... more

descriptionView Paper arrow_downwardDownload

Learning content selection rules for generating object descriptions in dialogue

by Marilyn Walker

2005

A fundamental requirement of any task-oriented dialogue system is the ability to generate object descriptions that refer to objects in the task domain. The subproblem of content selection for object descriptions in task-oriented dialogue has been the focus of much previous work and a large number of models have been proposed. In this paper, we use the annotated coconut corpus of task-oriented design dialogues to develop feature sets based on incremental model, conceptual pact model, and Jordan's (2000b) intentional influences model, and use these feature sets in a machine learning experiment to automatically learn a model of content selection for object descriptions. Since Dale and Reiter's model requires a representation of discourse structure, the corpus annotations are used to derive a representation based on Grosz and Sidner's (1986) theory of the intentional structure of discourse, as well as two very simple representations of discourse structure based purely on recency. We then apply the rule-induction program ripper to train and test the content selection component of an object description generator on a set of 393 object descriptions from the corpus. To our knowledge, this is the first reported experiment of a trainable content selection component for object description generation in dialogue. Three separate content selection models that are based on the three theoretical models, all independently achieve accuracies significantly above the majority class baseline (17%) on unseen test data, with the intentional influences model (42.4%) performing significantly better than either the incremental model (30.4%) or the conceptual pact model (28.9%). But the best performing models combine all the feature sets, achieving accuracies near 60%. Surprisingly, a simple recency-based representation of discourse structure does as well as one based on intentional structure. To our knowledge, this is also the first empirical comparison of a representation of Grosz and Sidner's model of discourse structure with a simpler model for any generation task.

descriptionView Paper arrow_downwardDownload

Rough set based rule induction from two decision tables

by Masahiro Inuiguchi

2007, European Journal of Operational Research

We study rule induction from two decision tables as a basis of rough set analysis of more than one decision tables. We regard the rule induction process as enumerating minimal conditions satisfied with positive examples but unsatisfied... more

descriptionView Paper arrow_downwardDownload

Experiment databases: Towards an improved experimental methodology in machine learning

by Joaquin Vanschoren

2007, Knowledge Discovery in Databases: PKDD …

descriptionView Paper arrow_downwardDownload

Improving rule sorting, predictive accuracy and training time in associative classification

by Suhel Hammoud

2006, Expert Systems With Applications

Traditional classification techniques such as decision trees and RIPPER use heuristic search methods to find a small subset of patterns. In recent years, a promising new approach that mainly uses association rule mining in classification... more

descriptionView Paper arrow_downwardDownload

Speeding up the evaluation of evolutionary learning systems using GPGPUs

by Natalio Krasnogor

2010

In this paper we introduce a method for computing fitness in evolutionary learning systems based on NVIDIA's massive parallel technology using the CUDA library. Both the match process of a population of classifiers against a training set... more

descriptionView Paper arrow_downwardDownload

Incremental Induction of Decision Rules from Dominance-based Rough Approximations

by Roman Słowiński

2003, Electronic Notes in Theoretical Computer Science

descriptionView Paper arrow_downwardDownload

Machine learning for intelligent processing of printed documents

by Lisi Francesca

2000, Journal of Intelligent Information Systems

A paper document processing system is an information system component which transforms information on printed or handwritten documents into a computer-revisable form. In intelligent systems for paper document processing this information... more

Figure1. Threedocument processing steps: a) document analysis, thatis breaking down the bitmap of adocument image into several layout components; b) document classification, that is assigning the document to a pre-defined set of classes; c) document understanding, that is attaching semantic (or logic) labels to some layout components.

Figure 2. Stages of the application of machine learning techniques to paper document processing (adapted from Fayyad et al. (1996)).

Figure 3. Functional architecture of WISDOM ++. Initially, each page is scanned with a resolution of 300 dpi and thresholded into a binary image. The bitmap of an A 4-sized page takes 2,496 x 3,500 = 1,092,000 bytes and is storec in TIFF format. Actually, WISDOM++ can manage multi-page documents, each of whict is a sequence of pages. The definition of the right sequence is the responsibility of the user since the scanner is able to scan a single page at time. Pages of multi-page documents are processed independently of each other in all steps, therefore the document processing flow is described for single pages only.

‘igure 4. Frame2 layout level of a document processed by WISDOM ++ (left). The level of the layout hierarchy 0 be displayed is chosen by clicking on the radio buttons F2 (frame2), F1 (frame1), SL (set of lines), LN (lines) and 3B (basic blocks). The document has been classified as an ICML95 paper (see status bar) and its logical structure 1as been understood (see labels associated to the layout components). The first-order logical description of the yage layout is reported on the right window. Distinct constants (X2, ..., X12) denote distinct layout components. The constant X 1 denotes the whole page. Models are represented as rules. Typically, such rules are handcoded for particular kinds of documents (Nagy et al., 1992), requiring much human tuning and effort. WISDOM++ uses rules that are automatically generated from a set of training examples for which the user- trainer has already defined the correct class and has specified the layout components with

Figure A2. Behaviour of the function H(L*+ +n*, L~) when Lt < L~

Table 1. Experimental results for a ten-fold cross-validation performed on the data set of 5,473 examples. Both systems are allowed to prune the induced trees using their default pruning method (error-based pruning for C4.5 and MDL pruning for ITI). Actually, ITI can operate in three different ways. In the batch mode, it works in a way similar to C4.5. In the normal operation mode, it first updates the frequency counts associated to each node of the tree as soon as a new instance is received. Then it restructures the decision tree according to the updated frequency counts. When working in this mode, TI builds trees with almost the same number of leaves and the same predictive accuracy of those induced with C4.5 (see Table 1). In the error-correction mode, frequency counts are updated only in case of misclassification of the new instance. The main difference between the two incremental modes is that the normal operation mode guarantees to build the same decision tree independently of the order in which examples are presented, while the error-correction mode does not.

Table 2. Features used to describe segmented blocks.

Figure 8. Choice of the best interval for numeric descriptors. 1, is the largest value in the table that does not exceed the threshold w. On the contrary, the lower bound r; of the right interval is the smallest value in the table that exceeds a, while the upper bound rz is the largest value with a + sign. When one of the two intervals contains no positive value, then it is set to undefined. At least one of the two intervals must be defined, since the table contains at least one entry (SeedValue, +) for the value taken by f(X1,..., Xn) in the seed. Not all defined intervals are to be considered, since the specialized clause

There are four possible cut points that generate the following intervals: Asa concrete illustration of the procedure determine-range, consider the table below. penalizes those admissible intervals with a low percentage of positive cases. A heuristic criterion is that of choosing the admissible interval that minimizes the weighted entropy. It differs from that adopted in C4.5, where the entropy is not weighted, and in FOIL where only the information content of the positive class is considered.

Let us suppose that Seed _value equals 1.50. Then only those intervals including 1.50 ar admissible. The weighted entropy for each of them is

Thus, the best interval is the second one, with weighted entropy equal to 1.0. Note that cut points 0.80 and 0.95 have not been considered. Indeed, only those betweer two consecutive distinct values with a different sign (boundary points) are considered. This choice is due to the following theorem:

Table 3. Descriptors used in the first-order representation of the layout components.

Table 4. Experimental results for document classification.

Table 5. Experimental results for document understanding.

descriptionView Paper arrow_downwardDownload

Automated alphabet reduction method with evolutionary algorithms for protein structure prediction

by Natalio Krasnogor

2007

This paper focuses on automated procedures to reduce the dimensionality of protein structure prediction datasets by simplifying the way in which the primary sequence of a protein is represented. The potential benefits of this procedure to... more

descriptionView Paper arrow_downwardDownload

A Study on the Use of Multiobjective Genetic Algorithms for Classifier Selection in FURIA-based Fuzzy Multiclassifiers

by Arnaud Quirin

2012, International Journal of Computational Intelligence Systems

descriptionView Paper arrow_downwardDownload

An Open Source Rule Induction Tool for Transfer-Based SMT

by Josef Van Genabith

2009, The Prague Bulletin of Mathematical Linguistics

In this paper we describe an open source tool for automatic induction of transfer rules. Transfer rule induction is carried out on pairs of dependency structures and their node alignment to produce all rules consistent with the node... more

descriptionView Paper arrow_downwardDownload

Variability and Trend-Based Generalized Rule Induction Model to NTL Detection in Power Companies

by F. Biscarri and

2000, IEEE Transactions on Power Systems

This paper proposes a comprehensive framework to detect non-technical losses (NTLs) and recover electrical energy (lost by abnormalities or fraud) by means of a data mining analysis, in the Spanish Power Electric Industry. It is divided... more

descriptionView Paper arrow_downwardDownload

Data mining: a tool for detecting cyclical disturbances in supply networks

by Vinaya Shukla

2007, Proceedings of The Institution of Mechanical Engineers Part B-journal of Engineering Manufacture

Disturbances in supply chains may be either exogenous or endogenous. The ability automatically to detect, diagnose, and distinguish between the causes of disturbances is of prime importance to decision makers in order to avoid... more

descriptionView Paper arrow_downwardDownload

Rule Induction

Key research themes

1. How do search strategies and heuristics influence rule learning performance and over-searching in inductive rule induction?

2. What methodologies enable effective rule extraction from complex black-box models, particularly support vector machines, enhancing interpretability without compromising performance?

3. How can constructive induction and complex condition formulation extend the expressivity and predictive capacity of rule induction algorithms?

Related Topics

All papers in Rule Induction