An investigation of rule induction based prediction systems

Carolyn  Mair

Outline

Title

Abstract

Background to Research

Method

Results

Discussion and Conclusions

References

An investigation of rule induction based prediction systems

Carolyn Mair

1999

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract
AI

This paper investigates the efficacy of rule induction (RI) methods in constructing prediction systems for software development effort, an area where traditional estimation techniques have shown significant inaccuracies. The study highlights the benefits of RI, particularly its transparency and interpretability compared to neural networks, and notes preliminary findings indicating that while RI can outperform other methods under specific conditions, it currently appears to yield less accurate predictions overall. The authors recommend further exploration into the contexts in which RI may be more effective.

ankita malhotra

International Journal of Advanced Computer Science and Applications, 2011

Accurate software effort estimation is an important part of software process. Effort is measured in terms of person months and duration. Both overestimation and underestimation of software effort may lead to risky consequences. Also, software project managers have to make estimates of how much a software development is going to cost. The dominant cost for any software is the cost of calculating effort. Thus, effort estimation is very crucial and there is always a need to improve its accuracy as much as possible. There are various effort estimation models, but it is difficult to determine which model gives more accurate estimation on which dataset. This paper empirically evaluates and compares the potential of Linear Regression, Artificial Neural Network, Decision Tree, Support Vector Machine and Bagging on software project dataset. The dataset is obtained from 499 projects. The results show that Mean Magnitude Relative error of decision tree method is only 17.06%. Thus, the performance of decision tree method is better than all the other compared methods.

downloadDownload free PDF View PDFchevron_right

An Artificial Neural Network Approach to Predict Software Project Effort: One-Class vs. Binary Classification

Elif Kartal, Prof. Sevinç Gülseçen (PhD)

Smart Technology & Smart Management, 2016

downloadDownload free PDF View PDFchevron_right

Application of machine learning methods for software effort prediction

Dr. Arvinder Kaur

ACM SIGSOFT Software Engineering Notes, 2010

Software effort estimation is an important area in the field of software engineering. If the software development effort is over estimated it may lead to tight time schedules and thus quality and testing of software may be compromised. In contrast, if the software development effort is underestimated it may lead to over allocation of man power and resource. There are many models proposed in the literature for estimating software effort. In this paper, we analyze machine learning methods in order to develop models to predict software development effort we used Maxwell data consisting 63 projects. The results show that linear regression, MSP and M5Rules are effective methods for predicting software development effort.

downloadDownload free PDF View PDFchevron_right

Software Effort Estimation using Machine Learning Technique

Hasan Sarwar

International Journal of Advanced Computer Science and Applications, 2023

Software engineering effort estimation plays a significant role in managing project cost, quality, and time and creating software. Researchers have been paying close attention to software estimation during the past few decades, and a great amount of work has been done utilizing a variety of machinelearning techniques and algorithms. In order to better effectively evaluate predictions, this study recommends various machine learning algorithms for estimating, including k-nearest neighbor regression, support vector regression, and decision trees. These methods are now used by the software development industry for software estimating with the goal of overcoming the limitations of parametric and conventional estimation techniques and advancing projects. Our dataset, which was created by a software company called Edusoft Consulted LTD, was used to assess the effectiveness of the established method. The three commonly used performance evaluation measures, mean absolute error (MAE), mean squared error (MSE), and R square error, represent the base for these. Comparative experimental results demonstrate that decision trees perform better at predicting effort than other techniques.

downloadDownload free PDF View PDFchevron_right

Software effort estimation using machine learning methods

Ayse Bener

22nd International Symposium on Computer and Information Sciences, ISCIS 2007 - Proceedings, 2007

In software engineering, the main aim is to develop projects that produce the desired results within limited schedule and budget. The most important factor affecting the budget of a project is the effort. Therefore, estimating effort is crucial because hiring people more than needed leads to a loss of income and hiring people less than needed leads to an extension of schedule. The main objective of this research is making an analysis of software effort estimation to overcome problems related to it: budget and schedule extension. To accomplish this, we propose a model that uses machine learning methods. We evaluate these models on public datasets and data gathered from software organizations in Turkey. It is found out in the experiments that the best method for a dataset may change and this proves the point that the usage of one model cannot always produce the best results.

downloadDownload free PDF View PDFchevron_right

Survey on Different Machine Learning Techniques for Software Effort Estimation

Binu Rajan

International Journal of Computer Applications, 2014

Software development effort estimation is the process of predicting the effort required to develop or maintain software based on vague, incomplete or uncertain inputs. Accurate estimate of software development effort is required in the early stages of development life cycle for planning the development activities. Determination of software cost, allocation of resources, scheduling and monitoring of development activities are all dependent on the effort. Hence effort estimation is crucial for the control, quality and success of all software development projects. This paper provides an overview of the three general categories of estimation models namely; Expert Judgment based models, Algorithmic models and Non Algorithmic models. Moreover a comparison of different machine learning techniques, namely Fuzzy Logic, Artificial Neural Network, Case Based Reasoning and Fuzzy Neural Network is done in order to study which machine learning method is more suitable in which situation. Advantages and Disadvantages of these four machine learning techniques are identified as well as it was found that when applying these techniques to the COCOMO dataset the fuzzy logic and Fuzzy Neural Network showed better performance compared to other techniques.

downloadDownload free PDF View PDFchevron_right

Size and Effort-Based Computational Models for Software Cost Prediction

Andreas Andreou

International Conference on Enterprise Information Systems, 2008

Reliable and accurate software cost estimations have always been a challenge especially for people involved in project resource management. The challenge is amplified due to the high level of complexity and uniqueness of the software process. The majority of estimation methods proposed fail to produce successful cost forecasting and neither resolve to explicit, measurable and concise set of factors affecting productivity. Throughout the software cost estimation literature software size is usually proposed as one of the most important attributes affecting effort and is used to build cost models. This paper aspires to provide size and effort-based estimations for the required software effort of new projects based on data obtained from past completed projects. The modelling approach utilises Artificial Neural Networks (ANN) with a random sliding window input and output method using holdout samples and moreover, a Genetic Algorithm (GA) undertakes to evolve the inputs and internal hidden architectures and to reduce the Mean Relative Error (MRE). The obtained optimal ANN topologies and input and output methods for each dataset are presented, discussed and compared with a classic MLR model.

downloadDownload free PDF View PDFchevron_right

Software Effort Prediction - A Datamining Approach

Publishing India Group

Journal of Network and Information Security, 2017

Effective software project estimation is one of the most challenging and important activities in software development. Proper project planning and control is not possible without a sound and reliable estimate. As a whole, the software industry doesn't estimate projects well and doesn't use estimates appropriately. We suffer far more than we should as a result and we need to focus some effort on improving the situation. Effort estimation is important to minimize the cost of a software project. The existing situation may lead to serious consequences to the company as because of poor effort estimation a major percentage of the project turns out to be either more expensive than expected, late on deliver and many more issues. Not properly giving importance to the effort estimation task by under-staffing it, running the task of low quality deliverables and setting too short schedule resulting in loss of credibility as deadlines are missed always lead to problems. The current system available for effort estimation produces non-comprehensible results. Hence the purpose of this project is to produce a software system which produces a more accurate and comprehensible results using modern tools and make it easier for the project manager to easily identify the effort needed to complete a software project in terms size of project, cost etc. The various algorithm used are Support vector machine(SVM) which are best for both classification and regression and an Active Learning Based Approach (ALBA)for rule extraction from the output of SVM to produce a comprehensible output for rule.

downloadDownload free PDF View PDFchevron_right

Adoption of Machine Learning Techniques in Software Effort Estimation: An Overview

NOR AZIZAH ALI FC

IOP Conference Series: Materials Science and Engineering, 2019

Nowadays the significant trend of the effort estimation is in demand. It needs more data to be collected and the stakeholders require an effective and efficient software for processing, which makes the hardware and software cost development becomes steeply increasing. This scenario is true especially in the area of large industry, as the size of a software project is becoming more complex and bigger, the complexity of estimation is continuously increased. Effort estimation is part of the software engineering economic study on how to manage limited resources in a way a project could meet its target goal in a specified schedule, budget and scope. It is necessary to develop or adopt a useful software development process in executing a software development project by acting as a key constraint to the project. The accuracy of estimation is the main critical evaluation for every study. Recently, the machine learning techniques are becoming widely used in many effort estimation problems bu...

downloadDownload free PDF View PDFchevron_right

A Preliminary Performance Evaluation of Machine Learning Algorithms for Software Effort Estimation

Dr. Poonam Rijwani

2017

Accurate Software Effort Estimation is vital to the areas of Software Project Management. It is a process to predict the Effort in terms of cost and time, required to develop a software product. Traditionally, researchers have used the off the shelf empirical models like COCOMO or developed various methods using statistical approaches like regression and analogy based methods but these methods exhibit a number of shortfalls. To predict the effort at early stages is really difficult as very less information is available. To improve the effort estimation accuracy, an alternative is to use machine learning (ML) techniques and many researchers have proposed plethora of such machine learning based models. This paper aims to systematically analyze various machine learning models considering the traits like type of machine learning method used, estimation accuracy gained with that method, dataset used and its comparison with empirical model. Although researchers have started exploring Mach...

downloadDownload free PDF View PDFchevron_right

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.

References (6)

Boehm, B.W. (1981). Software Engineering Economics. New York: Prentice Hall.
Desharnais, J.M., (1989) Analyse statistique de la productivitie des projets informatique a partie de la technique des point des fonction. Unpublished Masters Thesis, University of Montreal.
Kemerer, C.F. (1987). An Empirical Validation of Cost Estimation Models. Comms. of the ACM, 30 (5): 416-429.
Kennedy, H.C., Chinniah, C., Bradbeer, P. and Morss, L. (1997) The Construction and Evaluation of Decision Trees: A Comparison of Evolutionary and Concept Learning Methods inCorne, D. and Shapiro, J. L. (Eds.). (1997) Evolutionary Computing. AISB International Workshop, Manchester, UK April 7-8, 1997. Springer-Verlag, LNCS 1305.
Langley, P. (1996). Elements of Machine Learning. Morgan Kaufman, San Mateo.
Shepperd, M.J., and Schofield, C. (1997). Estimating software project effort using analogies. IEEE Transactions on Software Engineering, 23(11), 736- 743.

IOANNIS STAMELOS

IFIP International Federation for Information Processing, 2006

This paper suggests several estimation guidelines for the choice of a suitable machine learning technique for software development effort estimation. Initially, the paper presents a review of relevant published studies, pointing out pros and cons of specific machine learning methods. The techniques considered are Association Rules, Classification and Regression Trees, Bayesian Belief Networks, Neural Networks and Clustering, and they are compared in terms of accuracy, comprehensibility, applicability, causality and sensitivity. Finally the study proposes guidelines for choosing the appropriate technique, based on the size of the training data and the desirable features of the extracted estimation model.

downloadDownload free PDF View PDFchevron_right

The applicability of case-based reasoning to software cost estimation

Don Petkov

2002

The nature and competitiveness of the modern software development industry demands that software engineers be able to make accurate and consistent software cost estimates. Traditionally software cost estimates have been derived with algorithmic cost estimation models such as COCOMO and Function Point Analysis. However, researchers have shown that existing software cost estimation techniques fail to produce accurate and consistent software cost estimates. Improving the reliability of software cost estimates would facilitate cost savings, improved delivery time and better quality software developments. To this end, considerable research has been conducted into finding alternative software cost estimation models that are able produce better quality software cost estimates. Researchers have suggested a number of alternative models to this problem area. One of the most promi sing alternatives is Case-Based Reasoning (CBR), which is a machine learning paradigm that makes use of past experiences to solve new problems. CBR has been proposed as a solution since it is highly suited to weak theory domains, where the relationships between cause and effect are not well understood. The aim of this research was to determine the applicability of CBR to software cost estimation. This was accomplished in part through the thorough investigation of the theoretical and practical background to CBR, software cost estimation and current research on CBR applied to software cost estimation. This provided a foundation for the development of experimental CBR software cost estimation models with which an empirical evaluation of this technology applied to software cost estimation was performed. In addition, several regression models were developed, against which the effectiveness of the CBR system could be evaluated. The architecture of the CBR models developed, facilitated the investigation of the effects of case granularity on the quality of the results obtained from them. Traditionally researchers into this field have made use of poorly populated datasets, which did not accurately reflect the true nature of the software development industry. However, for the purposes of this research an exten sive database of 300 software development projects was obtained on which these experiments were performed.

downloadDownload free PDF View PDFchevron_right

Predicting Software Effort Estimation Using Machine Learning Techniques

Ahmed BaniMustafa

Predicting Software Effort Estimation Using Machine Learning Techniques, 2018

In software engineering, estimation plays a vital role in software development. Thus, affecting its cost and required effort and consequently influencing the overall success of software development. The error margin in Expert-Based, Analogy-Based and algorithmic based methods including: COCOMO, Function Point Analysis and Use-Case-Points is quite significant, which exposes software projects to the danger of delays and running over-budget. To obtain better estimation, we propose an alternative method through performing data mining on historical data. This paper suggests performing this prediction using three machine learning techniques that were applied to a preprocessed COCOMO NASA benchmark data which covered 93 projects: Naïve Bayes, Logistic Regression and Random Forests. The generated models were tested using five folds cross-validation and were evaluated using Classification Accuracy, Precision, Recall, and AUC. The estimation results were then compared to COCOMO estimation. All the applied techniques were successful in achieving better results than the compared COCOMO model. However, the best performance was obtained using both Naïve Bayes and Random Forests. Despite the fact that Naïve Bayes outperformed both of the other two techniques in its ROC curve and Recall score, Random Forests has a better Confusion Matrix and scored better in both Classification Accuracy, and Precision measures. The results of this work confirm the validity of data mining in general and the applied technique in particular for software estimation.

downloadDownload free PDF View PDFchevron_right

An investigation of machine learning based prediction systems

Carolyn Mair, Keith Phalp

Journal of Systems and …, 2000

downloadDownload free PDF View PDFchevron_right

Software Effort Prediction Using Regression Rule Extraction from Neural Networks

Bart Baesens

2010 22nd IEEE International Conference on Tools with Artificial Intelligence, 2010

Neural networks are often selected as tool for software effort prediction because of their capability to approximate any continuous function with arbitrary accuracy. A major drawback of neural networks is the complex mapping between inputs and output, which is not easily understood by a user. This paper describes a rule extraction technique that derives a set of comprehensible IF-THEN rules from a trained neural network applied to the domain of software effort prediction. The suitability of this technique is tested on the ISBSG R11 data set by a comparison with linear regression, radial basis function networks, and CART. It is found that the most accurate results are obtained by CART, though the large number of rules limits comprehensibility. Considering comprehensible models only, the concise set of extracted rules outperform the pruned CART tree, making neural network rule extraction the most suitable technique for software effort prediction when comprehensibility is important.

downloadDownload free PDF View PDFchevron_right

AI-Based Models for Software Effort Estimation

Ayse Bener

2010 36th EUROMICRO Conference on Software Engineering and Advanced Applications, 2010

Decision making under uncertainty is a critical problem in the field of software engineering. Predicting the software quality or the cost/ effort requires high level expertise. AI based predictor models, on the other hand, are useful decision making tools that learn from past projects' data. In this study, we have built an effort estimation model for a multinational bank to predict the effort prior to projects' development lifecycle. We have collected process, product and resource metrics from past projects together with the effort values distributed among software life cycle phases, i.e. analysis & test, design & development. We have used Clustering approach to form consistent project groups and Support Vector Regression (SVR) to predict the effort. Our results validate the benefits of using AI methods in real life problems. We attain Pred(25) values as high as 78% in predicting future projects.

downloadDownload free PDF View PDFchevron_right

Comparison and evaluation of data mining techniques with algorithmic models in software cost estimation

Farhad Soleimanian Gharehchopogh

Procedia Technology, 2012

Software Cost Estimation (SCE) is one of important topics in producing software in recent decades. Real estimation requires cost and effort factors in producing software by using of algorithmic or Artificial Intelligent (AI) techniques. Boehm developed the Constructive Cost Model (COCOMO) that is one of the algorithmic SCE models. Also, these models contain three increasingly basic, intermediate and detailed forms, i.e. basic COCOMO is suitable for quick, early, rough order of among the estimates of required effort in producing software, but its accuracy is limited due to its loss of factors to account for difference between cost drivers. Intermediate COCOMO assumes these project attributes into account. In addition detailed COCOMO accounts for individual project phases used. The COCOMO algorithmic techniques families have used since 1981. In recent years, some techniques emerged by using intelligent techniques to solve and estimate the effort required in producing software. In this paper, different data mining techniques to estimate software costs are presented and then the results of each technique are evaluated and compared. However, NASA's projects to train and test each of these techniques are applied. Then, data set to train and test the data mining techniques improve the estimation accuracy of the models in many cases. We show the comparison between COCOMO model and data mining techniques here. The results indicate that these methods result in many benefit answers. Also we show the comparison of the estimation accuracy of COCOMO model with data mining techniques. Data mining techniques improve the estimation accuracy of the models in many cases. So the estimated effort more improvement in this models.

downloadDownload free PDF View PDFchevron_right

A REVIEW OF MACHINE LEARNING MODELS FOR SOFTWARE COST ESTIMATION

Farrukh Arslan, PhD

A REVIEW OF MACHINE LEARNING MODELS FOR SOFTWARE COST ESTIMATION, 2019

Article History Software cost estimation is a critical task in software projects development. It assists project managers and software engineers to plan and manage their resources. However, developing an accurate cost estimation model for a software project is a challenging process. The aim of such a process is to have a better future sight of the project progress and its phases. Another main objective is to have clear project details and specifications to assist stakeholders in managing the project in terms of human resources, assets, software, data and even in the feasibility study. Accurate estimation results with definitely helps the project manager to do better estimation for the project cost, the time required for various project phases and resources or assets. This paper builds a software cost estimation model using machine learning approach. Different machine learning algorithms are applied to two public datasets to predict the software cost in the early stages. Results show that machine learning methods can be used to predict software cost with a high accuracy rate. Contribution/Originality: This study contributes to the existing literature by enhancing the results of thirteen Machine Learning algorithms on two datasets. The evaluation criteria used in this work are R², MAE, RMAE, RAE, and RRSE. The aim of the proposed model is to predict the effort using dataset attributes and compare them with the actual effort in order to measure the error using different criteria.

downloadDownload free PDF View PDFchevron_right

Predicting Software Projects Cost Estimation Based on Mining Historical Data

Izzat Alsmadi

ISRN Software Engineering, 2012

In this research, a hybrid cost estimation model is proposed to produce a realistic prediction model that takes into consideration software project, product, process, and environmental elements. A cost estimation dataset is built from a large number of open source projects. Those projects are divided into three domains: communication, finance, and game projects. Several data mining techniques are used to classify software projects in terms of their development complexity. Data mining techniques are also used to study association between different software attributes and their relation to cost estimation. Results showed that finance metrics are usually the most complex in terms of code size and some other complexity metrics. Results showed also that games applications have higher values of the SLOCmath, coupling, cyclomatic complexity, and MCDC metrics. Information gain is used in order to evaluate the ability of object-oriented metrics to predict software complexity. MCDC metric is shown to be the first metric in deciding a software project complexity. A software project effort equation is created based on clustering and based on all software projects' attributes. According to the software metrics weights values developed in this project, we can notice that MCDC, LOC, and cyclomatic complexity of the traditional metrics are still the dominant metrics that affect our classification process, while number of children and depth of inheritance are the dominant from the object-oriented metrics as a second level.

downloadDownload free PDF View PDFchevron_right

Comparative Analysis of Software Effort Estimation Using Data Mining Technique and Feature Selection

Rifqi Firdaus

JITK (Jurnal Ilmu Pengetahuan dan Teknologi Komputer), 2021

Software development involves several interrelated factors that influence development efforts and productivity. Improving the estimation techniques available to project managers will facilitate more effective time and budget control in software development. Software Effort Estimation or software cost/effort estimation can help a software development company to overcome difficulties experienced in estimating software development efforts. This study aims to compare the Machine Learning method of Linear Regression (LR), Multilayer Perceptron (MLP), Radial Basis Function (RBF), and Decision Tree Random Forest (DTRF) to calculate estimated cost/effort software. Then these five approaches will be tested on a dataset of software development projects as many as 10 dataset projects. So that it can produce new knowledge about what machine learning and non-machine learning methods are the most accurate for estimating software business. As well as knowing between the selection between using Par...

downloadDownload free PDF View PDFchevron_right

An investigation of rule induction based prediction systems

Sign up for access to the world's latest research

AbstractAI

Related papers

References (6)

Related papers

Abstract
AI