Reliable Confidence Intervals for Software Effort Estimation
2009
Abstract
This paper deals with the problem of software effort estimation through the use of a new machine learning technique for producing reliable confidence measures in predictions. More specifically, we propose the use of Conformal Predictors (CPs), a novel type of prediction algorithms, as a means for providing effort estimations for software projects in the form of predictive intervals according to a specified confidence level. Our approach is based on the well-known Ridge Regression technique, but instead of the simple effort estimates produced by the original method, it produces predictive intervals that satisfy a given confidence level. The results obtained using the proposed algorithm on the COCOMO, Desharnais and ISBSG datasets suggest a quite successful performance obtaining reliable predictive intervals which are narrow enough to be useful in practice.
References (24)
- Andreou, A., Papatheocharous, E.: Tools in Artificial Intelligence, chap. 1. Com- putational Intelligence in Software Cost Estimation: Evolving conditional sets of effort value ranges, pp. 1-20. I-Tech Education and Publication KG, Vienna, Austria (2008). URL http://intechweb.org/downloadpdf.php?id=5277
- Angelis, L., Stamelos, I.: A simulation tool for efficient analogy based cost esti- mation. Empirical software engineering 5, 35-68 (2000)
- Bellotti, T., Luo, Z., Gammerman, A., Delft, F.W.V., Saha, V.: Qualified pre- dictions for microarray and proteomics pattern diagnostics with confidence ma- chines. International Journal of Neural Systems 15(4), 247-258 (2005)
- Boehm, B.: Software Engineering Economics. Prentice Hall (1981)
- Desharnais, J.M.: Analyse statistique de la productivite des projects de develop- ment en informatique a partir de la technique de points de fonction. MSc. Thesis. Montreal Universite du Quebec (1988)
- Gammerman, A., Vapnik, V., Vovk, V.: Learning by transduction. In: Proceed- ings of the Fourteenth Conference on Uncertainty in Artificial Intelligence, pp. 148-156. Morgan Kaufmann, San Francisco, CA (1998)
- Gammerman, A., Vovk, V., Burford, B., Nouretdinov, I., Luo, Z., Chervo- nenkis, A., Waterfield, M., Cramer, R., Tempst, P., Villanueva, J., Kabir, M., Camuzeaux, S., Timms, J., Menon, U., Jacobs, I.: Serum proteomic abnor- mality predating screen detection of ovarian cancer. The Computer Journal, doi:10.1093/comjnl/bxn021 (2008)
- Gruschke, T., Jørgensen, M.: The role of outcome feedback in improving the un- certainty assessment of software development effort estimates. ACM Transactions of Software Engineering Methodology 17, 1-35 (2008)
- Huang, X., Ho, D., Ren, J., Capretz, L.: Improving the COCOMO model using a neuro-fuzzy approach. Applied Soft Computing 7, 9-40 (2007)
- International Software Benchmarking Standards Group: The ISBSG estimating, benchmarking & research suite release 9 (2005). URL http://www.isbsg.org/
- Jørgensen, M., Shepperd, M.: A systematic review of software development cost estimation studies. IEEE Transactions on Software Engineering 33(1), 33-53 (2007)
- Jørgensen, M., Sjøberg, D.: An effort prediction interval approach based on the empirical distribution of previous estimation accuracy. Information Software and Technology 45, 123-136 (2003)
- Jørgensen, M., Teigen, K., Moløkke, K.: Better sure than safe? overconfidence in judgment based software development effort prediction intervals. Systems and Software 70, 79-93 (2004)
- Mair, C., Kadoda, G., Lefley, M., Phalp, K., Schofield, C., Shepperd, M., Webster, S.: An investigation of machine learning based prediction systems. Journal of Systems and Software 53(1), 23-49 (2000)
- Nouretdinov, I., Melluish, T., Vovk, V.: Ridge regression confidence machine. In: Proceedings of the 18th International Conference on Machine Learning (ICML'01), pp. 385-392. Morgan Kaufmann, San Francisco, CA (2001)
- Nouretdinov, I., Vovk, V., Vyugin, M.V., Gammerman, A.: Pattern recognition and density estimation under the general i.i.d. assumption. In: Proceedings of the 14th Annual Conference on Computational Learning Theory and 5th Euro- pean Conference on Computational Learning Theory, Lecture Notes in Computer Science, vol. 2111, pp. 337-353. Springer (2001)
- Oliveira, A.: Estimation of software projecs effort with support vector regression. Neurocomputing 69(13-15), 1749-1753 (2006)
- Papadopoulos, H.: Tools in Artificial Intelligence, chap. 18. Inductive Conformal Prediction: Theory and Application to Neural Networks, pp. 315-330. I-Tech, Vienna, Austria (2008). URL http://intechweb.org/downloadpdf.php?id=5294
- Papadopoulos, H., Gammerman, A., Vovk, V.: Normalized nonconformity mea- sures for regression conformal prediction. In: Proceedings of the IASTED Inter- national Conference on Artificial Intelligence and Applications (AIA 2008), pp. 64-69. ACTA Press (2008)
- Papatheocharous, E., Andreou, A.: Software cost estimation using artificial neural networks with inputs selection. In: Proceedings of the 9th International Confer- ence on Enterprise Information Systems, pp. 398-407. Madeira, Funchal (2007)
- Proedrou, K., Nouretdinov, I., Vovk, V., Gammerman, A.: Transductive confi- dence machines for pattern recognition. In: Proceedings of the 13th European Conference on Machine Learning (ECML'02), Lecture Notes in Computer Sci- ence, vol. 2430, pp. 381-390. Springer (2002)
- Saunders, C., Gammerman, A., Vovk, V.: Ridge regression learning algorithm in dual variables. In: Proceedings of the 15th International Conference on Machine Learning (ICML'98), pp. 515-521. Morgan Kaufmann, San Francisco, CA (1998)
- Saunders, C., Gammerman, A., Vovk, V.: Transduction with confidence and cred- ibility. In: Proceedings of the 16th International Joint Conference on Artificial Intelligence, vol. 2, pp. 722-726. Morgan Kaufmann, Los Altos, CA (1999)
- Vovk, V., Gammerman, A., Shafer, G.: Algorithmic Learning in a Random World. Springer, New York (2005)