Fault-Proneness of Open Source Systems: An Empirical Analysis
2014
Abstract
Developing quality software is a very complex job considering the complexity and size of software developed these days. Early prediction of software quality assists in optimizing testing resources. Many fault prediction models have been developed using several internal attributes and different machine learning techniques. However, the open-source community still lacks a concise knowledge about what types of internal attributes affect the software quality the most. In this work, an empirical investigation is conducted to explore the relationships between internal attributes of open-source systems and their fault-proneness. The results of the empirical analysis showed that by selecting only nine internal attributes, the fault prediction models accuracy did not decrease significantly. This indicates that only a subset of these internal attributes is worth collection and investigation. By focusing on a small set of internal attributes, the quality assurance team can save time and resour...
References (15)
- C. Catal, "Software fault prediction: A literature review and current trends," Expert Systems with Applications, vol. 38, no. 4, pp. 4626-4636, 2011.
- M. A. Hall and G. Holmes, "Benchmarking attribute selection tech- niques for discrete class data mining," IEEE Transactions on Knowledge and Data Engineering, vol. 15, no. 6, pp. 1437-1447, 2003.
- C. Catal and B. Diri, "Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem," Information Sciences, vol. 179, no. 8, pp. 1040-1058, 2009.
- M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, "The weka data mining software: an update," ACM SIGKDD explorations newsletter, vol. 11, no. 1, pp. 10-18, 2009.
- M. Jureczko and D. Spinellis, "Using object-oriented design metrics to predict software defects," Models and Methods of System Dependability. Oficyna Wydawnicza Politechniki Wrocławskiej, pp. 69-81, 2010.
- G. Scanniello, C. Gravino, A. Marcus, and T. Menzies, "Class level fault prediction using software clustering," in Proceedings of the 28th IEEE/ACM International Conference on Automated Software Engineer- ing (ASE2013). IEEE, 2013.
- A. Okutan and O. T. Yıldız, "Software defect prediction using bayesian networks," Empirical Software Engineering, vol. 19, no. 1, pp. 154-181, 2014.
- M. Alenezi and K. Magel, "Empirical evaluation of a new coupling metric: Combining structural and semantic coupling," International Journal of Computers and Applications, vol. 36, no. 1, 2014.
- M. Baojun, K. Dejaeger, J. Vanthienen, and B. Baesens, "Software defect prediction based on association rule classification," Available at SSRN http://ssrn.com/abstract=1785381, 2011.
- P. Sprent and N. C. Smeeton, Applied nonparametric statistical meth-ods, 4th ed., ser. Chapman & Hall/CRC Texts in Statistical Science. Boca Raton, FL: CRC Press, 2007.
- L. C. Briand, J. Wu ¨st, S. V. Ikonomovski, and H. Lounis, "Investigating quality factors in object-oriented designs: an industrial case study," in Proceedings of the 21st international conference on Software engineer-ing. ACM, 1999, pp. 345-354.
- L. Briand, J. Wu ¨st, and H. Lounis, "Replicated case studies for inves-tigating quality factors in object-oriented designs," Empirical Software Engineering: An International Journal, vol. 6, pp. 11-58, 2001.
- T. Gyimothy, R. Ferenc, and I. Siket, "Empirical validation of object-oriented metrics on open source software for fault prediction," IEEE Transactions on Software Engineering,, vol. 31, no. 10, pp. 897-910, 2005.
- H. M. Olague, L. H. Etzkorn, S. Gholston, and S. Quattlebaum, "Empirical validation of three software metrics suites to predict fault-proneness of object-oriented classes developed using highly iterative or agile software development processes," IEEE Transactions on Software Engineering, vol. 33, no. 6, pp. 402- 419, 2007.
- Y. Zhou and H. Leung, "Empirical analysis of object-oriented design metrics for predicting high and low severity faults," IEEE Transactions on Software Engineering, vol. 32, no. 10, pp. 771- 789, 2006.