Shaker: a Tool for Detecting More Flaky Tests Faster
2021, 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE)
https://doi.org/10.1109/ASE51524.2021.9678918Abstract
A test case that intermittently passes or fails when performed under the same version of source code and test code is said to be flaky. The presence of flaky tests wastes testing time and effort. The most popular approach in industry to detect flakiness is ReRun. The idea behind ReRun is very simple: failing test cases are re-executed many times looking for inconsistencies in the output. Despite its simplicity, the ReRun strategy is very expensive both in terms of time and in terms of computational resources. This is particularly true for contexts where thousands of test cases are performed on a daily basis. Reducing the rerunning overhead is, thus, of utmost importance. This paper presents SHAKER, an open-source tool for detecting flakiness in time-constrained tests by adding noise in the execution environment. The main idea behind SHAKER is to add stressing tasks that compete with the test execution for the use of resources (CPU or memory). SHAKER is available as a GitHub Actions workflow that can be seamlessly integrated with any GitHub project. Alternatively, SHAKER can also be used via its provided Command Line Interface. In our evaluation, SHAKER was able to discover more flaky tests than ReRun and in a faster way (less re-executions); besides, our approach revealed tens of new flaky tests that went undetected by ReRun even after 50 re-executions. Thanks to its flexibility and ease of use, we believe that SHAKER can be useful for both practitioners and researchers.
References (15)
- S. Thorve, C. Sreshtha, and N. Meng, "An empirical study of flaky tests in android apps," in 2018 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 2018, pp. 534-538.
- J. Bell, O. Legunsen, M. Hilton, L. Eloussi, T. Yung, and D. Marinov, "Deflaker: automatically detecting flaky tests," in Proceedings of the 40th International Conference on Software Engineering. ACM, 2018, pp. 433-444.
- G. Pinto, B. Miranda, S. Dissanayake, M. d'Amorim, C. Treude, and A. Bertolino, "What is the vocabulary of flaky tests?" in Proceedings of the 17th International Conference on Mining Software Repositories, 2020, pp. 492-502.
- A. Shi, W. Lam, R. Oei, T. Xie, and D. Marinov, "ifixflakies: A framework for automatically fixing order-dependent flaky tests," in Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2019, pp. 545-555.
- W. Lam, R. Oei, A. Shi, D. Marinov, and T. Xie, "idflakies: A framework for detecting and partially classifying flaky tests," in 2019 12th ieee conference on software testing, validation and verification (icst). IEEE, 2019, pp. 312-322.
- T. M. King, D. Santiago, J. Phillips, and P. J. Clarke, "Towards a bayesian network model for predicting flaky automated tests," in 2018 IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C). IEEE, 2018, pp. 100-107.
- R. Verdecchia, E. Cruciani, B. Miranda, and A. Bertolino, "Know you neighbor: Fast static prediction of test flakiness," IEEE Access, vol. 9, pp. 76 119-76 134, 2021.
- J. Micco, "Flaky tests at google and how we mitigate them," 2016, https://testing.googleblog.com/2017/04/where-do-our-flaky-tests- come-from.html.
- J. Palmer, "Test flakiness -methods for identifying and deal- ing with flaky tests," 2019, https://labs.spotify.com/2019/11/18/test- flakiness-methods-for-identifying-and-dealing-with-flaky-tests/.
- W. Lam, P. Godefroid, S. Nath, A. Santhiar, and S. Thummalapenta, "Root causing flaky tests in a large-scale industrial setting," in Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, ser. ISSTA 2019. New York, NY, USA: ACM, 2019, pp. 101-111. [Online]. Available: http: //doi.acm.org/10.1145/3293882.3330570
- D. Silva, L. Teixeira, and M. d'Amorim, "Shake it! detecting flaky tests caused by concurrency with shaker," in 2020 IEEE International Conference on Software Maintenance and Evolution (ICSME). IEEE, 2020, pp. 301-311.
- Q. Luo, F. Hariri, L. Eloussi, and D. Marinov, "An empirical analysis of flaky tests," in Proc. FSE'14, 2014.
- Z. Dong, A. Tiwari, X. L. Yu, and A. Roychoudhury, "Concurrency- related flaky test detection in android apps," ArXiv, vol. abs/2005.10762, 2020.
- GitHub, "GitHub Actions," https://github.com/features/actions.
- C. King and A. Waterland, "stress-ng," https://manpages.ubuntu.com/manpages/artful/man1/stress- ng.1.html#description, 2020, [Online; accessed April-2020].