Scented since the beginning: On the diffuseness of test smells in automatically generated test code

•A large empirical study aiming at understanding the relationships between test smells and automatically generated test cases.•Test smells are widely diffused in automatically generated test cases.•Assertion Roulette and Eager Test are the most spread test smells and they often co-occur together.•Th...

Full description

Saved in:

Bibliographic Details
Published in	The Journal of systems and software Vol. 156; pp. 312 - 327
Main Authors	Grano, Giovanni, Palomba, Fabio, Di Nucci, Dario, De Lucia, Andrea, Gall, Harald C.
Format	Journal Article
Language	English
Published	Elsevier Inc 01.10.2019
Subjects	Empirical studies Software quality Test case generation Test smells Test smells Empirical studies Test case generation Software quality
Online Access	Get full text

Cover

Loading…

More Information
Summary:	•A large empirical study aiming at understanding the relationships between test smells and automatically generated test cases.•Test smells are widely diffused in automatically generated test cases.•Assertion Roulette and Eager Test are the most spread test smells and they often co-occur together.•The presence of test smells is not partially influenced by the production code quality.•Test suite size often correlates with generation of smelly test cases. Software testing represents a key software engineering practice to ensure source code quality and reliability. To support developers in this activity and reduce testing effort, several automated unit test generation tools have been proposed. Most of these approaches have the main goal of covering as more branches as possible. While these approaches have good performance, little is still known on the maintainability of the test code they produce, i.e.,whether the generated tests have a good code quality and if they do not possibly introduce issues threatening their effectiveness. To bridge this gap, in this paper we study to what extent existing automated test case generation tools produce potentially problematic test code. We consider seven test smells, i.e.,suboptimal design choices applied by programmers during the development of test cases, as measure of code quality of the generated tests, and evaluate their diffuseness in the unit test classes automatically generated by three state-of-the-art tools such as Randoop, JTExpert, and Evosuite. Moreover, we investigate whether there are characteristics of test and production code influencing the generation of smelly tests. Our study shows that all the considered tools tend to generate a high quantity of two specific test smell types, i.e.,Assertion Roulette and Eager Test, which are those that previous studies showed to negatively impact the reliability of production code. We also discover that test size is correlated with the generation of smelly tests. Based on our findings, we argue that more effective automated generation algorithms that explicitly take into account test code quality should be further investigated and devised.
ISSN:	0164-1212 1873-1228
DOI:	10.1016/j.jss.2019.07.016