A manual categorization of new quality issues on automatically-generated tests

Abstract
Diverse studies have analyzed the quality of automatically generated test cases by using test smells as the main quality attribute. But recent work reported that generated tests might suffer from a number of quality issues not considered previously, thus suggesting that not all test smells have been identified yet. Little is known about these issues and their frequency within generated tests. In this paper, we report on a manual analysis of an external dataset consisting of 2,340 automatically generated tests. This analysis aimed at detecting new quality issues, not covered by past recognized test smells. We use thematic analysis to group and categorize the new quality issues found. As a result, we propose a taxonomy of 13 new quality issues grouped in four categories. We also report on the frequency of these new quality issues within the dataset and present eight recommendations that test generators may consider to improve the quality and usefulness of the automatically generated tests. As an additional contribution, our results suggest that (i) test quality should be evaluated not only on the tests themselves, but considering also the tested code; and (ii) automatically generated tests present flaws that are unlikely to be found in manually created tests and thus require specific quality checking tools.
Description
Keywords
Citation