An Empirical Evaluation of Assertions as Oracles

In software testing, an oracle determines whether a test case passes or fails by comparing output from the program under test with the expected output. Since the identification of faults through testing requires that the bug is both exercised and the resulting failure is recognized, it follows that...

Full description

Saved in:
Bibliographic Details
Published in2011 Fourth IEEE International Conference on Software Testing, Verification and Validation pp. 110 - 119
Main Authors Shrestha, Kavir, Rutherford, Matthew J
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.03.2011
Subjects
Online AccessGet full text
ISBN9781612841748
1612841740
ISSN2159-4848
DOI10.1109/ICST.2011.50

Cover

Loading…
More Information
Summary:In software testing, an oracle determines whether a test case passes or fails by comparing output from the program under test with the expected output. Since the identification of faults through testing requires that the bug is both exercised and the resulting failure is recognized, it follows that oracles are critical to the efficacy of the testing process. Despite this, there are few rigorous empirical studies of the impact of oracles on effectiveness. In this paper, we report the results of one such experiment in which we exercise seven core Java classes and two sample programs with branch-adequate, input only(i.e., no oracle) test suites and collect the failures observed by different oracles. For faults, we use synthetic bugs created by the muJava mutation testing tool. In this study we evaluate two oracles: (1) the implicit oracle (or "null oracle") provided by the runtime system, and (2) runtime assertions embedded in the implementation (by others) using the Java Modeling Language. The null oracle establishes a baseline measurement of the potential benefit of rigorous oracles, while the assertions represent a more rigorous approach that is sometimes used in practice. The results of our experiments are interesting. First, on a per-method basis, we observe that the null oracle catches less than 11% of the faults, leaving more than 89% uncaught. Second, we observe that the runtime assertions in our subjects are effective at catching about 53% of the faults not caught by null oracle. Finally, by analyzing the data using data mining techniques, we observe that simple, code-based metrics can be used to predict which methods are amenable to the use of assertion-based oracles with a high degree of accuracy.
ISBN:9781612841748
1612841740
ISSN:2159-4848
DOI:10.1109/ICST.2011.50