Write a Blog >>
ESEC/FSE 2022
Mon 14 - Fri 18 November 2022 Singapore

Natural language (NL) documentation is the bridge between software managers and testers, and NL test cases are prevalent in system-level testing and other quality assurance activities.
Due to reasons such as requirements redundancy, parallel testing, tester turn-over within long evolving history, there are inevitably lots of redundant test cases, which significantly increase the cost.
Previous redundancy detection approaches typically treat the textual descriptions as a whole to compare their similarity and suffer from low precision.
Our observation reveals that a test case can have explicit test-oriented entities, such as tested function Components, Constraints, etc; and there are also specific relations between these entities. This inspires us with a potential opportunity for accurate redundancy detection.
In this paper, we first define five test-oriented entity categories and four associated relation categories, and re-formulate the NL test case redundancy detection problem as the comparison of detailed testing content guided by the test-oriented entities and relations.
Following that, we propose Tscope, a fine-grained approach for redundant NL test case detection by dissecting test cases into atomic test tuple(s) with the entities restricted by associated relations.
To serve as the test case dissection, Tscope designs a context-aware model for the automatic entity and relation extraction.
Evaluation on 3,467 test cases from ten projects shows Tscope could achieve 91.8% precision, 74.8% recall and 82.4% F1, significantly outperforming state-of-the-art approaches and commonly-used classifiers.
This new formulation of the NL test case redundant detection problem can motivate the follow-up studies in further improving this task and other related tasks involving NL descriptions.