Write a Blog >>
ESEC/FSE 2022
Mon 14 - Fri 18 November 2022 Singapore

Due to the openness of crowdsourced testing, mobile app crowdsourced testing has been subject to duplicate reports. The previous research methods extract the textual features of the crowdsourced test reports, combine with shallow image analysis, and perform unsupervised clustering on the crowdsourced test reports to clarify the duplication of crowdsourced test reports and solve the problem. However, these methods ignore the semantic connection between textual descriptions and screenshots, making the clustering results unsatisfactory and the deduplication effect less accurate.

This paper proposes a semi-supervised clustering tool for crowdsourced test reports with deep image understanding, namely SemCluster, which makes the most of the semantic connection between textual descriptions and screenshots by constructing semantic binding rules and performing semi-supervised clustering. SemCluster improves six metrics of clustering results in the experiment compared to the state-of-the-art method, which verifies that SemCluster has achieved a good deduplication effect. The demo can be found at: https://sites.google.com/view/semcluster-demo.