UTANGO: Untangling Commits with Context-Aware, Graph-Based, Code Change Clustering Learning Model (ESEC/FSE 2022 - Research Papers)

Write a Blog >>

Mon 14 - Fri 18 November 2022 Singapore

Who

Yi Li, Shaohua Wang, Tien N. Nguyen

Track

ESEC/FSE 2022 Research Papers

Time Zone

The program is currently displayed in (GMT+08:00) Beijing, Chongqing, Hong Kong, Urumqi.

Use conference time zone: (GMT+08:00) Beijing, Chongqing, Hong Kong, UrumqiSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 14 Nov 2022 16:45 - 17:00 at SRC Auditorium 2 - Software Evolution Chair(s): Miryung Kim

Abstract

During software evolution, developers make several changes and commit them into the repositories. Unfortunately, many of them tangle different purposes, both hampering program comprehension and reducing separation of concerns. Automated approaches with deterministic solutions have been proposed to untangle commits. However, specifying an effective clustering criteria on the changes in a commit for untangling is challenging for those approaches. In this work, we present UTango, a machine learning (ML)-based approach that learns to untangle the changes in a commit. We develop a novel code change clustering learning model that learns to cluster the code changes, represented by the embeddings, into different groups with different concerns. We adapt the agglomerative clustering algorithm into a supervised-learning clustering model operating on the learned code change embeddings via trainable parameters and a loss function in comparing the predicted clusters and the correct ones during training. To facilitate our clustering learning model, we develop a context-aware, graph-based, code change representation learning model, leveraging Label, Graph-based Convolution Network to produce the contextualized embeddings for code changes, that integrates program dependencies and the surrounding contexts of the changes. The contexts and cloned code are also explicitly represented, helping UTango distinguish the concerns. Our empirical evaluation on C# and Java datasets with 1,612 and 14k tangled commits show that it achieves the accuracy of 28.6%– 462.5% and 13.3%–100.0% relatively higher than the state-of-the-art commit-untangling approaches for C# and Java, respectively.

DOI

https://doi.org/10.1145/3540250.3549171

Yi Li

New Jersey Institute of Technology

United States

Shaohua Wang

New Jersey Institute of Technology

United States

Tien N. Nguyen

University of Texas at Dallas

United States

Time Zone

The program is currently displayed in (GMT+08:00) Beijing, Chongqing, Hong Kong, Urumqi.

Use conference time zone: (GMT+08:00) Beijing, Chongqing, Hong Kong, UrumqiSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 14 Nov
Displayed time zone: Beijing, Chongqing, Hong Kong, Urumqi change

16:00 - 17:30	Software EvolutionDemonstrations / Research Papers / Industry Paper at SRC Auditorium 2 Chair(s): Miryung Kim University of California at Los Angeles, USA

16:00 15m Research paper		Accurate Method and Variable Tracking in Commit History Research Papers Mehran Jodavi Concordia University, Nikolaos Tsantalis Concordia University Link to publication DOI Pre-print
16:15 15m Research paper		Classifying Edits to Variability in Source Code Research Papers Paul Maximilian Bittner University of Ulm, Christof Tinnes Siemens, Alexander Schultheiß Humboldt University of Berlin, Sören Viegener University of Ulm, Timo Kehrer University of Bern, Thomas Thüm University of Ulm Link to publication DOI Pre-print Media Attached
16:30 15m Talk		The Evolution of Type Annotations in Python: An Empirical StudyDistinguished Paper Award Research Papers Luca Di Grazia University of Stuttgart, Michael Pradel University of Stuttgart DOI Pre-print Media Attached
16:45 15m Talk		UTANGO: Untangling Commits with Context-Aware, Graph-Based, Code Change Clustering Learning Model Research Papers Yi Li New Jersey Institute of Technology, Shaohua Wang New Jersey Institute of Technology, Tien N. Nguyen University of Texas at Dallas DOI
17:00 15m Talk		Sometimes You Have to Treat the Symptoms: Tackling Model Drift in an Industrial Clone-and-Own Software Product Line Industry Paper Christof Tinnes Siemens, Wolfgang Rössler Siemens Mobility, Uwe Hohenstein Siemens, Torsten Kühn Siemens Mobility, Andreas Biesdorf Siemens, Sven Apel Saarland University DOI
17:15 7m Talk		Context Aware Code Recommendation in Intellij IDEA Demonstrations Shamsa Abid Lahore University of Management Sciences, Hamid Abdul Basit Prince Sultan University, Shafay Shamail LUMS, DHA, Lahore
17:23 7m Talk		Python-by-Contract Dataset Demonstrations Jiyang Zhang University of Texas at Austin, Marko Ristin ZHAW School of Engineering, Phillip Schanely , Hans Wernher van de Venn Zurich University of Applied Sciences (ZHAW), Milos Gligoric University of Texas at Austin