MultIPAs : Applying Program Transformations to Introductory Programming Assignments for Data Augmentation
There has been a growing interest, over the last few years, in the topic of automated program repair applied to fixing introductory programming assignments (IPAs). However, the datasets of IPAs publicly available tend to be small and with no valuable annotations about the defects of each program. Small datasets are not very useful for program repair tools that rely on machine learning models. Furthermore, a large diversity of correct implementations allows computing a smaller set of repairs to fix a given incorrect program rather than always using the same set of correct implementations for a given IPA. For these reasons, there has been an increasing demand for the task of augmenting IPAs benchmarks.
This paper presents MultIPAs, a program transformation tool that can augment IPAs benchmarks by: (1) applying six syntactic mutations that conserve the program’s semantics and (2) applying three semantic mutilations that introduce faults in the IPAs. Moreover, we demonstrate the usefulness of MultIPAs by augmenting with millions of programs two publicly available benchmarks of programs written in the C language, and also by generating an extensive benchmark of semantically incorrect programs.
Mon 14 NovDisplayed time zone: Beijing, Chongqing, Hong Kong, Urumqi change
16:00 - 17:30 | Human/Computer InteractionResearch Papers / Demonstrations at SRC LT 51 Chair(s): Saikat Chakraborty Microsoft Research | ||
16:00 15mTalk | How to Formulate Specific How-To Questions in Software Development? Research Papers Mingwei Liu Fudan University, Xin Peng Fudan University, Andrian Marcus University of Texas at Dallas, Christoph Treude University of Melbourne, Jiazhan Xie Fudan University, Huanjun Xu Fudan University, Yanjun Yang Fudan University DOI | ||
16:15 15mTalk | Asynchronous Technical Interviews: Reducing the Effect of Supervised Think-Aloud on Communication AbilityDistinguished Paper Award Research Papers DOI | ||
16:30 15mTalk | Pair Programming Conversations with Agents vs. Developers: Challenges and Opportunities for SE Community Research Papers Peter Robe University of Tulsa, Sandeep Kuttal University of Tulsa, Jake AuBuchon University of Tulsa, Jacob Hart University of Tulsa DOI | ||
16:45 15mTalk | Toward Interactive Bug Reporting for (Android App) End-Users Research Papers Yang Song College of William and Mary, Junayed Mahmud George Mason University, Ying Zhou University of Texas at Dallas, Oscar Chaparro College of William and Mary, Kevin Moran George Mason University, Andrian Marcus University of Texas at Dallas, Denys Poshyvanyk College of William and Mary DOI | ||
17:00 7mTalk | MultIPAs : Applying Program Transformations to Introductory Programming Assignments for Data Augmentation Demonstrations Pedro Orvalho INESC-ID, Instituto Superior Técnico, University of Lisbon, Mikoláš Janota Czech Technical University in Prague, Vasco Manquinho INESC-ID; Universidade de Lisboa Pre-print | ||
17:08 7mTalk | PolyFax: A Toolkit for Characterizing Multi-Language Software Demonstrations Wen Li Washington State University, Li Li Monash University, Haipeng Cai Washington State University Pre-print |