Automated Patch Extraction via Syntax- and Semantics-Aware Delta Debugging on Source Code Changes (ESEC/FSE2018) Automated Patch Extraction via Syntax- and Semantics-Aware Delta Debugging on Source Code Changes
automated debugging, software regression, tree differencing
Delta debugging (DD) is an approach to automating the debugging activities based on systematic testing. DD algorithms find the cause of a regression of a program by minimizing the changes applied between a working version and a faulty version of the program. However, it is still an open problem to minimize a huge set of changes while avoiding any invalid subsets that do not result in testable programs, especially in case that no software configuration management system is available. In this paper, we propose a rule-based approach to syntactic and semantic decomposition of changes into independent components to facilitate DD on source code changes, and hence to extract patches automatically. For analyzing changes, we make use of tree differencing on abstract syntax trees instead of common differencing on plain texts. We have developed an experimental implementation for Java programs and applied it to 194 bug fixes from Defects4J and 8 real-life regression bugs from 6 open source Java projects. Compared to a DD tool based on plain text differencing, it extracted patches whose size is reduced by 50% at the cost of 5% more test executions for the former dataset and by 73% at the cost of 40% more test executions for the latter, both on average.
Masatomo Hashimoto, Akira Mori, Tomonori Izumida. Automated Patch Extraction via Syntax- and Semantics-Aware Delta Debugging on Source Code Changes. In Proceedings of the 26th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2018), pp. 598-609, 2018. [ACM DL] [Appendix] [Slides] [Docker Image]