Investigating the Understandability of Review Comments on Code Change Requests
This program is tentative and subject to change.
Code review is a widely adopted quality assurance practice in software engineering, where expert reviewers assess developers’ code changes before merging. While prior studies have explored review comment quality and usefulness, they often overlook the clarity and understandability of Code Change Request (CCR) comments. Unclear CCR comments can pose significant challenges for developers to address. Therefore, this study investigates the prevalence and impact of confusing or unclear CCR comments and proposes two approaches to enhance CCR communication during code review. Using a dataset of 182 open-source GitHub projects with over 55K pull requests and 466K CCR comments, we analyzed how often unclear comments occur and their effects on the review process. Our classifier, built from manually annotated developers’ replies in response to CCR comments, revealed that 24% of comments led to author confusion. Statistical analysis shows that unclear CCR comments significantly increase resolution time and discussion length, and that pull requests with clear CCR comments are more likely to be addressed and merged. A manual analysis of 400 confusing CCR comments identified six key characteristics, with lack of clarity and unclear rationale being the most common. Our first approach, the confusion classifier, flags authors’ confusion to enable reviewers to clarify ambiguities promptly (recall of 0.96), while the second enables reviewers to evaluate the clarity and understandability of their CCR comments (recall of 0.93). This pioneering study further provides recommendations for enhancing CCR comments and offering a foundation for future research to streamline the review process.