MSR 2025
Mon 28 - Tue 29 April 2025 Ottawa, Ontario, Canada
co-located with ICSE 2025
Mon 28 Apr 2025 15:05 - 15:10 at 214 - AI for SE (1) Chair(s): Diego Elias Costa

Software repositories contain a wealth of data about the software development process, such as source code, documentation, issue tracking, and commit histories. However, accessing and extracting meaningful insights from these data is time-consuming and requires technical expertise, posing challenges for software practitioners, especially non-technical stakeholders like project managers. Existing solutions, such as software engineering chatbots leveraging LLMs, have demonstrated significant limitations in retrieving relevant data to answer user questions. In this paper, we introduce RepoChat, a web-based tool designed to answer repository-related questions by synergizing LLMs with knowledge graphs. RepoChat operates in two steps: (1) the Data Ingestion step, where it collects and constructs a knowledge graph from repository metadata, such as commits, issues, files and users; and (2) the Interaction step, where it takes the users natural language question, translates it into graph queries using an LLM, executes these queries against the knowledge graph, and generates a user-friendly response to the question using the query results as context. We evaluate RepoChat by conducting a user study in which participants asked a series of repository-related questions representing common developer intents. RepoChat achieved an accuracy of 90%, correctly answering 36 out of 40 questions, demonstrating its effectiveness in accurately retrieving relevant information to answer user’s questions. RepoChat is available at https://repochattool.streamlit.app, and its source code is accessible on GitHub at https://github.com/sabedu/repositoryChat.

Mon 28 Apr

Displayed time zone: Eastern Time (US & Canada) change

14:00 - 15:30
AI for SE (1)Technical Papers / Data and Tool Showcase Track / Registered Reports at 214
Chair(s): Diego Elias Costa Concordia University, Canada
14:00
10m
Talk
Combining Large Language Models with Static Analyzers for Code Review Generation
Technical Papers
Imen Jaoua DIRO, Université de Montréal, Oussama Ben Sghaier DIRO, Université de Montréal, Houari Sahraoui DIRO, Université de Montréal
Pre-print
14:10
10m
Talk
Harnessing Large Language Models for Curated Code Reviews
Technical Papers
Oussama Ben Sghaier DIRO, Université de Montréal, Martin Weyssow Singapore Management University, Houari Sahraoui DIRO, Université de Montréal
Pre-print
14:20
10m
Talk
SMATCH-M-LLM: Semantic Similarity in Metamodel Matching With Large Language Models
Technical Papers
Nafisa Ahmed Polytechnique Montreal, Hin Chi Kwok Hong Kong Polytechnic University, Mohammad Hamdaqa Polytechnique Montréal, Wesley Assunção North Carolina State University
14:30
10m
Talk
How Effective are LLMs for Data Science Coding? A Controlled Experiment
Technical Papers
Nathalia Nascimento Pennsylvania State University, Everton Guimaraes Pennsylvania State University, USA, Sai Sanjna Chintakunta Pennsylvania State University, Santhosh AB Pennsylvania State University
Pre-print
14:40
10m
Talk
Do LLMs Provide Links to Code Similar to what they Generate? A Study with Gemini and Bing CoPilot
Technical Papers
Daniele Bifolco University of Sannio, Pietro Cassieri University of Salerno, Giuseppe Scanniello University of Salerno, Massimiliano Di Penta University of Sannio, Italy, Fiorella Zampetti University of Sannio, Italy
Pre-print
14:50
10m
Talk
Too Noisy To Learn: Enhancing Data Quality for Code Review Comment Generation
Technical Papers
Chunhua Liu The University of Melbourne, Hong Yi Lin The University of Melbourne, Patanamon Thongtanunam University of Melbourne
15:00
5m
Talk
Should Code Models Learn Pedagogically? A Preliminary Evaluation of Curriculum Learning for Real-World Software Engineering Tasks
Technical Papers
Kyi Shin Khant The University of Melbourne, Hong Yi Lin The University of Melbourne, Patanamon Thongtanunam University of Melbourne
15:05
5m
Talk
RepoChat: An LLM-Powered Chatbot for GitHub Repository Question-Answering
Data and Tool Showcase Track
Samuel Abedu Concordia University, Laurine Menneron CESI Graduate School of Engineering, SayedHassan Khatoonabadi Concordia University, Emad Shihab Concordia University
15:10
5m
Talk
How do Copilot Suggestions Impact Developers' Frustration and Productivity?
Registered Reports
Emanuela Guglielmi University of Molise, Venera Arnaoudova Washington State University, Gabriele Bavota Software Institute @ Università della Svizzera Italiana, Rocco Oliveto University of Molise, Simone Scalabrino University of Molise
15:15
5m
Talk
Exploring the Lifecycle and Maintenance Practices of Pre-Trained Models in Open-Source Software Repositories
Registered Reports
Matin Koohjani Concordia University, Diego Elias Costa Concordia University, Canada