MSR 2025
Mon 28 - Tue 29 April 2025 Ottawa, Ontario, Canada
co-located with ICSE 2025
Dates
Tracks

This program is tentative and subject to change.

You're viewing the program in a time zone which is different from your device's time zone change time zone

Mon 28 Apr

Displayed time zone: Eastern Time (US & Canada) change

09:00 - 10:30
Plenary: Opening + Joint MSR + ICPC KeynoteProgram / Keynotes at 214
09:00
30m
Day opening
Official Opening
Program

09:30
60m
Keynote
Mining BOMs for Improving Supply Chain Efficiency & Resilience
Keynotes
Kate Stewart Linux Foundation
11:00 - 12:30
Defects, bugs, and issuesData and Tool Showcase Track / Technical Papers at 214
Chair(s): Mohammad Hamdaqa Polytechnique Montréal, Minhaz Zibran Idaho State University
11:00
10m
Talk
Learning from Mistakes: Understanding Ad-hoc Logs through Analyzing Accidental Commits
Technical Papers
Yi-Hung Chou University of California, Irvine, Yiyang Min Amazon, April Wang ETH Zürich, James Jones University of California at Irvine
Pre-print
11:10
10m
Talk
On the calibration of Just-in-time Defect Prediction
Technical Papers
Xhulja Shahini paluno - University of Duisburg-Essen, Jone Bartel University of Duisburg-Essen, paluno, Klaus Pohl University of Duisburg-Essen, paluno
11:20
10m
Talk
An Empirical Study on Leveraging Images in Automated Bug Report Reproduction
Technical Papers
Dingbang Wang University of Connecticut, Zhaoxu Zhang University of Southern California, Sidong Feng Monash University, William G.J. Halfond University of Southern California, Tingting Yu University of Connecticut
11:30
10m
Talk
It’s About Time: An Empirical Study of Date and Time Bugs in Open-Source Python Software
Technical Papers
Shrey Tiwari Carnegie Mellon University, Serena Chen University of California, San Diego, Alexander Joukov Stony Brook University, Peter Vandervelde University of California, Santa Barbara, Ao Li Carnegie Mellon University, Rohan Padhye Carnegie Mellon University
11:40
10m
Talk
Enhancing Just-In-Time Defect Prediction Models with Developer-Centric Features
Technical Papers
Emanuela Guglielmi University of Molise, Andrea D'Aguanno University of Molise, Rocco Oliveto University of Molise, Simone Scalabrino University of Molise
11:50
10m
Talk
Revisiting Defects4J for Fault Localization in Diverse Development Scenarios
Technical Papers
Md Nakhla Rafi Concordia University, An Ran Chen University of Alberta, Tse-Hsun (Peter) Chen Concordia University, Shaohua Wang Central University of Finance and Economics
12:00
5m
Talk
Mining Bug Repositories for Multi-Fault Programs
Data and Tool Showcase Track
Dylan Callaghan Stellenbosch University, Bernd Fischer Stellenbosch University
12:05
5m
Talk
HaPy-Bug - Human Annotated Python Bug Resolution Dataset
Data and Tool Showcase Track
Piotr Przymus Nicolaus Copernicus University in Toruń, Poland, Mikołaj Fejzer Nicolaus Copernicus University in Toruń, Jakub Narębski Nicolaus Copernicus University in Toruń, Radosław Woźniak Nicolaus Copernicus University in Toruń, Łukasz Halada University of Wrocław, Poland, Aleksander Kazecki Nicolaus Copernicus University in Toruń, Mykhailo Molchanov Igor Sikorsky Kyiv Polytechnic Institute, Ukraine, Krzysztof Stencel University of Warsaw
12:10
5m
Talk
SPRINT: An Assistant for Issue Report Management
Data and Tool Showcase Track
Ahmed Adnan , Antu Saha William & Mary, Oscar Chaparro William & Mary
Pre-print
11:00 - 12:30
Security and legal aspectsIndustry Track / Data and Tool Showcase Track / Technical Papers at 215
Chair(s): Mohammad Ghafari TU Clausthal, Mohammad Hamdaqa Polytechnique Montréal
11:00
10m
Talk
Wolves in the Repository: A Software Engineering Analysis of the XZ Utils Supply Chain Attack
Technical Papers
Piotr Przymus Nicolaus Copernicus University in Toruń, Poland, Thomas Durieux TU Delft
11:10
10m
Talk
Software Composition Analysis and Supply Chain Security in Apache Projects: an Empirical Study
Technical Papers
Sabato Nocera University of Salerno, Sira Vegas Universidad Politecnica de Madrid, Giuseppe Scanniello University of Salerno, Natalia Juristo Universidad Politecnica de Madrid
Pre-print
11:20
10m
Talk
Good practice versus reality: a landscape analysis of Research Software metadata adoption in European Open Science Clusters
Technical Papers
Anas El Hounsri Universidad Politécnica de Madrid, Daniel Garijo Universidad Politécnica de Madrid
11:30
10m
Talk
Towards Security Commit Message Standardization
Technical Papers
Sofia Reis Instituto Superior Técnico, U. Lisboa & INESC-ID, Rui Abreu INESC-ID; University of Porto, Corina Pasareanu CMU, NASA, KBR
11:40
10m
Talk
From Industrial Practices to Academia: Uncovering the Gap in Vulnerability Research and Practice
Technical Papers
Zhuang Liu , Xing Hu Zhejiang University, Jiayuan Zhou Queen's University, Xin Xia Huawei
11:50
5m
Talk
Patch Me If You Can—Securing the Linux Kernel
Industry Track
Gunnar Kudrjavets Amazon Web Services, USA
Pre-print
11:55
5m
Talk
OSS License Identification at Scale: A Comprehensive Dataset Using World of Code
Data and Tool Showcase Track
Mahmoud Jahanshahi Research Assistant, University of Tennessee Knoxville, David Reid University of Tennessee, Adam McDaniel University of Tennessee Knoxville, Audris Mockus The University of Tennessee
12:00
5m
Talk
SCRUBD: Smart Contracts Reentrancy and Unhandled Exceptions Vulnerability Dataset
Data and Tool Showcase Track
Chavhan Sujeet Yashavant Indian Institute of Technology, Kanpur, Mitrajsinh Chavda Indian Institute of Technology Kanpur, India, Saurabh Kumar Indian Institute of Technology Hyderabad, India, Amey Karkare IIT Kanpur, Angshuman Karmakar Indian Institute of Technology Kanpur, India
Pre-print
12:05
5m
Talk
ICVul: A Well-labeled C/C++ Vulnerability Dataset with Comprehensive Metadata and VCCs
Data and Tool Showcase Track
Chaomeng Lu DistriNet Group-T, KU Leuven, Tianyu Li DistriNet Group-T, KU Leuven, Toon Dehaene KU Leuven, Bert Lagaisse DistriNet Group-T, KU Leuven
12:10
5m
Talk
A Dataset of Software Bill of Materials for Evaluating SBOM Consumption Tools
Data and Tool Showcase Track
Rio Kishimoto Osaka University, Tetsuya Kanda Notre Dame Seishin University, Yuki Manabe The University of Fukuchiyama, Katsuro Inoue Nanzan University, Shi Qiu Toshiba, Yoshiki Higo Osaka University
12:15
5m
Talk
Wild SBOMs: a Large-scale Dataset of Software Bills of Materials from Public Code
Data and Tool Showcase Track
Luis Soeiro LTCI, Télécom Paris, Institut Polytechnique de Paris, Thomas Robert LTCI, Télécom Paris, Institut Polytechnique de Paris, Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris
Pre-print
12:20
5m
Talk
MaLAware: Automating the Comprehension of Malicious Software Behaviours using Large Language Models (LLMs)
Data and Tool Showcase Track
BIKASH SAHA Indian Institute of Technology Kanpur, Nanda Rani Indian Institute of Technology Kanpur, Sandeep K. Shukla Indian Institute of Technology Kanpur
13:00 - 13:30
14:00 - 15:30
AI for SE (1)Technical Papers / Data and Tool Showcase Track at 214
Chair(s): Diego Costa Concordia University, Canada, Maliheh Izadi Delft University of Technology
14:00
10m
Talk
Combining Large Language Models with Static Analyzers for Code Review Generation
Technical Papers
Imen Jaoua DIRO, Université de Montréal, Oussama Ben Sghaier DIRO, Université de Montréal, Houari Sahraoui DIRO, Université de Montréal
Pre-print
14:10
10m
Talk
Harnessing Large Language Models for Curated Code Reviews
Technical Papers
Oussama Ben Sghaier DIRO, Université de Montréal, Martin Weyssow Singapore Management University, Houari Sahraoui DIRO, Université de Montréal
Pre-print
14:20
10m
Talk
SMATCH-M-LLM: Semantic Similarity in Metamodel Matching With Large Language Models
Technical Papers
Nafisa Ahmed Polytechnique Montreal, Hin Chi Kwok Hong Kong Polytechnic University, Mohammad Hamdaqa Polytechnique Montréal, Wesley Assunção North Carolina State University
14:30
10m
Talk
How Effective are LLMs for Data Science Coding? A Controlled Experiment
Technical Papers
Nathalia Nascimento Pennsylvania State University, Everton Guimaraes Pennsylvania State University, USA, Sai Sanjna Chintakunta Pennsylvania State University, Santhosh AB Pennsylvania State University
Pre-print
14:40
10m
Talk
Do LLMs Provide Links to Code Similar to what they Generate? A Study with Gemini and Bing CoPilot
Technical Papers
Daniele Bifolco University of Sannio, Pietro Cassieri University of Salerno, Giuseppe Scanniello University of Salerno, Massimiliano Di Penta University of Sannio, Italy, Fiorella Zampetti University of Sannio, Italy
Pre-print
14:50
10m
Talk
Too Noisy To Learn: Enhancing Data Quality for Code Review Comment Generation
Technical Papers
Chunhua Liu The University of Melbourne, Hong Yi Lin The University of Melbourne, Patanamon Thongtanunam University of Melbourne
15:00
5m
Talk
Should Code Models Learn Pedagogically? A Preliminary Evaluation of Curriculum Learning for Real-World Software Engineering Tasks
Technical Papers
Skylar Kyi Shin Khant The University of Melbourne, Hong Yi Lin The University of Melbourne, Patanamon Thongtanunam University of Melbourne
15:05
5m
Talk
RepoChat: An LLM-Powered Chatbot for GitHub Repository Question-Answering
Data and Tool Showcase Track
Samuel Abedu Concordia University, Laurine Menneron CESI Graduate School of Engineering, SayedHassan Khatoonabadi Concordia University, Emad Shihab Concordia University
14:00 - 15:30
MSR 2025 Mining ChallengeMining Challenge at 215
Chair(s): Joyce El Haddad Université Paris Dauphine - PSL , Damien Jaime Université Paris Nanterre & LIP6, Pascal Poizat Université Paris Nanterre & LIP6
14:00
4m
Talk
Analyzing Dependency Clusters and Security Risks in the Maven Central Repository
Mining Challenge
George Lake Idaho State University, Minhaz F. Zibran Idaho State University
14:04
4m
Talk
Chasing the Clock: How Fast Are Vulnerabilities Fixed in the Maven Ecosystem?
Mining Challenge
Md Fazle Rabbi Idaho State University, Arifa Islam Champa Idaho State University, Rajshakhar Paul Wayne State University, Minhaz F. Zibran Idaho State University
14:08
4m
Talk
Decoding Dependency Risks: A Quantitative Study of Vulnerabilities in the Maven Ecosystem
Mining Challenge
Costain Nachuma Idaho State University, Md Mosharaf Hossan Idaho State University, Asif Kamal Turzo Wayne State University, Minhaz F. Zibran Idaho State University
Pre-print
14:12
4m
Talk
Faster Releases, Fewer Risks: A Study on Maven Artifact Vulnerabilities and Lifecycle Management
Mining Challenge
Md Shafiullah Shafin Rajshahi University of Engineering & Technology (RUET), Md Fazle Rabbi Idaho State University, S. M. Mahedy Hasan Rajshahi University of Engineering & Technology, Minhaz F. Zibran Idaho State University
14:16
4m
Talk
Insights into Dependency Maintenance Trends in the Maven Ecosystem
Mining Challenge
Barisha Chowdhury Rajshahi University of Engineering & Technology, Md Fazle Rabbi Idaho State University, S. M. Mahedy Hasan Rajshahi University of Engineering & Technology, Minhaz F. Zibran Idaho State University
14:20
4m
Talk
Insights into Vulnerability Trends in Maven Artifacts: Recurrence, Popularity, and User Behavior
Mining Challenge
Courtney Bodily Idaho State University, Eric Hill Idaho State University, Andreas Kramer Idaho State University, Leslie Kerby Idaho State University, Minhaz F. Zibran Idaho State University
14:24
4m
Talk
Understanding Software Vulnerabilities in the Maven Ecosystem: Patterns, Timelines, and Risks
Mining Challenge
Md Fazle Rabbi Idaho State University, Rajshakhar Paul Wayne State University, Arifa Islam Champa Idaho State University, Minhaz F. Zibran Idaho State University
Pre-print
14:28
4m
Talk
Dependency Update Adoption Patterns in the Maven Software Ecosystem
Mining Challenge
Baltasar Berretta College of Wooster, Augustus Thomas College of Wooster, Heather Guarnera The College of Wooster
14:32
4m
Talk
Analyzing Vulnerability Overestimation in Software Projects
Mining Challenge
Taha Draoui University of Michigan-Flint, Faten Jebari University of Michigan-Flint, Chawki Ben Slimen University of Michigan-Flint, Munjaap Uppal University of Michigan-Flint, Mohamed Wiem Mkaouer University of Michigan - Flint
14:36
4m
Talk
Dependency Dilemmas: A Comparative Study of Independent and Dependent Artifacts in Maven Ecosystem
Mining Challenge
Mehedi Hasan Shanto Khulna University, Muhammad Asaduzzman University of Windsor, Manishankar Mondal Khulna University, Shaiful Chowdhury University of Manitoba
14:40
4m
Talk
Cascading Effects: Analyzing Project Failure Impact in the Maven Central Ecosystem
Mining Challenge
Mina Shehata Belmont University, Saidmakhmud Makhkamjonoov Belmont University, Mahad Syed Belmont University, Esteban Parra Belmont University
14:45
4m
Talk
Do Developers Depend on Deprecated Library Versions? A Mining Study of Log4j
Mining Challenge
Haruhiko Yoshioka Nara Institute of Science and Technology, Sila Lertbanjongngam Nara Institute of Science and Technology, Masayuki Inaba Nara Institute of Science and Technology, Youmei Fan Nara Institute of Science and Technology, Takashi Nakano Nara Institute of Science and Technology, Kazumasa Shimari Nara Institute of Science and Technology, Raula Gaikovina Kula Osaka University, Kenichi Matsumoto Nara Institute of Science and Technology
14:49
4m
Talk
Mining for Lags in Updating Critical Security Threats: A Case Study of Log4j Library
Mining Challenge
Hidetake Tanaka Nara Institute of Science and Technology, Kazuma Yamasaki Nara Institute of Science and Technology, Momoka Hirose Nara Institute of Science and Technology, Takashi Nakano Nara Institute of Science and Technology, Youmei Fan Nara Institute of Science and Technology, Kazumasa Shimari Nara Institute of Science and Technology, Raula Gaikovina Kula Osaka University, Kenichi Matsumoto Nara Institute of Science and Technology
14:53
4m
Talk
On the Evolution of Unused Dependencies in Java Project Releases: An Empirical Study
Mining Challenge
Nabhan Suwanachote Nara Institute of Science and Technology, Yagut Shakizada Nara Institute of Science and Technology, Yutaro Kashiwa Nara Institute of Science and Technology, Bin Lin Hangzhou Dianzi University, Hajimu Iida Nara Institute of Science and Technology
14:57
4m
Talk
Out of Sight, Still at Risk: The Lifecycle of Transitive Vulnerabilities in Maven
Mining Challenge
Piotr Przymus Nicolaus Copernicus University in Toruń, Poland, Mikołaj Fejzer Nicolaus Copernicus University in Toruń, Jakub Narębski Nicolaus Copernicus University in Toruń, Krzysztof Rykaczewski Nicolaus Copernicus University in Toruń, Poland, Krzysztof Stencel University of Warsaw
15:01
4m
Talk
Popularity and Innovation in Maven Central
Mining Challenge
Nkiru Ede Victoria University of Wellington, Jens Dietrich Victoria University of Wellington, Ulrich Zülicke Victoria University of Wellington
Pre-print
15:05
4m
Talk
Software Bills of Materials in Maven Central
Mining Challenge
Yogya Gamage Universtité de Montréal, Nadia Gonzalez Fernandez Université de Montréal, Martin Monperrus KTH Royal Institute of Technology, Benoit Baudry Université de Montréal
15:09
4m
Talk
The Ripple Effect of Vulnerabilities in Maven Central: Prevalence, Propagation, and Mitigation Challenges
Mining Challenge
Ehtisham Ul Haq York University, Song Wang York University, Robert S Allison York University
15:13
4m
Talk
Tracing Vulnerabilities in Maven: A Study of CVE lifecycles and Dependency Networks
Mining Challenge
Corey Yang-Smith University of Calgary, Ahmad Abdellatif University of Calgary
Pre-print
15:17
4m
Talk
Understanding Abandonment and Slowdown Dynamics in the Maven Ecosystem
Mining Challenge
Kazi Amit Hasan Queen's University, Canada, Jerin Yasmin Queen's University, Canada, Huizi Hao Queen's University, Canada, Yuan Tian Queen's University, Kingston, Ontario, Safwat Hassan University of Toronto, Steven Ding
Pre-print
15:21
4m
Talk
Characterizing Packages for Vulnerability Prediction
Mining Challenge
Saviour Owolabi University of Calgary, Francesco Rosati University of Calgary, Ahmad Abdellatif University of Calgary, Lorenzo De Carli University of Calgary, Canada
15:25
4m
Talk
Understanding the Popularity of Packages in Maven Ecosystem
Mining Challenge
Sadman Jashim Sakib University of Windsor, Muhammad Asaduzzaman University of Windsor, Curtis Bright University of Windsor, Cole Morgan University of Windsor
16:00 - 17:30
Software evolution and analysisData and Tool Showcase Track / Technical Papers / Industry Track at 214
Chair(s): Mauricio Verano Merino Vrije Universiteit Amsterdam, Minhaz Zibran Idaho State University
16:00
10m
Talk
50 Years of Programming Language Evolution through the Software Heritage looking glass
Technical Papers
Adèle Desmazières Sorbonne Unversité, Roberto Di Cosmo Inria, France / University of Paris Diderot, France, Valentin Lorentz Inria Foundation
16:10
10m
Talk
It Works (only) on My Machine: A Study on Reproducibility Smells in Ansible Scripts
Technical Papers
Ghazal Sobhani Dalhousie University, Israat Haque Dalhousie University, Tushar Sharma Dalhousie University
Pre-print
16:20
10m
Talk
Are the Majority of Public Computational Notebooks Pathologically Non-Executable?
Technical Papers
Waris Gill Virginia Tech, Muhammad Ali Gulzar Virginia Tech, Tien Nguyen Virginia Tech
Pre-print
16:30
10m
Talk
Understanding Test Deletion in Java Applications
Technical Papers
Suraj Bhatta North Dakota State University, Frank Kendemah North Dakota State University, Ajay Jha North Dakota State University
Pre-print
16:40
10m
Talk
A Public Benchmark of REST APIs
Technical Papers
Alix Decrop University of Namur, Sara Eraso University of Valle, Xavier Devroey University of Namur, Gilles Perrouin Fonds de la Recherche Scientifique - FNRS & University of Namur
Pre-print
16:50
5m
Talk
What Do Contribution Guidelines Say About Software Testing?
Technical Papers
Pre-print
16:55
5m
Talk
Measuring InnerSource Value
Industry Track
17:00
5m
Talk
CoUpJava: A Dataset of Code Upgrade Histories in Open-Source Java Repositories
Data and Tool Showcase Track
Kaihang Jiang University of Waterloo, Bihui Jin University of Waterloo, Pengyu Nie University of Waterloo
17:05
5m
Talk
EvoChain: A Framework for Tracking and Visualizing Smart Contract Evolution
Data and Tool Showcase Track
Ilham Qasse Reykjavik University, Mohammad Hamdaqa Polytechnique Montréal, Björn Þór Jónsson Reykjavik University
17:10
5m
Talk
CoDocBench: A Dataset for Code-Documentation Alignment in Software Maintenance
Data and Tool Showcase Track
Kunal Suresh Pai UC Davis, Prem Devanbu University of California at Davis, Toufique Ahmed IBM Research
17:15
5m
Talk
RefExpo: Unveiling Software Project Structures through Advanced Dependency Graph Extraction
Data and Tool Showcase Track
Vahid Haratian Bilkent Univeristy, Pouria Derakhshanfar JetBrains Research, Vladimir Kovalenko JetBrains Research, Eray Tüzün Bilkent University
17:20
5m
Talk
HyperAST: Incrementally Mining Large Source Code Repositories
Data and Tool Showcase Track
Quentin Le Dilavrec TU Delft, Netherlands, Andy Zaidman Delft University of Technology
Pre-print
16:00 - 17:30
LLMs for CodeTutorials / Technical Papers / Data and Tool Showcase Track at 215
Chair(s): Ali Ouni ETS Montreal, University of Quebec, Houari Sahraoui DIRO, Université de Montréal
16:00
10m
Talk
How Much Do Code Language Models Remember? An Investigation on Data Extraction Attacks before and after Fine-tuning
Technical Papers
Fabio Salerno Delft University of Technology, Ali Al-Kaswan Delft University of Technology, Netherlands, Maliheh Izadi Delft University of Technology
16:10
10m
Talk
Can LLMs Generate Higher Quality Code Than Humans? An Empirical Study
Technical Papers
Mohammad Talal Jamil Lahore University of Management Sciences, Shamsa Abid National University of Computer and Emerging Sciences, Shafay Shamail LUMS, DHA, Lahore
Pre-print
16:20
10m
Talk
Prompt Engineering or Fine-Tuning: An Empirical Assessment of LLMs for Code
Technical Papers
Jiho Shin York University, Clark Tang , Tahmineh Mohati University of Calgary, Maleknaz Nayebi York University, Song Wang York University, Hadi Hemmati York University
16:30
5m
Talk
Drawing Pandas: A Benchmark for LLMs in Generating Plotting Code
Data and Tool Showcase Track
Timur Galimzyanov JetBrains Research, Sergey Titov JetBrains Research, Yaroslav Golubev JetBrains Research, Egor Bogomolov JetBrains Research
Pre-print
16:35
5m
Talk
SnipGen: A Mining Repository Framework for Evaluating LLMs for Code
Data and Tool Showcase Track
16:50
40m
Tutorial
Harmonized Coding with AI: LLMs for Qualitative Analysis in Software Engineering Research
Tutorials
Christoph Treude Singapore Management University, Youmei Fan Nara Institute of Science and Technology, Tao Xiao Kyushu University, Hideaki Hata Shinshu University

Tue 29 Apr

Displayed time zone: Eastern Time (US & Canada) change

09:00 - 10:30
Plenary: MIP + FCAMIP Award / FCA Award / Vision and Reflection at 214
Chair(s): Gabriele Bavota Software Institute @ Università della Svizzera Italiana, Jin L.C. Guo McGill University, Audris Mockus The University of Tennessee, Knoxville / Vilnius University, Martin Pinzger Universität Klagenfurt, Romain Robbes CNRS, LaBRI, University of Bordeaux, Patanamon Thongtanunam The University of Melbourne
09:00
30m
Awards
MSR 2025 Most Influential Paper Award
MIP Award

09:30
30m
Awards
MSR 2025 Foundational Contribution Award
FCA Award

10:00
30m
Talk
The Standard of Rigor for MSR Research: A 20-Year Evolution
Vision and Reflection
Bogdan Vasilescu Raj Reddy Associate Professor of Software and Societal Systems, Carnegie Mellon University, USA
11:00 - 12:30
Software ecosystems and humansData and Tool Showcase Track / Technical Papers at 214
Chair(s): Ahmad Abdellatif University of Calgary, Mohammad Hamdaqa Polytechnique Montréal
11:00
10m
Talk
The Ecosystem of Open-Source Music Production Software – A Mining Study on the Development Practices of VST Plugins on GitHub
Technical Papers
Andrei Bogdan University of Amsterdam, Mauricio Verano Merino Vrije Universiteit Amsterdam, Ivano Malavolta Vrije Universiteit Amsterdam
Pre-print
11:10
10m
Talk
Can LLMs Replace Manual Annotation of Software Engineering Artifacts?
Technical Papers
Toufique Ahmed IBM Research, Prem Devanbu University of California at Davis, Christoph Treude Singapore Management University, Michael Pradel University of Stuttgart
Pre-print
11:20
10m
Talk
Investigating the Understandability of Review Comments on Code Change Requests
Technical Papers
Md Shamimur Rahman University of Saskatchewan, Canada, Zadia Codabux University of Saskatchewan, Chanchal K. Roy University of Saskatchewan, Canada
11:30
10m
Talk
Mining a Decade of Contributor Dynamics in Ethereum: A Longitudinal Study
Technical Papers
Matteo Vaccargiu University of Cagliari, Sabrina Aufiero University College London (UCL), Cheick Ba Queen Mary University of London, Silvia Bartolucci University College London, Richard Clegg Queen Mary University London, Daniel Graziotin University of Hohenheim, Rumyana Neykova Brunel University London, Roberto Tonelli University of Cagliari, Giuseppe Destefanis Brunel University London
Pre-print
11:40
10m
Talk
Is it Really Fun? Detecting Low Engagement Events in Video Games
Technical Papers
Emanuela Guglielmi University of Molise, Gabriele Bavota Software Institute @ Università della Svizzera Italiana, Nicole Novielli University of Bari, Rocco Oliveto University of Molise, Simone Scalabrino University of Molise
11:50
5m
Talk
A Dataset of Contributor Activities in the NumFocus Open-Source Community
Data and Tool Showcase Track
Youness Hourri University of Mons, Alexandre Decan University of Mons; F.R.S.-FNRS, Tom Mens University of Mons
Pre-print
11:55
5m
Talk
Jupyter Notebook Activity Dataset
Data and Tool Showcase Track
Tomoki Nakamaru The University of Tokyo, Tomomasa Matsunaga The University of Tokyo, Tetsuro Yamazaki University of Tokyo
12:00
5m
Talk
CoPhi - Mining C/C++ Packages for Conan Ecosystem Analysis
Data and Tool Showcase Track
Vivek Sarkar University of Washington, Anemone Kampkötter TU Dortmund, Ben Hermann TU Dortmund
Pre-print
12:05
5m
Talk
MARIN: A Research-Centric Interface for Querying Software Artifacts on Maven Repositories
Data and Tool Showcase Track
Johannes Düsing TU Dortmund, Jared Chiaramonte Arizona State University, Ben Hermann TU Dortmund
Pre-print
12:10
5m
Talk
GitProjectHealth: an Extensible Framework for Git Social Platform Mining
Data and Tool Showcase Track
Nicolas Hlad Berger-Levrault, Benoit Verhaeghe Berger-Levrault, Kilian Bauvent Berger-levrault
12:15
5m
Talk
Myriad People. Open Source Software for New Media Arts
Data and Tool Showcase Track
Benoit Baudry Université de Montréal, Erik Natanael Gustafsson Independent artist, Roni Kaufman Independent artist, Maria Kling Independent artist
Pre-print
12:20
5m
Talk
OpenMent: A Dataset of Mentor-Mentee Interactions in Google Summer of Code
Data and Tool Showcase Track
Erfan Raoofian University of British Columbia, Fatemeh Hendijani Fard University of British Columbia, Ifeoma Adaji University of British Columbia, Gema Rodríguez-Pérez University of British Columbia (UBC)
12:25
5m
Talk
Under the Blueprints: Parsing Unreal Engine’s Visual Scripting at Scale
Data and Tool Showcase Track
Kalvin Eng University of Alberta, Abram Hindle University of Alberta
11:00 - 12:30
Build systems and DevOpsData and Tool Showcase Track / Technical Papers / Tutorials at 215
Chair(s): Massimiliano Di Penta University of Sannio, Italy, Sarah Nadi New York University Abu Dhabi
11:00
7m
Talk
Build Scripts Need Maintenance Too: A Study on Refactoring and Technical Debt in Build Systems
Technical Papers
Anwar Ghammam Oakland University, Dhia Elhaq Rzig University of Michigan - Dearborn, Mohamed Almukhtar Oakland University, Rania Khalsi University of Michigan - Flint, Foyzul Hassan University of Michigan at Dearborn, Marouane Kessentini Grand Valley State University
11:07
7m
Talk
LLMSecConfig: An LLM-Based Approach for Fixing Software Container Misconfigurations
Technical Papers
Ziyang Ye The University of Adelaide, Triet Le The University of Adelaide, Muhammad Ali Babar School of Computer Science, The University of Adelaide
Pre-print
11:14
7m
Talk
How Do Infrastructure-as-Code Practitioners Update Their Dependencies? An Empirical Study on Terraform Module Updates
Technical Papers
Mahi Begoug , Ali Ouni ETS Montreal, University of Quebec, Moataz Chouchen Department of Electrical and Computer Engineering, Concordia University, Montreal, Canada
11:21
7m
Talk
TerraDS: A Dataset for Terraform HCL Programs
Data and Tool Showcase Track
Christoph Buehler University of St. Gallen, David Spielmann University of St. Gallen, Roland Meier armasuisse, Guido Salvaneschi University of St. Gallen
Pre-print
11:28
7m
Talk
OSPtrack: A Labeled Dataset Targeting Simulated Execution of Open-Source Software
Data and Tool Showcase Track
Zhuoran Tan University of Glasgow, Christos Anagnostopoulos University of Glasgow, Jeremy Singer University of Glasgow
11:35
7m
Talk
CARDS: A collection of package, revision, and miscelleneous dependency graphs
Data and Tool Showcase Track
Euxane TRAN-GIRARD LIGM, CNRS, Université Gustave Eiffel, Laurent BULTEAU LIGM, CNRS, Université Gustave Eiffel, Pierre-Yves DAVID Octobus S.c.o.p.
Pre-print
11:42
7m
Talk
GHALogs: Large-scale dataset of GitHub Actions runs
Data and Tool Showcase Track
Florent Moriconi EURECOM, AMADEUS, Thomas Durieux TU Delft, Jean-Rémy Falleri Bordeaux INP, Raphaël Troncy EURECOM, Aurélien Francillon EURECOM
11:50
40m
Tutorial
Agents for Software Development
Tutorials
Graham Neubig Carnegie Mellon University
14:00 - 15:30
AI for SE (2)Technical Papers / Industry Track / Data and Tool Showcase Track at 214
Chair(s): Giuseppe Destefanis Brunel University London, Mohammad Hamdaqa Polytechnique Montréal
14:00
10m
Talk
Automatic High-Level Test Case Generation using Large Language Models
Technical Papers
Navid Bin Hasan Bangladesh University of Engineering and Technology, Md. Ashraful Islam Bangladesh University of Engineering and Technology, Junaed Younus Khan Bangladesh University of Engineering and Technology, Sanjida Senjik Bangladesh University of Engineering and Technology, Anindya Iqbal Bangladesh University of Engineering and Technology Dhaka, Bangladesh
14:10
10m
Talk
Prompting in the Wild: An Empirical Study of Prompt Evolution in Software Repositories
Technical Papers
Mahan Tafreshipour University of California at Irvine, Aaron Imani University of California, Irvine, Eric Huang University of California, Irvine, Eduardo Santana de Almeida Federal University of Bahia, Thomas Zimmermann University of California, Irvine, Iftekhar Ahmed University of California at Irvine
Pre-print
14:20
10m
Talk
Towards Detecting Prompt Knowledge Gaps for Improved LLM-guided Issue Resolution
Technical Papers
Ramtin Ehsani Drexel University, Sakshi Pathak Drexel University, Preetha Chatterjee Drexel University, USA
Pre-print
14:30
10m
Talk
Intelligent Semantic Matching (ISM) for Video Tutorial Search using Transformer Models
Technical Papers
Ahmad Tayeb , Sonia Haiduc Florida State University
14:40
10m
Talk
Language Models in Software Development Tasks: An Experimental Analysis of Energy and Accuracy
Technical Papers
Negar Alizadeh Universiteit Utrecht, Boris Belchev University of Twente, Nishant Saurabh Utrecht University, Patricia Kelbert Fraunhofer IESE, Fernando Castor University of Twente
14:50
10m
Talk
TriGraph: A Probabilistic Subgraph-Based Model for Visual Code Completion in Pure Data
Technical Papers
Anisha Islam Department of Computing Science, University of Alberta, Abram Hindle University of Alberta
15:00
5m
Talk
Inferring Questions from Programming Screenshots
Technical Papers
Faiz Ahmed York University, Xuchen Tan York University, Folajinmi Adewole York University, Suprakash Datta York University, Maleknaz Nayebi York University
15:05
5m
Talk
Human-In-The-Loop Software Development Agents: Challenges and Future Directions
Industry Track
Jirat Pasuksmit Atlassian, Wannita Takerngsaksiri Monash University, Patanamon Thongtanunam University of Melbourne, Kla Tantithamthavorn Monash University, Ruixiong Zhang Atlassian, Shiyan Wang Atlassian, Fan Jiang Atlassian, Jing Li Atlassian, Evan Cook Atlassian, Kun Chen Atlassian, Ming Wu Atlassian
15:10
5m
Talk
FormalSpecCpp: A Dataset of C++ Formal Specifications Created Using LLMs
Data and Tool Showcase Track
Madhurima Chakraborty University of California, Riverside, Peter Pirkelbauer Lawrence Livermore National Laboratory, Qing Yi Lawrence Livermore National Laboratory
14:00 - 15:30
Software qualityTechnical Papers / Data and Tool Showcase Track at 215
Chair(s): Mohammad Hamdaqa Polytechnique Montréal, Ying Zou Queen's University, Kingston, Ontario
14:00
10m
Talk
PyExamine: A Comprehensive, Un-Opinionated Smell Detection Tool for Python
Technical Papers
Karthik Shivashankar University of Oslo, Antonio Martini University of Oslo, Norway
14:10
10m
Talk
Does Functional Package Management Enable Reproducible Builds at Scale? Yes.
Technical Papers
Julien Malka LTCI, Télécom Paris, Institut Polytechnique de Paris, France, Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris, Théo Zimmermann Télécom Paris, Polytechnic Institute of Paris
Pre-print
14:20
10m
Talk
Refactoring for Dockerfile Quality: A Dive into Developer Practices and Automation Potential
Technical Papers
Emna Ksontini University of Michigan - Dearborn, Meriem Mastouri University of Michigan, Rania Khalsi University of Michigan - Flint, Wael Kessentini DePaul University
14:30
10m
Talk
Smells-sus: Sustainability Smells in IaC
Technical Papers
Seif Kosbar Polytechnique Montréal, Mohammad Hamdaqa Polytechnique Montréal
14:40
10m
Talk
Evidence is All We Need: Do Self-Admitted Technical Debts Impact Method-Level Maintenance?
Technical Papers
Shaiful Chowdhury University of Manitoba, Hisham Kidwai University of Manitoba, Muhammad Asaduzzman University of Windsor
14:50
5m
Talk
DPy: Code Smells Detection Tool for Python
Data and Tool Showcase Track
Aryan Boloori Dalhousie university, Tushar Sharma Dalhousie University
Pre-print
14:55
5m
Talk
CoMRAT: Commit Message Rationale Analysis Tool
Data and Tool Showcase Track
Mouna Dhaouadi University of Montreal, Bentley Oakes Polytechnique Montréal, Michalis Famelis Université de Montréal
Media Attached
15:00
5m
Talk
E2EGit: A Dataset of End-to-End Web Tests in Open Source Projects
Data and Tool Showcase Track
Sergio Di Meglio Università degli Studi di Napoli Federico II, Luigi Libero Lucio Starace Università degli Studi di Napoli Federico II, Valeria Pontillo Vrije Universiteit Brussel, Ruben Opdebeeck Vrije Universiteit Brussel, Coen De Roover Vrije Universiteit Brussel, Sergio Di Martino Università degli Studi di Napoli Federico II
15:05
5m
Talk
TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest
Data and Tool Showcase Track
Pre-print
15:10
5m
Talk
pyMethods2Test: A Dataset of Python Tests Mapped to Focal Methods
Data and Tool Showcase Track
Idriss Abdelmadjid University of Nebraska-Lincoln, Robert Dyer University of Nebraska-Lincoln
Pre-print
15:15
5m
Talk
DataTD: A Dataset of Java Projects Including Test Doubles
Data and Tool Showcase Track
Mengzhen Li University of Minnesota, Mattia Fazzini University of Minnesota
15:20
5m
Talk
JPerfEvo: A Tool for Tracking Method-Level Performance Changes in Java Projects
Data and Tool Showcase Track
Kaveh Shahedi Polytechnique Montréal, Maxime Lamothe Polytechnique Montreal, Foutse Khomh Polytechnique Montréal, Heng Li Polytechnique Montréal
16:00 - 17:30
Plenary: ClosingProgram / Vision and Reflection at 214
Chair(s): Gabriele Bavota Software Institute @ Università della Svizzera Italiana, Jin L.C. Guo McGill University
16:00
30m
Talk
Future of AI4SE: From Code Generation to Software Engineering?
Vision and Reflection
Baishakhi Ray Columbia University, New York;
16:30
30m
Talk
Reshaping MSR (and SE) empirical evaluations in 2030
Vision and Reflection
Massimiliano Di Penta University of Sannio, Italy
17:00
15m
Day closing
Closing Session
Program
Bram Adams Queen's University, Olga Baysal Carleton University, Ayushi Rastogi University of Groningen, The Netherlands
17:15
15m
Day closing
MSR 2026 Presentation
Program

Call for Registered Reports

The Elsevier Journal of Empirical Software Engineering (EMSE), in conjunction with the conference on Mining Software Repositories (MSR), is continuing the RR track. The RR track of MSR 2025 has two goals:

  1. Providing early feedback to authors in their initial study design. For papers submitted to the RR track, methods and proposed analyses are reviewed prior to execution.
  2. To prevent HARKing (hypothesizing after the results are known) for empirical studies; (2) to provide early feedback to authors in their initial study design. For papers submitted to the RR track, methods and proposed analyses are reviewed prior to execution.

Pre-registered studies follow a two-step process:

  • Stage 1: Authors submit a report that describes a study they plan to undertake. The submitted report is evaluated by the reviewers of the RR track of MSR 2025 and if accepted, authors of accepted pre-registered studies will be given the opportunity to present their report at MSR.
  • Stage 2: Once a report has passed Phase 1, the authors conduct the study (i.e., the actual data collection, experiments and analysis will take place) and they prepare a full paper based on the original plan and obtained results (which may also be negative) to be submitted for review to EMSE.

Type of Study

The RR track of MSR 2024 supports two types of papers:

Confirmatory Study: The researcher has a fixed hypothesis (or several fixed hypotheses) and the objective of the study is to find out whether the hypothesis is supported by the facts/data. An example of a completed confirmatory study:

  • Inozemtseva, L., & Holmes, R. (2014, May). Coverage is not strongly correlated with test suite effectiveness. In Proceedings of the 36th international conference on software engineering (pp. 435-445).

Exploratory Study: The researcher does not have a hypothesis (or has one that may change during the study). Often, the objective of such a study is to understand what is observed and answer questions such as WHY, HOW, WHAT, WHO, or WHEN. We include in this category registrations for which the researcher has an initial proposed solution for an automated approach (e.g., a new deep-learning-based defect prediction approach) that serves as a starting point for his/her exploration to reach an effective solution. Examples of completed exploratory studies:

  • Gousios, G., Pinzger, M., & Deursen, A. V. (2014, May). An exploratory study of the pull-based software development model. In Proceedings of the 36th International Conference on Software Engineering (pp. 345-355).
  • Rodrigues, I. M., Aloise, D., Fernandes, E. R., & Dagenais, M. (2020, June). A Soft Alignment Model for Bug Deduplication. In Proceedings of the 17th International Conference on Mining Software Repositories (pp. 43-53).

Evaluation Criteria and Possible Outcomes

The RR PC members will review papers in both Stage 1 and Stage 2. Four PC members will review the Stage 1 submission, and three will review the Stage 2 submission. The reviewers will evaluate RR track submissions based on the following criteria:

  • The importance of the research question(s).
  • The logic, rationale, and plausibility of the proposed hypotheses.
  • The soundness and feasibility of the methodology and analysis pipeline (including statistical power analysis where appropriate).
  • (For confirmatory study) Whether the clarity and degree of methodological detail is sufficient to exactly replicate the proposed experimental procedures and analysis pipeline.
  • (For confirmatory study) Whether the authors have pre-specified sufficient outcome-neutral tests for ensuring that the results obtained can test the stated hypotheses, including positive controls and quality checks.
  • (For exploratory study, if applicable) The description of the data set that is the base for exploration.

The outcome of the RR report review is one of the following:

  • In-Principle Acceptance (IPA): The reviewers agree that the study is relevant, the outcome of the study (whether confirmation / rejection of hypothesis) is of interest to the community, the protocol for data collection is sound, and that the analysis methods are adequate. The authors can engage in the actual study for Stage 2. If the protocol is adhered to (or deviations are thoroughly justified), the study is likely to be published. Of course, this being a journal submission, a revision of the submitted manuscript may be necessary. Reviewers will especially evaluate how precisely the protocol of the accepted pre-registered report is followed, or whether deviations are well-justified.
  • Continuity Acceptance (CA): The reviewers agree that the study is relevant and the (initial) methods appear to be appropriate. However, for exploratory studies, implementation details and post-experiment analyses or discussion (e.g., why the proposed automated approach does not work) may require follow-up revisions. We will do our best to assign the same reviewers in Stage 1 and 2.
  • Rejection:The reviewers do not agree on the relevance of the study or are not convinced that the study design is sufficiently mature. Comments are provided to the authors to improve the study design before starting it.

Note :For MSR 2025, only confirmatory studies are granted an IPA. Exploratory study in software engineering often cannot be adequately assessed until after the study has been completed and the findings are elaborated and discussed in a full paper. For example, consider a study in an RR proposing defect prediction using a new deep learning architecture. This work falls under the exploratory category. It is difficult to offer IPA, as we do not know whether it is any better than a traditional approach based on e.g., decision trees. Negative results are welcome; however, it is important that the negative results paper goes beyond presenting “we tried and failed”, but rather provide interesting insights to readers, e.g., why the results are negative or what that means for further studies on this topic (following criteria of REplication and Negative Results (RENE) tracks, e.g., https://saner2023.must.edu.mo/negativerestrack). Furthermore, it is important to note that authors are required to document all deviations (if any) in a section of the paper.

Key Dates

The timeline for MSR 2025 RR track will be as follows:

Dec 05, 2024: Authors submit an abstract of their initial report.

Dec 12, 2024: Authors submit their initial report.

Feb 27, 2025: Authors receive PC members’ reviews.

Mar 16, 2025: Authors submit a response letter + revised report in a single PDF.

  • The response letter should address reviewer comments and questions.
  • The response letter + revised report must not exceed 12 pages (plus 1 additional page of references).
  • The response letter does not need to follow ACM formatting instructions.

April 03, 2025: Notification of Stage 1

  • (Outcome: in-principal acceptance, continuity acceptance, or rejection).

Before April 27, 2025: Authors submit their accepted RR report to arXiv or SSRN.

  • To be checked by PC members for Stage 2
  • Note: Due to the timeline, RR reports will not be published in the MSR 2025 proceedings, but authors of accepted papers will be invited to present their plan to MSR 2025.

Before Dec 11, 2025: Authors submit a full paper to EMSE. Instructions will be provided later. However, the following constraints will be enforced:

  • Justifications need to be given to any change of authors. If the authors are added/removed or the author order is changed between the original Stage 1 and the EMSE submission, all authors will need to complete and sign a “Change of authorship request form”. The Editors in Chief of EMSE and chairs of the RR track reserve the right to deny author changes. If you anticipate any authorship changes, please reach out to the chairs of the RR track as early as possible.
  • PC members who reviewed an RR report in Stage 1 and their directly supervised students cannot be added as authors of the corresponding submission in Stage 2.

Submission Process

Registered report submissions must not exceed 6 pages (plus 1 additional page of references). All submissions must be in PDF. The page limit is strict. Submissions must conform to the IEEE conference proceedings template, specified in the IEEE Conference Proceedings Formatting Guidelines (title in 24pt font and full text in 10pt type, LaTeX users must use \documentclass[10pt,conference]{IEEEtran} without including the compsoc or compsocconf options). Submissions must strictly conform to the IEEE conference proceedings formatting instructions specified above. Alterations of spacing, font size, and other changes that deviate from the instructions may result in desk rejection without further review.

Submissions can be made via the submission site (https://msr2025-registered-report.hotcrp.com) by the submission deadline. Any submission that does not comply with the aforementioned instructions and the mandatory information specified in the Author Guide is likely to be desk rejected.

In addition, by submitting, the authors acknowledge that they are aware of and agree to be bound by the following policies: The ACM Policy and Procedures on Plagiarism and the IEEE Plagiarism FAQ. In particular, papers submitted to MSR 2025 must not have been published elsewhere and must not be under review or submitted for review elsewhere whilst under consideration for MSR 2025. Contravention of this concurrent submission policy will be deemed a serious breach of scientific ethics, and appropriate action will be taken in all such cases (including immediate rejection and reporting of the incident to ACM/IEEE). To check for double submission and plagiarism issues, the chairs reserve the right to (1) share the list of submissions with the PC Chairs of other conferences with overlapping review periods and (2) use external plagiarism detection software, under contract to the ACM or IEEE, to detect violations of these policies.

By submitting to MSR 2025, authors acknowledge that they conform to the authorship policy of the ACM, and the authorship policy of the IEEE. This includes following these points related to the use of Generative AI:

  • “Generative AI tools and technologies, such as ChatGPT, may not be listed as authors of an ACM-published Work. The use of generative AI tools and technologies to create content is permitted but must be fully disclosed in the Work. For example, the authors could include the following statement in the Acknowledgements section of the Work: ChatGPT was utilized to generate sections of this Work, including text, tables, graphs, code, data, citations, etc.). If you are uncertain ¬about the need to disclose the use of a particular tool, err on the side of caution, and include a disclosure in the acknowledgments section of the Work.” - ACM
  • “The use of artificial intelligence (AI)–generated text in an article shall be disclosed in the acknowledgments section of any paper submitted to an IEEE Conference or Periodical. The sections of the paper that use AI-generated text shall have a citation to the AI system used to generate the text.” - IEEE
  • “If you are using generative AI software tools to edit and improve the quality of your existing text in much the same way you would use a typing assistant like Grammarly to improve spelling, grammar, punctuation, clarity, engagement or to use a basic word processing system to correct spelling or grammar, it is not necessary to disclose such usage of these tools in your Work.” - ACM