This program is tentative and subject to change.
Mon 28 AprDisplayed time zone: Eastern Time (US & Canada) change
09:00 - 10:30 | |||
09:00 30mDay opening | Opening + Award Announcement Program | ||
09:30 60mTalk | Mining BOMs for Improving Supply Chain Efficiency & Resilience Keynotes Kate Stewart Linux Foundation |
11:00 - 12:30 | |||
11:00 10mTalk | Learning from Mistakes: Understanding Ad-hoc Logs through Analyzing Accidental Commits Technical Papers Yi-Hung Chou University of California, Irvine, Yiyang Min Amazon, April Wang ETH Zürich, James Jones University of California at Irvine Pre-print | ||
11:10 10mTalk | On the calibration of Just-in-time Defect Prediction Technical Papers Xhulja Shahini paluno - University of Duisburg-Essen, Jone Bartel University of Duisburg-Essen, paluno, Klaus Pohl University of Duisburg-Essen, paluno | ||
11:20 10mTalk | An Empirical Study on Leveraging Images in Automated Bug Report Reproduction Technical Papers Dingbang Wang University of Connecticut, Zhaoxu Zhang University of Southern California, Sidong Feng Monash University, William G.J. Halfond University of Southern California, Tingting Yu University of Connecticut | ||
11:30 10mTalk | It’s About Time: An Empirical Study of Date and Time Bugs in Open-Source Python Software Technical Papers Shrey Tiwari Carnegie Mellon University, Serena Chen University of California, San Diego, Alexander Joukov Stony Brook University, Peter Vandervelde University of California, Santa Barbara, Ao Li Carnegie Mellon University, Rohan Padhye Carnegie Mellon University | ||
11:40 10mTalk | Enhancing Just-In-Time Defect Prediction Models with Developer-Centric Features Technical Papers Emanuela Guglielmi University of Molise, Andrea D'Aguanno University of Molise, Rocco Oliveto University of Molise, Simone Scalabrino University of Molise | ||
11:50 10mTalk | Revisiting Defects4J for Fault Localization in Diverse Development Scenarios Technical Papers Md Nakhla Rafi Concordia University, An Ran Chen University of Alberta, Tse-Hsun (Peter) Chen Concordia University, Shaohua Wang Central University of Finance and Economics | ||
12:00 5mTalk | Mining Bug Repositories for Multi-Fault Programs Data and Tool Showcase Track | ||
12:05 5mTalk | HaPy-Bug - Human Annotated Python Bug Resolution Dataset Data and Tool Showcase Track Piotr Przymus Nicolaus Copernicus University in Toruń, Poland, Mikołaj Fejzer Nicolaus Copernicus University in Toruń, Jakub Narębski Nicolaus Copernicus University in Toruń, Radosław Woźniak Nicolaus Copernicus University in Toruń, Łukasz Halada University of Wrocław, Poland, Aleksander Kazecki Nicolaus Copernicus University in Toruń, Mykhailo Molchanov Igor Sikorsky Kyiv Polytechnic Institute, Ukraine, Krzysztof Stencel University of Warsaw | ||
12:10 5mTalk | SPRINT: An Assistant for Issue Report Management Data and Tool Showcase Track Pre-print |
11:00 - 12:30 | |||
11:00 10mTalk | Wolves in the Repository: A Software Engineering Analysis of the XZ Utils Supply Chain Attack Technical Papers | ||
11:10 10mTalk | Software Composition Analysis and Supply Chain Security in Apache Projects: an Empirical Study Technical Papers Sabato Nocera University of Salerno, Sira Vegas Universidad Politecnica de Madrid, Giuseppe Scanniello University of Salerno, Natalia Juristo Universidad Politecnica de Madrid Pre-print | ||
11:20 10mTalk | Good practice versus reality: a landscape analysis of Research Software metadata adoption in European Open Science Clusters Technical Papers | ||
11:30 10mTalk | Towards Security Commit Message Standardization Technical Papers Sofia Reis Instituto Superior Técnico, U. Lisboa & INESC-ID, Rui Abreu INESC-ID; University of Porto, Corina Pasareanu CMU, NASA, KBR | ||
11:40 10mTalk | From Industrial Practices to Academia: Uncovering the Gap in Vulnerability Research and Practice Technical Papers | ||
11:50 5mTalk | Patch Me If You Can—Securing the Linux Kernel Industry Track Gunnar Kudrjavets Amazon Web Services, USA Pre-print | ||
11:55 5mTalk | OSS License Identification at Scale: A Comprehensive Dataset Using World of Code Data and Tool Showcase Track Mahmoud Jahanshahi Research Assistant, University of Tennessee Knoxville, David Reid University of Tennessee, Adam McDaniel University of Tennessee Knoxville, Audris Mockus The University of Tennessee | ||
12:00 5mTalk | SCRUBD: Smart Contracts Reentrancy and Unhandled Exceptions Vulnerability Dataset Data and Tool Showcase Track Chavhan Sujeet Yashavant Indian Institute of Technology, Kanpur, Mitrajsinh Chavda Indian Institute of Technology Kanpur, India, Saurabh Kumar Indian Institute of Technology Hyderabad, India, Amey Karkare IIT Kanpur, Angshuman Karmakar Indian Institute of Technology Kanpur, India Pre-print | ||
12:05 5mTalk | ICVul: A Well-labeled C/C++ Vulnerability Dataset with Comprehensive Metadata and VCCs Data and Tool Showcase Track Chaomeng Lu DistriNet Group-T, KU Leuven, Tianyu Li DistriNet Group-T, KU Leuven, Toon Dehaene KU Leuven, Bert Lagaisse DistriNet Group-T, KU Leuven | ||
12:10 5mTalk | A Dataset of Software Bill of Materials for Evaluating SBOM Consumption Tools Data and Tool Showcase Track Rio Kishimoto Osaka University, Tetsuya Kanda Notre Dame Seishin University, Yuki Manabe The University of Fukuchiyama, Katsuro Inoue Nanzan University, Shi Qiu Toshiba, Yoshiki Higo Osaka University | ||
12:15 5mTalk | Wild SBOMs: a Large-scale Dataset of Software Bills of Materials from Public Code Data and Tool Showcase Track Luis Soeiro LTCI, Télécom Paris, Institut Polytechnique de Paris, Thomas Robert LTCI, Télécom Paris, Institut Polytechnique de Paris, Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris Pre-print | ||
12:20 5mTalk | MaLAware: Automating the Comprehension of Malicious Software Behaviours using Large Language Models (LLMs) Data and Tool Showcase Track BIKASH SAHA Indian Institute of Technology Kanpur, Nanda Rani Indian Institute of Technology Kanpur, Sandeep K. Shukla Indian Institute of Technology Kanpur |
13:00 - 13:30 | |||
14:00 - 15:30 | MSR 2025 Mining ChallengeMining Challenge at 215 Chair(s): Joyce El Haddad Université Paris Dauphine - PSL , Damien Jaime Université Paris Nanterre & LIP6, Pascal Poizat Université Paris Nanterre & LIP6 | ||
14:00 4mTalk | Analyzing Dependency Clusters and Security Risks in the Maven Central Repository Mining Challenge | ||
14:04 4mTalk | Chasing the Clock: How Fast Are Vulnerabilities Fixed in the Maven Ecosystem? Mining Challenge Md Fazle Rabbi Idaho State University, Arifa Islam Champa Idaho State University, Rajshakhar Paul Wayne State University, Minhaz F. Zibran Idaho State University | ||
14:08 4mTalk | Decoding Dependency Risks: A Quantitative Study of Vulnerabilities in the Maven Ecosystem Mining Challenge Costain Nachuma Idaho State University, Md Mosharaf Hossan Idaho State University, Asif Kamal Turzo Wayne State University, Minhaz F. Zibran Idaho State University | ||
14:12 4mTalk | Faster Releases, Fewer Risks: A Study on Maven Artifact Vulnerabilities and Lifecycle Management Mining Challenge Md Shafiullah Shafin Rajshahi University of Engineering & Technology (RUET), Md Fazle Rabbi Idaho State University, S. M. Mahedy Hasan Rajshahi University of Engineering & Technology, Minhaz F. Zibran Idaho State University | ||
14:16 4mTalk | Insights into Dependency Maintenance Trends in the Maven Ecosystem Mining Challenge Barisha Chowdhury Rajshahi University of Engineering & Technology, Md Fazle Rabbi Idaho State University, S. M. Mahedy Hasan Rajshahi University of Engineering & Technology, Minhaz F. Zibran Idaho State University | ||
14:20 4mTalk | Insights into Vulnerability Trends in Maven Artifacts: Recurrence, Popularity, and User Behavior Mining Challenge Courtney Bodily Idaho State University, Eric Hill Idaho State University, Andreas Kramer Idaho State University, Leslie Kerby Idaho State University, Minhaz F. Zibran Idaho State University | ||
14:24 4mTalk | Understanding Software Vulnerabilities in the Maven Ecosystem: Patterns, Timelines, and Risks Mining Challenge Md Fazle Rabbi Idaho State University, Rajshakhar Paul Wayne State University, Arifa Islam Champa Idaho State University, Minhaz F. Zibran Idaho State University | ||
14:28 4mTalk | Dependency Update Adoption Patterns in the Maven Software Ecosystem Mining Challenge Baltasar Berretta College of Wooster, Augustus Thomas College of Wooster, Heather Guarnera The College of Wooster | ||
14:32 4mTalk | Analyzing Vulnerability Overestimation in Software Projects Mining Challenge Taha Draoui University of Michigan-Flint, Faten Jebari University of Michigan-Flint, Chawki Ben Slimen University of Michigan-Flint, Munjaap Uppal University of Michigan-Flint, Mohamed Wiem Mkaouer University of Michigan - Flint | ||
14:36 4mTalk | Dependency Dilemmas: A Comparative Study of Independent and Dependent Artifacts in Maven Ecosystem Mining Challenge Mehedi Hasan Shanto Khulna University, Muhammad Asaduzzman University of Windsor, Manishankar Mondal Khulna University, Shaiful Chowdhury University of Manitoba | ||
14:40 4mTalk | Cascading Effects: Analyzing Project Failure Impact in the Maven Central Ecosystem Mining Challenge Mina Shehata Belmont University, Saidmakhmud Makhkamjonoov Belmont University, Mahad Syed Belmont University, Esteban Parra Belmont University | ||
14:45 4mTalk | Do Developers Depend on Deprecated Library Versions? A Mining Study of Log4j Mining Challenge Haruhiko Yoshioka Nara Institute of Science and Technology, Sila Lertbanjongngam Nara Institute of Science and Technology, Masayuki Inaba Nara Institute of Science and Technology, Youmei Fan Nara Institute of Science and Technology, Takashi Nakano Nara Institute of Science and Technology, Kazumasa Shimari Nara Institute of Science and Technology, Raula Gaikovina Kula Osaka University, Kenichi Matsumoto Nara Institute of Science and Technology | ||
14:49 4mTalk | Mining for Lags in Updating Critical Security Threats: A Case Study of Log4j Library Mining Challenge Hidetake Tanaka Nara Institute of Science and Technology, Kazuma Yamasaki Nara Institute of Science and Technology, Momoka Hirose Nara Institute of Science and Technology, Takashi Nakano Nara Institute of Science and Technology, Youmei Fan Nara Institute of Science and Technology, Kazumasa Shimari Nara Institute of Science and Technology, Raula Gaikovina Kula Osaka University, Kenichi Matsumoto Nara Institute of Science and Technology | ||
14:53 4mTalk | On the Evolution of Unused Dependencies in Java Project Releases: An Empirical Study Mining Challenge Nabhan Suwanachote Nara Institute of Science and Technology, Yagut Shakizada Nara Institute of Science and Technology, Yutaro Kashiwa Nara Institute of Science and Technology, Bin Lin Hangzhou Dianzi University, Hajimu Iida Nara Institute of Science and Technology | ||
14:57 4mTalk | Out of Sight, Still at Risk: The Lifecycle of Transitive Vulnerabilities in Maven Mining Challenge Piotr Przymus Nicolaus Copernicus University in Toruń, Poland, Mikołaj Fejzer Nicolaus Copernicus University in Toruń, Jakub Narębski Nicolaus Copernicus University in Toruń, Krzysztof Rykaczewski Nicolaus Copernicus University in Toruń, Poland, Krzysztof Stencel University of Warsaw | ||
15:01 4mTalk | Popularity and Innovation in Maven Central Mining Challenge Nkiru Ede Victoria University of Wellington, Jens Dietrich Victoria University of Wellington, Ulrich Zülicke Victoria University of Wellington | ||
15:05 4mTalk | Software Bills of Materials in Maven Central Mining Challenge Yogya Gamage Universtité de Montréal, Nadia Gonzalez Fernandez Université de Montréal, Martin Monperrus KTH Royal Institute of Technology, Benoit Baudry | ||
15:09 4mTalk | The Ripple Effect of Vulnerabilities in Maven Central: Prevalence, Propagation, and Mitigation Challenges Mining Challenge | ||
15:13 4mTalk | Tracing Vulnerabilities in Maven: A Study of CVE lifecycles and Dependency Networks Mining Challenge Pre-print | ||
15:17 4mTalk | Understanding Abandonment and Slowdown Dynamics in the Maven Ecosystem Mining Challenge Kazi Amit Hasan Queen's University, Canada, Jerin Yasmin Queen's University, Canada, Huizi Hao Queen's University, Canada, Yuan Tian Queen's University, Kingston, Ontario, Safwat Hassan University of Toronto, Steven Ding Pre-print | ||
15:21 4mTalk | Characterizing Packages for Vulnerability Prediction Mining Challenge Saviour Owolabi University of Calgary, Francesco Rosati University of Calgary, Ahmad Abdellatif University of Calgary, Lorenzo De Carli University of Calgary, Canada | ||
15:25 4mTalk | Understanding the Popularity of Packages in Maven Ecosystem Mining Challenge Sadman Jashim Sakib University of Windsor, Muhammad Asaduzzaman University of Windsor, Curtis Bright University of Windsor, Cole Morgan University of Windsor |
17:30 - 22:00 | Dinner at Museum of HistoryCatering / Program / Mining Challenge / Data and Tool Showcase Track / Technical Papers / Keynotes / Industry Track at Grand Hall | ||
18:00 4hDinner | Dinner Catering |
Tue 29 AprDisplayed time zone: Eastern Time (US & Canada) change
09:00 - 10:30 | |||
09:00 30mAwards | MSR 2025 Most Influential Paper Award MIP Award | ||
09:30 30mAwards | MSR 2025 Foundational Contribution Award FCA Award | ||
10:00 30mTalk | The Standard of Rigor for MSR Research: A 20-Year Evolution Vision and Reflection Bogdan Vasilescu Associate Professor of Software and Societal Systems, Carnegie Mellon University, USA, Raj Reddy Associate Professor of Software and Societal Systems, Carnegie Mellon University, USA |
11:00 - 12:30 | Build systems and DevOpsData and Tool Showcase Track / Technical Papers / Tutorials at 215 Chair(s): Sarah Nadi New York University Abu Dhabi | ||
11:00 7mTalk | Build Scripts Need Maintenance Too: A Study on Refactoring and Technical Debt in Build Systems Technical Papers Anwar Ghammam Oakland University, Dhia Elhaq Rzig University of Michigan - Dearborn, Mohamed Almukhtar Oakland University, Rania Khalsi University of Michigan - Flint, Foyzul Hassan University of Michigan at Dearborn, Marouane Kessentini Grand Valley State University | ||
11:07 7mTalk | LLMSecConfig: An LLM-Based Approach for Fixing Software Container Misconfigurations Technical Papers Ziyang Ye The University of Adelaide, Triet Le The University of Adelaide, Muhammad Ali Babar School of Computer Science, The University of Adelaide Pre-print | ||
11:14 7mTalk | How Do Infrastructure-as-Code Practitioners Update Their Dependencies? An Empirical Study on Terraform Module Updates Technical Papers Mahi Begoug , Ali Ouni ETS Montreal, University of Quebec, Moataz Chouchen Department of Electrical and Computer Engineering, Concordia University, Montreal, Canada | ||
11:21 7mTalk | TerraDS: A Dataset for Terraform HCL Programs Data and Tool Showcase Track Christoph Buehler University of St. Gallen, David Spielmann University of St. Gallen, Roland Meier armasuisse, Guido Salvaneschi University of St. Gallen Pre-print | ||
11:28 7mTalk | OSPtrack: A Labeled Dataset Targeting Simulated Execution of Open-Source Software Data and Tool Showcase Track Zhuoran Tan University of Glasgow, Christos Anagnostopoulos University of Glasgow, Jeremy Singer University of Glasgow | ||
11:35 7mTalk | CARDS: A collection of package, revision, and miscelleneous dependency graphs Data and Tool Showcase Track Euxane TRAN-GIRARD LIGM, CNRS, Université Gustave Eiffel, Laurent BULTEAU LIGM, CNRS, Université Gustave Eiffel, Pierre-Yves DAVID Octobus S.c.o.p. Pre-print | ||
11:42 7mTalk | GHALogs: Large-scale dataset of GitHub Actions runs Data and Tool Showcase Track Florent Moriconi EURECOM, AMADEUS, Thomas Durieux TU Delft, Jean-Rémy Falleri Bordeaux INP, Raphaël Troncy EURECOM, Aurélien Francillon EURECOM | ||
11:50 40mTalk | Agents for Software Development Tutorials Graham Neubig Carnegie Mellon University |
14:00 - 15:30 | AI for SE (2)Technical Papers / Industry Track / Data and Tool Showcase Track at 214 Chair(s): Giuseppe Destefanis Brunel University London, Mohammad Hamdaqa Polytechnique Montréal | ||
14:00 10mTalk | Automatic High-Level Test Case Generation using Large Language Models Technical Papers Navid Bin Hasan Bangladesh University of Engineering and Technology, Md. Ashraful Islam Bangladesh University of Engineering and Technology, Junaed Younus Khan Bangladesh University of Engineering and Technology, Sanjida Senjik Bangladesh University of Engineering and Technology, Anindya Iqbal Bangladesh University of Engineering and Technology Dhaka, Bangladesh | ||
14:10 10mTalk | Prompting in the Wild: An Empirical Study of Prompt Evolution in Software Repositories Technical Papers Mahan Tafreshipour University of California at Irvine, Aaron Imani University of California, Irvine, Eric Huang University of California, Irvine, Eduardo Santana de Almeida Federal University of Bahia, Thomas Zimmermann University of California, Irvine, Iftekhar Ahmed University of California at Irvine Pre-print | ||
14:20 10mTalk | Towards Detecting Prompt Knowledge Gaps for Improved LLM-guided Issue Resolution Technical Papers Ramtin Ehsani Drexel University, Sakshi Pathak Drexel University, Preetha Chatterjee Drexel University, USA Pre-print | ||
14:30 10mTalk | Intelligent Semantic Matching (ISM) for Video Tutorial Search using Transformer Models Technical Papers | ||
14:40 10mTalk | Language Models in Software Development Tasks: An Experimental Analysis of Energy and Accuracy Technical Papers Negar Alizadeh Universiteit Utrecht, Boris Belchev University of Twente, Nishant Saurabh Utrecht University, Patricia Kelbert Fraunhofer IESE, Fernando Castor University of Twente | ||
14:50 10mTalk | TriGraph: A Probabilistic Subgraph-Based Model for Visual Code Completion in Pure Data Technical Papers Anisha Islam Department of Computing Science, University of Alberta, Abram Hindle University of Alberta | ||
15:00 5mTalk | Inferring Questions from Programming Screenshots Technical Papers Faiz Ahmed York University, Xuchen Tan York University, Folajinmi Adewole York University, Suprakash Datta York University, Maleknaz Nayebi York University | ||
15:05 5mTalk | Human-In-The-Loop Software Development Agents: Challenges and Future Directions Industry Track Jirat Pasuksmit Atlassian, Wannita Takerngsaksiri Monash University, Patanamon Thongtanunam University of Melbourne, Kla Tantithamthavorn Monash University, Ruixiong Zhang Atlassian, Shiyan Wang Atlassian, Fan Jiang Atlassian, Jing Li Atlassian, Evan Cook Atlassian, Kun Chen Atlassian, Ming Wu Atlassian | ||
15:10 5mTalk | FormalSpecCpp: A Dataset of C++ Formal Specifications Created Using LLMs Data and Tool Showcase Track Madhurima Chakraborty University of California, Riverside, Peter Pirkelbauer Lawrence Livermore National Laboratory, Qing Yi Lawrence Livermore National Laboratory |
14:00 - 15:30 | Software qualityTechnical Papers / Data and Tool Showcase Track at 215 Chair(s): Mohammad Hamdaqa Polytechnique Montréal, Ying Zou Queen's University, Kingston, Ontario | ||
14:00 10mTalk | PyExamine: A Comprehensive, Un-Opinionated Smell Detection Tool for Python Technical Papers | ||
14:10 10mTalk | Does Functional Package Management Enable Reproducible Builds at Scale? Yes. Technical Papers Julien Malka LTCI, Télécom Paris, Institut Polytechnique de Paris, France, Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris, Théo Zimmermann Télécom Paris, Polytechnic Institute of Paris Pre-print | ||
14:20 10mTalk | Refactoring for Dockerfile Quality: A Dive into Developer Practices and Automation Potential Technical Papers Emna Ksontini University of Michigan - Dearborn, Meriem Mastouri University of Michigan, Rania Khalsi University of Michigan - Flint, Wael Kessentini DePaul University | ||
14:30 10mTalk | Smells-sus: Sustainability Smells in IaC Technical Papers | ||
14:40 10mTalk | Evidence is All We Need: Do Self-Admitted Technical Debts Impact Method-Level Maintenance? Technical Papers Shaiful Chowdhury University of Manitoba, Hisham Kidwai University of Manitoba, Muhammad Asaduzzman University of Windsor | ||
14:50 5mTalk | DPy: Code Smells Detection Tool for Python Data and Tool Showcase Track Pre-print | ||
14:55 5mTalk | CoMRAT: Commit Message Rationale Analysis Tool Data and Tool Showcase Track Mouna Dhaouadi University of Montreal, Bentley Oakes Polytechnique Montréal, Michalis Famelis Université de Montréal Media Attached | ||
15:00 5mTalk | E2EGit: A Dataset of End-to-End Web Tests in Open Source Projects Data and Tool Showcase Track Sergio Di Meglio Università degli Studi di Napoli Federico II, Luigi Libero Lucio Starace Università degli Studi di Napoli Federico II, Valeria Pontillo Vrije Universiteit Brussel, Ruben Opdebeeck Vrije Universiteit Brussel, Coen De Roover Vrije Universiteit Brussel, Sergio Di Martino Università degli Studi di Napoli Federico II | ||
15:05 5mTalk | TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest Data and Tool Showcase Track Pre-print | ||
15:10 5mTalk | pyMethods2Test: A Dataset of Python Tests Mapped to Focal Methods Data and Tool Showcase Track Pre-print | ||
15:15 5mTalk | DataTD: A Dataset of Java Projects Including Test Doubles Data and Tool Showcase Track | ||
15:20 5mTalk | JPerfEvo: A Tool for Tracking Method-Level Performance Changes in Java Projects Data and Tool Showcase Track Kaveh Shahedi Polytechnique Montréal, Maxime Lamothe Polytechnique Montreal, Foutse Khomh Polytechnique Montréal, Heng Li Polytechnique Montréal |
16:00 - 17:30 | Plenary: ClosingProgram / Vision and Reflection at 214 Chair(s): Gabriele Bavota Software Institute @ Università della Svizzera Italiana | ||
16:00 30mTalk | Future of AI4SE: From Code Generation to Software Engineering? Vision and Reflection Baishakhi Ray Columbia University, New York; | ||
16:30 30mTalk | Reshaping MSR (and SE) empirical evaluations in 2030 Vision and Reflection Massimiliano Di Penta University of Sannio, Italy | ||
17:00 15mDay closing | Closing Session Program Bram Adams Queen's University, Olga Baysal Carleton University, Ayushi Rastogi University of Groningen, The Netherlands | ||
17:15 15mDay closing | MSR 2026 Presentation Program |
Unscheduled Events
Not scheduled Day opening | Opening Session and Award Announcement Technical Papers | ||
Not scheduled Keynote | Keynote Technical Papers |
Accepted Papers
Call for Papers
The International Conference on Mining Software Repositories (MSR) is the premier conference for data science (DS), machine learning (ML), and artificial intelligence (AI) in software engineering. There are vast amounts of data available in software-related repositories, such as source control systems, defect trackers, code review repositories, app stores, archived communications between project personnel, question-and-answer sites, CI build servers, package registries, and run-time telemetry. The MSR conference invites significant research contributions in which software data plays a central role. MSR Technical Track submissions using data from software repositories, either solely or combined with data from other sources, can take many forms, including: studies applying existing DS/ML/AI techniques to better understand the practice of software engineering, software users, and software behavior; empirically-validated applications of existing or novel DS/ML/AI-based techniques to improve software development and support the maintenance of software systems; and cross-cutting concerns around the engineering of DS/ML/AI-enabled software systems.
The 22nd International Conference on Mining Software Repositories will be held on April 28-29, 2025, in Ottawa, Canada.
Evaluation Criteria
We invite both full (maximum ten pages, plus two additional pages of references) as well as short (four pages, plus references) papers to the Research Track. Full papers are expected to describe new techniques and/or novel research results, to have a high degree of technical rigor, and to be evaluated scientifically. Short papers are expected to discuss controversial issues in the field or present interesting or thought-provoking ideas that are not yet fully developed. Submissions will be evaluated according to the following criteria:
- Soundness: This aspect pertains to how well the paper’s contributions — whether they involve new methodologies, applications of existing techniques to unfamiliar problems, empirical studies, or other research methods — address the research questions posed and are backed by a thorough application of relevant research procedures. For short papers, the expectation is for more limited evaluations given their narrower scope.
- Relevance: The extent to which the paper successfully argues or illustrates that its contributions help bridge a significant knowledge gap or tackle a crucial practical issue within the field of software engineering.
- Novelty: How original the paper’s contributions are in comparison to existing knowledge or how significantly they contribute to the current body of knowledge. Note that this doesn’t discourage well-motivated replication studies.
- Presentation: How well-structured and clear the paper’s argumentation is, how clearly the contributions are articulated, the legibility of figures and tables, and the adequacy of English language usage. All papers should comply with the formatting instructions provided.
- Replicability: The extent to which the paper’s claims can be independently verified through available replication packages and/or sufficient information included in the paper to understand how data was obtained, analyzed, and interpreted, or how a proposed technique works. All submissions are expected to adhere to the Open Science policy below.
Junior PC
Following two successful editions of the MSR Shadow PC in 2021 and 2022 (see also this paper and this presentation for more context), and the success of the Junior PC in MSR 2023 and MSR 2024, MSR 2025 will once again integrate the junior reviewers into the Technical track program committee!
The main goal remains unchanged: to train the next generation of MSR (and, more broadly, SE) reviewers and program committee members, in response to a widely-recognized challenge of scaling peer review capacity as the research community and volume of submissions grows over time. As with the previous Shadow and Junior PC, the primary audience for the Junior PC is early-career researchers (PhD students, postdocs, new faculty members, and industry practitioners) who are keen to get more involved in the academic peer-review process but have not yet served on a technical research track program committee at big international SE conferences (e.g., ICSE, ESEC/FSE, ASE, MSR, ICSME, SANER).
Prior to the MSR submission deadline, all PC members, including the junior reviewers, will receive guidance on review quality, confidentiality, and ethics standards, how to write good reviews, and how to participate in discussions (see ACM reviewers’ responsibilities). Junior reviewers will then serve alongside regular PC members on the main technical track PC, participating fully in the review process, including author responses and PC discussions to reach a consensus. In addition, Junior PC members will receive feedback on how to improve their reviews throughout the process.
All submissions to the MSR research track will be reviewed jointly by both regular and junior PC members, as part of the same process. We expect that each paper will receive three reviews from PC members. The final decisions will be made by consensus among all reviewers, as always. Based on our experience with the MSR Shadow and Junior PC, we expect that the addition of junior reviewers to each paper will increase the overall quality of reviews the authors receive, since junior reviewers will typically have a deep understanding of recent topics, and can thus provide deep technical feedback on the subject.
Submission Process
Submissions must conform to the IEEE conference proceedings template, specified in the IEEE Conference Proceedings Formatting Guidelines (title in 24pt font and full text in 10pt type, LaTeX users must use \documentclass[10pt,conference]{IEEEtran} without including the compsoc or compsocconf options).
Submissions to the Technical Track can be made via the submission site by the submission deadline. However, we encourage authors to submit at least the paper abstract and author details well in advance of the deadline, to leave enough time to properly enter conflicts of interest for anonymous reviewing. All submissions must adhere to the following requirements:
- All submissions must not exceed 10 pages for the main text, inclusive of all figures, tables, appendices, etc. Two more pages containing only references are permitted. All submissions must be in PDF. Accepted papers will be allowed one extra page for the main text of the camera-ready version. The page limit is strict, and it will not be possible to purchase additional pages at any point in the process (including after acceptance).
- Submissions must strictly conform to the IEEE conference proceedings formatting instructions specified above. Alterations of spacing, font size, and other changes that deviate from the instructions may result in desk rejection without further review.
- By submitting to MSR, authors acknowledge that they are aware of and agree to be bound by the ACM Policy and Procedures on Plagiarism and the IEEE Plagiarism FAQ. Papers submitted to MSR 2025 must not have been published elsewhere and must not be under review or submitted for review elsewhere whilst under consideration for MSR 2025. Contravention of this concurrent submission policy will be deemed a serious breach of scientific ethics, and appropriate action will be taken in all such cases. To check for double submission and plagiarism issues, the chairs reserve the right to (1) share the list of submissions with the PC Chairs of other conferences with overlapping review periods and (2) use external plagiarism detection software, under contract to the ACM or IEEE, to detect violations of these policies.
- By submitting your article to an ACM Publication, you are hereby acknowledging that you and your co-authors are subject to all ACM Publications Policies, including ACM’s new Publications Policy on Research Involving Human Participants and Subjects. Alleged violations of this policy or any ACM Publications Policy will be investigated by ACM and may result in a full retraction of your paper, in addition to other potential penalties, as per ACM Publications Policy.
- Please ensure that you and your co-authors obtain an ORCID ID, so you can complete the publishing process for your accepted paper. ACM has been involved in ORCID from the start and ICSE has recently made a commitment to collect ORCID IDs from all of the published authors. We are committed to improving author discoverability, ensuring proper attribution, and contributing to ongoing community efforts around name normalization; your ORCID ID will help in these efforts.
- The MSR 2025 Technical Track will employ a double-anonymous review process. Thus, no submission may reveal its authors’ identities. The authors must make every effort to honor the double-anonymous review process. In particular:
- Authors’ names must be omitted from the submission.
- All references to the author’s prior work should be in the third person.
- While authors have the right to upload preprints on ArXiV or similar sites, they must avoid specifying that the manuscript was submitted to MSR 2025.
- During review, authors should not publicly use the submission title. We recommend using a different paper title for any pre-print in arxiv or similar websites.
- Further advice, guidance, and explanation about the double-anonymous review process can be found on the Q&A page from ICSEs.
- New this year: By submitting to MSR 2025, authors acknowledge that they conform to the authorship policy of the ACM, and the authorship policy of the IEEE. This includes following these points related to the use of Generative AI:
- “Generative AI tools and technologies, such as ChatGPT, may not be listed as authors of an ACM-published Work. The use of generative AI tools and technologies to create content is permitted but must be fully disclosed in the Work. For example, the authors could include the following statement in the Acknowledgements section of the Work: ChatGPT was utilized to generate sections of this Work, including text, tables, graphs, code, data, citations, etc.). If you are uncertain about the need to disclose the use of a particular tool, err on the side of caution, and include a disclosure in the acknowledgments section of the Work.” - ACM
- “The use of artificial intelligence (AI)–generated text in an article shall be disclosed in the acknowledgments section of any paper submitted to an IEEE Conference or Periodical. The sections of the paper that use AI-generated text shall have a citation to the AI system used to generate the text.” - IEEE
- “If you are using generative AI software tools to edit and improve the quality of your existing text in much the same way you would use a typing assistant like Grammarly to improve spelling, grammar, punctuation, clarity, engagement or to use a basic word processing system to correct spelling or grammar, it is not necessary to disclose such usage of these tools in your Work.” - ACM
Submissions should also include a supporting statement on the data availability, per the Open Science policy below.
Any submission that does not comply with these requirements is likely to be desk rejected by the PC Chairs without further review.
Authors will have a chance to see the reviews and respond to reviewer comments before any decision about the submission is made.
Upon notification of acceptance, all authors of accepted papers will be asked to fill out a copyright form and will receive further instructions for preparing the camera-ready version of their papers. At least one author of each paper is expected to register and present the paper at the MSR 2025 conference. All accepted contributions will be published in the electronic proceedings of the conference.
A selection of the best papers will be invited to an Empirical Software Engineering (EMSE) Special Issue. The authors of accepted papers that show outstanding contributions to the FOSS community will have a chance to self-nominate their paper for the MSR FOSS Impact Award.
Open Science Policy
The MSR conference actively supports the adoption of open science principles. Indeed, we consider replicability as an explicit evaluation criterion. We expect all contributing authors to disclose the (anonymized and curated) data to increase reproducibility, replicability, and/or recoverability of the studies, provided that there are no ethical, legal, technical, economic, or sensible barriers preventing the disclosure. Please provide a supporting statement on the data availability in your submitted papers, including an argument for why (some of) the data cannot be made available, if that is the case.
Specifically, we expect all contributing authors to disclose:
- the source code of relevant software used or proposed in the paper, including that used to retrieve and analyze data.
- the data used in the paper (e.g., evaluation data, anonymized survey data, etc.)
- instructions for other researchers describing how to reproduce or replicate the results.
Fostering artifacts as open data and open source should be done as:
- Archived on preserved digital repositories such as zenodo.org, figshare.com, www.softwareheritage.org, osf.io, or institutional repositories. GitHub, GitLab, and similar services for version control systems do not offer properly archived and preserved data. Personal or institutional websites, consumer cloud storage such as Dropbox, or services such as Academia.edu and Researchgate.net may not provide properly archived and preserved data and may increase the risk of violating anonymity if used at submission time.
- Data should be released under a recognized open data license such as the CC0 dedication or the CC-BY 4.0 license when publishing the data.
- Software should be released under an open source license.
- Different open licenses, if mandated by institutions or regulations, are also permitted.
We encourage authors to make artifacts available upon submission (either privately or publicly) and upon acceptance (publicly).
We recognize that anonymizing artifacts such as source code is more difficult than preserving anonymity in a paper. We ask authors to take a best-effort approach to not reveal their identities. We will also ask reviewers to avoid trying to identify authors by looking at commit histories and other such information that is not easily anonymized. Authors wanting to share GitHub repositories may also look into using https://anonymous.4open.science/, which is an open-source tool that helps you to quickly double-anonymize your repository.
For additional information on creating open artifacts and open access pre- and post-prints, please see this ICSE 2023 page.
Submission Link
Papers must be submitted through HotCRP: https://msr2025-technical.hotcrp.com
Important Dates
- Abstract Deadline: November 6, 2024 AoE
- Paper Deadline: November 9, 2024 AoE
- Author Response Period: December 12 – 15, 2024 AoE
- Author Notification: January 12, 2025 AoE
- Camera Ready Deadline: February 05, 2025 AoE
Accepted Papers and Attendance Expectation
After acceptance, the list of paper authors can not be changed under any circumstances and the list of authors on camera-ready papers must be identical to those on submitted papers. After acceptance paper titles can not be changed except by permission of the Program Co-Chairs, and only then when referees recommended a change for clarity or accuracy with paper content.
If a submission is accepted, at least one author of the paper is required to register for MSR 2025 and present the paper.