MSR 2025
Mon 28 - Tue 29 April 2025 Ottawa, Ontario, Canada
co-located with ICSE 2025
Tue 29 Apr 2025 11:40 - 11:45 at 215 - Build systems and DevOps Chair(s): Massimiliano Di Penta

In recent years, continuous integration and deployment (CI/CD) has become increasingly popular in both the open-source community and industry. Evaluating CI/CD performance is a critical aspect of software development, as it not only helps minimize execution costs but also ensures faster feedback for developers. Despite its importance, there is limited fine-grained knowledge about the performance of CI/CD processes, while this knowledge is essential for identifying bottlenecks and optimization opportunities. Moreover, the availability of large-scale, publicly accessible datasets of CI/CD logs remains scarce. The few datasets that do exist are often outdated and lack comprehensive coverage. To address this gap, we introduce GHALogs, a new dataset comprising 116k CI/CD workflows executed using GitHub Actions (GHA) across 25k public code projects spanning 20 different programming languages. This dataset includes 513k workflow runs encompassing 2.3 million individual steps. For each workflow run, we provide detailed metadata along with complete run logs. To the best of our knowledge, this is the largest dataset of CI/CD runs that includes full log data. The inclusion of these logs enables more in-depth analysis of CI/CD pipelines, offering insights that cannot be gleaned solely from code repositories. We postulate that this dataset will facilitate future CI/CD pipeline behavior research through log-based analysis. Potential applications include performance evaluation (e.g., measuring task execution times) and root cause analysis (e.g., identifying reasons for pipeline failures).

Tue 29 Apr

Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:30
Build systems and DevOpsData and Tool Showcase Track / Technical Papers / Tutorials at 215
Chair(s): Massimiliano Di Penta University of Sannio, Italy
11:00
10m
Talk
Build Scripts Need Maintenance Too: A Study on Refactoring and Technical Debt in Build Systems
Technical Papers
Anwar Ghammam Oakland University, Dhia Elhaq Rzig University of Michigan - Dearborn, Mohamed Almukhtar Oakland University, Rania Khalsi University of Michigan - Flint, Foyzul Hassan University of Michigan at Dearborn, Marouane Kessentini Grand Valley State University
11:10
10m
Talk
LLMSecConfig: An LLM-Based Approach for Fixing Software Container Misconfigurations
Technical Papers
Ziyang Ye The University of Adelaide, Triet Le The University of Adelaide, Muhammad Ali Babar School of Computer Science, The University of Adelaide
Pre-print
11:20
10m
Talk
How Do Infrastructure-as-Code Practitioners Update Their Dependencies? An Empirical Study on Terraform Module Updates
Technical Papers
Mahi Begoug , Ali Ouni ETS Montreal, University of Quebec, Moataz Chouchen Department of Electrical and Computer Engineering, Concordia University, Montreal, Canada
11:30
5m
Talk
TerraDS: A Dataset for Terraform HCL Programs
Data and Tool Showcase Track
Christoph Buehler University of St. Gallen, David Spielmann University of St. Gallen, Roland Meier armasuisse, Guido Salvaneschi University of St. Gallen
Pre-print
11:35
5m
Talk
CARDS: A collection of package, revision, and miscelleneous dependency graphs
Data and Tool Showcase Track
Euxane TRAN-GIRARD LIGM, CNRS, Université Gustave Eiffel, Laurent BULTEAU LIGM, CNRS, Université Gustave Eiffel, Pierre-Yves DAVID Octobus S.c.o.p.
Pre-print
11:40
5m
Talk
GHALogs: Large-scale dataset of GitHub Actions runs
Data and Tool Showcase Track
Florent Moriconi EURECOM, AMADEUS, Thomas Durieux TU Delft, Jean-Rémy Falleri Bordeaux INP, Raphaël Troncy EURECOM, Aurélien Francillon EURECOM
11:45
5m
Talk
OSPtrack: A Labeled Dataset Targeting Simulated Execution of Open-Source Software
Data and Tool Showcase Track
Zhuoran Tan University of Glasgow, Christos Anagnostopoulos University of Glasgow, Jeremy Singer University of Glasgow
11:50
40m
Tutorial
Agents for Software Development
Tutorials
Graham Neubig Carnegie Mellon University