MSR 2025
Mon 28 - Tue 29 April 2025 Ottawa, Ontario, Canada
co-located with ICSE 2025
Tue 29 Apr 2025 12:05 - 12:10 at 214 - Software ecosystems and humans Chair(s): Ahmad Abdellatif

Maven Central is the largest open repository for JVM libraries, hosting just under 15 million artifacts as of November 2024. Its popularity has made it a prime target for malicious actors to upload malware or exploit vulnerabilities – one in eight open source downloads have been vulnerable in 2023. Consequently, analyzing the artifacts is essential to understanding and improving software security and safety, both for individual projects and on a large-scale. However, current implementations of concrete analyses do not separate the infrastructural task of iterating and accessing artifacts from their domain-specific analysis task. Consequently, features are implemented many times in different variations, increasing the potential for bugs as well as the overhead in development and maintenance. With this work we propose MARIN, a framework for conducting analyses targeting software hosted on Maven Central. MARIN handles common infrastructural tasks in such scenarios, including iterating artifacts, retrieving metadata, parsing binaries, and resolving dependencies. It is designed to have minimal performance overhead, using both internal caches and the local Maven repository to reduce the number of HTTP calls and computations. This way, researchers can solely focus on implementing their domain-specific analysis task – MARIN provides configurable facilities to execute it for all artifacts on Maven Central.

Tue 29 Apr

Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:30
Software ecosystems and humansData and Tool Showcase Track / Technical Papers at 214
Chair(s): Ahmad Abdellatif University of Calgary
11:00
10m
Talk
The Ecosystem of Open-Source Music Production Software – A Mining Study on the Development Practices of VST Plugins on GitHub
Technical Papers
Andrei Bogdan University of Amsterdam, Mauricio Verano Merino Vrije Universiteit Amsterdam, Ivano Malavolta Vrije Universiteit Amsterdam
Pre-print
11:10
10m
Talk
Can LLMs Replace Manual Annotation of Software Engineering Artifacts?
Technical Papers
Toufique Ahmed IBM Research, Prem Devanbu University of California at Davis, Christoph Treude Singapore Management University, Michael Pradel University of Stuttgart
Pre-print
11:20
10m
Talk
Investigating the Understandability of Review Comments on Code Change Requests
Technical Papers
Md Shamimur Rahman University of Saskatchewan, Canada, Zadia Codabux University of Saskatchewan, Chanchal K. Roy University of Saskatchewan, Canada
11:30
10m
Talk
Mining a Decade of Contributor Dynamics in Ethereum: A Longitudinal Study
Technical Papers
Matteo Vaccargiu University of Cagliari, Sabrina Aufiero University College London (UCL), Cheick Ba Queen Mary University of London, Silvia Bartolucci University College London, Richard Clegg Queen Mary University London, Daniel Graziotin University of Hohenheim, Rumyana Neykova Brunel University London, Roberto Tonelli University of Cagliari, Giuseppe Destefanis Brunel University London
Pre-print
11:40
10m
Talk
Is it Really Fun? Detecting Low Engagement Events in Video Games
Technical Papers
Emanuela Guglielmi University of Molise, Gabriele Bavota Software Institute @ Università della Svizzera Italiana, Nicole Novielli University of Bari, Rocco Oliveto University of Molise, Simone Scalabrino University of Molise
11:50
5m
Talk
A Dataset of Contributor Activities in the NumFocus Open-Source Community
Data and Tool Showcase Track
Youness Hourri University of Mons, Alexandre Decan University of Mons; F.R.S.-FNRS, Tom Mens University of Mons
Pre-print
11:55
5m
Talk
Jupyter Notebook Activity Dataset
Data and Tool Showcase Track
Tomoki Nakamaru The University of Tokyo, Tomomasa Matsunaga The University of Tokyo, Tetsuro Yamazaki University of Tokyo
12:00
5m
Talk
CoPhi - Mining C/C++ Packages for Conan Ecosystem Analysis
Data and Tool Showcase Track
Vivek Sarkar University of Washington, Anemone Kampkötter TU Dortmund, Ben Hermann TU Dortmund
Pre-print
12:05
5m
Talk
MARIN: A Research-Centric Interface for Querying Software Artifacts on Maven Repositories
Data and Tool Showcase Track
Johannes Düsing TU Dortmund, Jared Chiaramonte Arizona State University, Ben Hermann TU Dortmund
Pre-print
12:10
5m
Talk
GitProjectHealth: an Extensible Framework for Git Social Platform Mining
Data and Tool Showcase Track
Nicolas Hlad Berger-Levrault, Benoit Verhaeghe Berger-Levrault, Kilian Bauvent Berger-levrault
12:15
5m
Talk
Myriad People. Open Source Software for New Media Arts
Data and Tool Showcase Track
Benoit Baudry Université de Montréal, Erik Natanael Gustafsson Independent artist, Roni Kaufman Independent artist, Maria Kling Independent artist
Pre-print
12:20
5m
Talk
OpenMent: A Dataset of Mentor-Mentee Interactions in Google Summer of Code
Data and Tool Showcase Track
Erfan Raoofian University of British Columbia, Fatemeh Hendijani Fard University of British Columbia, Ifeoma Adaji University of British Columbia, Gema Rodríguez-Pérez University of British Columbia (UBC)
12:25
5m
Talk
Under the Blueprints: Parsing Unreal Engine’s Visual Scripting at Scale
Data and Tool Showcase Track
Kalvin Eng University of Alberta, Abram Hindle University of Alberta