MSR 2025
Mon 28 - Tue 29 April 2025 Ottawa, Ontario, Canada
co-located with ICSE 2025
Mon 28 Apr 2025 12:20 - 12:25 at 215 - Security and legal aspects Chair(s): Mohammad Ghafari
Mon 28 Apr 2025 13:00 - 14:00 at Canada Hall 3 Poster Area - MSR Poster (Monday)
Tue 29 Apr 2025 13:00 - 14:00 at Canada Hall 3 Poster Area - MSR Poster (Tuesday)

Current malware (malicious software) analysis tools focus on detection and family classification but fail to provide clear and actionable narrative insights into the malignant activity of the malware. Therefore, there is a need for a tool that translates raw malware data into human-readable descriptions. Developing such a tool accelerates incident response, reduces malware analysts’ cognitive load, and enables individuals having limited technical expertise to understand malicious software behaviour. With this objective, we present MaLAware, which automatically summarizes the full spectrum of malicious activity of malware executables. MaLAware leverages cuckoo sandbox analysis and large language models (LLMs) to explain malware behaviour. It parses sandbox JSON reports and uses LLMs to correlate executable activities and generate concise summaries. We benchmark the tool’s performance on five open-source LLMs. The evaluation uses the human-written malware behaviour description dataset as ground truth. The model’s performance is measured using 11 extensive performance metrics, which boosts the confidence of MaLAware’s effectiveness. The current version of the tool, i.e., v0, supports Qwen2.5-7B, Llama2-7B, Llama3.1-8B, Mistral-7B, and Falcon-7B, along with the quantization feature for resource-constrained environments. MaLAware lays a foundation for future research in malware behaviour explanation, and its extensive evaluation sets a benchmark for LLMs’ ability to narrate malware behaviour in actionable and comprehensive manner.

Mon 28 Apr

Displayed time zone: Eastern Time (US & Canada) change

11:00 - 12:30
Security and legal aspectsIndustry Track / Data and Tool Showcase Track / Technical Papers at 215
Chair(s): Mohammad Ghafari TU Clausthal
11:00
10m
Talk
Wolves in the Repository: A Software Engineering Analysis of the XZ Utils Supply Chain Attack
Technical Papers
Piotr Przymus Nicolaus Copernicus University in Toruń, Poland, Thomas Durieux TU Delft
Pre-print
11:10
10m
Talk
Software Composition Analysis and Supply Chain Security in Apache Projects: an Empirical Study
Technical Papers
Sabato Nocera University of Salerno, Sira Vegas Universidad Politecnica de Madrid, Giuseppe Scanniello University of Salerno, Natalia Juristo Universidad Politecnica de Madrid
Pre-print
11:20
10m
Talk
Good practice versus reality: a landscape analysis of Research Software metadata adoption in European Open Science Clusters
Technical Papers
Anas El Hounsri Universidad Politécnica de Madrid, Daniel Garijo Universidad Politécnica de Madrid
11:30
10m
Talk
Towards Security Commit Message Standardization
Technical Papers
Sofia Reis Instituto Superior Técnico, U. Lisboa & INESC-ID, Rui Abreu Faculty of Engineering of the University of Porto, Portugal, Corina Pasareanu CMU, NASA, KBR
11:40
10m
Talk
From Industrial Practices to Academia: Uncovering the Gap in Vulnerability Research and Practice
Technical Papers
Zhuang Liu , Xing Hu Zhejiang University, Jiayuan Zhou Queen's University, Xin Xia Huawei
11:50
5m
Talk
Patch Me If You Can—Securing the Linux Kernel
Industry Track
Gunnar Kudrjavets Amazon Web Services, USA
Pre-print
11:55
5m
Talk
OSS License Identification at Scale: A Comprehensive Dataset Using World of Code
Data and Tool Showcase Track
Mahmoud Jahanshahi University of Tennessee, David Reid University of Tennessee, Adam McDaniel University of Tennessee Knoxville, Audris Mockus University of Tennessee
12:00
5m
Talk
SCRUBD: Smart Contracts Reentrancy and Unhandled Exceptions Vulnerability Dataset
Data and Tool Showcase Track
Chavhan Sujeet Yashavant Indian Institute of Technology, Kanpur, Mitrajsinh Chavda Indian Institute of Technology Kanpur, India, Saurabh Kumar Indian Institute of Technology Hyderabad, India, Amey Karkare IIT Kanpur, Angshuman Karmakar Indian Institute of Technology Kanpur, India
Pre-print
12:05
5m
Talk
ICVul: A Well-labeled C/C++ Vulnerability Dataset with Comprehensive Metadata and VCCs
Data and Tool Showcase Track
Chaomeng Lu DistriNet Group-T, KU Leuven, Tianyu Li DistriNet Group-T, KU Leuven, Toon Dehaene KU Leuven, Bert Lagaisse DistriNet Group-T, KU Leuven
12:10
5m
Talk
A Dataset of Software Bill of Materials for Evaluating SBOM Consumption Tools
Data and Tool Showcase Track
Rio Kishimoto Osaka University, Tetsuya Kanda Notre Dame Seishin University, Yuki Manabe The University of Fukuchiyama, Katsuro Inoue Nanzan University, Shi Qiu Toshiba, Yoshiki Higo Osaka University
Pre-print
12:15
5m
Talk
Wild SBOMs: a Large-scale Dataset of Software Bills of Materials from Public Code
Data and Tool Showcase Track
Luis Soeiro LTCI, Télécom Paris, Institut Polytechnique de Paris, Thomas Robert LTCI, Télécom Paris, Institut Polytechnique de Paris, Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris
Pre-print
12:20
5m
Talk
MaLAware: Automating the Comprehension of Malicious Software Behaviours using Large Language Models (LLMs)
Data and Tool Showcase Track
BIKASH SAHA Indian Institute of Technology Kanpur, Nanda Rani Indian Institute of Technology Kanpur, Sandeep K. Shukla Indian Institute of Technology Kanpur
Pre-print
13:00 - 14:00
13:00
60m
Talk
SPRINT: An Assistant for Issue Report Management
Data and Tool Showcase Track
Ahmed Adnan , Antu Saha William & Mary, Oscar Chaparro William & Mary
Pre-print
13:00
60m
Talk
Combining Large Language Models with Static Analyzers for Code Review Generation
Technical Papers
Imen Jaoua DIRO, Université de Montréal, Oussama Ben Sghaier DIRO, Université de Montréal, Houari Sahraoui DIRO, Université de Montréal
Pre-print
13:00
60m
Talk
Can LLMs Replace Manual Annotation of Software Engineering Artifacts?
Technical Papers
Toufique Ahmed IBM Research, Prem Devanbu University of California at Davis, Christoph Treude Singapore Management University, Michael Pradel University of Stuttgart
Pre-print
13:00
60m
Talk
Dependency Update Adoption Patterns in the Maven Software Ecosystem
Mining Challenge
Baltasar Berretta College of Wooster, Augustus Thomas College of Wooster, Heather Guarnera The College of Wooster
13:00
60m
Talk
Popularity and Innovation in Maven Central
Mining Challenge
Nkiru Ede Victoria University of Wellington, Jens Dietrich Victoria University of Wellington, Ulrich Zülicke Victoria University of Wellington
Pre-print
13:00
60m
Talk
Chasing the Clock: How Fast Are Vulnerabilities Fixed in the Maven Ecosystem?
Mining Challenge
Md Fazle Rabbi Idaho State University, Arifa Islam Champa Idaho State University, Rajshakhar Paul Wayne State University, Minhaz F. Zibran Idaho State University
Pre-print
13:00
60m
Talk
SCRUBD: Smart Contracts Reentrancy and Unhandled Exceptions Vulnerability Dataset
Data and Tool Showcase Track
Chavhan Sujeet Yashavant Indian Institute of Technology, Kanpur, Mitrajsinh Chavda Indian Institute of Technology Kanpur, India, Saurabh Kumar Indian Institute of Technology Hyderabad, India, Amey Karkare IIT Kanpur, Angshuman Karmakar Indian Institute of Technology Kanpur, India
Pre-print
13:00
60m
Talk
TerraDS: A Dataset for Terraform HCL Programs
Data and Tool Showcase Track
Christoph Buehler University of St. Gallen, David Spielmann University of St. Gallen, Roland Meier armasuisse, Guido Salvaneschi University of St. Gallen
Pre-print
13:00
60m
Talk
Mining a Decade of Contributor Dynamics in Ethereum: A Longitudinal Study
Technical Papers
Matteo Vaccargiu University of Cagliari, Sabrina Aufiero University College London (UCL), Cheick Ba Queen Mary University of London, Silvia Bartolucci University College London, Richard Clegg Queen Mary University London, Daniel Graziotin University of Hohenheim, Rumyana Neykova Brunel University London, Roberto Tonelli University of Cagliari, Giuseppe Destefanis Brunel University London
Pre-print
13:00
60m
Talk
CoMRAT: Commit Message Rationale Analysis Tool
Data and Tool Showcase Track
Mouna Dhaouadi University of Montreal, Bentley Oakes Polytechnique Montréal, Michalis Famelis Université de Montréal
Media Attached File Attached
13:00
60m
Talk
A Dataset of Software Bill of Materials for Evaluating SBOM Consumption Tools
Data and Tool Showcase Track
Rio Kishimoto Osaka University, Tetsuya Kanda Notre Dame Seishin University, Yuki Manabe The University of Fukuchiyama, Katsuro Inoue Nanzan University, Shi Qiu Toshiba, Yoshiki Higo Osaka University
Pre-print
13:00
60m
Talk
A Dataset of Contributor Activities in the NumFocus Open-Source Community
Data and Tool Showcase Track
Youness Hourri University of Mons, Alexandre Decan University of Mons; F.R.S.-FNRS, Tom Mens University of Mons
Pre-print
13:00
60m
Talk
Does Functional Package Management Enable Reproducible Builds at Scale? Yes.
Technical Papers
Julien Malka LTCI, Télécom Paris, Institut Polytechnique de Paris, France, Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris, Théo Zimmermann Télécom Paris, Polytechnic Institute of Paris
Pre-print
13:00
60m
Talk
HaPy-Bug - Human Annotated Python Bug Resolution Dataset
Data and Tool Showcase Track
Piotr Przymus Nicolaus Copernicus University in Toruń, Poland, Mikołaj Fejzer Nicolaus Copernicus University in Toruń, Jakub Narębski Nicolaus Copernicus University in Toruń, Radosław Woźniak Nicolaus Copernicus University in Toruń, Łukasz Halada University of Wrocław, Poland, Aleksander Kazecki Nicolaus Copernicus University in Toruń, Mykhailo Molchanov Igor Sikorsky Kyiv Polytechnic Institute, Ukraine, Krzysztof Stencel University of Warsaw
Pre-print
13:00
60m
Talk
Do LLMs Provide Links to Code Similar to what they Generate? A Study with Gemini and Bing CoPilot
Technical Papers
Daniele Bifolco University of Sannio, Pietro Cassieri University of Salerno, Giuseppe Scanniello University of Salerno, Massimiliano Di Penta University of Sannio, Italy, Fiorella Zampetti University of Sannio, Italy
Pre-print
13:00
60m
Talk
Out of Sight, Still at Risk: The Lifecycle of Transitive Vulnerabilities in Maven
Mining Challenge
Piotr Przymus Nicolaus Copernicus University in Toruń, Poland, Mikołaj Fejzer Nicolaus Copernicus University in Toruń, Jakub Narębski Nicolaus Copernicus University in Toruń, Krzysztof Rykaczewski Nicolaus Copernicus University in Toruń, Poland, Krzysztof Stencel University of Warsaw
Pre-print
13:00
60m
Talk
Refactoring for Dockerfile Quality: A Dive into Developer Practices and Automation Potential
Technical Papers
Emna Ksontini University of Michigan, Meriem Mastouri University of Michigan, Rania Khalsi University of Michigan - Flint, Wael Kessentini DePaul University
13:00
60m
Talk
Cascading Effects: Analyzing Project Failure Impact in the Maven Central Ecosystem
Mining Challenge
Mina Shehata Belmont University, Saidmakhmud Makhkamjonoov Belmont University, Mahad Syed Belmont University, Esteban Parra Belmont University
13:00
60m
Talk
MaLAware: Automating the Comprehension of Malicious Software Behaviours using Large Language Models (LLMs)
Data and Tool Showcase Track
BIKASH SAHA Indian Institute of Technology Kanpur, Nanda Rani Indian Institute of Technology Kanpur, Sandeep K. Shukla Indian Institute of Technology Kanpur
Pre-print
13:00
60m
Talk
Investigating the Understandability of Review Comments on Code Change Requests
Technical Papers
Md Shamimur Rahman University of Saskatchewan, Canada, Zadia Codabux University of Saskatchewan, Chanchal K. Roy University of Saskatchewan, Canada

Tue 29 Apr

Displayed time zone: Eastern Time (US & Canada) change

13:00 - 14:00
13:00
60m
Talk
Chasing the Clock: How Fast Are Vulnerabilities Fixed in the Maven Ecosystem?
Mining Challenge
Md Fazle Rabbi Idaho State University, Arifa Islam Champa Idaho State University, Rajshakhar Paul Wayne State University, Minhaz F. Zibran Idaho State University
Pre-print
13:00
60m
Talk
MaLAware: Automating the Comprehension of Malicious Software Behaviours using Large Language Models (LLMs)
Data and Tool Showcase Track
BIKASH SAHA Indian Institute of Technology Kanpur, Nanda Rani Indian Institute of Technology Kanpur, Sandeep K. Shukla Indian Institute of Technology Kanpur
Pre-print
13:00
60m
Talk
A Dataset of Contributor Activities in the NumFocus Open-Source Community
Data and Tool Showcase Track
Youness Hourri University of Mons, Alexandre Decan University of Mons; F.R.S.-FNRS, Tom Mens University of Mons
Pre-print
13:00
60m
Talk
Popularity and Innovation in Maven Central
Mining Challenge
Nkiru Ede Victoria University of Wellington, Jens Dietrich Victoria University of Wellington, Ulrich Zülicke Victoria University of Wellington
Pre-print
13:00
60m
Talk
TerraDS: A Dataset for Terraform HCL Programs
Data and Tool Showcase Track
Christoph Buehler University of St. Gallen, David Spielmann University of St. Gallen, Roland Meier armasuisse, Guido Salvaneschi University of St. Gallen
Pre-print
13:00
60m
Talk
SPRINT: An Assistant for Issue Report Management
Data and Tool Showcase Track
Ahmed Adnan , Antu Saha William & Mary, Oscar Chaparro William & Mary
Pre-print
13:00
60m
Talk
Does Functional Package Management Enable Reproducible Builds at Scale? Yes.
Technical Papers
Julien Malka LTCI, Télécom Paris, Institut Polytechnique de Paris, France, Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris, Théo Zimmermann Télécom Paris, Polytechnic Institute of Paris
Pre-print
13:00
60m
Talk
Dependency Update Adoption Patterns in the Maven Software Ecosystem
Mining Challenge
Baltasar Berretta College of Wooster, Augustus Thomas College of Wooster, Heather Guarnera The College of Wooster
13:00
60m
Talk
A Dataset of Software Bill of Materials for Evaluating SBOM Consumption Tools
Data and Tool Showcase Track
Rio Kishimoto Osaka University, Tetsuya Kanda Notre Dame Seishin University, Yuki Manabe The University of Fukuchiyama, Katsuro Inoue Nanzan University, Shi Qiu Toshiba, Yoshiki Higo Osaka University
Pre-print
13:00
60m
Talk
Investigating the Understandability of Review Comments on Code Change Requests
Technical Papers
Md Shamimur Rahman University of Saskatchewan, Canada, Zadia Codabux University of Saskatchewan, Chanchal K. Roy University of Saskatchewan, Canada
13:00
60m
Talk
Refactoring for Dockerfile Quality: A Dive into Developer Practices and Automation Potential
Technical Papers
Emna Ksontini University of Michigan, Meriem Mastouri University of Michigan, Rania Khalsi University of Michigan - Flint, Wael Kessentini DePaul University
13:00
60m
Talk
Combining Large Language Models with Static Analyzers for Code Review Generation
Technical Papers
Imen Jaoua DIRO, Université de Montréal, Oussama Ben Sghaier DIRO, Université de Montréal, Houari Sahraoui DIRO, Université de Montréal
Pre-print
13:00
60m
Talk
Cascading Effects: Analyzing Project Failure Impact in the Maven Central Ecosystem
Mining Challenge
Mina Shehata Belmont University, Saidmakhmud Makhkamjonoov Belmont University, Mahad Syed Belmont University, Esteban Parra Belmont University
13:00
60m
Talk
CoMRAT: Commit Message Rationale Analysis Tool
Data and Tool Showcase Track
Mouna Dhaouadi University of Montreal, Bentley Oakes Polytechnique Montréal, Michalis Famelis Université de Montréal
Media Attached File Attached
13:00
60m
Talk
Can LLMs Replace Manual Annotation of Software Engineering Artifacts?
Technical Papers
Toufique Ahmed IBM Research, Prem Devanbu University of California at Davis, Christoph Treude Singapore Management University, Michael Pradel University of Stuttgart
Pre-print
13:00
60m
Talk
Do LLMs Provide Links to Code Similar to what they Generate? A Study with Gemini and Bing CoPilot
Technical Papers
Daniele Bifolco University of Sannio, Pietro Cassieri University of Salerno, Giuseppe Scanniello University of Salerno, Massimiliano Di Penta University of Sannio, Italy, Fiorella Zampetti University of Sannio, Italy
Pre-print
13:00
60m
Talk
Mining a Decade of Contributor Dynamics in Ethereum: A Longitudinal Study
Technical Papers
Matteo Vaccargiu University of Cagliari, Sabrina Aufiero University College London (UCL), Cheick Ba Queen Mary University of London, Silvia Bartolucci University College London, Richard Clegg Queen Mary University London, Daniel Graziotin University of Hohenheim, Rumyana Neykova Brunel University London, Roberto Tonelli University of Cagliari, Giuseppe Destefanis Brunel University London
Pre-print
13:00
60m
Talk
SCRUBD: Smart Contracts Reentrancy and Unhandled Exceptions Vulnerability Dataset
Data and Tool Showcase Track
Chavhan Sujeet Yashavant Indian Institute of Technology, Kanpur, Mitrajsinh Chavda Indian Institute of Technology Kanpur, India, Saurabh Kumar Indian Institute of Technology Hyderabad, India, Amey Karkare IIT Kanpur, Angshuman Karmakar Indian Institute of Technology Kanpur, India
Pre-print
13:00
60m
Talk
Out of Sight, Still at Risk: The Lifecycle of Transitive Vulnerabilities in Maven
Mining Challenge
Piotr Przymus Nicolaus Copernicus University in Toruń, Poland, Mikołaj Fejzer Nicolaus Copernicus University in Toruń, Jakub Narębski Nicolaus Copernicus University in Toruń, Krzysztof Rykaczewski Nicolaus Copernicus University in Toruń, Poland, Krzysztof Stencel University of Warsaw
Pre-print
13:00
60m
Talk
HaPy-Bug - Human Annotated Python Bug Resolution Dataset
Data and Tool Showcase Track
Piotr Przymus Nicolaus Copernicus University in Toruń, Poland, Mikołaj Fejzer Nicolaus Copernicus University in Toruń, Jakub Narębski Nicolaus Copernicus University in Toruń, Radosław Woźniak Nicolaus Copernicus University in Toruń, Łukasz Halada University of Wrocław, Poland, Aleksander Kazecki Nicolaus Copernicus University in Toruń, Mykhailo Molchanov Igor Sikorsky Kyiv Polytechnic Institute, Ukraine, Krzysztof Stencel University of Warsaw
Pre-print
Hide past events