MaLAware: Automating the Comprehension of Malicious Software Behaviours using Large Language Models (LLMs)
This program is tentative and subject to change.
Current malware (malicious software) analysis tools focus on detection and family classification but fail to provide clear and actionable narrative insights into the malignant activity of the malware. Therefore, there is a need for a tool that translates raw malware data into human-readable descriptions. Developing such a tool accelerates incident response, reduces malware analysts’ cognitive load, and enables individuals having limited technical expertise to understand malicious software behaviour. With this objective, we present MaLAware, which automatically summarizes the full spectrum of malicious activity of malware executables. MaLAware leverages cuckoo sandbox analysis and large language models (LLMs) to explain malware behaviour. It parses sandbox JSON reports and uses LLMs to correlate executable activities and generate concise summaries. We benchmark the tool’s performance on five open-source LLMs. The evaluation uses the human-written malware behaviour description dataset as ground truth. The model’s performance is measured using 11 extensive performance metrics, which boosts the confidence of MaLAware’s effectiveness. The current version of the tool, i.e., v0, supports Qwen2.5-7B, Llama2-7B, Llama3.1-8B, Mistral-7B, and Falcon-7B, along with the quantization feature for resource-constrained environments. MaLAware lays a foundation for future research in malware behaviour explanation, and its extensive evaluation sets a benchmark for LLMs’ ability to narrate malware behaviour in actionable and comprehensive manner.
This program is tentative and subject to change.
Mon 28 AprDisplayed time zone: Eastern Time (US & Canada) change
11:00 - 12:30 | |||
11:00 10mTalk | Wolves in the Repository: A Software Engineering Analysis of the XZ Utils Supply Chain Attack Technical Papers | ||
11:10 10mTalk | Software Composition Analysis and Supply Chain Security in Apache Projects: an Empirical Study Technical Papers Sabato Nocera University of Salerno, Sira Vegas Universidad Politecnica de Madrid, Giuseppe Scanniello University of Salerno, Natalia Juristo Universidad Politecnica de Madrid Pre-print | ||
11:20 10mTalk | Good practice versus reality: a landscape analysis of Research Software metadata adoption in European Open Science Clusters Technical Papers | ||
11:30 10mTalk | Towards Security Commit Message Standardization Technical Papers Sofia Reis Instituto Superior Técnico, U. Lisboa & INESC-ID, Rui Abreu INESC-ID; University of Porto, Corina Pasareanu CMU, NASA, KBR | ||
11:40 10mTalk | From Industrial Practices to Academia: Uncovering the Gap in Vulnerability Research and Practice Technical Papers | ||
11:50 5mTalk | Patch Me If You Can—Securing the Linux Kernel Industry Track Gunnar Kudrjavets Amazon Web Services, USA Pre-print | ||
11:55 5mTalk | OSS License Identification at Scale: A Comprehensive Dataset Using World of Code Data and Tool Showcase Track Mahmoud Jahanshahi Research Assistant, University of Tennessee Knoxville, David Reid University of Tennessee, Adam McDaniel University of Tennessee Knoxville, Audris Mockus The University of Tennessee | ||
12:00 5mTalk | SCRUBD: Smart Contracts Reentrancy and Unhandled Exceptions Vulnerability Dataset Data and Tool Showcase Track Chavhan Sujeet Yashavant Indian Institute of Technology, Kanpur, Mitrajsinh Chavda Indian Institute of Technology Kanpur, India, Saurabh Kumar Indian Institute of Technology Hyderabad, India, Amey Karkare IIT Kanpur, Angshuman Karmakar Indian Institute of Technology Kanpur, India | ||
12:05 5mTalk | ICVul: A Well-labeled C/C++ Vulnerability Dataset with Comprehensive Metadata and VCCs Data and Tool Showcase Track Chaomeng Lu DistriNet Group-T, KU Leuven, Tianyu Li DistriNet Group-T, KU Leuven, Toon Dehaene KU Leuven, Bert Lagaisse DistriNet Group-T, KU Leuven | ||
12:10 5mTalk | A Dataset of Software Bill of Materials for Evaluating SBOM Consumption Tools Data and Tool Showcase Track Rio Kishimoto Osaka University, Tetsuya Kanda Notre Dame Seishin University, Yuki Manabe The University of Fukuchiyama, Katsuro Inoue Nanzan University, Shi Qiu Toshiba, Yoshiki Higo Osaka University | ||
12:15 5mTalk | Wild SBOMs: a Large-scale Dataset of Software Bills of Materials from Public Code Data and Tool Showcase Track Luis Soeiro LTCI, Télécom Paris, Institut Polytechnique de Paris, Thomas Robert LTCI, Télécom Paris, Institut Polytechnique de Paris, Stefano Zacchiroli Télécom Paris, Polytechnic Institute of Paris | ||
12:20 5mTalk | MaLAware: Automating the Comprehension of Malicious Software Behaviours using Large Language Models (LLMs) Data and Tool Showcase Track BIKASH SAHA Indian Institute of Technology Kanpur, Nanda Rani Indian Institute of Technology Kanpur, Sandeep K. Shukla Indian Institute of Technology Kanpur |