pyMethods2Test: A Dataset of Python Tests Mapped to Focal Methods (MSR 2025 - Data and Tool Showcase Track) - MSR 2025

Mon 28 - Tue 29 April 2025 Ottawa, Ontario, Canada

co-located with ICSE 2025

Who

Idriss Abdelmadjid, Robert Dyer

Track

MSR 2025 Data and Tool Showcase Track

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

When

Tue 29 Apr 2025 15:00 - 15:05 at 215 - Software quality Chair(s): Mohammad Hamdaqa

Abstract

Python is one of the fastest-growing programming languages and currently ranks as the top language in many lists, even recently overtaking JavaScript as the top language on GitHub. Given its importance in data science and machine learning, it is imperative to be able to effectively train LLMs to generate good unit test cases for Python code. This motivates the need for a large dataset to provide training and testing data. To date, while other large datasets exist for languages like Java, none publicly exist for Python. Python poses difficult challenges in generating such a dataset, due to its less rigid naming requirements. In this work, we consider two commonly used Python unit testing frameworks: Pytest and unittest. We analyze a large corpus of over 88K open-source GitHub projects utilizing these testing frameworks. Using a carefully designed set of heuristics, we are able to locate over 22 million test methods. We then analyze the test and non-test code and map individual unit tests to the focal method being tested. This provides an explicit traceability link from the test to the tested method. Our pyMethods2Test dataset contains over 2 million of these focal method mappings, as well as the ability to generate useful context for input to LLMs. The pyMethods2Test dataset is publicly available on Zenodo at: https://doi.org/10.5281/zenodo.14264519

Link to Preprint

https://arxiv.org/abs/2502.05143

Idriss Abdelmadjid

University of Nebraska-Lincoln

Robert Dyer

University of Nebraska-Lincoln

United States

Dataset

Time Zone

The program is currently displayed in (GMT-04:00) Eastern Time (US & Canada).

Use conference time zone: (GMT-04:00) Eastern Time (US & Canada)Select other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Session Program

Tue 29 Apr
Displayed time zone: Eastern Time (US & Canada) change

	14:00 - 15:30	Software qualityTechnical Papers / Data and Tool Showcase Track / Registered Reports at 215 Chair(s): Mohammad Hamdaqa Polytechnique Montreal

	14:00 10m Talk		Does Functional Package Management Enable Reproducible Builds at Scale? Yes.Technical Track Distinguished Paper Award Technical Papers Julien Malka LTCI, Télécom Paris, Institut Polytechnique de Paris, France, Stefano Zacchiroli LTCI, Télécom Paris, Institut Polytechnique de Paris, Palaiseau, France, Théo Zimmermann Télécom Paris, Polytechnic Institute of Paris Pre-print
	14:10 10m Talk		Refactoring for Dockerfile Quality: A Dive into Developer Practices and Automation Potential Technical Papers Emna Ksontini University of Michigan, Meriem Mastouri University of Michigan, Rania Khalsi University of Michigan - Flint, Wael Kessentini DePaul University
	14:20 10m Talk		Smells-sus: Sustainability Smells in IaC Technical Papers Seif Kosbar Polytechnique Montréal, Mohammad Hamdaqa Polytechnique Montreal
	14:30 10m Talk		Evidence is All We Need: Do Self-Admitted Technical Debts Impact Method-Level Maintenance? Technical Papers Shaiful Chowdhury University of Manitoba, Hisham Kidwai University of Manitoba, Muhammad Asaduzzaman University of Windsor
	14:40 5m Talk		DPy: Code Smells Detection Tool for Python Data and Tool Showcase Track Aryan Boloori Dalhousie university, Tushar Sharma Dalhousie University Pre-print
	14:45 5m Talk		CoMRAT: Commit Message Rationale Analysis Tool Data and Tool Showcase Track Mouna Dhaouadi University of Montreal, Bentley Oakes Polytechnique Montréal, Michalis Famelis Université de Montréal Pre-print Media Attached File Attached
	14:50 5m Talk		E2EGit: A Dataset of End-to-End Web Tests in Open Source ProjectsData/Tool Track Distinguished Dataset Award Data and Tool Showcase Track Sergio Di Meglio Università degli Studi di Napoli Federico II, Luigi Libero Lucio Starace Università degli Studi di Napoli Federico II, Valeria Pontillo Gran Sasso Science Institute, Ruben Opdebeeck Vrije Universiteit Brussel, Coen De Roover Vrije Universiteit Brussel, Sergio Di Martino Università degli Studi di Napoli Federico II Media Attached
	14:55 5m Talk		TestMigrationsInPy: A Dataset of Test Migrations from Unittest to Pytest Data and Tool Showcase Track Altino Alves Júnior UFMG, Andre Hora UFMG Pre-print Media Attached
	15:00 5m Talk		pyMethods2Test: A Dataset of Python Tests Mapped to Focal Methods Data and Tool Showcase Track Idriss Abdelmadjid University of Nebraska-Lincoln, Robert Dyer University of Nebraska-Lincoln Pre-print Media Attached
	15:05 5m Talk		DataTD: A Dataset of Java Projects Including Test Doubles Data and Tool Showcase Track Mengzhen Li University of Minnesota, Mattia Fazzini University of Minnesota
	15:10 5m Talk		JPerfEvo: A Tool for Tracking Method-Level Performance Changes in Java Projects Data and Tool Showcase Track Kaveh Shahedi Polytechnique Montréal, Maxime Lamothe Polytechnique Montreal, Foutse Khomh Polytechnique Montréal, Heng Li Polytechnique Montréal
	15:15 10m Talk		PyExamine: A Comprehensive, Un-Opinionated Smell Detection Tool for Python Technical Papers Karthik Shivashankar University of Oslo, Antonio Martini University of Oslo, Norway
	15:25 5m Talk		How Do Solidity Versions Affect Vulnerability Detection Tools? An Empirical Study Registered Reports Gerardo Iuliano University of Salerno, Davide Corradini University of Luxembourg, Michele Pasqua University of Verona, Mariano Ceccato University of Verona, Dario Di Nucci University of Salerno