This paper introduces FormalSpecCpp, a dataset to address the critical need for standardized verification of the C++ program specifications. To the best of our knowledge, this is the first comprehensive collection of C++ programs with well-defined formal specifications. Additionally, we present a specification translation approach that employs rule-based prompting strategies to handle critical aspects of the target programming language such as appropriate type selection, safety checks, and assertion mapping. This work provides a standardized dataset for evaluating specification inference tools and explores how Large Language Models (LLMs) can support automated specification generation across various programming languages. We envision this benchmark advancing research on program verification, automated testing, and specification inference for C++ programs and providing guidelines for using LLMs to generate specifications for other programming languages.
Navid Bin Hasan Bangladesh University of Engineering and Technology, Md. Ashraful Islam Bangladesh University of Engineering and Technology, Junaed Younus Khan Bangladesh University of Engineering and Technology, Sanjida Senjik Bangladesh University of Engineering and Technology, Anindya Iqbal Bangladesh University of Engineering and Technology Dhaka, Bangladesh