50 Years of Programming Language Evolution through the Software Heritage looking glass
This program is tentative and subject to change.
Programming languages have evolved rapidly over the past five decades, reflecting broader shifts in software development practices and technological advances. Early on, entities like the U.S. Department of Defense recognized the challenges posed by diverse programming languages, leading to initiatives such as the Ada programming language. Since then, indexes like Tiobe, RedMonk, and Open Hub have attempted to track language popularity, though their metrics provide only a snapshot view, and they do not make available their data. Software Heritage, the largest public archive of source code, makes it now possible to address this question in a comprehensive, transparent and reproducible manner through its unified dataset, which includes over 20 billion source files and 4 billion commits. Our study leverages key properties of the Software Heritage graph to analyze five decades of programming language trends, by measuring the programming activity as seen in the Software Heritage archive, revealing trends in language adoption, shifts in popularity, and significant transitions linked to technological changes. We compare the results with the existing indexes, and provide a reusable open dataset and pipeline to facilitate further research on programming language evolution.