TriGraph: A Probabilistic Subgraph-Based Model for Visual Code Completion in Pure Data
Pure Data (PD) is a visual programming language for computer music that allows users to create applications through a graph-based, drag-and-drop interface, using objects and connections to manage program flow. There is a lack of tool support for computer musicians using PD, particularly for code completion. In this paper, we introduce TriGraph, a graph-based probabilistic model specifically designed for code completion in PD. TriGraph uses statistical analysis of 2-node and 3-node subgraph frequencies to predict nodes and connections in PD graphs. Using a dataset of parsed PD files, we train and evaluate 5 TriGraph models, assessing their performance in predicting nodes and edges in PD graphs. Our evaluations indicate that the models achieve an average Mean Reciprocal Rank (MRR) score of 0.39 for node prediction, placing the correct answer within the top 3 suggestions, and outperforming the \textit{n}-gram-based KenLM model on similar tasks. For edge prediction, the models achieve an average MRR score of 0.57, with results showing that incorporating both 2-node and 3-node subgraphs yields better results than using only 3-node subgraphs. These findings demonstrate that TriGraph enhances productivity of PD programmers by offering code completion support that speeds up development, reduces errors, and aids available option discovery, marking a significant advancement in support tools for end-user programmers in graphical environments.