Implementing Large Language Models in an Educational Refactoring Tutor
Summary
Large Language Models possess capabilities such as code generation, explanation, and refactoring. Given their flexibility, researchers have begun to explore their potential to support programming education. This work investigates how LLMs can improve the Refactoring Programming Tutor (RPT), an educational tool that lets students practice refactoring by improving small programs that are already functionally correct. We replace the original rule-based backend of RPT with an LLM-powered system capable of detecting semantic errors, more code quality issues, identifying correct refactorings and generating hints. We evaluated the new system’s performance through expert assessments on its outputs generated from a dataset of student submissions to the original system, as well as on a newly designed exercise containing code issues outside the original rule set. Our findings show that the LLM-generated feedback is generally correct, aligned with the submitted code, and adaptable to new exercises. This
work demonstrates the potential of LLMs to provide feedback and hints in programming tutors, in particular for refactoring exercises, although future work is needed to evaluate their effectiveness in real classroom settings.