Bridging Logical Error Identification and KC-Based Adaptive Feedback with LLMs in Programming Education
Summary
This thesis presents a modular evaluation and feedback system powered by large language
models (LLMs) for assessing student code in introductory programming courses. The
system mirrors the stepwise reasoning process of a human educator and is composed of
four distinct modules: unit testing, logical error detection, knowledge component (KC)
mapping and grading, and feedback generation.
In interactive assignments without fixed inputs, the system simulates user interaction
using a memory-enhanced LLM and evaluates behavioral correctness using a more advanced model. Logical errors are identified through structured prompts and assigned to
predefined categories. Detected issues are used to inform concept-level grading, with
each KC evaluated independently based on the student’s code and relevant error context.
The final module synthesizes a humanlike feedback report, including performance summaries and tailored suggestions for improvement. Different LLMs are used based on task
complexity, and zero-temperature settings ensure deterministic outputs.
By separating evaluation into interpretable modules and aligning results with curriculumbased concepts, the system offers more granular, consistent, and pedagogically meaningful assessment than traditional auto-graders or single-prompt LLM systems. The design
supports integration into intelligent tutoring systems (ITSs) and offers a scalable solution
for providing accurate, practical feedback in large-scale educational settings.