dc.description.abstract | This thesis introduces a Collaborative Role-Oriented Workflow for SQL generation
(CROW-SQL), which is a modular multi-agent framework designed to improve the reliability, accuracy, and interpretability of Text-to-SQL generation using Large Language Models
(LLMs). Rather than relying on a monolithic prompting strategy, CROW-SQL decomposes
the Structured Query Language (SQL) generation process into collaborative subtasks, query
generation, schema suggestion, refinement, and orchestration, which are handled by independent, specialized agents. All agents are instantiated from the same LLM backend, primarily
Gemini 2.0 Flash, ensuring a fair and controlled evaluation of agent behavior.
To evaluate the system’s effectiveness, we benchmark CROW-SQL on two datasets:
the academic Spider benchmark and the real-world BIg Bench for LaRge-scale Database
Grounded Text-to-SQL Evaluation (BIRD) dataset. Experiments vary key parameters such
as Query Generation Budget (QGB), few-shot prompt size, and agent composition. Evaluation metrics include execution accuracy, SQL correctness, structure correctness, skeleton
similarity, Levenshtein distance, and runtime. A comparative study between Gemini 2.0
Flash and Lightweight variant of OpenAI’s Generative Pre-trained Transformer 4o (GPT4o-mini) highlights Gemini’s better performance in structural alignment and execution robustness within the multi-agent context.
The results show that multi-agent configurations significantly outperform single-agent
baselines, especially on complex queries. The Refiner Agent plays a critical role in recovering from execution failures. Optimal performance is achieved with a Query Generation
Budget of 3, beyond which diminishing returns are observed. The modular architecture
also enhances transparency, debugging, and deployability, making CROW-SQL particularly
suitable for enterprise and compliance-focused applications.
This work contributes a reproducible, tool-augmented framework for agent-based Textto-SQL reasoning, and sets the stage for future research in schema-aware prompting and
adaptive agent routing SQL generation. | |