Prompt Engineering for filling in evidence tables
Summary
The development of medical guidelines is essential but often labor-intensive and time-consuming. This study explores the potential of leveraging Large Language Models (LLMs), specifically GPT-4o, to automate the creation of evidence tables from Randomized Controlled Trials (RCTs). We designed and iteratively refined a prompt to optimize the data extraction from the studies. Then, we evaluated the model’s performance column by column by comparing the numbers and texts extracted by the model with those manually extracted by healthcare experts. Numeric data was assessed by comparing the set of numbers extracted by each, while textual data was analyzed using 5 similarity measures: TF-IDF, Jaccard, BERT, Sentence-BERT, and spaCy. The results demonstrated that GPT-4o effectively extracted and summarized key elements of the studies, showcasing its potential to streamline the development of medical guidelines. This approach promises to reduce the workload of healthcare professionals, improve efficiency and ensure that patient care is based on the most current and comprehensive data available.