Ontology Learning using public Large Language Models under set conditions, identifying metrics for measuring success
Summary
Capturing natural language text in an ontology which accurately reflects the domain semantics is a labor intensive task. It requires both knowledge regarding ontology modeling theory and the domain to be modeled. Large Language Models (LLM) can aid in reducing the amount of manual labor required. We test the performance of five LLM which receive no training specific to the task or domain. Five different approaches to prompting LLM are applied. Additionally, a subset of prompts instruct usage of a specific Ontology Learning (OL) framework. One novel approach to prompting is applied where the LLM is provided a description of the task and instructed to design its own prompt. Precision, Recall and F-score are used to quantify the quality of ontologies produced. Results indicate that such general purpose language models can produce semantically and syntactically accurate ontologies. However, performance drops significantly when the domain to be modeled increases in obscurity of content and semantic complexity. This is shown by moving from a baseline experiment about African wildlife to a text concerning regulations in the Dutch Pension Domain (DPD). Instructing the use of a specific method in executing the task is shown to result in ontologies of improved quality whereas ambiguous prompting shows reduced results. Overall, untrained LLM can model ontologies from
unstructured data but are not suited to automated and unsupervised annotation as the results contain too many errors.