View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Ontology Learning using public Large Language Models under set conditions, identifying metrics for measuring success

        Thumbnail
        View/Open
        Thesis phase 2 1088149 hand-in v0.2.pdf (1.005Mb)
        Publication date
        2025
        Author
        Elten, Ronald van
        Metadata
        Show full item record
        Summary
        Capturing natural language text in an ontology which accurately reflects the domain semantics is a labor intensive task. It requires both knowledge regarding ontology modeling theory and the domain to be modeled. Large Language Models (LLM) can aid in reducing the amount of manual labor required. We test the performance of five LLM which receive no training specific to the task or domain. Five different approaches to prompting LLM are applied. Additionally, a subset of prompts instruct usage of a specific Ontology Learning (OL) framework. One novel approach to prompting is applied where the LLM is provided a description of the task and instructed to design its own prompt. Precision, Recall and F-score are used to quantify the quality of ontologies produced. Results indicate that such general purpose language models can produce semantically and syntactically accurate ontologies. However, performance drops significantly when the domain to be modeled increases in obscurity of content and semantic complexity. This is shown by moving from a baseline experiment about African wildlife to a text concerning regulations in the Dutch Pension Domain (DPD). Instructing the use of a specific method in executing the task is shown to result in ontologies of improved quality whereas ambiguous prompting shows reduced results. Overall, untrained LLM can model ontologies from unstructured data but are not suited to automated and unsupervised annotation as the results contain too many errors.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/48814
        Collections
        • Theses
        Utrecht university logo