View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Beyond Keywords: Intent-Driven Semantic Code Search in Software Ecosystems

        Thumbnail
        View/Open
        Final_Master_Thesis_Chris.pdf (1.699Mb)
        Publication date
        2023
        Author
        Pfaff, Chris
        Metadata
        Show full item record
        Summary
        The exponential growth of code repositories has posed significant challenges for developers in efficiently and effectively searching for relevant code snippets. Tradi- tional keyword-based code search engines often struggle to provide accurate results due to the ambiguity inherent in programming language keywords and the semantic gap between the developer’s search intent and the code syntax. To address this challenge, this study proposes a novel approach—an advanced semantic code search engine—that harnesses intent modelling and vector embed- ding techniques to enhance the relevance of search results. Our methodology utilizes machine learning models to extract the developer’s search intent from their query, thereby capturing the underlying meaning of their search. Furthermore, the code snippets are represented using vector embeddings, which capture the semantic con- text and relationships between different pieces of code. This allows for a more nuanced understanding of the code snippets’ meanings and functionalities. The proposed system ranks the code snippets based on their semantic similar- ity with the user’s search intent. This ranking approach facilitates the delivery of more accurate and relevant search results, providing developers with a more targeted and practical search experience. Moreover, the improved relevance of the search re- sults guides users in the right direction for future searches, fostering an iterative and progressive learning process. The findings of this research demonstrate the effectiveness of leveraging intent modelling and vector embedding techniques in enhancing the search capabilities of code repositories. By bridging the gap between the developer’s search intent and the code syntax, the proposed semantic code search engine offers a valuable tool for developers to locate and utilize relevant code snippets effectively.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/44030
        Collections
        • Theses
        Utrecht university logo