Foundation Models: Evaluating Alternatives, Benefits, and Challenges in Automating Claims Processing in the Insurance Industry

Dinten, Yvette van

View/Open

Master Thesis-Yvette van Dinten.pdf (1.316Mb)

Publication date

2025

Author

Dinten, Yvette van

Metadata

Show full item record

Summary

With millions of users across the globe, the popularity of Large Language Models gave rise to questions about how companies could utilize these models effectively. Despite great interest, a gap remains in effectively applying these to corporate use cases. That is why this research explores the efficient use of LLMs by companies that utilize non-English and non-homogeneous data. Feature selection algorithms were used to reduce the cost and time of using LLMs. Moreover, the data was translated to observe if this would affect the model’s performance. Datasets were used with the E5 and Flan models, in which the E5 model is a smaller LLM with fewer parameters. More specifically, the E5 model is encoder-only, while Flan is an encoder-decoder model. Encoding is the process of embedding the input into vectors, and decoding is the process of de-embedding these vectors back into natural language. Both models preferred feature selection algorithms that reduced the datasets to very few features. Results show that the feature selection method Forward Searching combined with the E5 model has the lowest time duration and produces the highest performance. Furthermore, translation is not required, as it gives little to no improvement in performance but does add an extra step for preprocessing that increases time and costs. In conclusion, the higher performance of the E5 model suggests that a complex model does not necessarily imply a more efficient solution for corporate use cases.

URI

https://studenttheses.uu.nl/handle/20.500.12932/48860

Collections

Theses