Text Mining in Financial Industry: Implementing Text Mining Techniques on Bank Policies

Ferati, D.

dc.rights.license	CC-BY-NC-ND
dc.contributor.advisor	Spruit, Marco
dc.contributor.advisor	Brinkhuis, M.J.S.
dc.contributor.author	Ferati, D.
dc.date.accessioned	2017-07-20T17:01:10Z
dc.date.available	2017-07-20T17:01:10Z
dc.date.issued	2017
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/26236
dc.description.abstract	With the increase in data, that organisations collect and create, the necessity to leverage from these resources has become apparent. This pool of data distinguishes two primary data structures, namely structured data and unstructured data. Both of these formats come with their own bag of techniques for scrutinising the data and extracting information and knowledge subsequently. Besides not having a predefined structure or representation, unstructured data also comprise roughly 80% of all the data that organizations possess. Policy documents are a good illustration of this kind of data, with their text-heavy format and domain specific language. As written guidelines of acceptable actions to which organisations must adhere, policy documents are present across industries and in a large number. This is especially true for organisations in the financial industry, such as banks, who continuously introduce policies in order to be fully compliant with regulations that governing bodies impose. In an attempt to bring order and some understanding to policies, this research investigates the applicability and benefits of TM on processing such documents. Relying on the DS principles, initially, the literature was consulted, to determine the extent to which such techniques have been exploited on policies. This investigation revealed that the use of TM on policy documents fell short in both qualitative and quantitative aspect. Next, to the limited amount of publications that treated these concepts, the variety of techniques that were examined was narrow. Hence, through a CS in one of the biggest banks in Netherlands, a set of unprecedented techniques were applied to policy documents. The use of IE to extract references between policies, together with the use of automatic summarization and keyword extraction, to retrieve a concise representation of the documents and a set of descriptive labels (tags) respectively, were evaluated both statistically and by experts. The results showed that to a large extent, these techniques are capable of analysing internal policies and extracting reliable information from them. Furthermore, this led to the introduction of a new TM framework for processing policies. The framework is a MAM of the approach followed in this study and it represents the harmonic use of three different techniques and the results that derive from their utilisation. Thus, next to unveiling the current state of literature, this research also introduces a novel approach for processing policies with the use of TM techniques.
dc.description.sponsorship	Utrecht University
dc.language.iso	en
dc.title	Text Mining in Financial Industry: Implementing Text Mining Techniques on Bank Policies
dc.type.content	Master Thesis
dc.rights.accessrights	Open Access
dc.subject.keywords	Policy documents; text mining; natural language processing; banks; information extraction; automatic summarization; keyword extraction
dc.subject.courseuu	Business Informatics

Files in this item

Name:: DrilonFerati_Thesis_v2.pdf
Size:: 1.458Mb
Format:: PDF

View/Open

Name:: DrilonFerati_Thesis_5676932.pdf
Size:: 1.458Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record