Machine-annotated rationales: faithfully explaining machine learning models for text classification

Herrewijnen, E.

dc.rights.license	CC-BY-NC-ND
dc.contributor.advisor	Nguyen, D.P.
dc.contributor.advisor	Bex, F.J.
dc.contributor.advisor	Mense, J.
dc.contributor.author	Herrewijnen, E.
dc.date.accessioned	2020-09-21T18:00:14Z
dc.date.available	2020-09-21T18:00:14Z
dc.date.issued	2020
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/37693
dc.description.abstract	Artificial intelligence is not always interpretable to humans at first sight. Especially machine learning models with hidden states or high complexity remain difficult to understand. Explanations for such machine learning models can be found, but are not always faithful: according to the actual reasoning that was done inside the model. Finding parts of the model input that contain signals for a classification can be a way of explaining model outputs. Natural language explanations are called rationales. Whoever annotated a part of the text as being an explanation (rationale), is called the annotator. Texts form decomposable sets of interpretable features, where selections of (sub-)sentences can be explanations for model predictions. To find explanations for machine model predictions in text classification, this study introduces machine-annotated rationales, which are natural language explanations from the input text for a model's prediction. Four different approaches to finding faithful machine-annotated rationales are proposed. Evaluation is done by measuring faithfulness, set similarity to human-annotated rationales, and through a user evaluation. Results show that faithful machine-annotated rationales can be found for the investigated machine learning models, but that there is a trade-off between faithfulness and end-user interpretability.
dc.description.sponsorship	Utrecht University
dc.format.extent	1574918
dc.format.mimetype	application/pdf
dc.language.iso	en
dc.title	Machine-annotated rationales: faithfully explaining machine learning models for text classification
dc.type.content	Master Thesis
dc.rights.accessrights	Open Access
dc.subject.keywords	explainable AI, neural networks, faithful explainable AI, machine-annotated rationales, annotator rationales, text classification
dc.subject.courseuu	Artificial Intelligence

Files in this item

Name:: Thesis_AI_EH.pdf
Size:: 1.501Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record