Explainable Online Reinforcement Learning Using Abstract Argumentation
Summary
The democratisation of deep learning (DL) in recent years has led to an increasing presence of DL algorithms influencing our everyday lives, from recommending us our next book to deciding whether we are granted a loan or not. Although DL has allowed for a major performance boost in data-driven applications, the decisions made by neural networks are completely opaque to humans, rendering their suitability questionable for applications where the model needs to be verifiable and/or explanations must be completely faithful to the model. Related literature exists that tries to overcome this problem by using model extraction to derive an (approximately) equivalent symbolic model using a value-based argumentation framework (VAF) as its inference engine. While the resulting model has the advantage of being verifiable and providing faithful explanations, model extraction imposes an exploration boundary on the symbolic model. This thesis proposes a novel approach that integrates formal argumentation in an end-to-end reinforcement learning (RL) pipeline. The benefit of this method is that the model can be trained using online RL instead of using a surrogate model, leading to a potentially better solution while still using a VAF as its inference engine.