Task-oriented Dialog Policy Learning via Deep Reinforcement Learning and Automatic Graph Neural Network Curriculum Learning

Hanneman, Koen

dc.rights.license	CC-BY-NC-ND
dc.contributor.advisor	Wang, Shihan
dc.contributor.author	Hanneman, Koen
dc.date.accessioned	2024-03-31T23:01:47Z
dc.date.available	2024-03-31T23:01:47Z
dc.date.issued	2024
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/46228
dc.description.abstract	In a task-oriented dialog system, a core component is the dialog policy, which determines the response action and guides the conversation system to complete the task. Optimizing such a dialog policy is often formulated as a reinforcement learning (RL) problem. But given the subjectivity and open-ended nature of human conversations, the complexity of dialogs varies greatly and negatively impacts the training efficiency of the RL-based method. A proven method to solve this problem is curriculum learning (CL) which breaks down complex problems and improves learning efficiency by providing a sequence of learning steps of increasing difficulty, similar to human learning. However, existing models implement this sequence by ordering tasks just based on complexity, without taking into account task similarity. In this thesis, we propose a method that reduces the distance between similar tasks in a curriculum, which is hypothesised to lead to increased training efficiency. Therefore, we introduce a curriculum learning model by offline generating a sequence of similar tasks via a graph neural network (GNN), and where the low-level dialog policy is transferred in each iteration of the curriculum. After this, the curriculum learning model performance is compared, on the MultiWOZ dataset, against the performance of dialog policy learning without a curriculum and was found to outperform the baseline model in specific scenarios.
dc.description.sponsorship	Utrecht University
dc.language.iso	EN
dc.subject	Development of a novel curriculum learning model for reinforcement learning based dialog policy learning by offline generating a sequence of tasks with the corresponding conversation graphs. The conversation graphs are clustered on similarity with a graph neural network (GNN) and ordered based on a graph complexity metric.
dc.title	Task-oriented Dialog Policy Learning via Deep Reinforcement Learning and Automatic Graph Neural Network Curriculum Learning
dc.type.content	Master Thesis
dc.rights.accessrights	Open Access
dc.subject.keywords	Task-oriented Dialog System, Dialog Policy Learning, Reinforcement Learning, Curriculum Learning, Graph Neural Network, Conversation Graph
dc.subject.courseuu	Artificial Intelligence
dc.thesis.id	24863

Files in this item

Name:: Master_Thesis_Final.pdf
Size:: 2.796Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record

Task-oriented Dialog Policy Learning via Deep Reinforcement Learning and Automatic Graph Neural Network Curriculum Learning

Files in this item

This item appears in the following Collection(s)

Related items

45 Anomaly detection with similarity graphs and active learning Building and storing static and dynamic similarity graphs with the help of a vector database ﻿

Greedy causal structure learning of maximal arid graphs ﻿

Reinforcement Learning and surrogate reward functions based on graph Laplacians ﻿

45 Anomaly detection with similarity graphs and active learning Building and storing static and dynamic similarity graphs with the help of a vector database

Greedy causal structure learning of maximal arid graphs

Reinforcement Learning and surrogate reward functions based on graph Laplacians