Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorWang, Shihan
dc.contributor.authorScholte, Niels
dc.date.accessioned2022-02-24T00:00:27Z
dc.date.available2022-02-24T00:00:27Z
dc.date.issued2022
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/520
dc.description.abstractIn this thesis, we improve reinforcement learning through curriculum learning. We pioneer a new approach to curriculum learning based on resimulation. We formulate 2 approaches to resimulation. Namely, Goal-Based Resimulation (GBR), where we resimulate after changing the goal, and Initial-State-Based Resimulation (ISBR), where we resimulate after changing the initial state. We construct one GBR method, namely G, where the goal is set to be the last state of the resimulated episode. G is shown to enable solving tasks that are neither solvable by using Proximal Policy Optimization (PPO) [Schulman et al., 2017], an Intrinsic Curiosity module (ICM) [Pathak et al., 2017] nor Hindsight Experience Replay (HER) [Andrychowicz et al., 2017]. We construct 2 ISBR methods, namely S+ and S−. Both methods process the advantage estimates to determine swing events, periods with high amplitude advantage estimates. S+ and S− then resimulate successes and mistakes, respectively, by setting the initial states to be the states at the start of swing events. All methods are tested on two tasks that only differed in the level of sparsity, and 3 reward ratios, controlling for the extent with which the ICM was used. The performance is measured in the solve rate and the learning speed. We find that • G enables solving the proposed tasks. • S+ improves the solve rate, but only on sufficiently sparse tasks and when using ICM. • S− improves both the solve rate and the learning speed. • G, S+ and S− can be used in unison to create a better combined algorithm.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectIn this thesis, we improve reinforcement learning through curriculum learning. We pioneer a new approach to curriculum learning based on resimulation. We formulate 2 approaches to resimulation. Namely, Goal-Based Resimulation (GBR), where we resimulate after changing the goal, and Initial-State-Based Resimulation (ISBR), where we resimulate after changing the initial state.
dc.titleGoal, mistake and success learning through resimulation
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsReinforcement learning;Curriculum learning;Resimulation;Machine learning;RL;CL;Goals;
dc.subject.courseuuArtificial Intelligence
dc.thesis.id2151


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record