Distracted Driver Detection: A Safer Reinforcement Learning Approach
Summary
Both driver inattention and driver distraction present significant challenges in road safety, leading to an increasing number of accidents and fatalities every year. As drivers periodically become distracted, driving performance declines, and accidents become more likely. A cooperative lane-keeping assistance system could enhance safety while retaining most of the driver’s autonomy. To train this system, Safe Reinforcement Learning with a memory component will be utilized. Safe Reinforcement Learning involves learning policies that maximize rewards in scenarios where maintaining reasonable system performance and safety is crucial during the learning or deployment stages. Adding memory-based Deep Reinforcement Learning improves performance in partially observable environments. Since the driver’s psychological state is unknown to the system, the problem will be formulated as a Partially Observable Markov Decision Process (POMDP). However, during experimentation, memorybased DRL did not show any improvement compared to regular DRL. Therefore the problem was later reformalized as a Block Markov Decision Process (BMDP). We will use First Order Constrained Optimization in Policy Space (FOCOPS) extended with a Long Short-Term Memory (LSTM) layer, to address the distracted driver issue. The problem will be divided into three subsections: first, exploring the advantage of using memory-based Deep Reinforcement Learning in BMDPs; second, examining the benefit of using Safe Reinforcement Learning for the distracted driver problem; and finally, comparing FOCOPS with an LSTM layer to state-of-the-art reinforcement learning methods. We chose Recurrent Safe Reinforcement Learning to increase the learning rate and policy safety, making it more suitable for real-world applications.