Optimal Control of Lettuce Greenhouse Horticulture using Model-Free Reinforcement Learning

Jansen, Yde

View/Open

MSc_thesis_YJ_final_version.pdf (3.988Mb)

Publication date

2023

Author

Jansen, Yde

Metadata

Show full item record

Summary

A greenhouse is an important growing system that can provide a controlled climate environment and allow for crop growth in a changing outdoor climate. Due to the high energy cost, labor and resource scarcity, optimal and automated control of greenhouse horticulture is becoming more and more important with the aim of optimizing resource usage while maximizing crop production. Outdoor weather is a critical disturbance when controlling greenhouse climate and it complicates the modelling and optimization processes. With the development of Artificial Intelligence (AI) and improved sensing techniques, Reinforcement Learning (RL) is getting more and more attention due to its learning-based control strategies, independent from having a good model. Up to now, most of the RL applications in greenhouse climate control do not consider outdoor weather forecast while making control decisions, which means plenty of useful information is missed and this might lead to control actions which are not optimal. Therefore in this project, we investigated how weather forecast horizons will affect optimal control of greenhouse horticulture using reinforcement learning. After going through different deep RL approaches, Soft Actor-Critic (SAC) and Twin-Delayed Deep Deterministic Policy Gradient (TD3) stand out because of their capacity to consider continuous state-action space. As the weather prediction will mainly work well in the short-term due to forecast uncertainty, moreover, long-term weather forecast will bring various unnecessary noise and a large state space. As a result, our work mainly focused on short-term weather forecast horizons of 0, 3, 7, 11, 15, 19, 23 time steps of fifteen minutes. To investigate how horizons can affect the control performance, these seven different horizons were used in experiments using the state-of-the-art continuous control algorithms, SAC and TD3. After demonstrating the proposed approaches in a lettuce greenhouse, we found that SAC consistently performed better than TD3 with higher rewards in terms of crop production and net profit, while resource use was comparable. Furthermore, inclusion of weather forecasts proved essential for both algorithm learning stability, as well as its training and generalization performance, resulting in increased yields and net profits, while reducing resource use and the amount of indoor climate constraint violations. Moreover, we can also conclude that four hours of weather forecast is the best option. Longer predictions did not increase performance, whereas using shorter forecasts quickly degraded performance.

URI

https://studenttheses.uu.nl/handle/20.500.12932/44812

Collections

Theses