dc.rights.license | CC-BY-NC-ND | |
dc.contributor.advisor | Önal Ertugrul, I. | |
dc.contributor.author | Rosmalen, Yochem van | |
dc.date.accessioned | 2025-04-03T23:01:15Z | |
dc.date.available | 2025-04-03T23:01:15Z | |
dc.date.issued | 2025 | |
dc.identifier.uri | https://studenttheses.uu.nl/handle/20.500.12932/48808 | |
dc.description.abstract | Camera traps are deployed in Romania to keep bears from entering villages in search of food. These battery-powered, low-energy devices rely on a deep neural network for effective bear detection. However, neural networks typically require a large amount of RAM to store their parameters, which leads to high energy consumption. This creates a challenge for deploying AI models on these edge devices with limited power.
To address this challenge, this thesis introduces a novel training approach that combines two model compression techniques and applies these to an object detection problem, using YOLOv5 as the base model, a battle-tested object detector based on the convolutional neural network architecture. The approach integrates two model compression techniques. The first technique is self-compression, which allows a model to learn to convert its parameters into smaller data types. The second technique is online knowledge distillation, where a smaller model acquires knowledge from a larger, more complex model that is training simultaneously. The novelty of this approach is the combination of these techniques, allowing the larger model to account for the self-compression process of the smaller model during training due to the online manner of knowledge distillation. This novel approach aims to optimize model compression while maintaining performance, creating an efficient object detector that can be deployed on devices with limited RAM.
The proposed approach results in a model that only requires 1.4 MB of memory for its parameters. This is almost 60 times fewer than the 83 MB required by the medium-sized YOLOv5 model, and five times fewer than the 7.1 MB used by the nano-sized YOLOv5 model. Even with substantial size reduction, the resulting model achieves an F1-score of 0.971 when classifying bears, which is comparable to the performance of the larger baseline models: the medium-sized YOLOv5 model has an F1-score of 0.985, and the nano-sized model scores 0.977. The results of this thesis demonstrate the potential of combining self-compression and knowledge distillation for energy-efficient object detectors. | |
dc.description.sponsorship | Utrecht University | |
dc.language.iso | EN | |
dc.subject | Bears in Romania enter villages for food, so camera traps use AI to detect them. However, deep neural networks consume too much energy for these battery-powered devices due to high RAM requirements. This thesis proposes a novel training method combining self-compression and online knowledge distillation to reduce model size while maintaining accuracy. The result is a YOLOv5-based model needing just 1.4 MB of memory while achieving an F1-score of 0.971, making it ideal for low-power deployment. | |
dc.title | Compressing Object Detectors for Bear Detection on Edge Devices | |
dc.type.content | Master Thesis | |
dc.rights.accessrights | Open Access | |
dc.subject.keywords | object detection, network quantization, model compression, edge device, artificial intelligence, neural network, computer vision | |
dc.subject.courseuu | Artificial Intelligence | |
dc.thesis.id | 44830 | |