Compressing Object Detectors for Bear Detection on Edge Devices

Rosmalen, Yochem van

dc.rights.license	CC-BY-NC-ND
dc.contributor.advisor	Önal Ertugrul, I.
dc.contributor.author	Rosmalen, Yochem van
dc.date.accessioned	2025-04-03T23:01:15Z
dc.date.available	2025-04-03T23:01:15Z
dc.date.issued	2025
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/48808
dc.description.abstract	Camera traps are deployed in Romania to keep bears from entering villages in search of food. These battery-powered, low-energy devices rely on a deep neural network for effective bear detection. However, neural networks typically require a large amount of RAM to store their parameters, which leads to high energy consumption. This creates a challenge for deploying AI models on these edge devices with limited power. To address this challenge, this thesis introduces a novel training approach that combines two model compression techniques and applies these to an object detection problem, using YOLOv5 as the base model, a battle-tested object detector based on the convolutional neural network architecture. The approach integrates two model compression techniques. The first technique is self-compression, which allows a model to learn to convert its parameters into smaller data types. The second technique is online knowledge distillation, where a smaller model acquires knowledge from a larger, more complex model that is training simultaneously. The novelty of this approach is the combination of these techniques, allowing the larger model to account for the self-compression process of the smaller model during training due to the online manner of knowledge distillation. This novel approach aims to optimize model compression while maintaining performance, creating an efficient object detector that can be deployed on devices with limited RAM. The proposed approach results in a model that only requires 1.4 MB of memory for its parameters. This is almost 60 times fewer than the 83 MB required by the medium-sized YOLOv5 model, and five times fewer than the 7.1 MB used by the nano-sized YOLOv5 model. Even with substantial size reduction, the resulting model achieves an F1-score of 0.971 when classifying bears, which is comparable to the performance of the larger baseline models: the medium-sized YOLOv5 model has an F1-score of 0.985, and the nano-sized model scores 0.977. The results of this thesis demonstrate the potential of combining self-compression and knowledge distillation for energy-efficient object detectors.
dc.description.sponsorship	Utrecht University
dc.language.iso	EN
dc.subject	Bears in Romania enter villages for food, so camera traps use AI to detect them. However, deep neural networks consume too much energy for these battery-powered devices due to high RAM requirements. This thesis proposes a novel training method combining self-compression and online knowledge distillation to reduce model size while maintaining accuracy. The result is a YOLOv5-based model needing just 1.4 MB of memory while achieving an F1-score of 0.971, making it ideal for low-power deployment.
dc.title	Compressing Object Detectors for Bear Detection on Edge Devices
dc.type.content	Master Thesis
dc.rights.accessrights	Open Access
dc.subject.keywords	object detection, network quantization, model compression, edge device, artificial intelligence, neural network, computer vision
dc.subject.courseuu	Artificial Intelligence
dc.thesis.id	44830

Files in this item

Name:: thesis-yochem-van-rosmalen.pdf
Size:: 2.320Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record