Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorPoppe, R.W.
dc.contributor.advisorVeltkamp, R.C.
dc.contributor.authorBos, G.E.
dc.date.accessioned2020-02-20T19:03:50Z
dc.date.available2020-02-20T19:03:50Z
dc.date.issued2019
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/34868
dc.description.abstractRapidly developing research on convolutional neural networks allows for generating semantic information on images in an increasing variety of ways and with increased segmentation performance. This work focuses on one such deep learning objective, which we termed material segmentation and which involves determining for each pixel in an image -- in this research we work with street-level imagery -- which of a predefined set of materials it represents. The context for this research was provided by a GIS company, which records and digitizes the public space and whose clients have requested such per-pixel material information. This company also employs deep learning to automatically detect and locate street furniture, and the material information additionally has the potential to improve this object detection. We annotated our own dataset, which is deficient in both the number of annotated images and the ground truth coverage per image. In addition, this dataset suffers from severe class imbalance. In the first part of this research we explore techniques to either resolve this class imbalance or to mitigate its negative effect on material segmentation performance. In our case, the best loss function turns out to be class-weighted cross entropy loss, though only by small margin. We conjecture that our class imbalance is too severe, and renders the dataset intractable without merging small, non-performing classes together or acquiring more ground truth for the small classes. In addition to colour values, we also have depth information to our disposal, which gives the distance from the camera to the surface at each pixel. Contrary to our expectations, our network cannot manage an increase in performance when working with a unified RGBD representation over only RGB colour input. Our experiments indicate that, even though depth maps hold some discriminatory value, they become superfluous in the presence of colour information. We show examples where depth falls short in terms of discriminatory value by visual inspection. Lastly, we assessed the extent to which our lack of training data holds performance back. We deduce that two factors are likely to cause the largest gains in performance: increasing the number of segmentations for classes with little ground truth, and increasing the number of training images to a thousand. Such measures should be able to effectuate satisfactory performance. Further improvements in material segmentation performance are likely to be gained by swapping depth maps for intensity maps, and simultaneously performing instance segmentation and material segmentation, using a joint network like Panoptic FPN.
dc.description.sponsorshipUtrecht University
dc.language.isoen
dc.titleMaterial Segmentation
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordscnn, neural networks, semantic segmentation, deep learning
dc.subject.courseuuGame and Media Technology


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record