Material Segmentation

Bos, G.E.

dc.rights.license	CC-BY-NC-ND
dc.contributor.advisor	Poppe, R.W.
dc.contributor.advisor	Veltkamp, R.C.
dc.contributor.author	Bos, G.E.
dc.date.accessioned	2020-02-20T19:03:50Z
dc.date.available	2020-02-20T19:03:50Z
dc.date.issued	2019
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/34868
dc.description.abstract	Rapidly developing research on convolutional neural networks allows for generating semantic information on images in an increasing variety of ways and with increased segmentation performance. This work focuses on one such deep learning objective, which we termed material segmentation and which involves determining for each pixel in an image -- in this research we work with street-level imagery -- which of a predefined set of materials it represents. The context for this research was provided by a GIS company, which records and digitizes the public space and whose clients have requested such per-pixel material information. This company also employs deep learning to automatically detect and locate street furniture, and the material information additionally has the potential to improve this object detection. We annotated our own dataset, which is deficient in both the number of annotated images and the ground truth coverage per image. In addition, this dataset suffers from severe class imbalance. In the first part of this research we explore techniques to either resolve this class imbalance or to mitigate its negative effect on material segmentation performance. In our case, the best loss function turns out to be class-weighted cross entropy loss, though only by small margin. We conjecture that our class imbalance is too severe, and renders the dataset intractable without merging small, non-performing classes together or acquiring more ground truth for the small classes. In addition to colour values, we also have depth information to our disposal, which gives the distance from the camera to the surface at each pixel. Contrary to our expectations, our network cannot manage an increase in performance when working with a unified RGBD representation over only RGB colour input. Our experiments indicate that, even though depth maps hold some discriminatory value, they become superfluous in the presence of colour information. We show examples where depth falls short in terms of discriminatory value by visual inspection. Lastly, we assessed the extent to which our lack of training data holds performance back. We deduce that two factors are likely to cause the largest gains in performance: increasing the number of segmentations for classes with little ground truth, and increasing the number of training images to a thousand. Such measures should be able to effectuate satisfactory performance. Further improvements in material segmentation performance are likely to be gained by swapping depth maps for intensity maps, and simultaneously performing instance segmentation and material segmentation, using a joint network like Panoptic FPN.
dc.description.sponsorship	Utrecht University
dc.language.iso	en
dc.title	Material Segmentation
dc.type.content	Master Thesis
dc.rights.accessrights	Open Access
dc.subject.keywords	cnn, neural networks, semantic segmentation, deep learning
dc.subject.courseuu	Game and Media Technology

Files in this item

Name:: Material Segmentation.pdf
Size:: 18.26Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record