Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorFeelders, dr. A.J.
dc.contributor.advisorPauwels, dr. E.J.E.M.
dc.contributor.authorChen, Y.
dc.date.accessioned2017-08-23T17:01:55Z
dc.date.available2017-08-23T17:01:55Z
dc.date.issued2017
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/27003
dc.description.abstractIn recent years, Neural Networks (NN) and Deep Learning (DL) have achieved exceptional performance in a number of applications such as computer vision, natural language processing, audio recognition and machine translation. However, NNs as predictors are usually not interpretable in practice and the learning mechanism is theoretically not well understood yet by researchers. Therefore, NNs are known as "black boxes". To explain how NNs work, we perform different types of empirical analysis on trained models for a simple supervised classification task on one-dimensional signals, including analysis of hidden layer activations, visualization by gradient ascent, experiments on learning noise labels and measuring distance in the high-dimensional feature space, etc. In practice, NN models surpass the traditional signal processing methods in terms of performance on the task. For explanations on how NNs work, first we observe that this specific task can be interpreted directly from weights with some certain NN structures with limited expressivity; second, empirically NNs learn a smoothed first derivative extractor in this task, from which we suggest that NN models learn "principal subpatterns"; third, with measuring the inner- and inter-class distance of the data samples, we suggest that the behaviour of the networks that learn from real or structured data is to shrink the layer activation representation to a certain range of encoding for data samples in the same class with internal hidden layers, which differs from the behaviour of the networks in the abnormal case to fit random noise with brute-force memorization. The difference in network behaviour also provides a reasonable answer to the question why the over-parameterized NNs are able to achieve generalization power.
dc.description.sponsorshipUtrecht University
dc.format.extent25991129
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.titleTowards Explaining Neural Networks
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsNeural networks, deep learning, interpretation, memorization, generalization, hidden layer behaviour
dc.subject.courseuuArtificial Intelligence


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record