Memorisation and generalisation of shallow random neural networks.
Summary
In the last few years, neural networks have proven to be powerful tools with many applications. For fully connected single layer feed forward neural networks (SLFN), it is common practice to optimize all weights and biases. This could lead to overfitting or complete memorization of the training dataset, also on noisy datasets. In the 1990s a different architecture was proposed: the Random Vector Functional Link network (RVFL). Here, only the output weights of the neural network are optimized, which makes the optimization problem a linear problem. This reduces computational cost but could also lead to underfitting. In this thesis, we propose a novel hybrid approach called the \emph{learned biases architecture}: We take the RVFL architecture but also optimize the biases while keeping the inner weights fixed. We will give an upper bound on the Rademacher complexity of the learned biases and RVFL architecture. We will also give an upper bound on the VC dimension of the learned biases architecture. In addition, we will compare the three architectures empirically. In particular, we are interested in whether or not these two random neural networks also generalize well on datasets with label noise. The results indicate that the learned biases architecture is a promising alternative to both the RVFL and the fully learned architecture. Compared to the RVFL there is a reduced risk of underfitting, while compared to the SLFN the risk of overfitting is reduced.