BERT, but Better: Improving Robustness using Human Insights
Summary
Pre-trained transformers are highly effective across numerous Natural Language Processing (NLP) tasks,
yet their ability to generalise to new domains remains a concern due to their tendency to rely on spurious
correlations. Consequently, this thesis investigates the impact of token-level human supervision to enhance
BERT's generalisation capabilities. Although the benefits of token-level insights have been shown to improve
the performance of these models, few studies have examined the effect of these insights to improve
generalisation. Consequently, this work explores the potential of human supervision to guide BERT's attention
mechanism towards salient features, thereby improving generalisation across domains. Results from
experiments in both binary and multi-label classification scenarios demonstrate not only substantial gains
in out-of-distribution (OOD) performance in few-shot contexts, but also a closer alignment between the
model's attention scores and salient features identified by human annotators. By emphasising the role of human
insight in transformer models, this thesis contributes to the ongoing discourse on enhancing performance
and explainability in NLP applications.