Comparing Text Representations: In Search Of Caring Communities
Summary
Multiple text representation techniques ( BERT, word2vec, LDA topics etc) are compared for a text classification task. This classification task involves identifying caring communities from Dutch Chamber of Commerce data and utilizes a RF classifier. The goal is to identify the highest performing text representation. The classifier using the Word2Vec representation ends up with the highest F1-score.