Study on Iterative Nullspace Projection Debiasing of Multiple Attributes of Word Embeddings
MetadataShow full item record
Fairness is one of the themes of human-centered machine learning. In the field of machine learning and natural language processing (NLP), it has been reported that multiple biases were introduced during word embedding training, and blindly applying the biased word embedding to decision-making tasks leads to fairness problems. Therefore, it is necessary to find ways to eliminate or reduce these biases. Several debiasing algorithms have been proposed in previous research, including the Iterative Nullspace Projection (INLP) method. However, few studies have analyzed debiasing for multiple attributes together, especially for multiple attributes that include more than 2 classes. In this study, we implemented multi-attribute debiasing for gender (binary) and race (tri-class) through 4 different strategies: 2 successive debiasing strategies (gender-race, race-gender); 2 simultaneous debiasing strategies (independent grouping, intersectional grouping). We validated the debiasing results by INLP classifier performance, bias measure (WEAT), and downstream task performance, and then evaluated and compared different strategies and other detailed parameters. We found that simultaneous debiasing tended to produce moderate results, while successive debiasing tended to produce extremely good or poor results depending on the debiasing order of attributes. In the future, these strategies could be used for debiasing of other attributes. Further theoretical analysis and more comprehensive testing are suggested.