Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributorSara Marti Marcet, Maike Lea Vaitea Weiper
dc.contributor.advisorBagheri, Ayoub
dc.contributor.authorAktepe, Malka
dc.date.accessioned2023-07-27T00:01:54Z
dc.date.available2023-07-27T00:01:54Z
dc.date.issued2023
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/44348
dc.description.abstractResumes constitute an important part of forming an impression of candidate employees during the hiring process. They are shaped by the interplay between societal, personal, and occupational values. One important personal value or norm that shapes the writing style of a resume, is the gender of the candidate employee (Guadagno & Cialdini, 2007). Previous research shows that women tend to communicate in a more communal way, while men tend to communicate in a more agentic way. In our current society, the feminine gender norms (e.g. care) seem to be less in line with occupational values than the masculine gender norms (e.g. competitiveness; Eagly and Karau, 2002). Nevertheless, to our knowledge, previous research has only investigated gender differences in resumes in male-dominated occupations, like IT careers (Parasurama et al., 2022). Therefore, in this research, we have investigated the interrelationship between gender and occupation in careers that are male-dominated, gender-balanced, and female-dominated. In our first study, we have identified the most important textual features that differentiate resumes written by men from those written by women in more than 1700 resumes (Yang et al., 2022). Our results indicate that women use more communal language in their resumes with words such as "assist" or "care" being the most predictive ones, while the features that predicted male resumes were clearly agentic or pointed out that they are the ones who apply or have experience in higher positions or technical fields. Examples of those words are "manager" or "engineering". In a second study, we tested multiple machine learning algorithms to predict gender from the resume texts in different occupations. We found that all models can predict gender from the resume text. The traditional and word-embedding models alongside DistilBERT perform well in balanced occupations, but fall short when the data was more female- or male-dominated. RoBERTa and Longformer showed steady performance across all occupations demonstrating the capabilities of newer transformer models. In our third study, we investigated to what extent men and women conform to their respective gender norms and whether this differs across occupations. We found that women communicate significantly less gender-congruently in male-dominated occupations compared to gender-balanced and female-dominated occupations. Similarly, men communicate significantly less gender-congruently in female-dominated occupations compared to gender-balanced and male-dominated occupations. Thus, even though people might experience social and economic penalties if they communicate in a gender-incongruent way, they still use a different communication style depending on the occupational context to which they apply. To sum up, in this research project, we successfully trained multiple machine learning algorithms to predict gender from textual features in resumes. We found that their performances and predictive scores differ across occupations. In our discussion, we discuss the implications of these results with regard to societal norms and values surrounding gender and occupation and how these influence the hiring process. We argue that our results highlight the need for gender- and context-aware tools to help employers in selecting appropriate candidates for hiring in a fair manner.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectInvestigating the performance of several word- and contextual embedding models in predicting gender from resumes across different occupational groups
dc.titleNavigating Gender Bias in Resumes: An Investigation of Word and Contextual Embedding Models
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsNLP; word-embedding; contextual embedding; resumes; gender
dc.subject.courseuuApplied Data Science
dc.thesis.id20045


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record