Natural Language Processing in Resume Data: The Interplay Between Gender and Occupation on Resume Writing Style
Summary
Resumes constitute an important part of forming an impression of candidate employees during the hiring process. They are shaped by the interplay between societal, personal, and occupational values. One important personal value or norm that shapes the writing style of a resume, is the gender of the candidate employee (Guadagno & Cialdini, 2007). Previous research shows that women tend to communicate in a more communal way, while men tend to communicate in a more agentic way. In our current society, the feminine gender norms (e.g. care) seem to be less in line with occupational values than the masculine gender norms (e.g. competitiveness; Eagly and Karau, 2002). Nevertheless, to our knowledge, previous research has only investigated gender differences in resumes in male-dominated occupations, like IT careers (Parasurama et al., 2022). Therefore, in this research, we have investigated the interrelationship between gender and occupation in careers that are male-dominated, gender-balanced, and
female-dominated.
In our first study, we identified the most important textual features that differentiate resumes written by men from those written by women in more than 1700 resumes (Yang et al., 2022). Our results indicate that women use more communal language in their resumes with words such as "assist" or "care" being the most predictive ones, while the features that predicted male resumes were clearly agentic or pointed out that they are the ones who apply or have experience in higher positions or technical
fields. Examples of those words are "manager" or "engineering".
In a second study, we tested multiple machine learning algorithms to predict gender from the resume texts in different occupations. We found that all models can predict gender from the resume text. The traditional and word-embedding models alongside DistilBERT perform well in balanced occupations but fall short when the data was more female- or male-dominated.
RoBERTa and Longformer showed steady performance across all occupations demonstrating the capabilities of newer transformer models.
In our third study, we investigated to what extent men and women conform to their respective gender norms and whether this differs across occupations. We found that women communicate significantly less gender congruently in male-dominated occupations compared to gender-balanced and female-dominated occupations. Similarly, men communicate significantly less gender-congruently in female-dominated occupations compared to gender-balanced and male-dominated occupations. Thus, even though people might experience social and economic penalties if they communicate in a gender-incongruent way, they still use a different communication style depending on the occupational context to which they apply.
To sum up, in this research project, we successfully trained multiple machine learning algorithms to predict gender from textual features in resumes. We found that their performances and predictive scores differ across occupations. In our discussion, we discuss the implications of these results with regard to societal norms and values surrounding gender and occupation and how these influence the hiring process. We argue that our results highlight the need for gender- and context-aware tools to help
employers in selecting appropriate candidates for hiring in a fair manner.