Predicting demographics based on eye tracking data using machine learning
Summary
Eye tracking has been a topic of interest for researchers for a long time and because of recent technological advancements its applications have reached multiple new areas. Even though data collection and processing has steadily improved over the years, it is still a time costly process which limits the maximum number of participants. In many eye tracking researches, there are no more than 30 participants because each person needs to come to a specialised lab in combination with an observer to set up the equipment and help with the experiment. In this thesis, data was collected with an unsupervised experiment in an installation at the NEMO science museum in Amsterdam which provided a uniquely large amount of data for analysis. It is however uncertain how the data quality is affected by the quality of the eye tracker, the unsupervised nature of the experiment and the diversity in participants. The research aimed to provide an insight into the possibilities of predicting demographics based on gaze behaviour using machine learning techniques. Regarding the lack of comparable research published on this topic, this research offers a new and unique insight. There have been differences reported in viewing behaviour between sick and healthy people, men and women as well as age dependent differences. This can result in different number of fixations, different saccade speeds or different fixation durations.
Data was collected using the Tobii 4c eye tracker in combination with a 1920 by 1080 pixel display in an enclosed installation in which participants were shown a composition for 10 seconds as a free-viewing experiment with no prior instructions. Thereafter participants were shown their most frequented areas of interest and were asked to fill in their gender (male, female, other) and their year of birth (default = 2000). The total amount of participants was 5604 with a distribution of 2423 female (43%), 2526 male (45%) and 655 Transgender/other (12%). The age of the participants ranged from 2 to 92. Participants with transgender/other gender were exlcuded as well as participants with suspected erroneus data. The final dataset consisted of 624 participants (N = 328 male, N = 296 female) in the age range of 6 to 74 years old.
Python was used in combination with the SciKitLearn package to apply supervised machine learning algorithms to predict gender and exact year of birth. For the prediction of gender, a random forest, support vector classifier and a gradient boosted random forest were implemented. For the prediction of exact year of birth, a linear and ridge regression model were implemented. The sets of input variables were tested, these always included either gender or year of birth, total number of fixations and average fixation duration. In addition to these, either exact coordinates of fixations, clustered fixations or handcrafted regions of interest were used as input. The range of numbers of fixations tried were 1 through 9 and the number of clusters used were 5,10,15 or 20. Each model was tested with each input type 100 times to ensure no outliers were recorded as consistent results.
The model's performance was estimated using accuracy and F1 score for classification and root mean square error and mean absolute error for regression. The best performing model for gender classification was a random forest with 9 fixations and 5 clusters as input which resulted in an accuracy of 0.71 and F1 score of 0.72. Age regression best performed with 9 fixations and 10 clusters which resulted in a RMSE of 8,89 years and a MAE of 7,39 years. This shows that it is possible to use eye tracking data for the prediction of demographics without extensive parameter tweaking. This research can be extended to include more high quality data, more demographics and more advanced algorithms. Possible applications for this research are in the field of medic