Cross-modal recipe analysis for fine-grained geographical mapping of food consumption from images and supermarket sales

Dijkstra, Neele

dc.rights.license	CC-BY-NC-ND
dc.contributor.advisor	Salah, Albert
dc.contributor.author	Dijkstra, Neele
dc.date.accessioned	2022-06-15T00:00:41Z
dc.date.available	2022-06-15T00:00:41Z
dc.date.issued	2022
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/41636
dc.description.abstract	The proposed study analyses the applicability of a cross-modal deep learning model trained online recipe data (Salvador et al., 2017) by Li et al. (2020b) to recognise ingredients and their proportional amounts in food images on food images from social media to geographically map food consumption within a city. This new method can provide insights into food consumption at fine spatial granularity, contributing to global health, diminishing world hunger, and providing opportunities for local production and consumption. This study explored whether we can use cross-modal analysis of images and ingredient sets to estimate relative ingredient proportions given images from a specific geographical area. These ingredient amounts are compared to relative consumption rates given by a translated dataset of retail sales of supermarkets within the same region. Specifically, the research focuses on Baku, Azerbaijan, as case study. To study consumption in this region, a new dataset is presented, AzerFSQFood, containing food images from eating establishments around the city. A pretrained cross-modal neural network (Li et al., 2020b) is applied to the novel image dataset, and the results are compared to a new translated supermarket sales dataset, based on an existing Azerbaijani dataset from supermarkets in Baku (Zeynalov, 2020). Unfortunately, the existing cross-modal neural network performs poorly on the AzerFSQFood dataset regarding ingredient detection and relative amount prediction. This indicates that existing cross-modal models are not yet readily applicable to novel datasets, and highlights the need of available models and curated datasets. The study in addition finds a significant correlation between Foursquare images and supermarket sales around Baku in terms of relative ingredient consumption.
dc.description.sponsorship	Utrecht University
dc.language.iso	EN
dc.subject	Analysis of the applicability of a pretrained cross-modal deep learning model to geographically map food consumption across Baku, Azerbaijan. The model is applied to food images from eating establishments from Foursquare. In addition, food consumption is mapped based on supermarket sales across the city.
dc.title	Cross-modal recipe analysis for fine-grained geographical mapping of food consumption from images and supermarket sales
dc.type.content	Master Thesis
dc.rights.accessrights	Open Access
dc.subject.keywords	deep learning; cross-modal analysis; food consumption; geographical mapping; computer vision; natural language processing
dc.subject.courseuu	Artificial Intelligence
dc.thesis.id	4448

Files in this item

Name:: Thesis_Neele_Dijkstra_Part_2.pdf
Size:: 2.233Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record