Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorLiu, Alison
dc.contributor.authorBerg, Maarten van den
dc.date.accessioned2023-05-23T00:00:51Z
dc.date.available2023-05-23T00:00:51Z
dc.date.issued2023
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/43909
dc.description.abstractOnline advertising is an important strategy for companies that sell products on the internet to find customers. These companies often make use of an e-commerce system, which stores data on the products that the company sells. If a company's inventory changes often, it may be desirable to automate the process of creating and updating advertisements, to reduce the workload of keeping advertisements up to date. Channable is a company that provides a tool for automated creation of advertisements, based on the data in a company's e-commerce system. Channable offers a product feed processing system which connects to a company's e-commerce system and regularly downloads information on the company's inventory. Once this data has been downloaded the system can apply customer-defined processing rules to the data and convert the data to a format suitable for submission to one or more advertising platforms or marketplaces. The heavy computational lifting in this system is performed by rule processing servers, which accept inventory data and customer-defined processing rules and produce a datastream that has been processed according to the customer-defined rules. Channable uses multiple of these rule processing servers for redundancy and performance reasons, and so it must balance the workload between the servers. The current method for assigning work to the servers uses a distributed scheduler. This scheduler has some limitations which cause it to distribute the work unevenly between servers, causing some servers to regularly become overloaded while other servers sit idle. In this thesis we implement a better method for assigning work to the rule processing servers, by making use of a centralised scheduler. We compare two methods for detecting overloaded servers and one alternative algorithm for assigning work to servers to the current approach by performing experiments using real-world data to determine which scheduling approach performs best. We show that using our scheduler can significantly improve the performance of the rule processing system: our best-performing scheduling algorithm speeds up the average duration of low-priority jobs by a factor of 2.2 and reduces the average waiting time of low-priority jobs by a factor of 3.5. We are also able to significantly reduce the variance in waiting time between the different rule processing servers, making the product feed processing system's performance more predictable.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectThis thesis describes a thesis project for an external company, Channable. In this project the performance of one of the company's distributed systems is improved by changing the method by which work is assigned to the servers in the system. The new work assignment method is evaluated by running experiments on a partial copy of the company's production environment.
dc.titleScheduling data feed processing jobs
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.courseuuComputing Science
dc.thesis.id16855


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record