View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Forecasting SARS-CoV-2 Virus Load in Sewage Using Autoregression Models for Time Series Data

        Thumbnail
        View/Open
        Research_Project_final_version.pdf (5.510Mb)
        Publication date
        2023
        Author
        Slegers, Fleur
        Metadata
        Show full item record
        Summary
        Wastewater is a reservoir of human excretion and contains virus particles shed by people. Its analysis can provide information on the prevalence of infectious diseases. In the Netherlands, sewage surveillance has been used as a tool for monitoring the COVID-19 pandemic by detecting and measuring SARS-CoV-2 virus particles in sewage at over 300 sewage treatment plants (STPs) multiple times a week. The result thereof is an extensive data set of multivariate time series. In this thesis, linear regression models are used to model and forecast virus load time series for each variable (STP). We compare vector autoregressive (VAR) models which were enriched with different variable selection methods, based on K-Nearest Neighbours, correlations between time series and principal component analysis. How much the inclusion of multiple variables improves predictions is strongly dependent on the number and choice of variables, on the smoothness of the time series, on the length of the training set and on time between training and testing. A remarkable result from this research is that intermediate number of variables used in the models resulted in largest test errors. We found that performance of the models was worse for STPs with small catchment areas. With this research, we shed light on how relationships between STPs can be incorporated in multivariate linear time series models and introduce three novel variable selection methods for VAR.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/43719
        Collections
        • Theses
        Utrecht university logo