Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorQahtan, Hakim
dc.contributor.authorHou, Yuxuan Hou
dc.date.accessioned2024-01-01T01:03:36Z
dc.date.available2024-01-01T01:03:36Z
dc.date.issued2024
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/45760
dc.description.abstractAbstract In this study, we aim to assist real estate evaluators in avoiding serious errors during the evaluation process due to carelessness or subjective biases. To achieve this, we primarily employ real Dutch real estate valuation data provided by KATE Innovations and build a data prediction framework with three key components: data imputation, feature selection, and data prediction. Within this framework, we build a data imputation method called “Bucket4Imp” and an improved version of the Fast Correlation- Based Filter (FCBF) as an optional feature selection preprocessing step. For data imputation and data prediction, we adopt a bucket-like ensemble model, incorporating three models: K-Nearest Neighbors, Decision Trees, and Random Forest. Through these model and module configurations, we comprehensively address classification or regression problems involving high dimensional and missing data. Experimenting with diverse datasets demonstrates the significant enhancement in predictive accuracy achievable through our data imputation preprocessing method. Furthermore, our improved feature selec- tion technique contributes to enhanced predictive performance. However, the decision to employ feature selection should be based on the characteristics of the feature set. The study also conducts a compar- ative analysis between the machine learning-based Bucket4Imp method and the statistics-based Mode method. The results show that our Bucket4Imp method performs significantly better than the statistical approach, particularly in multi-class scenarios. Moreover, the findings highlighted the challenges of pre- dicting multi-class features, particularly when class distributions are even, and the number of class labels is large.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectRecovering missing data and making predictions about the target variables
dc.titleImproving Accuracy in Real Estate Valuation: A Study on Data Imputation and Prediction Models
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsMissing Data, Data Prediction, Feature Engineering, Data Imputation
dc.subject.courseuuComputing Science
dc.thesis.id24067


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record