dc.description.abstract | Recommender systems are becoming increasingly important in the growing online society. DeepICF and DeepICF+a are recommender system algorithms which have been claimed to provide more suitable recommendations than alternative algorithms. However, DeepICF and DeepICF+a have only been tested on movies and pictures, using HR and NDGC as evaluation metrics. This paper aims to explore the generalizability of both algorithms, using the same evaluation metrics, following several steps. First, this paper reproduces original experiments with DeepICF and DeepICF+a and shows that the accuracy of both algorithms is reduced through the process of reproduction. Then, the differences between explicit and implicit ratings, as well as the effects of pre-training, are investigated. The use of implicit ratings instead of explicit ratings has no negative influence on the performance of DeepICF and DeepICF+a, while the absence of pre-training data does have a negative impact on their accuracy. Finally, DeepICF and DeepICF+a are applied to the Million Song Dataset in an attempt to examine the generalizability of these algorithms, with FISM as baseline algorithm. Compared to the originally used data sets, all three algorithm returned less accurate recommendation lists on the Million Song Dataset. The used FISM implementation produced malfunctioning output on this data set, so it remains unproven that DeepICF and DeepICF+a outperform FISM on the Million Song Dataset. Still, it is found that DeepICF+a is more sensitive to parameter settings than DeepICF, and yields worse result when not adjusted properly. For practical implementations, DeepICF is therefore the better option of the two---in fact, it has yet to be proven inferior to any other recommender system algorithm. | |