dc.rights.license | CC-BY-NC-ND | |
dc.contributor.advisor | Keller, G.K. | |
dc.contributor.author | Prasetya, Naraenda | |
dc.date.accessioned | 2022-08-17T23:00:32Z | |
dc.date.available | 2022-08-17T23:00:32Z | |
dc.date.issued | 2022 | |
dc.identifier.uri | https://studenttheses.uu.nl/handle/20.500.12932/42329 | |
dc.description.abstract | Efficient cache utilization is critical in programs with high data throughput. Improving performance in this area often requires niche knowledge of computer architecture, extensive benchmarking, and algorithms that do more than intuively required. By changing the order in which tasks are executed, the order in which memory gets accessed gets changed. This way, we can manipulate how caches get used. This thesis proposes a column iterator which reschedules a 2D workload. We show that performance can be increased compared to the naive method by implementing the proposed method in C++ with CUDA and as an extension to the data parallel DSL Accelerate. | |
dc.description.sponsorship | Utrecht University | |
dc.language.iso | EN | |
dc.subject | This thesis shows that the cache efficiency for stencil and matrix multiplication can be improved compared to the naive implementation and the more common tiling approach, by rescheduling via index mapping. While the main focus lies in improving the performance on GPUs, the techniques presented can also be applied on a CPU. | |
dc.title | Improving Cache Performance in Structured GPGPU Workloads via Specialized Thread Schedules | |
dc.type.content | Master Thesis | |
dc.rights.accessrights | Open Access | |
dc.subject.keywords | GPU, GPGPU, parallel computing, cache, optimization, scheduling, multi-threading | |
dc.subject.courseuu | Computing Science | |
dc.thesis.id | 8765 | |