Low Level GPU Performance Characteristics Using Vendor Independent Benchmarks
Summary
In parallel processing, GPUs are one of the most common devices used for computing. Each GPU architecture is different than others and usually has difference performance characteristics under different loads. In order for an application to run at optimal performance when using GPUs as compute devices, it is necessary to know the low level behavior of that device. In this study we first create a framework that is device independent and can be compiled by each vendor's native compiler. We then classify the GPU hardware modules that have the most impact on performance and create a series of benchmarks with different execution patterns and loads in order to create an overview of the GPU's performance characteristics. These characteristics can then be used as a basis for other applications in order to use the results to optimize for a certain device or decide which device provides optimal performance for that application.