83. THE CRISP PERFORMANCE MODEL FOR GPGPUS
Name: Rajib Kumar Nath
Grad Year: 2017
￼￼￼￼My research aims to develop a hardware counter based analytical performance model for NVIDIA GPGPUs. Such models can be instrumental in driving on chip dynamic power, performance, and energy optimization decisions, e.g., DVFS, cache replacement, selective caching, warp scheduling, thread block scheduling, etc. To this end, we have developed CRISP , the first runtime analytical model of performance in the face of changing frequency in a GPGPU. It shows that prior models not targeted at a GPGPU fail to account for important characteristics of GPGPU execution, including the high degree of overlap between memory access and computation and the frequency of store-related stalls. CRISP, for example, carefully models the overlap be- tween loads and computation, enabling it to predict at what frequency the overlapped computation will completely cover the loads and make the application compute bound. CRISP provides significantly greater ac- curacy than prior runtime performance models, being within 4% on average when scaling frequency by up to 7X. Using CRISP to drive a runtime energy efficiency controller yields a 10.7% improvement in energy-delay product, vs 6.2% attainable via the best prior performance model. My current work involves enhancing the CRISP analytical model to: (a) find optimum DVFS settings for multiple clock domains in GPGPUs (e.g., memory, interconnect, last level cache) for different optimization scenarios ? performance vs. power vs. energy, (b) find the optimum number of thread blocks for kernels showing significant thrashing in caches and memory row buffers, (c) identify critical warps to drive the warp scheduling policy efficiently, and (d) assist the caching policy by classify- ing critical cache lines.