A Framework for Predicting Optimal Tuning Parameters for GPU Compute Kernels using Machine Learning

Mahmood, Khawir; Supervised by Dr. Hammad Afzal.

DSpace Home
→
E-Theses
→
MCS
→
Computer Software Engineering
→
MSCS
→
View Item

dc.contributor.author	Mahmood, Khawir
dc.contributor.author	Supervised by Dr. Hammad Afzal.
dc.date.accessioned	2020-11-17T04:55:44Z
dc.date.available	2020-11-17T04:55:44Z
dc.date.issued	2019-09
dc.identifier.other	TCS-442
dc.identifier.other	MSCS / MSSE--24
dc.identifier.uri	http://10.250.8.41:8080/xmlui/handle/123456789/12270
dc.description.abstract	Finding optimal tuning parameters for Graphics Processing Unit (GPU) Compute Kernels is typically undertaken as an optimization problem solved through search techniques. Due to the unknown nature of the optimization surface, an exhaustive search is required for best results. Such a methodology consumes considerable compute resources as well as time due to which it is impractical for production software. This thesis describes a framework that uses deep learning sequence models to predict the optimal tuning parameters for GPU compute kernels, solely on the basis of input tensor parameter values. The models are first trained on the available dataset, which contains both the input and corresponding optimum output parameter values. The model, from within the large search-space, learns the underlying multi-dimensional optimization surface or manifold and is able to predict, with high accuracy, the optimal tuning configuration even for unseen set of input tensor values. A modified beam search technique has also been proposed and incorporated in the prediction state of the framework which ensures that the predicted output parameters satisfy hardware constraints dictated by the GPU architecture. The framework has been tested on half and full precision modes of four different kernels from the MIOpen dataset. By incorporating beam-search and output parameter constraint satisfaction, the framework can predict with more than 90% accuracy, the optimum parameters for kernels which otherwise take hours to tune. As a result, it is able to substantially reduce the development time and compute resources required to tune unseen input configurations in production environment. This in turn translates to shorter development cycles, reduced costs of development and better user experience.	en_US
dc.language.iso	en	en_US
dc.title	A Framework for Predicting Optimal Tuning Parameters for GPU Compute Kernels using Machine Learning	en_US
dc.type	Thesis	en_US