NUST Institutional Repository

A Framework for Predicting Optimal Tuning Parameters for GPU Compute Kernels using Machine Learning

Show simple item record

dc.contributor.author Mahmood, Khawir
dc.contributor.author Supervised by Dr. Hammad Afzal.
dc.date.accessioned 2020-11-17T04:55:44Z
dc.date.available 2020-11-17T04:55:44Z
dc.date.issued 2019-09
dc.identifier.other TCS-442
dc.identifier.other MSCS / MSSE--24
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/12270
dc.description.abstract Finding optimal tuning parameters for Graphics Processing Unit (GPU) Compute Kernels is typically undertaken as an optimization problem solved through search techniques. Due to the unknown nature of the optimization surface, an exhaustive search is required for best results. Such a methodology consumes considerable compute resources as well as time due to which it is impractical for production software. This thesis describes a framework that uses deep learning sequence models to predict the optimal tuning parameters for GPU compute kernels, solely on the basis of input tensor parameter values. The models are first trained on the available dataset, which contains both the input and corresponding optimum output parameter values. The model, from within the large search-space, learns the underlying multi-dimensional optimization surface or manifold and is able to predict, with high accuracy, the optimal tuning configuration even for unseen set of input tensor values. A modified beam search technique has also been proposed and incorporated in the prediction state of the framework which ensures that the predicted output parameters satisfy hardware constraints dictated by the GPU architecture. The framework has been tested on half and full precision modes of four different kernels from the MIOpen dataset. By incorporating beam-search and output parameter constraint satisfaction, the framework can predict with more than 90% accuracy, the optimum parameters for kernels which otherwise take hours to tune. As a result, it is able to substantially reduce the development time and compute resources required to tune unseen input configurations in production environment. This in turn translates to shorter development cycles, reduced costs of development and better user experience. en_US
dc.language.iso en en_US
dc.title A Framework for Predicting Optimal Tuning Parameters for GPU Compute Kernels using Machine Learning en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account