ACCELERATING COMPUTE INTENSIVE IMAGE PROCESSING ALGORITHMS ON GPUS

DR SHOAB A KHAN, SALMAN-UL-HAQ, JAWAD

DSpace Home
→
E-Theses
→
CEME
→
Computer Engineering
→
BS
→
View Item

dc.contributor.author	DR SHOAB A KHAN, SALMAN-UL-HAQ, JAWAD
dc.date.accessioned	2025-04-25T07:08:28Z
dc.date.available	2025-04-25T07:08:28Z
dc.date.issued	2010
dc.identifier.other	DE-COMP-28
dc.identifier.uri	http://10.250.8.41:8080/xmlui/handle/123456789/52388
dc.description	Supervisor DR SHOAB A KHAN	en_US
dc.description.abstract	This project is to achieve computational acceleration of MATLAB image tool box using NVIDIA’s advance parallel computing architecture named CUDA (Compute Unified Device Architecture) on a standard PC hardware. During the last couple of years, graphics processors have experienced a tremendous performance increase. Further, their instruction set has become remarkably general purpose. Several research projects have been started to formulate adequate programming abstractions for using GPUs as coprocessors. One example is the Compute Unified Device Architecture (CUDA) by graphics card vendor NVIDIA. The advent of multicore CPUs and manycore GPUs means that mainstream processor chips are now parallel systems. Furthermore, their parallelism continues to scale with Moore’s law. The challenge is to develop application software that transparently scales its parallelism to leverage the increasing number of processor cores, much as 3D graphics applications transparently scale their parallelism to manycore GPUs with widely varying numbers of cores. NVIDIA is world leader in visual computing technologies and the inventor of the GPU (Graphics Processing Unit). CUDA developed by NVIDIA is a new technology of generalpurpose computing on the GPU, which makes users develop general GPU programs easily. CUDA gives developers access to the native instruction set and memory of the parallel computational elements in CUDA GPUs. Using CUDA, the latest NVIDIA GPUs effectively become open architectures like CPUs. Unlike CPUs however, GPUs have a parallel "many-core" architecture, each core capable of running thousands of threads simultaneously - if an application is suited to this kind of an architecture, the GPU can offer large performance benefits. This approach of solving general purpose problems on GPUs is known as GPGPU. We implemented some of the important and widely used image processing algorithms by CUDA, such as Huff-Transform, RGB to Gray, Convolution, Rapmat, Discrete Cosine Transform, Optical flow and many others. Implementation of these Matlab functions on CUDA resulted in significant performance and efficiency increase. These functions got their applications in different engineering and scientific domains including medical, space and multimedia among others. For parallel computing by CUDA, we should pay attention to two points. Allocating data for each thread is important. So if better allocation algorithms of the input data are found, the efficiency of the image algorithms would be improved. In addition, the memory bandwidth of host device is the bottleneck of the whole speed, so the quick read of input data is also very important and we should attach importance to it. Obviously, CUDA provides us with a novel massively data-parallel general computing method, and is cheaper in hardware implementation. In future, we will do more work on some of the other image processing algorithms of Matlab and their optimization strategies and how to make the best of the increasing cores of GPUs for parallel computing.	en_US
dc.language.iso	en	en_US
dc.publisher	College of Electrical & Mechanical Engineering (CEME), NUST	en_US
dc.title	ACCELERATING COMPUTE INTENSIVE IMAGE PROCESSING ALGORITHMS ON GPUS	en_US
dc.type	Project Report	en_US