NUST Institutional Repository

Parallel Architectures for Data Mining and Machine Learning Algorithms

Show simple item record

dc.contributor.author Amna Tehreem
dc.date.accessioned 2020-12-31T06:47:03Z
dc.date.available 2020-12-31T06:47:03Z
dc.date.issued 2016
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/20148
dc.description Supervisor Dr. Shoab Ahmad Khan en_US
dc.description.abstract Data mining and machine learning algorithms deal with large amount of data, which with the invention of cost e cient devices has increased by massive amounts. Many algorithms of these domains are not part of real time systems because of their computational complexity and large data on which they need to work. A lot of algorithms are being implemented on parallel processing systems like GPUs and FPGAs etc. to achieve the desired speed. The purpose of this thesis is to provide parallel processing model of mean shift clustering and frequent patter growth (FP-growth) algorithm, targeted to run on FPGA. The general model consists of multiple homogeneous processing entities (PEs) connected through a bus. These PEs work in collaborative working environment with each PE working independently and also communicating with its peers according to the requirements of algorithms. Two architectures for mean shift clustering algorithm are proposed. One of them is a general architecture which divides the computational complexity in each successive iteration by decreasing the number of windows to be processed. The second architecture is proposed and implemented on FPGA for one dimensional data. The algorithm is tested on 20 images from segmentation evaluation database for di erent number of PEs and di erent number of fractional bits used to represent mean. With a clock frequency of approximately 120MHz, the algorithm is able to segment an image in 2.47ms to 0.114ms for 1 PE and 7 fractional bits and 16 PEs with 0 fractional bits respectively as compared to 6.44 minutes per image with the conventional mean shift algorithm. The simplicity of algorithm resulted in very low utilization of Spartan 6 FPGAs resources. A parallel architecture for implementing FP-growth algorithm is also proposed which divides the task e ciently among PEs. The parallel algorithm is tested on databases from UCI machine learning repository and frequent itemset mining dataset repository. Speedup achieved for 2 PEs is approximately 1.99. By increasing PEs to 16, speedup increases to approximately 15.5. The processing requirements for the algorithms show that they can be used in real time systems. en_US
dc.publisher CEME, National University of Sciences and Technology, Islamabad en_US
dc.subject Computer Engineering en_US
dc.title Parallel Architectures for Data Mining and Machine Learning Algorithms en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [331]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account