dc.contributor.author |
Abdul Latif |
|
dc.date.accessioned |
2020-11-04T11:17:32Z |
|
dc.date.available |
2020-11-04T11:17:32Z |
|
dc.date.issued |
2007 |
|
dc.identifier.uri |
http://10.250.8.41:8080/xmlui/handle/123456789/9837 |
|
dc.description |
Advisor: Dr. Arshad Ali |
en_US |
dc.description.abstract |
Grid data mining has emerged as an important field, as data continues to be produced at astounding rates and, in order to get most out of this data, efficient techniques for data mining are required. Various classes of algorithms have been developed. Because of the increasingly large magnitude of the data to be processed the Grid has become a natural platform for Data mining. Recently a lot of frameworks have been developed which facilitate grid data mining. However Grids, being very application specific, vary in terms of data distribution and scale. The performance of different classes of data mining algorithms against varying levels of data distribution has not been studied. This project aims to cover this domain, in that it tries to benchmark data mining algorithms on Grids in order to determine if some class of algorithms are more suitable for some specific level of data distribution in Grids. This work will facilitate the deployment of optimized data mining algorithms on application specific Grids and may lead to a generic adaptive Grid data mining framework in future.
The simulations are run on Sun Fire V890 system. The java bases data mining platform Weka is being used along with the GridSim, grid simulation environment. On the basis of this analysis we will be able to have a relation between number of computation nodes, data fragmentation level and the performance in terms of time. Thus on the basis of this analysis we can design a new data mining platform for distributed Grid infrastructure where the data can be efficiently and intelligently distributed to the grid resources in order to minimize the whole data mining time. |
en_US |
dc.publisher |
SEECS, National University of Sciences & Technology, Islamabad. |
en_US |
dc.subject |
Information Technology, Analyzing Effect of Data Fragmentation |
en_US |
dc.title |
Analyzing Effect of Data Fragmentation on Data Mining Algorithms in Distributed Grid Environment |
en_US |
dc.type |
Thesis |
en_US |