Abstract:
Mining, in this age of data science, spans over statistical analysis, machine learning, information retrieval, visualization and other related fields. It helps companies and other organizations in taking strategic decisions based on trend analysis and business staying ahead of their competitors.
Many data mining softwares have emerged during the last decade such as WEKA, R and Rapid Miner but they are mostly desktop based. This approach on one hand limits the collaboration among data mining experts over developed predictive models, and on the other hand limits the data processing capabilities to computing power of the desktop. Huge volumes of data is available over the web and is under-utilized in terms of its potential benefits if properly analyzed and used in decision support.
The main goal of this project is to develop data mining framework as “software as a service” (SaaS) model. WEKA, an open source data mining API, is used as the core of our framework. It covers a large collection of machine learning and data mining algorithms for solving real-world problems. Our hypothesis is that the software as a service approach of the cloud computing is very critical for wide spread adaption of data mining tools.
Data preprocessing (data filtering), classification, clustering, association, and attribute selection features of WEKA have successfully been ported as RESTful services. Moreover, interactive visualization of results have also been developed for consumption over the web. Users can use this application without any overhead of installation and configuration. World Wide WEKA will not only target business community but will also be useful for academic purposes.