Abstract:
“An Optimized Data Clustering Algorithm for Analysis of Software Architectural Styles Data Set” proposes the clustering algorithm which provides the accuracy and efficiency of data clustering by eliminating the initial value selection and thus is more efficient. The proposed algorithmic is capable to cluster any numerical data but we have designed a special data set of software architectural styles based on an online survey. The survey has been conducted through a detailed online questionnaire. Well known participants from academia and industry are selected for data retrieval and more than one thousands responses are recorded. The data set is donated to a prominence data repository website and is online available for more refinements and improvements through high quality research. The proposed algorithm of software clustering has been evaluated with the accuracy of the K-Means , Fuzzy C-Means and Agglomerative clustering in MATLAB through implementation in MATLAB 2015 and in RapidMiner Studio 7.3. The results found are 100% accurate. The performance of the proposed algorithm of software clustering has been evaluated based on the CPU time in which Q-Means has the best CPU Time than the other algorithms. The designed data set of software architectural styles has all attributes in the numerical data types. The proposed algorithm has one key drawback as it is capable to generate only two clusters unlike other clustering algorithms. So hence it is expected that this deficiency may be resolved in the future research studies.