Abstract:
Grid computing is a promising next generation computing platform as it enables the sharing of resources that are distributed across the world. The management of such large computing systems becomes increasingly complex as more and more heterogeneous components are added. Given the size of the Global Grid, manual management is not feasible and there is definite need for self-managing Grid systems.
The aims and objectives of this thesis were to device methods, techniques or algorithms that would enable autonomic grid management. Autonomic grid management includes reducing the work and complexity associated with large systems and be able to better respond to sudden changes in the system and adjust settings appropriately.
To achieve autonomic grid management, self-managing systems have to be developed. Autonomic grid management raises a lot of research questions. How to develop systems that are (a) self-configuring, i.e., the system changes its configuration constantly with changing environments, (b) self-optimizing, i.e., the system continuously looks for ways to optimize its performance and (c) self-healing, i.e., the system must recognize abnormal conditions or problems that may harm its workings and be able to recover from them?
To conform to the international standards the proposed architecture is based on Grid Monitoring Architecture (GMA) which is proposed by the Global Grid Forum (GGF). The self-managing system that has been developed in this project collects critical information through its sensors and maintains historical and up-to-date data. This data is analyzed and then appropriate decisions are made resulting in adjustment of the system. The system provides the four core functionalities outlined by IBM. (a) Monitoring provides the mechanisms that collect, aggregate, filter, correlate, and report details collected from manageable resources. (b) Analysis provides the mechanisms which model complex situations that allow the system to learn about the environment and help predict future situations. (c) Planning provides the mechanisms that construct the actions needed to achieve goals and objectives. (d) Execution provides mechanisms that control the execution of a plan.