Abstract:
Water is essential for life because it supports biological processes, nourishes crops, and keeps
ecosystems healthy. Water is essential for human health, economic development, and productivity.
In this study, 80 surface water samples were taken at random from each lake for 8 months (10
samples monthly), and their physiochemical and microbiological characteristics were investigated
using lab tests to establish whether or not the water was drinkable. 18 Hanna Lake samples and 8
Spin Karez samples were deemed unsafe to drink. The samples were classified as drinkable or
non-drinkable based on their drinkability values. The first six months of drinkability data were
used to train the algorithms, and the remaining two months of drinkability data were forecasted.
We used a confusion matrix to examine the prediction performance of seven distinct built-in
machine learning algorithms - Linear Regression (LR), Decision Tree Classifier (DTC), Random
Forest (RF), XGBoost (XGB), KN-Neighbour (KNN), Support Vector Machines (SVM), and
Adaboost. According to the data, the DTC algorithm outperformed all other algorithms for Hanna
Lake and Spin Karez. Based on these observations, an effective water treatment process for
eliminating these highly concentrated features from both lakes for purified water was developed.
Researchers interested in utilizing machine learning to improve water quality can benefit from the
water quality prediction models described in this article.
Keywords: Hanna Lake, Spin Karez, Physio-chemical parameters, Microbial activities, WHO, and NEQs,
Machine Learning