dc.contributor.author |
Naveed, Maryam |
|
dc.date.accessioned |
2023-08-07T10:30:14Z |
|
dc.date.available |
2023-08-07T10:30:14Z |
|
dc.date.issued |
2022 |
|
dc.identifier.other |
277111 |
|
dc.identifier.uri |
http://10.250.8.41:8080/xmlui/handle/123456789/35746 |
|
dc.description |
Supervisor: Dr. Arslan Shaukat |
en_US |
dc.description.abstract |
For any organization, customers are the basis for company success, so Customer Relations Management
(CRM) is an integral department. CRM research shows that it is more beneficial to retain customers, as
it guarantees a higher return than it is to acquire new ones at five times the cost. For this purpose,
organizations target minimal churning. Churning is defined as any customer ending a subscription or
stop using a service being provided by an entity. Customer churn is happening across various business
domains and has quite an impact on revenue generation. For companies to retain their essentials, they
must be identified well in time. In the event of their identification, they are subjected to retention
strategies. It is also much easier to target a specific group of customers than all of them to ensure
retention when possible churning characteristics are identified. This makes churn identification and
classification very important for the growth of a business.
This research aims to provide a generalized system that includes pre-processing and feature selection
that can be utilized with different parts and business rules to identify customers on the verge of churning.
A centralized hybrid algorithm has been devised to identify possible at-risk customers. We have
addressed the gap created when a researcher has to rely on a hit and trial method to locate the best
possible algorithm to solve their problem. Telecommunications data is widely available and has been
made the benchmark to test the proposed methodology. We have used available datasets IBM Watson
and Cell2Cell and a locally sourced dataset. Classifiers such as Support Vector Machines with RBF
kernel, GP-AdaBoost, and Random Forest are used with SMOTE-ENN sampling, RFE feature
selection, and normalization techniques. A potent combination of classification evaluation metrics is
employed for thorough testing and 10-fold cross-validation for further support. Experiments have been
performed with varying parameters and components. We can achieve a ground-breaking accuracy of
0.984 on IBM Watson and 0.994 on Cell2Cell. The locally sourced dataset has not been used in previous
research. Hence, it was used as scoring data on which we have achieved an accuracy greater than 0.990.
The results achieved on the two benchmark datasets using our proposed system are competitive
compared to previous literature reports. |
en_US |
dc.language.iso |
en |
en_US |
dc.publisher |
College of Electrical & Mechanical Engineering (CEME), NUST |
en_US |
dc.subject |
Keywords: Customer Churn, SVM, Random Forest, GP-AdaBoost, SMOTE-ENN |
en_US |
dc.title |
Generalized Churn Classification Across Multiple Business Domains |
en_US |
dc.type |
Thesis |
en_US |