NUST Institutional Repository

E-commerce Churn: Definition and Prediction – The Best Modelling Approach

Show simple item record

dc.contributor.author Ghaznavi, Syed Muhammad Ameer
dc.date.accessioned 2023-03-03T05:59:18Z
dc.date.available 2023-03-03T05:59:18Z
dc.date.issued 2023-02-23
dc.identifier.other RCMS003384
dc.identifier.uri http://10.250.8.41:8080/xmlui/handle/123456789/32499
dc.description.abstract The vast accessibility and advancement of the internet have made it an essential component of modern companies and organizations. Particularly in recent times, with the emergence of the COVID-19 pandemic, the adoption of online platforms and digital solutions has become increasingly prevalent among businesses to connect with their customers. Ecommerce refers to the buying and selling of products or services over the internet. Online firms interact with clients under non-contractual terms, making it difficult to track customer retention. One of the major challenges encountered by e- commerce is churn, which refers to the situation when a customer stop buying a product or service for a prolonged period. The churn rate in e-commerce is closely linked to a company's revenue, as retaining customers leads to higher margins compared to randomly acquiring new customers. It is estimated that the cost of acquiring new customers is four to five times that of retaining existing customers. The foremost objective of this research is to determine the most effective approach for identifying potential customer churn in the e-commerce industry. To carry out the analysis, an unlabelled dataset obtained from an e-commerce store is used to obtain insights regarding customer purchasing pattern. The data undergoes various stages of preprocessing and during this process, new features are derived from the original dataset. To label the customer data, three distinct churn indicator techniques has been applied. These techniques include a comparison of the average purchase duration of customers, the implementation of the RFM (recency, frequency, and monetary) method, and the application of a K-means unsupervised learning algorithm. Ultimately, a comparative analysis of several machine learning classification algorithms is performed to develop an accurate churn prediction model. This study constructed nine models by employing the Random Forests, Support Vector Machine, and Extreme Gradient Boosting algorithms in conjunction with three defining criteria. These models were then evaluated based on a range of performance metrics, including precision, recall, f1-score, accuracy, and auroc. The models attained their highest accuracy when trained on data that had been labelled using the RFM method, with accuracies of 86% and 82%, respectively. Additionally, the memory and time consumption of the models were assessed, and it was discovered that the support vector machine classifier used the least amount of memory, while the extreme gradient boosting approach demonstrated the most time-efficient performance. en_US
dc.description.sponsorship Dr. Mehak Rafiq en_US
dc.language.iso en_US en_US
dc.publisher SINES NUST. en_US
dc.subject E-commerce Churn: en_US
dc.title E-commerce Churn: Definition and Prediction – The Best Modelling Approach en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • MS [234]

Show simple item record

Search DSpace


Advanced Search

Browse

My Account