Abstract:
The telecommunications industry is becoming increasingly competitive, making it necessary for businesses to use data-driven insights to develop effective marketing strategies.
One of the most important pieces of customer data is demographic information, such as
gender and age. This information can be used to segment customers into groups with
similar interests and needs, which can then be used to develop more targeted marketing
campaigns.However, demographic information is not always available. In some cases,
customers may choose not to provide their demographic information. In other cases, the
demographic information may be inaccurate. This can make it difficult for businesses to
target their marketing campaigns effectively. This thesis presents a novel methodology
for estimating the demographic attributes (by age and gender) of unlabeled telecom
customers based on their individual calling behavior and the topology of the communication graph. The proposed methodology is based on machine learning algorithms,
including k-nearest neighbors (K-NN), support vector machine (SVM), Neighborhood
Component Analysis (NCA), logistic regression, and decision trees.The methodology
was evaluated using a real-world dataset with millions of users. The results showed
that the proposed methodology can accurately predict the gender and age group of unlabeled telecom customers .The findings of this thesis have important implications for
the telecommunications industry. By accurately predicting the gender and age group
of their customers, businesses can develop more targeted marketing campaigns that are
more likely to be successful. This can lead to increased customer retention and revenue.