dc.description.abstract |
In the age of advanced technology, the internet has drastically transformed our lives
in different ways. The traditional way to perform any activity has now been switched
online. Therefore, seeking a job and hiring employees has also changed online. Most
organizations publish job ads that fully describe their requirements, and job seekers
apply for them according to their interests. An online recruitment system is
beneficial, but it can also be deleterious if it is not administered carefully. It is
inauspicious for job seekers in terms of losing their privacy, money, or even their
current job sometimes. Therefore, it is necessary to detect fake job postings to get
rid of online job scams. In recent studies, traditional machine learning and deep
learning algorithms have been implemented to classify fake job postings as
fraudulent/non-fraudulent; more attention must be paid to overcoming the class
imbalance problem. It is observed that the problem of detecting fake job postings
came up against the class imbalance problem. This problem caused high predictive
accuracy for frequent and low predictive accuracy for infrequent classes. Therefore,
this research aims to use transformer-based deep learning classification models
BERT and RoBERTa on fake job postings to precisely detect them as
fraudulent/non-fraudulent and to handle class imbalance problem, a total of ten
top-performing SMOTE variants, including Polynom-fit-SMOTE, ProWSyn, SMOTE IPF, Lee SMOTE, SMOBD, G-SMOTE, CCR, LVQ-SMOTE, Assembled-SMOTE and
SMOTE-Tomeklinks were implemented. The benchmarked data used in recent
studies seem outdated, containing job postings advertised between 2012 and 2014.
Hence, we extended it with the latest job postings posted between July 2019 and
March 2021 for better evaluation. The models' performances on data balanced by
each of the above-mentioned SMOTE variants were analyzed and compared. All
implemented approaches showed up to notable performances. However, it was
observed from the classification results that BERT achieved the highest balanced
accuracy of about 90% on the data balanced by using SMOBD SMOTE, whereas
RoBERTa achieved the highest balanced accuracy of about 83% on the data
balanced by using G-SMOTE. |
en_US |