Abstract:
This research is based on the fine tuning of BERT model and applying long short term memory
layers (LSTM). Bert which is already well known for text classification is being used along
with other layers to further enhance the accuracy of text classification. Many existing Specific
Bert models are available but they are only trained for a specific task. This paper shows the
classification of four different classes: chats, emails, news and tweets. The method is pretty
simple, at first dataset for each target class is collected and preprocessed using NLP libraries to
remove extra and useless data from the datasets. The data-loader is prepared to feed the testing
and validation data into a BERT base model. Before that, Bert tokenizer is used as Bert only
takes data which is presented in a specific format having Special tokens ([CLS] and [SEP]).
Using a recommended approach Bert is fine-tuned one by one for all target classes. The
innovation is the introduction of LSTM layers merged with fully connected (FC) and some
pooling layers in case. The trained output of Bert, which is Bert-embedding, is being used as
an input to the LSTM model. Although, Bert alone could perform well, but for some complex
datasets these additional layers have provided an edge. As LSTM layers are being used in bidirection
to further capture the feature in the text to classify them more efficiently. Accuracy is
enhanced overall. However, for binary classification with limited datasets there is a minor
change in accuracy by introducing LSTM layers but for multi-classification with complex data,
the accuracy is noticeable. For chats achieved accuracy is 99%; for emails, 98%; for news,
97%; and for tweets (complex data with multi-label sentiment analysis), 85%. These
accuracies in comparison with alone Bert model are more efficient.