Abstract:
Software Bug Prediction is an active research area and is being widely explored with
the help of Machine Learning. Since bug prediction is now considered as an important
measure of SDLC, we need to have optimized techniques for making predictive models.
Presently transfer learning and ensemble learning approaches are being researched
much. However, previous studies are not sufficient in this regard. So in this paper a
framework is created by using multiple techniques to explore their effectiveness when
combined in one model. The techniques involved feature selection which is used to reduce
the dimensionality and redundancy of features and select only the relevant ones;
transfer learning is used to train and test the model on different datasets to analyze
how much of the learning is passed to other dataset; and ensemble method is utilized
to explore the increase in performance upon combining multiple classifiers in a model.
Four NASA and four Promise datasets are used in the study, the results of which show
an increase in the performance of the model by providing better AUC-ROC values when
different classifiers were combined in the model. Thus revealing that use of amalgam of
techniques such as used in this study, feature selection, transfer learning and ensemble
methods prove helpful in optimizing the software bug prediction models and provide
high performing, useful end model.