Abstract:
The process of accurately estimating the effort required to complete user stories is a crucial activity in software development. It has the potential to significantly impact both the predictability and efficiency of the agile software development cycle. Nevertheless, a considerable proportion of teams are currently facing difficulties in this area, resulting in postponed timelines and unmet deadlines across various domains. Despite the availability of several methodologies aimed at estimating the workload necessary for completing a story point, research has demonstrated that these algorithms are incapable of comprehending the precise contextual requirements of the user. Furthermore, a critical concern pertaining to the techniques of machine learning and deep learning employed in this domain is their elevated time complexity and suboptimal precision. The development of pre-trained transformers, such as GPT, has made a noteworthy contribution to effectively surmounting these challenges. It is possible that certain attention heads may not be effectively contributing to the task of estimating story points, leading to suboptimal outcomes. Notwithstanding the satisfactory performance of several iterations of GPT in previous instances. Through extensive evaluation on 23,313 issues across 16 open-source software projects. The evaluation compared five existing baseline approaches for within- and cross-project scenarios. The results revealed that the GPT2++ approach achieved an impressive accuracy of 92% and an MAE (Mean Absolute Error) of 1.18. Specifically, the GPT2++ approach outperformed existing baseline approaches in two ways: 1. For within-project estimations, the GPT2++ approach was found to be 23% to 59% more accurate than the existing baseline approaches. 2. For cross-project estimations, the GPT2++ approach demonstrated a higher accuracy of 3% to 46% compared to the existing baseline approaches. The ablation study reveals that the GPT-2 architecture employed in this approach significantly enhances GPT2++ by 6% to 47% in terms of performance and boosts the F1 score by 87%. This underscores the remarkable progress of AI in Agile story point estimation.