Please report any queries concerning the funding data grouped in the sections named "Externally Awarded" or "Internally Disbursed" (shown on the profile page) to
your Research Finance Administrator. Your can find your Research Finance Administrator at https://www.ucl.ac.uk/finance/research/rs-contacts.php by entering your department
Please report any queries concerning the student data shown on the profile page to:
Email: portico-services@ucl.ac.uk
Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Email: portico-services@ucl.ac.uk
Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Publication Detail
Investigating the Effectiveness of Clustering for Story Point Estimation
-
Publication Type:Conference
-
Authors:Tawosi V, Al-Subaihin A, Sarro F
-
Publisher:IEEE
-
Publication date:03/2022
-
Status:Accepted
-
Name of conference:26th IEEE International Conference on Software Analysis, Evolution and Reengineering
-
Conference place:Hawaii (Virtual)
-
Conference start date:15/03/2022
-
Conference finish date:18/03/2022
-
Language:English
-
Keywords:Software Effort Estimation, Story Point Estimation, Latent Dirichlet Allocation, Hierarchical Clustering
Abstract
Automated techniques to estimate Story Points (SP) for user stories in agile software development came to the fore a decade ago. Yet, the state-of-the-art estimation techniques’ accuracy has room for improvement.
In this paper, we present a new approach for SP estimation, based on analysing textual features of software issues by employing latent Dirichlet allocation (LDA) and clustering. We first use LDA to represent issue reports in a new space of generated topics. We then use hierarchical clustering to agglomerate issues into clusters based on their topic similarities. Next, we build estimation models using the issues in each cluster. Then, we find the closest cluster to the new coming issue and use the model from that cluster to estimate the SP.
Our approach is evaluated on a dataset of 26 open source projects with a total of 31,960 issues and compared against both baselines and state-of-the-art SP estimation techniques.
The results show that the estimation performance of our proposed approach is as good as the state-of-the-art. However, none of these approaches is statistically significantly better than more naive estimators in all cases, which does not justify their additional complexity. We therefore encourage future work to develop alternative strategies for story points estimation.
The experimental data and scripts we used in this work are publicly available to allow for replication and extension.
› More search options
UCL Researchers