Please report any queries concerning the funding data grouped in the sections named "Externally Awarded" or "Internally Disbursed" (shown on the profile page) to
your Research Finance Administrator. Your can find your Research Finance Administrator at https://www.ucl.ac.uk/finance/research/rs-contacts.php by entering your department
Please report any queries concerning the student data shown on the profile page to:
Email: portico-services@ucl.ac.uk
Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Email: portico-services@ucl.ac.uk
Help Desk: http://www.ucl.ac.uk/ras/portico/helpdesk
Publication Detail
Agile Effort Estimation: Have We Solved the Problem Yet? Insights From A
Replication Study
-
Publication Type:Journal article
-
Authors:Tawosi V, Moussa R, Sarro F
-
Publication date:14/01/2022
-
Keywords:cs.SE, cs.SE, cs.LG, stat.ML
-
Author URL:
-
Publisher URL:
-
Notes:Accepted for publication in IEEE Transactions on Software Engineering (TSE, 2022)
Abstract
In the last decade, several studies have explored automated techniques to
estimate the effort of agile software development. We perform a close
replication and extension of a seminal work proposing the use of Deep Learning
for Agile Effort Estimation (namely Deep-SE), which has set the
state-of-the-art since. Specifically, we replicate three of the original
research questions aiming at investigating the effectiveness of Deep-SE for
both within-project and cross-project effort estimation. We benchmark Deep-SE
against three baselines (i.e., Random, Mean and Median effort estimators) and a
previously proposed method to estimate agile software project development
effort (dubbed TF/IDF-SVM), as done in the original study. To this end, we use
the data from the original study and an additional dataset of 31,960 issues
mined from TAWOS, as using more data allows us to strengthen the confidence in
the results, and to further mitigate external validity threats. The results of
our replication show that Deep-SE outperforms the Median baseline estimator and
TF/IDF-SVM in only very few cases with statistical significance (8/42 and 9/32
cases, respectively), thus confounding previous findings on the efficacy of
Deep-SE. The two additional RQs revealed that neither augmenting the training
set nor pre-training Deep-SE play lead to an improvement of its accuracy and
convergence speed. These results suggest that using semantic similarity is not
enough to differentiate user stories with respect to their story points; thus,
future work has yet to explore and find new techniques and features that obtain
accurate agile software development estimates.
› More search options
UCL Researchers