Effort Estimation in Agile Software Development Using Autoencoders

Effort estimation is important to correctly plan the use of resources in a software project. When adopting Agile in IT, business value is raised in both performance and quality. A gap in agile effort estimation is the lack of research combining software engineering models and deep learning technique...

Full description

Saved in:
Bibliographic Details
Published in2023 12th International Conference On Software Process Improvement (CIMPS) pp. 1 - 7
Main Authors Sanchez, Eduardo Rodriguez, Santacruz, Eduardo Vazquez, Maceda, Humberto Cervantes
Format Conference Proceeding
LanguageEnglish
Published IEEE 18.10.2023
Subjects
Online AccessGet full text
DOI10.1109/CIMPS61323.2023.10528839

Cover

Loading…
More Information
Summary:Effort estimation is important to correctly plan the use of resources in a software project. When adopting Agile in IT, business value is raised in both performance and quality. A gap in agile effort estimation is the lack of research combining software engineering models and deep learning techniques. During the planning phase the team involved makes an approximate estimation of time and cost based on artifacts and requirements obtained from initial interviews with clients and stakeholders. This paper aims to contribute with a hybrid effort estimation model that uses story points which measure the amount of effort needed to accomplish the project, team velocity which measures how many units of effort the team completes in a typical Sprint, and category size labels of effort, time and cost in order to estimate completion time and total cost of a project developed with agile methods like Scrum, The machine learning techniques used to implement the project are neural networks such as autoencoders and different variations of it. The learning capabilities are assessed through 10-Fold cross validation and the estimates are compared with the original dataset and the results obtained from literature. This research uses 21 projects developed by six software houses, a set of 42 noisy data is used for training created using data augmentation technique. Each project has two dependent variables that the algorithm tries to estimate and they are completion time measured in days and total cost valued in Pakistan rupees. The proposed approach compares the use of the original data as input versus the original data with the addition of category size labels. The main idea is that every project has three main features that are scope, time and cost. Since the current work is based on historical data the scope is always fixed and a single project can be estimated according to a hypothetical time or cost which can be small, medium or large.
DOI:10.1109/CIMPS61323.2023.10528839