An investigation of machine learning based prediction systems

Traditionally, researchers have used either off-the-shelf models such as COCOMO, or developed local models using statistical techniques such as stepwise regression, to obtain software effort estimates. More recently, attention has turned to a variety of machine learning methods such as artificial ne...

Full description

Saved in:

Bibliographic Details
Published in	The Journal of systems and software Vol. 53; no. 1; pp. 23 - 29
Main Authors	Mair, Carolyn, Kadoda, Gada, Lefley, Martin, Phalp, Keith, Schofield, Chris, Shepperd, Martin, Webster, Steve
Format	Journal Article
Language	English
Published	New York Elsevier Inc 15.07.2000 Elsevier Sequoia S.A
Subjects	Case-based reasoning Information systems Learning Machine learning Machinery Neural net Neural networks Prediction system Predictions Rule induction Software cost model Software effort estimation Studies Canada Prediction system Software cost model Neural net Rule induction Machine learning Case-based reasoning Software effort estimation
Online Access	Get full text

Cover

Loading…

More Information
Summary:	Traditionally, researchers have used either off-the-shelf models such as COCOMO, or developed local models using statistical techniques such as stepwise regression, to obtain software effort estimates. More recently, attention has turned to a variety of machine learning methods such as artificial neural networks (ANNs), case-based reasoning (CBR) and rule induction (RI). This paper outlines some comparative research into the use of these three machine learning methods to build software effort prediction systems. We briefly describe each method and then apply the techniques to a dataset of 81 software projects derived from a Canadian software house in the late 1980s. We compare the prediction systems in terms of three factors: accuracy, explanatory value and configurability. We show that ANN methods have superior accuracy and that RI methods are least accurate. However, this view is somewhat counteracted by problems with explanatory value and configurability. For example, we found that considerable effort was required to configure the ANN and that this compared very unfavourably with the other techniques, particularly CBR and least squares regression (LSR). We suggest that further work be carried out, both to further explore interaction between the end-user and the prediction system, and also to facilitate configuration, particularly of ANNs.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 23
ISSN:	0164-1212 1873-1228
DOI:	10.1016/S0164-1212(00)00005-4