Feature Engineering for Enhanced Model Performance in Software Effort Estimation

Many new methodologies have been defined in the last two decades in the domain of Software Effort Estimation. They include manual methods based on expert judgment, analogy-based models, parametric models, regression models, machine learning models, and more recently, deep learning models. Except for...

Full description

Saved in:
Bibliographic Details
Published inInternational journal of recent technology and engineering Vol. 8; no. 3; pp. 6053 - 6063
Main Authors Pillai, Sreekumar P., Radharamanan, Dr. T., Madhukumar, Dr. S.D.
Format Journal Article
LanguageEnglish
Published 30.09.2019
Online AccessGet full text

Cover

Loading…
Abstract Many new methodologies have been defined in the last two decades in the domain of Software Effort Estimation. They include manual methods based on expert judgment, analogy-based models, parametric models, regression models, machine learning models, and more recently, deep learning models. Except for manual methods, all other models depend heavily on data. Lack of quality data in this domain is a motivation to explore means to optimize the sparse data available. Machine learning algorithms depend on domain features, and their ability to represent and model the domain, to solve the problems irrespective of whether it is classification or regression, image, or voice synthesis. There is continued research for the best representation of the issue through the right feature space. While most of the traditional research rely on the original dataset and concentrate more on feature selection, modern-day approaches explore creating additional features that have the potential to extend the models representational space.This research builds on our last research exploring the potential to improve Software Effort Estimation accuracy by employing engineered features in addition to the original ones. The features are created manually based on the literature. Through the engineered features, we captured additional representational features such as missingness and proportion of categorical data available in the dataset. We present the rationale for the features generated and compare the prediction accuracy between a model using the original dataset and the engineered data set.Our experiments in Feature Engineering is innovative in the Software Estimation domain and the results conclusive establishing its use in predicting Software Effort. We report an improved accuracy of 38% with engineered features at PRED(15), and 11% improvement at PRED(20). The quantitative growth that we have been able to achieve in terms of accuracy is promising enough for this to be adopted as a standard in future research on the subject and practical applications.
AbstractList Many new methodologies have been defined in the last two decades in the domain of Software Effort Estimation. They include manual methods based on expert judgment, analogy-based models, parametric models, regression models, machine learning models, and more recently, deep learning models. Except for manual methods, all other models depend heavily on data. Lack of quality data in this domain is a motivation to explore means to optimize the sparse data available. Machine learning algorithms depend on domain features, and their ability to represent and model the domain, to solve the problems irrespective of whether it is classification or regression, image, or voice synthesis. There is continued research for the best representation of the issue through the right feature space. While most of the traditional research rely on the original dataset and concentrate more on feature selection, modern-day approaches explore creating additional features that have the potential to extend the models representational space.This research builds on our last research exploring the potential to improve Software Effort Estimation accuracy by employing engineered features in addition to the original ones. The features are created manually based on the literature. Through the engineered features, we captured additional representational features such as missingness and proportion of categorical data available in the dataset. We present the rationale for the features generated and compare the prediction accuracy between a model using the original dataset and the engineered data set.Our experiments in Feature Engineering is innovative in the Software Estimation domain and the results conclusive establishing its use in predicting Software Effort. We report an improved accuracy of 38% with engineered features at PRED(15), and 11% improvement at PRED(20). The quantitative growth that we have been able to achieve in terms of accuracy is promising enough for this to be adopted as a standard in future research on the subject and practical applications.
Author Madhukumar, Dr. S.D.
Radharamanan, Dr. T.
Pillai, Sreekumar P.
Author_xml – sequence: 1
  givenname: Sreekumar P.
  surname: Pillai
  fullname: Pillai, Sreekumar P.
– sequence: 2
  givenname: Dr. T.
  surname: Radharamanan
  fullname: Radharamanan, Dr. T.
– sequence: 3
  givenname: Dr. S.D.
  surname: Madhukumar
  fullname: Madhukumar, Dr. S.D.
BookMark eNpNkM9KAzEQh4NUsNY-gZe8wK6TP7ubHGVpVahYUM9Lkk3qljaRbER8e9OtB08z88EMv_mu0cwHbxG6JVCySnK4G_Yx2bKtaqAlSMGIvEBzSpumYKIRs3_9FVqO4x4ACKsJZ_UcbddWpa9o8crvBm9tHPwOuxDz_KG8sT1-Dr094K2NmR5PCA8evwaXvtVpzWWc8GpMw1GlIfgbdOnUYbTLv7pA7-vVW_tYbF4entr7TWFIJWWRE1GqjVIauOh17UjFdS97CVZyRlzFqQKXX2o0E1oJBZJwAM2dMJWpNVsgdr5rYhjHaF33GXOE-NMR6CYv3eSlm7x0Zy_sF4XLWYI
ContentType Journal Article
CorporateAuthor Head of Department, School of Management Studies, NITC, Kozhikode, Kerala, India
Professor, Dean (SW), Dept. of Computer Science, NITC, Kozhikode, Kerala, India
Research Scholar, School of Management Studies, National Institute of Technology Calicut, Kozhikode, Kerala, India
CorporateAuthor_xml – name: Research Scholar, School of Management Studies, National Institute of Technology Calicut, Kozhikode, Kerala, India
– name: Head of Department, School of Management Studies, NITC, Kozhikode, Kerala, India
– name: Professor, Dean (SW), Dept. of Computer Science, NITC, Kozhikode, Kerala, India
DBID AAYXX
CITATION
DOI 10.35940/ijrte.C5602.098319
DatabaseTitle CrossRef
DatabaseTitleList CrossRef
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
EISSN 2277-3878
EndPage 6063
ExternalDocumentID 10_35940_ijrte_C5602_098319
GroupedDBID AAYXX
ALMA_UNASSIGNED_HOLDINGS
CITATION
M~E
OK1
RNS
ID FETCH-LOGICAL-c1599-27722bcaab048db6f154bd9d90e9431f542a0f5607b38ba8a091400b4f8c5c6b3
ISSN 2277-3878
IngestDate Tue Jul 01 04:00:29 EDT 2025
IsDoiOpenAccess false
IsOpenAccess true
IsPeerReviewed false
IsScholarly true
Issue 3
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c1599-27722bcaab048db6f154bd9d90e9431f542a0f5607b38ba8a091400b4f8c5c6b3
OpenAccessLink https://doi.org/10.35940/ijrte.c5602.098319
PageCount 11
ParticipantIDs crossref_primary_10_35940_ijrte_C5602_098319
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2019-09-30
PublicationDateYYYYMMDD 2019-09-30
PublicationDate_xml – month: 09
  year: 2019
  text: 2019-09-30
  day: 30
PublicationDecade 2010
PublicationTitle International journal of recent technology and engineering
PublicationYear 2019
SSID ssj0001361436
Score 2.097614
Snippet Many new methodologies have been defined in the last two decades in the domain of Software Effort Estimation. They include manual methods based on expert...
SourceID crossref
SourceType Index Database
StartPage 6053
Title Feature Engineering for Enhanced Model Performance in Software Effort Estimation
Volume 8
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1La9wwEBbb5NIeSl-hz6BDbxs7tmVnrWNJN4TChkA2kJuRZJlu0rpl8VLoob-mP7TfyC-1XUqTi1mk3UHa-Twzkj7NMPZ2ZuKyykwaGPijAC6gDFSUlwHldZGxSZVyxWAWZ0enl-mHq-xqMvnpsZY2jQ7N9633Su6iVbRBr3RL9haaHYSiAZ-hXzyhYTz_S8cUv9EBgJdU0NEG5_XH9mCfKp19Ipb7cDlghdcZlvcbEb7mFZqb6Rxv-edRQdcjt33cKvQSTMBEEn-gGfbk3fmDHYcwWFsqaOTIAhdra2-Iyz09D8eDpZJyRRN7tjV863C6HHoX6N3c9PRv6rsI34f-FkUsez5Fb8kSOicWeVurJ7Rb2jpTnHuIE55ZxZpLeC4aiy6xzfyLTKZEmFxdrxsbHiOYS8JI5qIzyr8l2_7DCQ7URCyKnJjCCSmckKIVco_tYtAJ1clY_PB28gRCHFeLcphTm97KyTn8ezBeCOTFMstH7GG3COHvWkQ9ZhNbP2EPPBQ9ZecdtrjXygEX3mOLO2xxD1t8VfMeW7zFFh-x9YxdnsyXx6dBV30jMAhxZeCmqo1SGkaeLmsi2NalLGVkJaLOKksTFVWY1kyLXKtcIfKEQ9BplZvMHGmxx3bqL7V9zriJKXGilmVeKjjmmbJwqvh1UkWpio19wQ76P6T42iZZKf6hh5e3-_ordn8E5Wu206w39g0iyUbvO0X-AlwudJU
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Feature+Engineering+for+Enhanced+Model+Performance+in+Software+Effort+Estimation&rft.jtitle=International+journal+of+recent+technology+and+engineering&rft.au=Pillai%2C+Sreekumar+P.&rft.au=Radharamanan%2C+Dr.+T.&rft.au=Madhukumar%2C+Dr.+S.D.&rft.date=2019-09-30&rft.issn=2277-3878&rft.eissn=2277-3878&rft.volume=8&rft.issue=3&rft.spage=6053&rft.epage=6063&rft_id=info:doi/10.35940%2Fijrte.C5602.098319&rft.externalDBID=n%2Fa&rft.externalDocID=10_35940_ijrte_C5602_098319
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2277-3878&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2277-3878&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2277-3878&client=summon