Extremely Accurate Symbolic Regression for Large Feature Problems

grammarnonlinear regressiongeneralized linear models (GLM)basis functionmaximum binary treeRegression Query Language (RQL)islandelitistconstraintextreme accuracystepwise regressionheuristicridge regressionpolynomialAsKorns Michael F. symbolic regression (SR) has advanced into the early stages of com...

Full description

Saved in:
Bibliographic Details
Published inGenetic Programming Theory and Practice XII pp. 109 - 131
Main Author Korns, Michael F.
Format Book Chapter
LanguageEnglish
Published Cham Springer International Publishing 05.06.2015
SeriesGenetic and Evolutionary Computation
Subjects
Online AccessGet full text
ISBN331916029X
9783319160290
ISSN1932-0167
DOI10.1007/978-3-319-16030-6_7

Cover

Abstract grammarnonlinear regressiongeneralized linear models (GLM)basis functionmaximum binary treeRegression Query Language (RQL)islandelitistconstraintextreme accuracystepwise regressionheuristicridge regressionpolynomialAsKorns Michael F. symbolic regression (SR) has advanced into the early stages of commercial exploitation, the poor accuracy of SR, still plaguing even the most advanced commercial packages, has become an issue for early adopters. Users expect to have the correct formula returned, especially in cases with zero noise and only one basis function with minimally complex grammar depth. At a minimum, users expect the response surface of the SR tool to be easily understood, so that the user can know apriori on what classes of problems to expect excellent, average, or poor accuracy. Poor or unknown accuracy is a hinderence to greater academic and industrial acceptance of SR tools. In a previous paper, we published a complex algorithm for modern symbolic regression which is extremely accurate for a large class of Symbolic Regression problems. The class of problems, on which SR is extremely accurate, was described in detail. This algorithm was extremely accurate, on a single processor, for up to 25 features (columns); and, a cloud configuration was used to extend the extreme accuracy up to as many as 100 features. While the previous algorithm’s extreme accuracy for deep problems with a small number of features (25–100) was an impressive advance, there are many very important academic and industrial SR problems requiring from 100 to 1000 features. In this chapter we extend the previous algorithm such that high accuracy is achieved on a wide range of problems, from 25 to 3000 features, using only a single processor. The class of problems, on which the enhanced algorithm is highly accurate, is described in detail. A definition of extreme accuracy is provided, and an informal argument of highly SR accuracy is outlined in this chapter. The new enhanced algorithm is tested on a set of representative problems. The enhanced algorithm is shown to be robust, performing well even in the face of testing data containing up to 3000 features.
AbstractList grammarnonlinear regressiongeneralized linear models (GLM)basis functionmaximum binary treeRegression Query Language (RQL)islandelitistconstraintextreme accuracystepwise regressionheuristicridge regressionpolynomialAsKorns Michael F. symbolic regression (SR) has advanced into the early stages of commercial exploitation, the poor accuracy of SR, still plaguing even the most advanced commercial packages, has become an issue for early adopters. Users expect to have the correct formula returned, especially in cases with zero noise and only one basis function with minimally complex grammar depth. At a minimum, users expect the response surface of the SR tool to be easily understood, so that the user can know apriori on what classes of problems to expect excellent, average, or poor accuracy. Poor or unknown accuracy is a hinderence to greater academic and industrial acceptance of SR tools. In a previous paper, we published a complex algorithm for modern symbolic regression which is extremely accurate for a large class of Symbolic Regression problems. The class of problems, on which SR is extremely accurate, was described in detail. This algorithm was extremely accurate, on a single processor, for up to 25 features (columns); and, a cloud configuration was used to extend the extreme accuracy up to as many as 100 features. While the previous algorithm’s extreme accuracy for deep problems with a small number of features (25–100) was an impressive advance, there are many very important academic and industrial SR problems requiring from 100 to 1000 features. In this chapter we extend the previous algorithm such that high accuracy is achieved on a wide range of problems, from 25 to 3000 features, using only a single processor. The class of problems, on which the enhanced algorithm is highly accurate, is described in detail. A definition of extreme accuracy is provided, and an informal argument of highly SR accuracy is outlined in this chapter. The new enhanced algorithm is tested on a set of representative problems. The enhanced algorithm is shown to be robust, performing well even in the face of testing data containing up to 3000 features.
Author Korns, Michael F.
Author_xml – sequence: 1
  givenname: Michael F.
  surname: Korns
  fullname: Korns, Michael F.
  email: mkorns@korns.com
BookMark eNo9kN1Kw0AUhFesYFv7BN7sC6yek5PNJpehtFYoKP6Ad0t-Tko1ycpuCvbtjVWcm2HmYmC-mZj0rmchrhFuEMDcZiZVpAgzhQkQqMSaMzGjsThlOv8PUfY2EVPMKFKAibkUixDeYZSODUU4Ffnqa_DccXuUeVUdfDGwfD52pWv3lXzinecQ9q6XjfNyW_gdyzUXw8GzfPSubLkLV-KiKdrAiz-fi9f16mW5UduHu_tlvlUBIzMobow2OiWkCtIamjqKSwCqGzBooEBdGqwirBPNWFGmteGUTRwjpnVTMNJc4O9u-PT7fsfels59BItgf6DYEYolO962Jwh2hELfjv1Thg
ContentType Book Chapter
Copyright Springer International Publishing Switzerland 2015
Copyright_xml – notice: Springer International Publishing Switzerland 2015
DOI 10.1007/978-3-319-16030-6_7
DatabaseTitleList
DeliveryMethod fulltext_linktorsrc
Discipline Engineering
Computer Science
EISBN 3319160303
9783319160306
Editor Riolo, Rick
Worzel, William P.
Kotanchek, Mark
Editor_xml – sequence: 1
  givenname: Rick
  surname: Riolo
  fullname: Riolo, Rick
  email: rlriolo@umich.edu
– sequence: 2
  givenname: William P.
  surname: Worzel
  fullname: Worzel, William P.
  email: billwzel@gmail.com
– sequence: 3
  givenname: Mark
  surname: Kotanchek
  fullname: Kotanchek, Mark
  email: mark@evolved-analytics.com
EndPage 131
GroupedDBID 0D6
0DA
20A
38.
AABBV
AAGZE
AAZAK
AAZUS
ABFTD
ABMNI
ACBPT
ACKNT
ACKTS
ACRRC
AEJLV
AEKFX
AETDV
AEZAY
ALMA_UNASSIGNED_HOLDINGS
APFYR
AZZ
BBABE
CZZ
I4C
IEZ
MYL
SBO
SFQCF
TMQGW
TPJZQ
TWXRB
Z5O
Z7R
Z7S
Z7U
Z7V
Z7W
Z7X
Z7Y
Z7Z
Z81
Z82
Z83
Z84
Z85
Z87
Z88
ID FETCH-LOGICAL-s127t-ef75758313c08d0fd24b003df07170a15b71c21d65e1c39557e8e744118dfae13
ISBN 331916029X
9783319160290
ISSN 1932-0167
IngestDate Tue Jul 29 20:29:46 EDT 2025
IsPeerReviewed true
IsScholarly true
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-s127t-ef75758313c08d0fd24b003df07170a15b71c21d65e1c39557e8e744118dfae13
PageCount 23
ParticipantIDs springer_books_10_1007_978_3_319_16030_6_7
PublicationCentury 2000
PublicationDate 20150605
PublicationDateYYYYMMDD 2015-06-05
PublicationDate_xml – month: 6
  year: 2015
  text: 20150605
  day: 5
PublicationDecade 2010
PublicationPlace Cham
PublicationPlace_xml – name: Cham
PublicationSeriesTitle Genetic and Evolutionary Computation
PublicationSeriesTitleAlternate Genetic,Evolutionary Computation
PublicationTitle Genetic Programming Theory and Practice XII
PublicationYear 2015
Publisher Springer International Publishing
Publisher_xml – name: Springer International Publishing
RelatedPersons Koza, John R.
Goldberg, David E.
RelatedPersons_xml – sequence: 1
  givenname: David E.
  surname: Goldberg
  fullname: Goldberg, David E.
– sequence: 2
  givenname: John R.
  surname: Koza
  fullname: Koza, John R.
SSID ssj0000547321
ssj0001524920
Score 1.7842265
Snippet grammarnonlinear regressiongeneralized linear models (GLM)basis functionmaximum binary treeRegression Query Language (RQL)islandelitistconstraintextreme...
SourceID springer
SourceType Publisher
StartPage 109
SubjectTerms Abstract expression grammars
Genetic algorithms
Grammar template genetic programming
Particle swarm
Symbolic regression
Title Extremely Accurate Symbolic Regression for Large Feature Problems
URI http://link.springer.com/10.1007/978-3-319-16030-6_7
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1LT-MwELagXJY98NhFLC_5wIkqKI7tJD0iVMRbSDzUW2THzgoJikTDauHXMxM7D1oucImiqGqc-Rx7ZjLffITsxjoXsQpFMFAaApRBqgPcBQNphJKKR1ZXKhEXl_HxrTgdyVGtd-_ZJaXez98-5ZV8B1W4BrgiS_YLyDZ_ChfgHPCFIyAMxynn92Oa9d6LwCEBEUv9scLqEWN-x7T3DABHf-qPTprE6NnTs88uu2L5_tF-d8oM_5eYLXx4RQ2JF-wh0b9-fdTYORhg-OsqZl1h4jkWkPfRf8QPEFdOlca55_jkdtIMD8cy_OfNgCV6TkeinRA-5cBkVRolZ1KOU0nLNm_2IUbl8JKzOIycKqhfZsFrDJAA0Vk6WTjo7MLM7Q0zC3y3pgP5V6iSDfFvlsyT-SQVPbJwMDw9v2vSbCFqK3sPxRHHsUVi6GoN3BiQ8VOP0fdkasfcNKpyvYinbjrz-bzySm6WyU9kqlCkkICFVsicHa-SpVqng_ple5UsdppO_iIHDcy0hpnWMNMWZgow0wpm6mGmNcy_ye3R8ObwOPBCGsGERUkZ2CIBrzzljOdhasLCRAJXc1NgMB8qJnXC8oiZWFqW84GUiU1tAo4yS02hLONrpDd-Gtt1QrkoElMwowyzQqdIZJaGq8KqyBqR2z9kr7ZIhq_GJKv7YoP5Mp6B-bLKfBmYb-MrP94kP9q5uEV65fOL3QaHsNQ7HvR34ClZAw
linkProvider Library Specific Holdings
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.title=Genetic+Programming+Theory+and+Practice+XII&rft.au=Korns%2C+Michael+F.&rft.atitle=Extremely+Accurate+Symbolic+Regression+for+Large+Feature+Problems&rft.series=Genetic+and+Evolutionary+Computation&rft.date=2015-06-05&rft.pub=Springer+International+Publishing&rft.isbn=9783319160290&rft.issn=1932-0167&rft.spage=109&rft.epage=131&rft_id=info:doi/10.1007%2F978-3-319-16030-6_7
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1932-0167&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1932-0167&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1932-0167&client=summon