Statistical Learning from a Regression Perspective

Statistical Learning from a Regression Perspective considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response. As a firs...

Full description

Saved in:
Bibliographic Details
Main Author Berk, Dr. Richard A
Format eBook Book
LanguageEnglish
Published New York, NY Springer-Verlag 2008
Springer
Springer New York
Edition1. Aufl.
SeriesSpringer Series in Statistics
Subjects
Online AccessGet full text
ISBN9780387775005
0387775005
ISSN0172-7397
DOI10.1007/978-0-387-77501-2

Cover

Loading…
Abstract Statistical Learning from a Regression Perspective considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response. As a first approximation, this is can be seen as an extension of nonparametric regression. Among the statistical learning procedures examined are bagging, random forests, boosting, and support vector machines. Response variables may be quantitative or categorical. Real applications are emphasized, especially those with practical implications. One important theme is the need to explicitly take into account asymmetric costs in the fitting process. For example, in some situations false positives may be far less costly than false negatives. Another important theme is to not automatically cede modeling decisions to a fitting algorithm. In many settings, subject-matter knowledge should trump formal fitting criteria. Yet another important theme is to appreciate the limitation of one’s data and not apply statistical learning procedures that require more than the data can provide. The material is written for graduate students in the social and life sciences and for researchers who want to apply statistical learning procedures to scientific and policy problems. Intuitive explanations and visual representations are prominent. All of the analyses included are done in R.
AbstractList Statistical Learning from a Regression Perspective considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response. As a first approximation, this is can be seen as an extension of nonparametric regression. Among the statistical learning procedures examined are bagging, random forests, boosting, and support vector machines. Response variables may be quantitative or categorical. Real applications are emphasized, especially those with practical implications. One important theme is the need to explicitly take into account asymmetric costs in the fitting process. For example, in some situations false positives may be far less costly than false negatives. Another important theme is to not automatically cede modeling decisions to a fitting algorithm. In many settings, subject-matter knowledge should trump formal fitting criteria. Yet another important theme is to appreciate the limitation of one’s data and not apply statistical learning procedures that require more than the data can provide. The material is written for graduate students in the social and life sciences and for researchers who want to apply statistical learning procedures to scientific and policy problems. Intuitive explanations and visual representations are prominent. All of the analyses included are done in R.
This book considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response.
Author Berk, Dr. Richard A
Author_xml – sequence: 1
  fullname: Berk, Dr. Richard A
BackLink https://cir.nii.ac.jp/crid/1130000795624046592$$DView record in CiNii
BookMark eNpVUE1LAzEQjdiKbe0P8LaCeFs7STY7yVFL_YCCouI1pGm2ROtuTVbBf2-268U5zDCP9x4zb0wGdVM7Qk4pXFIAnCmUOeRcYo4ogObsgEwTBgnZA-zw3w5iQEZAkeXIFQ7JmAFIxUopyyMyYsg4lLLAYzKN8Q1SMUGhhBFhz61pfWy9Ndts6Uyofb3JqtB8ZCZ7cpvgYvRNnT26EHfOtv7bnZBhZbbRTf_mhLzeLF7md_ny4fZ-frXMDXJOIbfUgFDIrF0LUBWurEDDgVpnaLVWBp1JvbBQcOBUCi5XpgLHKiWoLIqST8isN467kK5yQa-a5j1qCrrLSKf_NeiUgN5HollSXPSKXWg-v1xstesk1tVtMFu9uJ5zZKhkIp73xNp7bX3XKeVdMKhEyQooSqE6v7OeZr2p141Od3yY8KP7NTkVyH8BO8R1mg
ContentType eBook
Book
Copyright Springer-Verlag New York 2008
Copyright_xml – notice: Springer-Verlag New York 2008
DBID 08O
RYH
DEWEY 519.536
DOI 10.1007/978-0-387-77501-2
DatabaseName ciando eBooks
CiNii Complete
DatabaseTitleList

DeliveryMethod fulltext_linktorsrc
Discipline Mathematics
Applied Sciences
Statistics
Public Health
EISBN 9780387775012
0387775013
Edition 1. Aufl.
1
ExternalDocumentID 138331
EBC372798
BA86940138
ciando27947
GroupedDBID -T.
089
08O
0DA
0DF
0E8
20A
38.
A4I
A4J
AABBV
AABFA
AAJYQ
AATVQ
AAUKK
AAWJG
ABARN
ABBUY
ABCYT
ABGTP
ABIAV
ABMNI
ABQPQ
ACAMX
ACANT
ACDTA
ACDUY
ACKTP
ACLGV
ACPRQ
ADHDZ
ADNMO
ADOGT
ADVEM
ADVHH
AEHEY
AEKFX
AERYV
AETDV
AEZAY
AFOJC
AFPTF
AHNNE
AHWGJ
AJFER
AKHYG
ALMA_UNASSIGNED_HOLDINGS
AMYDA
ATJMZ
AZZ
BBABE
CVDBJ
CZZ
E6I
GEOUK
HF4
IEZ
IWG
MNA
MYL
N2R
NUU
PQQKQ
SBO
SVJCK
TPJZQ
Z5O
Z7R
Z7U
Z7Y
Z81
Z83
Z87
Z88
RYH
RSU
ID FETCH-LOGICAL-a73310-c1a05972ccd509f7bc57a301cea1fd9a7ead9a4c0430318538baf0e2f95184463
ISBN 9780387775005
0387775005
ISSN 0172-7397
IngestDate Tue Oct 01 20:04:43 EDT 2024
Wed Sep 03 04:15:03 EDT 2025
Thu Jun 26 21:04:37 EDT 2025
Wed Sep 03 05:25:34 EDT 2025
IsPeerReviewed false
IsScholarly false
Keywords Technik
Wissen Statistik
LCCN 2008926886
LCCallNum_Ident QA273.A1-274.9
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-a73310-c1a05972ccd509f7bc57a301cea1fd9a7ead9a4c0430318538baf0e2f95184463
Notes Includes bibliographical references (p. [343]-353) and index
OCLC 272306847
PQID EBC372798
PageCount 377
ParticipantIDs springer_books_10_1007_978_0_387_77501_2
proquest_ebookcentral_EBC372798
nii_cinii_1130000795624046592
ciando_primary_ciando27947
PublicationCentury 2000
PublicationDate 2008
c2008
20080607
PublicationDateYYYYMMDD 2008-01-01
2008-06-07
PublicationDate_xml – year: 2008
  text: 2008
PublicationDecade 2000
PublicationPlace New York, NY
PublicationPlace_xml – name: New York
– name: New York, NY
PublicationSeriesTitle Springer Series in Statistics
PublicationSeriesTitleAlternate Springer Ser.Statistics
PublicationYear 2008
Publisher Springer-Verlag
Springer
Springer New York
Publisher_xml – name: Springer-Verlag
– name: Springer
– name: Springer New York
SSID ssj0000251060
ssj0000615877
Score 1.8470842
Snippet Statistical Learning from a Regression Perspective considers statistical learning applications when interest centers on the conditional distribution of the...
This book considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of...
SourceID springer
proquest
nii
ciando
SourceType Publisher
SubjectTerms Mathematics
Mathematics and Statistics
Methodology of the Social Sciences
Probability Theory and Stochastic Processes
Psychological Methods/Evaluation
Public Health
Regression analysis
Statistical Theory and Methods
Statistics for Social Sciences, Humanities, Law
TableOfContents 3.2.4 CART as an Adaptive Nearest Neighbor Method -- 3.2.5 What CART Needs to Do -- 3.3 Splitting a Node -- 3.4 More on Classification -- 3.4.1 Fitted Values and Related Terms -- 3.4.2 An Example -- 3.5 Classification Errors and Costs -- 3.5.1 Default Costs in CART -- 3.5.2 Prior Probabilities and Costs -- 3.6 Pruning -- 3.6.1 Impurity Versus R α (T ) -- 3.7 Missing Data -- 3.7.1 Missing Data with CART -- 3.8 Statistical Inference with CART -- 3.9 Classification Versus Forecasting -- 3.10 Varying the Prior, Costs, and the Complexity Penalty -- 3.11 An Example with Three Response Categories -- 3.12 CART with Highly Skewed Response Distributions -- 3.13 Some Cautions in Interpreting CART Results -- 3.13.1 Model Bias -- 3.13.2 Model Variance -- 3.14 Regression Trees -- 3.14.1 An Illustration -- 3.14.2 Some Extensions -- 3.14.3 Multivariate Adaptive Regression Splines (MARS) -- 3.15 Software Issues -- 3.16 Summary and Conclusions -- 4 Bagging -- 4.1 Introduction -- 4.2 Overfitting and Cross-Validation -- 4.3 Bagging as an Algorithm -- 4.3.1 Margins -- 4.3.2 Out-Of-Bag Observations -- 4.4 Some Thinking on Why Bagging Works -- 4.4.1 More on Instability in CART -- 4.4.2 How Bagging Can Help -- 4.4.3 A Somewhat More Formal Explanation -- 4.5 Some Limitations of Bagging -- 4.5.1 Sometimes Bagging Does Not Help -- 4.5.2 Sometimes Bagging Can Make the Bias Worse -- 4.5.3 Sometimes Bagging Can Make the Variance Worse -- 4.5.4 Losing the Trees for the Forest -- 4.5.5 Bagging Is Only an Algorithm -- 4.6 An Example -- 4.7 Bagging a Quantitative Response Variable -- 4.8 Software Considerations -- 4.9 Summary and Conclusions -- 5 Random Forests -- 5.1 Introduction and Overview -- 5.1.1 Unpacking How Random Forests Works -- 5.2 An Initial Illustration -- 5.3 A Few Formalities -- 5.3.1 What Is a Random Forest?
Intro -- CONTENTS -- Preface -- 1 Statistical Learning as a Regression Problem -- 1.1 Getting Started -- 1.2 Setting the Regression Context -- 1.3 The Transition to Statistical Learning -- 1.3.1 Some Goals of Statistical Learning -- 1.3.2 Statistical Inference -- 1.3.3 Some Initial Cautions -- 1.3.4 A Cartoon Illustration -- 1.3.5 A Taste of Things to Come -- 1.4 Some Initial Concepts and Definitions -- 1.4.1 Overall Goals -- 1.4.2 Loss Functions and Related Concepts -- 1.4.3 Linear Estimators -- 1.4.4 Degrees of Freedom -- 1.4.5 Model Evaluation -- 1.4.6 Model Selection -- 1.4.7 Basis Functions -- 1.5 Some Common Themes -- 1.6 Summary and Conclusions -- 2 Regression Splines and Regression Smoothers -- 2.1 Introduction -- 2.2 Regression Splines -- 2.2.1 Applying a Piecewise Linear Basis -- 2.2.2 Polynomial Regression Splines -- 2.2.3 Natural Cubic Splines -- 2.2.4 B-Splines -- 2.3 Penalized Smoothing -- 2.3.1 Shrinkage -- 2.3.2 Shrinkage and Statistical Inference -- 2.3.3 Shrinkage: So What? -- 2.4 Smoothing Splines -- 2.4.1 An Illustration -- 2.5 Locally Weighted Regression as a Smoother -- 2.5.1 Nearest Neighbor Methods -- 2.5.2 Locally Weighted Regression -- 2.6 Smoothers for Multiple Predictors -- 2.6.1 Smoothing in Two Dimensions -- 2.6.2 The Generalized Additive Model -- 2.7 Smoothers with Categorical Variables -- 2.7.1 An Illustration -- 2.8 Locally Adaptive Smoothers -- 2.9 The Role of Statistical Inference -- 2.9.1 Some Apparent Prerequisites -- 2.9.2 Confidence Intervals -- 2.9.3 Statistical Tests -- 2.9.4 Can Asymptotics Help? -- 2.10 Software Issues -- 2.11 Summary and Conclusions -- 3 Classification and Regression Trees (CART) -- 3.1 Introduction -- 3.2 An Overview of Recursive Partitioning with CART -- 3.2.1 Tree Diagrams -- 3.2.2 Classification and Forecasting with CART -- 3.2.3 Confusion Tables
5.3.2 Margins and Generalization Error for Classifiers in General -- 5.3.3 Generalization Error for Random Forests -- 5.3.4 The Strength of a Random Forest -- 5.3.5 Dependence -- 5.3.6 Implications -- 5.4 Random Forests and Adaptive Nearest Neighbor Methods -- 5.5 Taking Costs into Account in Random Forests -- 5.5.1 A Brief Illustration -- 5.6 Determining the Importance of the Predictors -- 5.6.1 Contributions to the Fit -- 5.6.2 Contributions to Forecasting Skill -- 5.7 Response Functions -- 5.7.1 An Example -- 5.8 The Proximity Matrix -- 5.8.1 Clustering by Proximity Values -- 5.8.2 Using Proximity Values to Impute Missing Data -- 5.8.3 Using Proximities to Detect Outliers -- 5.9 Quantitative Response Variables -- 5.10 Tuning Parameters -- 5.11 An Illustration Using a Binary Response Variable -- 5.12 An Illustration Using a Quantitative Response Variable -- 5.13 Software Considerations -- 5.14 Summary and Conclusions -- 5.14.1 Problem Set 1 -- 5.14.2 Problem Set 2 -- 5.14.3 Problem Set 3 -- 6 Boosting -- 6.1 Introduction -- 6.2 Adaboost -- 6.2.1 A Toy Numerical Example of Adaboost -- 6.2.2 A Statistical Perspective on Adaboost -- 6.3 Why Does Adaboost Work So Well? -- 6.3.1 Least Angle Regression (LARS) -- 6.4 Stochastic Gradient Boosting -- 6.4.1 Tuning Parameters -- 6.4.2 Output -- 6.5 Some Problems and Some Possible Solutions -- 6.5.1 Some Potential Problems -- 6.5.2 Some Potential Solutions -- 6.6 Some Examples -- 6.6.1 A Garden Variety Data Analysis -- 6.6.2 Inmate Misconduct Again -- 6.6.3 Homicides and the Impact of Executions -- 6.6.4 Imputing the Number of Homeless -- 6.6.5 Estimating Conditional Probabilities -- 6.7 Software Considerations -- 6.8 Summary and Conclusions -- 7 Support Vector Machines -- 7.1 A Simple Didactic Illustration -- 7.2 Support Vector Machines in Pictures -- 7.2.1 Support Vector Classifiers
7.2.2 Support Vector Machines -- 7.3 Support Vector Machines in Statistical Notation -- 7.3.1 Support Vector Classifiers -- 7.3.2 Support Vector Machines -- 7.3.3 SVM for Regression -- 7.4 A Classification Example -- 7.4.1 SVM Analysis with a Linear Kernel -- 7.4.2 SVM Analysis with a Radial Kernel -- 7.4.3 Varying Tuning Parameters -- 7.4.4 Taking the Costs of Classification Errors into Account -- 7.4.5 Comparisons to Logistic Regression -- 7.5 Software Considerations -- 7.6 Summary and Conclusions -- 8 Broader Implications and a Bit of Craft Lore -- 8.1 Some Fundamental Limitations of Statistical Learning -- 8.2 Some Assets of Statistical Learning -- 8.2.1 The Attitude Adjustment -- 8.2.2 Selectively Better Performance -- 8.2.3 Improving Other Procedures -- 8.3 Some Practical Suggestions -- 8.3.1 Matching Tools to Jobs -- 8.3.2 Getting to Know Your Software -- 8.3.3 Not Forgetting the Basics -- 8.3.4 Getting Good Data -- 8.3.5 Being Sensitive to Overtuning -- 8.3.6 Matching Your Goals to What You Can Credibly Do -- 8.4 Some Concluding Observations -- References -- Index
Title Statistical Learning from a Regression Perspective
URI http://ebooks.ciando.com/book/index.cfm/bok_id/27947
https://cir.nii.ac.jp/crid/1130000795624046592
https://ebookcentral.proquest.com/lib/[SITE_ID]/detail.action?docID=372798
http://link.springer.com/10.1007/978-0-387-77501-2
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3NS8MwFA9uXvTkJ87PHjwIoyP9WtqjykSEeVLxFtKkkYFMcfPiX-_vNW1npyB6CU0IDbzXJr_38Xth7DRKjBDact9kKQwUy5WvEhX4uQq50iIyvLy1ZHw7vL6Pbx6Tx8X1XCW7ZJ4P9MePvJL_aBVj0CuxZP-g2ealGMAz9IsWGka7BH6bbsXfoAD6zLmhn2vnRskUUf234snltlJ6eItJ6fg4Lje6ItT3zwctuz9dsvtrv1_LHqRYtAAGKJnM33fHRUKEq6lLU2v2Ybvo9MV5OszI9Eo7rCMEtotVHJOjceO-ItOED3kV_3ZrVvWMmn4dRG7V8a3WxIFPXhvzgjN9Opm08P1SSLo86e82WJfYH5tspZhusfVxU9R2ts3CL1L3aql7JHVPeQupe1-kvsMerkZ3l9d-deWEr-jySu7rQAFwilBrAyhlRa4TobAJ6kIF1mRK4M_LVKypVBoRz6M0V5YXoQVSTWFaR7usO32ZFnvMG8bamhxTuDZxEfAsLIAFjbU5_oYkTHps34lAvrrCItJ1Q2yRoseOIBaMUBtQzBFihC0LABZTJLzHTmqByTKsXuXyytHFZQRMmqU9dlbLUdKEmawrVEMZkksoQ5bKkOH-L4sdsLXFF3jIuvO39-IIWGyeH1ffxScK_ycM
linkProvider ProQuest Ebooks
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.title=Statistical+learning+from+a+regression+perspective&rft.au=Berk%2C+Richard+A.&rft.date=2008-01-01&rft.pub=Springer&rft.isbn=9780387775005&rft_id=info:doi/10.1007%2F978-0-387-77501-2&rft.externalDocID=BA86940138
thumbnail_s http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Fmedia.springernature.com%2Fw306%2Fspringer-static%2Fcover-hires%2Fbook%2F978-0-387-77501-2