Statistical Learning from a Regression Perspective
Statistical Learning from a Regression Perspective considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response. As a firs...
Saved in:
Main Author | |
---|---|
Format | eBook Book |
Language | English |
Published |
New York, NY
Springer-Verlag
2008
Springer Springer New York |
Edition | 1. Aufl. |
Series | Springer Series in Statistics |
Subjects | |
Online Access | Get full text |
ISBN | 9780387775005 0387775005 |
ISSN | 0172-7397 |
DOI | 10.1007/978-0-387-77501-2 |
Cover
Loading…
Abstract | Statistical Learning from a Regression Perspective considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response. As a first approximation, this is can be seen as an extension of nonparametric regression. Among the statistical learning procedures examined are bagging, random forests, boosting, and support vector machines. Response variables may be quantitative or categorical. Real applications are emphasized, especially those with practical implications. One important theme is the need to explicitly take into account asymmetric costs in the fitting process. For example, in some situations false positives may be far less costly than false negatives. Another important theme is to not automatically cede modeling decisions to a fitting algorithm. In many settings, subject-matter knowledge should trump formal fitting criteria. Yet another important theme is to appreciate the limitation of one’s data and not apply statistical learning procedures that require more than the data can provide. The material is written for graduate students in the social and life sciences and for researchers who want to apply statistical learning procedures to scientific and policy problems. Intuitive explanations and visual representations are prominent. All of the analyses included are done in R. |
---|---|
AbstractList | Statistical Learning from a Regression Perspective considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response. As a first approximation, this is can be seen as an extension of nonparametric regression. Among the statistical learning procedures examined are bagging, random forests, boosting, and support vector machines. Response variables may be quantitative or categorical. Real applications are emphasized, especially those with practical implications. One important theme is the need to explicitly take into account asymmetric costs in the fitting process. For example, in some situations false positives may be far less costly than false negatives. Another important theme is to not automatically cede modeling decisions to a fitting algorithm. In many settings, subject-matter knowledge should trump formal fitting criteria. Yet another important theme is to appreciate the limitation of one’s data and not apply statistical learning procedures that require more than the data can provide. The material is written for graduate students in the social and life sciences and for researchers who want to apply statistical learning procedures to scientific and policy problems. Intuitive explanations and visual representations are prominent. All of the analyses included are done in R. This book considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of predictors, and when it is important to characterize how the predictors are related to the response. |
Author | Berk, Dr. Richard A |
Author_xml | – sequence: 1 fullname: Berk, Dr. Richard A |
BackLink | https://cir.nii.ac.jp/crid/1130000795624046592$$DView record in CiNii |
BookMark | eNpVUE1LAzEQjdiKbe0P8LaCeFs7STY7yVFL_YCCouI1pGm2ROtuTVbBf2-268U5zDCP9x4zb0wGdVM7Qk4pXFIAnCmUOeRcYo4ogObsgEwTBgnZA-zw3w5iQEZAkeXIFQ7JmAFIxUopyyMyYsg4lLLAYzKN8Q1SMUGhhBFhz61pfWy9Ndts6Uyofb3JqtB8ZCZ7cpvgYvRNnT26EHfOtv7bnZBhZbbRTf_mhLzeLF7md_ny4fZ-frXMDXJOIbfUgFDIrF0LUBWurEDDgVpnaLVWBp1JvbBQcOBUCi5XpgLHKiWoLIqST8isN467kK5yQa-a5j1qCrrLSKf_NeiUgN5HollSXPSKXWg-v1xstesk1tVtMFu9uJ5zZKhkIp73xNp7bX3XKeVdMKhEyQooSqE6v7OeZr2p141Od3yY8KP7NTkVyH8BO8R1mg |
ContentType | eBook Book |
Copyright | Springer-Verlag New York 2008 |
Copyright_xml | – notice: Springer-Verlag New York 2008 |
DBID | 08O RYH |
DEWEY | 519.536 |
DOI | 10.1007/978-0-387-77501-2 |
DatabaseName | ciando eBooks CiNii Complete |
DatabaseTitleList | |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Mathematics Applied Sciences Statistics Public Health |
EISBN | 9780387775012 0387775013 |
Edition | 1. Aufl. 1 |
ExternalDocumentID | 138331 EBC372798 BA86940138 ciando27947 |
GroupedDBID | -T. 089 08O 0DA 0DF 0E8 20A 38. A4I A4J AABBV AABFA AAJYQ AATVQ AAUKK AAWJG ABARN ABBUY ABCYT ABGTP ABIAV ABMNI ABQPQ ACAMX ACANT ACDTA ACDUY ACKTP ACLGV ACPRQ ADHDZ ADNMO ADOGT ADVEM ADVHH AEHEY AEKFX AERYV AETDV AEZAY AFOJC AFPTF AHNNE AHWGJ AJFER AKHYG ALMA_UNASSIGNED_HOLDINGS AMYDA ATJMZ AZZ BBABE CVDBJ CZZ E6I GEOUK HF4 IEZ IWG MNA MYL N2R NUU PQQKQ SBO SVJCK TPJZQ Z5O Z7R Z7U Z7Y Z81 Z83 Z87 Z88 RYH RSU |
ID | FETCH-LOGICAL-a73310-c1a05972ccd509f7bc57a301cea1fd9a7ead9a4c0430318538baf0e2f95184463 |
ISBN | 9780387775005 0387775005 |
ISSN | 0172-7397 |
IngestDate | Tue Oct 01 20:04:43 EDT 2024 Wed Sep 03 04:15:03 EDT 2025 Thu Jun 26 21:04:37 EDT 2025 Wed Sep 03 05:25:34 EDT 2025 |
IsPeerReviewed | false |
IsScholarly | false |
Keywords | Technik Wissen Statistik |
LCCN | 2008926886 |
LCCallNum_Ident | QA273.A1-274.9 |
Language | English |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-a73310-c1a05972ccd509f7bc57a301cea1fd9a7ead9a4c0430318538baf0e2f95184463 |
Notes | Includes bibliographical references (p. [343]-353) and index |
OCLC | 272306847 |
PQID | EBC372798 |
PageCount | 377 |
ParticipantIDs | springer_books_10_1007_978_0_387_77501_2 proquest_ebookcentral_EBC372798 nii_cinii_1130000795624046592 ciando_primary_ciando27947 |
PublicationCentury | 2000 |
PublicationDate | 2008 c2008 20080607 |
PublicationDateYYYYMMDD | 2008-01-01 2008-06-07 |
PublicationDate_xml | – year: 2008 text: 2008 |
PublicationDecade | 2000 |
PublicationPlace | New York, NY |
PublicationPlace_xml | – name: New York – name: New York, NY |
PublicationSeriesTitle | Springer Series in Statistics |
PublicationSeriesTitleAlternate | Springer Ser.Statistics |
PublicationYear | 2008 |
Publisher | Springer-Verlag Springer Springer New York |
Publisher_xml | – name: Springer-Verlag – name: Springer – name: Springer New York |
SSID | ssj0000251060 ssj0000615877 |
Score | 1.8470842 |
Snippet | Statistical Learning from a Regression Perspective considers statistical learning applications when interest centers on the conditional distribution of the... This book considers statistical learning applications when interest centers on the conditional distribution of the response variable, given a set of... |
SourceID | springer proquest nii ciando |
SourceType | Publisher |
SubjectTerms | Mathematics Mathematics and Statistics Methodology of the Social Sciences Probability Theory and Stochastic Processes Psychological Methods/Evaluation Public Health Regression analysis Statistical Theory and Methods Statistics for Social Sciences, Humanities, Law |
TableOfContents | 3.2.4 CART as an Adaptive Nearest Neighbor Method -- 3.2.5 What CART Needs to Do -- 3.3 Splitting a Node -- 3.4 More on Classification -- 3.4.1 Fitted Values and Related Terms -- 3.4.2 An Example -- 3.5 Classification Errors and Costs -- 3.5.1 Default Costs in CART -- 3.5.2 Prior Probabilities and Costs -- 3.6 Pruning -- 3.6.1 Impurity Versus R α (T ) -- 3.7 Missing Data -- 3.7.1 Missing Data with CART -- 3.8 Statistical Inference with CART -- 3.9 Classification Versus Forecasting -- 3.10 Varying the Prior, Costs, and the Complexity Penalty -- 3.11 An Example with Three Response Categories -- 3.12 CART with Highly Skewed Response Distributions -- 3.13 Some Cautions in Interpreting CART Results -- 3.13.1 Model Bias -- 3.13.2 Model Variance -- 3.14 Regression Trees -- 3.14.1 An Illustration -- 3.14.2 Some Extensions -- 3.14.3 Multivariate Adaptive Regression Splines (MARS) -- 3.15 Software Issues -- 3.16 Summary and Conclusions -- 4 Bagging -- 4.1 Introduction -- 4.2 Overfitting and Cross-Validation -- 4.3 Bagging as an Algorithm -- 4.3.1 Margins -- 4.3.2 Out-Of-Bag Observations -- 4.4 Some Thinking on Why Bagging Works -- 4.4.1 More on Instability in CART -- 4.4.2 How Bagging Can Help -- 4.4.3 A Somewhat More Formal Explanation -- 4.5 Some Limitations of Bagging -- 4.5.1 Sometimes Bagging Does Not Help -- 4.5.2 Sometimes Bagging Can Make the Bias Worse -- 4.5.3 Sometimes Bagging Can Make the Variance Worse -- 4.5.4 Losing the Trees for the Forest -- 4.5.5 Bagging Is Only an Algorithm -- 4.6 An Example -- 4.7 Bagging a Quantitative Response Variable -- 4.8 Software Considerations -- 4.9 Summary and Conclusions -- 5 Random Forests -- 5.1 Introduction and Overview -- 5.1.1 Unpacking How Random Forests Works -- 5.2 An Initial Illustration -- 5.3 A Few Formalities -- 5.3.1 What Is a Random Forest? Intro -- CONTENTS -- Preface -- 1 Statistical Learning as a Regression Problem -- 1.1 Getting Started -- 1.2 Setting the Regression Context -- 1.3 The Transition to Statistical Learning -- 1.3.1 Some Goals of Statistical Learning -- 1.3.2 Statistical Inference -- 1.3.3 Some Initial Cautions -- 1.3.4 A Cartoon Illustration -- 1.3.5 A Taste of Things to Come -- 1.4 Some Initial Concepts and Definitions -- 1.4.1 Overall Goals -- 1.4.2 Loss Functions and Related Concepts -- 1.4.3 Linear Estimators -- 1.4.4 Degrees of Freedom -- 1.4.5 Model Evaluation -- 1.4.6 Model Selection -- 1.4.7 Basis Functions -- 1.5 Some Common Themes -- 1.6 Summary and Conclusions -- 2 Regression Splines and Regression Smoothers -- 2.1 Introduction -- 2.2 Regression Splines -- 2.2.1 Applying a Piecewise Linear Basis -- 2.2.2 Polynomial Regression Splines -- 2.2.3 Natural Cubic Splines -- 2.2.4 B-Splines -- 2.3 Penalized Smoothing -- 2.3.1 Shrinkage -- 2.3.2 Shrinkage and Statistical Inference -- 2.3.3 Shrinkage: So What? -- 2.4 Smoothing Splines -- 2.4.1 An Illustration -- 2.5 Locally Weighted Regression as a Smoother -- 2.5.1 Nearest Neighbor Methods -- 2.5.2 Locally Weighted Regression -- 2.6 Smoothers for Multiple Predictors -- 2.6.1 Smoothing in Two Dimensions -- 2.6.2 The Generalized Additive Model -- 2.7 Smoothers with Categorical Variables -- 2.7.1 An Illustration -- 2.8 Locally Adaptive Smoothers -- 2.9 The Role of Statistical Inference -- 2.9.1 Some Apparent Prerequisites -- 2.9.2 Confidence Intervals -- 2.9.3 Statistical Tests -- 2.9.4 Can Asymptotics Help? -- 2.10 Software Issues -- 2.11 Summary and Conclusions -- 3 Classification and Regression Trees (CART) -- 3.1 Introduction -- 3.2 An Overview of Recursive Partitioning with CART -- 3.2.1 Tree Diagrams -- 3.2.2 Classification and Forecasting with CART -- 3.2.3 Confusion Tables 5.3.2 Margins and Generalization Error for Classifiers in General -- 5.3.3 Generalization Error for Random Forests -- 5.3.4 The Strength of a Random Forest -- 5.3.5 Dependence -- 5.3.6 Implications -- 5.4 Random Forests and Adaptive Nearest Neighbor Methods -- 5.5 Taking Costs into Account in Random Forests -- 5.5.1 A Brief Illustration -- 5.6 Determining the Importance of the Predictors -- 5.6.1 Contributions to the Fit -- 5.6.2 Contributions to Forecasting Skill -- 5.7 Response Functions -- 5.7.1 An Example -- 5.8 The Proximity Matrix -- 5.8.1 Clustering by Proximity Values -- 5.8.2 Using Proximity Values to Impute Missing Data -- 5.8.3 Using Proximities to Detect Outliers -- 5.9 Quantitative Response Variables -- 5.10 Tuning Parameters -- 5.11 An Illustration Using a Binary Response Variable -- 5.12 An Illustration Using a Quantitative Response Variable -- 5.13 Software Considerations -- 5.14 Summary and Conclusions -- 5.14.1 Problem Set 1 -- 5.14.2 Problem Set 2 -- 5.14.3 Problem Set 3 -- 6 Boosting -- 6.1 Introduction -- 6.2 Adaboost -- 6.2.1 A Toy Numerical Example of Adaboost -- 6.2.2 A Statistical Perspective on Adaboost -- 6.3 Why Does Adaboost Work So Well? -- 6.3.1 Least Angle Regression (LARS) -- 6.4 Stochastic Gradient Boosting -- 6.4.1 Tuning Parameters -- 6.4.2 Output -- 6.5 Some Problems and Some Possible Solutions -- 6.5.1 Some Potential Problems -- 6.5.2 Some Potential Solutions -- 6.6 Some Examples -- 6.6.1 A Garden Variety Data Analysis -- 6.6.2 Inmate Misconduct Again -- 6.6.3 Homicides and the Impact of Executions -- 6.6.4 Imputing the Number of Homeless -- 6.6.5 Estimating Conditional Probabilities -- 6.7 Software Considerations -- 6.8 Summary and Conclusions -- 7 Support Vector Machines -- 7.1 A Simple Didactic Illustration -- 7.2 Support Vector Machines in Pictures -- 7.2.1 Support Vector Classifiers 7.2.2 Support Vector Machines -- 7.3 Support Vector Machines in Statistical Notation -- 7.3.1 Support Vector Classifiers -- 7.3.2 Support Vector Machines -- 7.3.3 SVM for Regression -- 7.4 A Classification Example -- 7.4.1 SVM Analysis with a Linear Kernel -- 7.4.2 SVM Analysis with a Radial Kernel -- 7.4.3 Varying Tuning Parameters -- 7.4.4 Taking the Costs of Classification Errors into Account -- 7.4.5 Comparisons to Logistic Regression -- 7.5 Software Considerations -- 7.6 Summary and Conclusions -- 8 Broader Implications and a Bit of Craft Lore -- 8.1 Some Fundamental Limitations of Statistical Learning -- 8.2 Some Assets of Statistical Learning -- 8.2.1 The Attitude Adjustment -- 8.2.2 Selectively Better Performance -- 8.2.3 Improving Other Procedures -- 8.3 Some Practical Suggestions -- 8.3.1 Matching Tools to Jobs -- 8.3.2 Getting to Know Your Software -- 8.3.3 Not Forgetting the Basics -- 8.3.4 Getting Good Data -- 8.3.5 Being Sensitive to Overtuning -- 8.3.6 Matching Your Goals to What You Can Credibly Do -- 8.4 Some Concluding Observations -- References -- Index |
Title | Statistical Learning from a Regression Perspective |
URI | http://ebooks.ciando.com/book/index.cfm/bok_id/27947 https://cir.nii.ac.jp/crid/1130000795624046592 https://ebookcentral.proquest.com/lib/[SITE_ID]/detail.action?docID=372798 http://link.springer.com/10.1007/978-0-387-77501-2 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3NS8MwFA9uXvTkJ87PHjwIoyP9WtqjykSEeVLxFtKkkYFMcfPiX-_vNW1npyB6CU0IDbzXJr_38Xth7DRKjBDact9kKQwUy5WvEhX4uQq50iIyvLy1ZHw7vL6Pbx6Tx8X1XCW7ZJ4P9MePvJL_aBVj0CuxZP-g2ealGMAz9IsWGka7BH6bbsXfoAD6zLmhn2vnRskUUf234snltlJ6eItJ6fg4Lje6ItT3zwctuz9dsvtrv1_LHqRYtAAGKJnM33fHRUKEq6lLU2v2Ybvo9MV5OszI9Eo7rCMEtotVHJOjceO-ItOED3kV_3ZrVvWMmn4dRG7V8a3WxIFPXhvzgjN9Opm08P1SSLo86e82WJfYH5tspZhusfVxU9R2ts3CL1L3aql7JHVPeQupe1-kvsMerkZ3l9d-deWEr-jySu7rQAFwilBrAyhlRa4TobAJ6kIF1mRK4M_LVKypVBoRz6M0V5YXoQVSTWFaR7usO32ZFnvMG8bamhxTuDZxEfAsLIAFjbU5_oYkTHps34lAvrrCItJ1Q2yRoseOIBaMUBtQzBFihC0LABZTJLzHTmqByTKsXuXyytHFZQRMmqU9dlbLUdKEmawrVEMZkksoQ5bKkOH-L4sdsLXFF3jIuvO39-IIWGyeH1ffxScK_ycM |
linkProvider | ProQuest Ebooks |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.title=Statistical+learning+from+a+regression+perspective&rft.au=Berk%2C+Richard+A.&rft.date=2008-01-01&rft.pub=Springer&rft.isbn=9780387775005&rft_id=info:doi/10.1007%2F978-0-387-77501-2&rft.externalDocID=BA86940138 |
thumbnail_s | http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=https%3A%2F%2Fmedia.springernature.com%2Fw306%2Fspringer-static%2Fcover-hires%2Fbook%2F978-0-387-77501-2 |