A finite-sample simulation study of cross validation in tree-based models
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kim, Seoung Bum | - |
dc.contributor.author | Huo, Xiaoming | - |
dc.contributor.author | Tsui, Kwok-Leung | - |
dc.date.accessioned | 2021-09-08T11:23:51Z | - |
dc.date.available | 2021-09-08T11:23:51Z | - |
dc.date.created | 2021-06-11 | - |
dc.date.issued | 2009-12 | - |
dc.identifier.issn | 1385-951X | - |
dc.identifier.uri | https://scholar.korea.ac.kr/handle/2021.sw.korea/118878 | - |
dc.description.abstract | Cross validation (CV) has been widely used for choosing and evaluating statistical models. The main purpose of this study is to explore the behavior of CV in tree-based models. We achieve this goal by an experimental approach, which compares a cross-validated tree classifier with the Bayes classifier that is ideal for the underlying distribution. The main observation of this study is that the difference between the testing and training errors from a cross-validated tree classifier and the Bayes classifier empirically has a linear regression relation. The slope and the coefficient of determination of the regression model can serve as performance measure of a cross-validated tree classifier. Moreover, simulation reveals that the performance of a cross-validated tree classifier depends on the geometry, parameters of the underlying distributions, and sample sizes. Our study can explain, evaluate, and justify the use of CV in tree-based models when the sample size is relatively small. | - |
dc.language | English | - |
dc.language.iso | en | - |
dc.publisher | SPRINGER | - |
dc.subject | SELECTION CRITERIA | - |
dc.subject | PREDICTION RULE | - |
dc.subject | ERROR RATE | - |
dc.subject | BOUNDS | - |
dc.title | A finite-sample simulation study of cross validation in tree-based models | - |
dc.type | Article | - |
dc.contributor.affiliatedAuthor | Kim, Seoung Bum | - |
dc.identifier.doi | 10.1007/s10799-009-0052-7 | - |
dc.identifier.wosid | 000272915200005 | - |
dc.identifier.bibliographicCitation | INFORMATION TECHNOLOGY & MANAGEMENT, v.10, no.4, pp.223 - 233 | - |
dc.relation.isPartOf | INFORMATION TECHNOLOGY & MANAGEMENT | - |
dc.citation.title | INFORMATION TECHNOLOGY & MANAGEMENT | - |
dc.citation.volume | 10 | - |
dc.citation.number | 4 | - |
dc.citation.startPage | 223 | - |
dc.citation.endPage | 233 | - |
dc.type.rims | ART | - |
dc.type.docType | Article; Proceedings Paper | - |
dc.description.journalClass | 1 | - |
dc.description.journalRegisteredClass | scie | - |
dc.description.journalRegisteredClass | scopus | - |
dc.relation.journalResearchArea | Information Science & Library Science | - |
dc.relation.journalResearchArea | Business & Economics | - |
dc.relation.journalWebOfScienceCategory | Information Science & Library Science | - |
dc.relation.journalWebOfScienceCategory | Management | - |
dc.subject.keywordPlus | SELECTION CRITERIA | - |
dc.subject.keywordPlus | PREDICTION RULE | - |
dc.subject.keywordPlus | ERROR RATE | - |
dc.subject.keywordPlus | BOUNDS | - |
dc.subject.keywordAuthor | Cross validation | - |
dc.subject.keywordAuthor | Bayes classifier | - |
dc.subject.keywordAuthor | Trees-based models | - |
Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.
(02841) 서울특별시 성북구 안암로 14502-3290-1114
COPYRIGHT © 2021 Korea University. All Rights Reserved.
Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.