Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

Robustness of model averaging methods for the violation of standard linear regression assumptions

Authors
Lee, YongsuSong, Juwon
Issue Date
Mar-2021
Publisher
KOREAN STATISTICAL SOC
Keywords
model averaging; stacking regression; Bayesian model averaging; outliers; misspecified distribution
Citation
COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS, v.28, no.2, pp.189 - 204
Indexed
SCOPUS
KCI
OTHER
Journal Title
COMMUNICATIONS FOR STATISTICAL APPLICATIONS AND METHODS
Volume
28
Number
2
Start Page
189
End Page
204
URI
https://scholar.korea.ac.kr/handle/2021.sw.korea/128524
DOI
10.29220/CSAM.2021.28.2.189
ISSN
2287-7843
Abstract
In a regression analysis, a single best model is usually selected among several candidate models. However, it is often useful to combine several candidate models to achieve better performance, especially, in the prediction viewpoint. Model combining methods such as stacking and Bayesian model averaging (BMA) have been suggested from the perspective of averaging candidate models. When the candidate models include a true model, it is expected that BMA generally gives better performance than stacking. On the other hand, when candidate models do not include the true model, it is known that stacking outperforms BMA. Since stacking and BMA approaches have different properties, it is difficult to determine which method is more appropriate under other situations. In particular, it is not easy to find research papers that compare stacking and BMA when regression model assumptions are violated. Therefore, in the paper, we compare the performance among model averaging methods as well as a single best model in the linear regression analysis when standard linear regression assumptions are violated. Simulations were conducted to compare model averaging methods with the linear regression when data include outliers and data do not include them. We also compared them when data include errors from a non-normal distribution. The model averaging methods were applied to the water pollution data, which have a strong multicollinearity among variables. Simulation studies showed that the stacking method tends to give better performance than BMA or standard linear regression analysis (including the stepwise selection method) in the sense of risks (see (3.1)) or prediction error (see (3.2)) when typical linear regression assumptions are violated.
Files in This Item
There are no files associated with this item.
Appears in
Collections
College of Political Science & Economics > Department of Statistics > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher SONG, Ju won photo

SONG, Ju won
College of Political Science & Economics (Department of Statistics)
Read more

Altmetrics

Total Views & Downloads

BROWSE