TY - JOUR
T1 - Alternatives to default shrinkage methods can improve prediction accuracy, calibration, and coverage
T2 - A methods comparison study
AU - van de Wiel, Mark A.
AU - Leday, Gwenaël G. R.
AU - Heymans, Martijn W.
AU - van Zwet, Erik W.
AU - Zwinderman, Ailko H.
AU - Hoogland, Jeroen
N1 - Publisher Copyright:
© The Author(s) 2025. This article is distributed under the terms of the Creative Commons Attribution 4.0 License (https://creativecommons.org/licenses/by/4.0/) which permits any use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access page (https://us.sagepub.com/en-us/nam/open-access-at-sage).
PY - 2025/7
Y1 - 2025/7
N2 - While shrinkage is essential in high-dimensional settings, its use for low-dimensional regression-based prediction has been debated. It reduces variance, often leading to improved prediction accuracy. However, it also inevitably introduces bias, which may harm two other measures of predictive performance: calibration and coverage of confidence intervals. Here, the latter evaluates whether the amount of uncertainty is correctly quantified. Much of the criticism stems from the usage of standard shrinkage methods, such as lasso and ridge with a single, cross-validated penalty. Our aim is to show that readily available alternatives may improve predictive performance, in terms of accuracy, calibration or coverage. We study linear and logistic regression. For linear regression, we use small sample splits of a large, fairly typical epidemiological data set to illustrate that usage of differential ridge penalties for covariate groups may enhance prediction accuracy, while calibration and coverage benefit from additional shrinkage of the penalties. Bayesian hierarchical modeling facilitates the latter, including local shrinkage. In the logistic regression setting, we apply an external simulation to illustrate that local shrinkage may improve calibration with respect to global shrinkage, while providing better prediction accuracy than other solutions, like Firth’s correction. The potential benefits of the alternative shrinkage methods are easily accessible via example implementations in R, including the estimation of multiple penalties. A synthetic copy of the large data set is shared for reproducibility.
AB - While shrinkage is essential in high-dimensional settings, its use for low-dimensional regression-based prediction has been debated. It reduces variance, often leading to improved prediction accuracy. However, it also inevitably introduces bias, which may harm two other measures of predictive performance: calibration and coverage of confidence intervals. Here, the latter evaluates whether the amount of uncertainty is correctly quantified. Much of the criticism stems from the usage of standard shrinkage methods, such as lasso and ridge with a single, cross-validated penalty. Our aim is to show that readily available alternatives may improve predictive performance, in terms of accuracy, calibration or coverage. We study linear and logistic regression. For linear regression, we use small sample splits of a large, fairly typical epidemiological data set to illustrate that usage of differential ridge penalties for covariate groups may enhance prediction accuracy, while calibration and coverage benefit from additional shrinkage of the penalties. Bayesian hierarchical modeling facilitates the latter, including local shrinkage. In the logistic regression setting, we apply an external simulation to illustrate that local shrinkage may improve calibration with respect to global shrinkage, while providing better prediction accuracy than other solutions, like Firth’s correction. The potential benefits of the alternative shrinkage methods are easily accessible via example implementations in R, including the estimation of multiple penalties. A synthetic copy of the large data set is shared for reproducibility.
KW - Shrinkage
KW - calibration
KW - coverage
KW - prediction
KW - regression
UR - https://www.scopus.com/pages/publications/105007011916
U2 - 10.1177/09622802251338440
DO - 10.1177/09622802251338440
M3 - Article
C2 - 40437980
SN - 0962-2802
VL - 34
SP - 1342
EP - 1355
JO - Statistical methods in medical research
JF - Statistical methods in medical research
IS - 7
M1 - 09622802251338440
ER -