Zum Hauptinhalt springen

Using leave-one-out cross validation (LOO) in a multilevel regression and poststratification (MRP) workflow: A cautionary tale.

Kuh, S ; Kennedy, L ; et al.
In: Statistics in medicine, Jg. 43 (2024-02-28), Heft 5, S. 953-982
Online academicJournal

Titel:
Using leave-one-out cross validation (LOO) in a multilevel regression and poststratification (MRP) workflow: A cautionary tale.
Autor/in / Beteiligte Person: Kuh, S ; Kennedy, L ; Chen, Q ; Gelman, A
Link:
Zeitschrift: Statistics in medicine, Jg. 43 (2024-02-28), Heft 5, S. 953-982
Veröffentlichung: Chichester ; New York : Wiley, c1982-, 2024
Medientyp: academicJournal
ISSN: 1097-0258 (electronic)
DOI: 10.1002/sim.9964
Schlagwort:
  • Humans
  • United States
  • Nutrition Surveys
  • Bayes Theorem
  • Workflow
  • Computer Simulation
  • Research Design
Sonstiges:
  • Nachgewiesen in: MEDLINE
  • Sprachen: English
  • Publication Type: Journal Article
  • Language: English
  • [Stat Med] 2024 Feb 28; Vol. 43 (5), pp. 953-982. <i>Date of Electronic Publication: </i>2023 Dec 26.
  • MeSH Terms: Research Design* ; Humans ; United States ; Nutrition Surveys ; Bayes Theorem ; Workflow ; Computer Simulation
  • References: Gelman A, Little TC. Poststratification into many categories using hierarchical logistic regression. Surv Methodol. 1997;23:2127-2136. ; Park DK, Gelman A, Bafumi J. Bayesian multilevel estimation with poststratification: State-level estimates from national polls. Polit Anal. 2004;12(4):375-385. ; Lax JR, Phillips JH. Gay rights in the states: Public opinion and policy responsiveness. Am Polit Sci Rev. 2009a;103(3):367-386. ; Wang W, Rothschild D, Goel S, Gelman A. Forecasting elections with non-representative polls. Int J Forecast. 2015;31(3):980-991. ; Downes M, Gurrin LC, English DR, et al. Multilevel regression and poststratification: A modeling approach to estimating population quantities from highly selected survey samples. Am J Epidemiol. 2018;187(8):1780-1790. ; Si Y, Trangucci R, Gabry JS, Gelman A. Bayesian hierarchical weighting adjustment and survey inference. arXiv:1707.08220 2017. ; Valliant R. Comparing alternatives for estimation from nonprobability samples. J Survey Stat Methodol. 2020;8(2):231-263. ; Vehtari A, Gelman A, Gabry J. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC. Stat Comput. 2017;27(5):1413-1432. ; Park DK, Gelman A, Bafumi J. State-level opinions from national surveys: Poststratification using multilevel logistic regression. In: Cohen JE, ed. Public Opinion in State Politics. Stanford, CA: Stanford University Press; 2006:209-228. ; Warshaw C, Rodden J. How should we measure district-level public opinion on individual issues? J Polit. 2012;74(1):203-219. ; Lax JR, Phillips JH. How should we estimate public opinion in the states? Am J Polit Sci. 2009b;53(1):107-121. ; Buttice MK, Highton B. How does multilevel regression and poststratification perform with conventional national surveys? Polit Anal. 2013;21(4):449-467. ; Ghitza Y, Gelman A. Deep interactions with MRP: Election turnout and voting patterns among small electoral subgroups. Am J Polit Sci. 2013;57(3):762-776. ; Zhang X, Holt JB, Yun S, Lu H, Greenlund KJ, Croft JB. Validation of multilevel regression and poststratification methodology for small area estimation of health indicators from the behavioral risk factor surveillance system. Am J Epidemiol. 2015;182(2):127-137. ; Covello L, Gelman A, Si Y, Wang S. Routine hospital-based SARS-CoV-2 testing outperforms state-based data in predicting clinical burden. Epidemiology. 2021;32(6):792. ; Machalek DA, Vette KM, Downes M, et al. Serological testing of blood donors to characterise the impact of COVID-19 in Melbourne, Australia, 2020. PLoS One. 2022;17(7):e0265858. doi:10.1371/journal.pone.0265858. ; Bisbee J. BARP: Improving Mister P using Bayesian additive regression trees. Am Polit Sci Rev. 2019;113(4):1060-1065. ; Liu Y, Chen Q. Bayesian inference of finite population quantiles for skewed survey data using skew-normal penalized spline regression. J Survey Stat Methodol. 2020;8(4):792-816. ; Ornstein JT. Stacked regression and poststratification. Polit Anal. 2020;28(2):293-301. ; Gao Y, Kennedy L, Simpson D. Treatment effect estimation with Multilevel Regression and Poststratification. arXiv preprint, arXiv:2102.10003 2021. ; Liu Y, Gelman A, Chen Q. Inference from non-random samples using Bayesian machine learning. J Survey Stat Methodol. 2022;11(2):433-455. ; Gelman A, Hill J. Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge: Cambridge University Press; 2007. ; Gao Y, Kennedy L, Simpson D, Gelman A. Improving multilevel regression and poststratification with structured priors. Bayesian Anal. 2021;16(3):719-744. ; Kastellec JP, Lax JR, Malecki M, Phillips JH. Polarizing the electoral connection: partisan representation in Supreme Court confirmation politics. J Polit. 2015;77(3):787-804. ; Kuriwaki S, Ansolabehere S, Dagonel A, Yamauchi S. The geography of racially polarized voting: calibrating surveys at the district level. Am Polit Sci Rev. 2023:1-18. doi:10.1017/S0003055423000436. ; Wang W, Gelman A. Difficulty of selecting among multilevel models using predictive accuracy. Stat Interface. 2015;8(2):153-160. ; Lumley T, Scott A. AIC and BIC for modeling with complex survey data. J Survey Stat Methodol. 2015;3(1):1-18. ; Mercer AW, Kreuter F, Keeter S, Stuart EA. Theory and practice in nonprobability surveys: parallels between causal inference and survey inference. Public Opin Q. 2017;81(S1):250-271. ; Little RJ, Vartivarian S. On weighting the rates in non-response weights. Stat Med. 2003;22(9):1589-1599. ; Urminsky O, Hansen C, Chernozhukov V. Using double-lasso regression for principled variable selection. Available at SSRN 2733374 2016. ; Vehtari A, Gelman A, Gabry J, Yao Y. LOO: Efficient leave-one-out cross-validation and waic for Bayesian models. R package version 2.6.0. 2023. https://mc-stan.org/loo/. ; Isakov M, Kuriwaki S. Towards principled unskewing: Viewing 2020 election polls through a corrective lens from 2016. Harvard Data Sci Rev. 2020;2(4):69. ; Gneiting T, Raftery AE. Strictly proper scoring rules, prediction, and estimation. J Am Stat Assoc. 2007;102(477):359-378. ; Carpenter B, Gelman A, Hoffman MD, et al. Stan: A probabilistic programming language. J Stat Softw. 2017;76(1):76. ; Stan Development Team. Stan Modeling Language Users Guide and Reference Manual. 2019. https://mc-stan.org/users/documentation/. ; Gabry J, Češnovar R. cmdstanr: R Interface to cmdstan. 2020. https://mc-stan.org/cmdstanr/. ; R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2021. ; Akinbami LJ, Chen TC, Davy O, et al. National health and nutrition examination survey, 2017-march 2020 prepandemic file: sample design, estimation, and analytic guidelines. Vital and Health Statistics. Ser. 1, Programs and Collection Procedures. 2022;2(190):1-36. ; Weaver CM. Potassium and health. Adv Nutr. 2013;4(3):368S-377S. ; Svetkey LP, Sacks FM, Obarzanek E, et al. The DASH diet, sodium intake and blood pressure trial (DASH-sodium): Rationale and design. J Am Diet Assoc. 1999;99(8):S96-S104. ; Kannel WB. Blood pressure as a cardiovascular risk factor: Prevention and treatment. JAMA. 1996;275(20):1571-1576. ; Chipman HA, George EI, McCulloch RE. BART: Bayesian additive regression trees. Ann Appl Stat. 2010;4(1):266-298. doi:10.1214/09-AOAS285. ; Dorie V. dbarts: Discrete Bayesian Additive Regression Trees Sampler. 2022. R package version 0.9-22. https://CRAN.R-project.org/package=dbarts. ; Friedman J, Hastie T, Simon N, Tibshirani R, Hastie MT, Matrix D. Package glmnet. J Stat Softw. 2017;33(1):1-22.
  • Grant Information: 5R01AG067149-02 National Institutes of Health
  • Contributed Indexing: Keywords: LOO; MRP; model validation; population estimand; small-area estimation
  • Entry Date(s): Date Created: 20231226 Date Completed: 20240221 Latest Revision: 20240221
  • Update Code: 20240221

Klicken Sie ein Format an und speichern Sie dann die Daten oder geben Sie eine Empfänger-Adresse ein und lassen Sie sich per Email zusenden.

oder
oder

Wählen Sie das für Sie passende Zitationsformat und kopieren Sie es dann in die Zwischenablage, lassen es sich per Mail zusenden oder speichern es als PDF-Datei.

oder
oder

Bitte prüfen Sie, ob die Zitation formal korrekt ist, bevor Sie sie in einer Arbeit verwenden. Benutzen Sie gegebenenfalls den "Exportieren"-Dialog, wenn Sie ein Literaturverwaltungsprogramm verwenden und die Zitat-Angaben selbst formatieren wollen.

xs 0 - 576
sm 576 - 768
md 768 - 992
lg 992 - 1200
xl 1200 - 1366
xxl 1366 -