hello, the following questions will without doubt reveal some fundamental ignorance, but hopefully you can still help me out.
I'd like to bootstrap a coefficient gained on the basis of the coefficients in a logistic regression model (the mean differences in the predicted probabilities between two groups, where each predict() operation uses as the newdata-argument a dataframe of equal size as the original dataframe).I've got 130,000 rows and 7 columns in my dataframe. The glm-model uses all variables (as well as two 2-way interactions). System: - R-version: 2.12.2 - OS: Windows XP Pro, 32-bit - 3.16Ghz intel dual core processor, 2.9GB RAM I'm using the boot package to arrive at the standard errors for this difference, but even with only 10 replications, this takes quite a long time: 216 seconds (perhaps this is partly also due to my inefficiently programmed function underlying the boot-call, I'm also looking into that). I wanted to try out calculating a bca-bootstrapped confidence interval, which as I understand requires a lot more replications than normal-theory intervals. Drawing on John Fox' Appendix to his "An R Companion to Applied Regression", I was thinking of trying out 2000 replications -- but this will take several hours to compute on my system (which isn't in itself a major issue though). My Questions: - let's say I try bootstrapping with 2000 replications. Can I be certain that the memory available to R will be sufficient for this operation? - (this relates to statistics more generally): is it a good idea in your opinion to try bca-bootstrapping, or can it be assumed that a normal theory confidence interval will be a sufficiently good approximation (letting me get away with, say, 500 replications)? Best, Esther ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.