Hi folks, I'm trying to estimate bias-corrected percentile (BCP) confidence intervals on a vector from a simple for loop used for resampling. I am attempting to follow steps in Manly, B. 1998. Randomization, bootstrap and monte carlo methods in biology. 2nd edition., p. 48. PDF of the approach/steps should be available here: https://wyocoopunit.box.com/s/9vm4vgmbx5h7um809bvg6u7wr392v6i9 If this is too statsy for the R list, I can try stackexchange or crossval, but the R list provides all kinds of great help so thought I'd try it first. I aware of boot::boot but am hoping to avoid it for the current analysis I am working on (the boot function became quite challenging, for me, for a few reasons).
I cannot figure out where I'm going wrong but the estimates from my attempt at the BCP CI are different enough from other methods that I assume I'm doing something wrong. require(boot) data("mtcars") # 1) Bootstrap 95% CI for R-Squared via boot::boot # statmethods.net/advstats/bootstrapping.html # Function for boot rsq <- function(formula, data, indices) { d <- data[indices,] fit <- lm(formula, data=d) return(summary(fit)$r.square) } # bootstrapping with 1000 replications results <- boot(data=mtcars, statistic=rsq, R=1000, formula = mpg ~ wt + disp) # get 95% confidence interval rsq.ci <- boot.ci(results, type="all") # 2) Bootstrap via for loop and estimate bias-corrected pecentile CI's # Fit LM with all the data lm.obs <- lm(mpg ~ wt + disp, mtcars) # Bootstrap with for loop iterations <- 1000 # # of iterations rsq.out <- vector() # to hold loop output for(i in 1:iterations){ boot.df <- mtcars[sample(nrow(mtcars), nrow(mtcars), replace = TRUE), ] boot.lm <- lm(mpg ~ wt + disp, boot.df) boot.rsq <- summary(boot.lm)$r.squared rsq.out <- c(rsq.out, boot.rsq) } hist(rsq.out) # Bias-corrected percentile CI # Manly, B. 1998. Randomization, bootstrap and monte carlo methods in biology. 2nd edition. # - p. 48 # Proportion of times that the boot estimate exceeds observed estimate mtcar.boot.rsq <- data.frame(boot.rsq = rsq.out, obs.rsq = summary(lm.obs)$r.squared) head(mtcar.boot.rsq) # If boot mean is > observed mean, code 1, otherwise 0 mtcar.boot.rsq$boot.high <- ifelse(mtcar.boot.rsq$boot.rsq > mtcar.boot.rsq$obs.rsq, 1, 0) # mean is the proportion of times boot mean > obs mean mean(mtcar.boot.rsq$boot.high) head(mtcar.boot.rsq) # Use proportion to get Z score, then use that to calculate the new bias-correct # Z score to look up the new proportion to use in quantile() rsq.bc <- quantile(mtcar.boot.rsq$boot.rsq, c(pnorm((2*pnorm(mean(mtcar.boot.rsq$boot.high))) - 1.96), pnorm((2*pnorm(mean(mtcar.boot.rsq$boot.high))) + 1.96))) # ESTIMATES # My attempt at BCP CI (ehhhh) rsq.bc 31.59952% -- 0.7747525 99.97103% -- 0.9034800 # Boot package rsq.ci Intervals : Level Normal Basic 95% ( 0.6690, 0.8659 ) ( 0.6818, 0.8824 ) Level Percentile BCa 95% ( 0.6795, 0.8800 ) ( 0.6044, 0.8511 ) # Simple percentile CI quantile(rsq.out, c(0.025, 0.975)) 2.5% 97.5% 0.6871754 0.8809823 boot.ci(results, type="perc") Level Percentile 95% ( 0.6795, 0.8800 ) I appreciate any advice! Thanks. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.