Hi all, Here is the first six rows of my data. In total I have 1269 rows. My goal is to get conduct nonparametric bootstrap and case resampling. I would like to randomly select 100 out of the 1269 After that, I wish to bootstrap that randomly selected 100 out of 1269.
I assume I need to set the seed to conduct this randomization, as with bootstrapping you would get varied results each time the code is run. ## NAR SQRTNAR NIC SQRTNIC ## 1 2.6 1.612452 5.6 2.366432 ## 2 8.1 2.846050 9.9 3.146427 ## 3 5.7 2.387467 7.1 2.664583 ## 4 8.3 2.880972 8.1 2.846050 ## 5 7.3 2.701851 9.9 3.146427 ## 6 4.9 2.213594 8.6 2.932576 Here is my definition of the DataSummary function. DataSummary <- function(df, indices){ sample <- df[indices, ] sumry_for_NAR <- summary(sample$NAR) nms <- names(sumry_for_NAR) nms <- c(nms, 'std') out_for_NAR <- c(sumry_for_NAR, sd(sample$NAR)) names(out_for_NAR) <- nms sumry_for_SQRTNAR <- summary(sample$SQRTNAR) nms <- names(sumry_for_SQRTNAR) nms <- c(nms, 'std') out_for_SQRTNAR <- c(sumry_for_SQRTNAR, sd(sample$SQRTNAR)) names(out_for_SQRTNAR) <- nms sumry_for_NIC <- summary(sample$NIC) nms <- names(sumry_for_NIC) nms <- c(nms, 'std') out_for_NIC <- c(sumry_for_NIC, sd(sample$NIC)) names(out_for_NIC) <- nms sumry_for_SQRTNIC <- summary(sample$SQRTNIC) nms <- names(sumry_for_SQRTNIC) nms <- c(nms, 'std') out_for_SQRTNIC <- c(sumry_for_SQRTNIC, sd(sample$SQRTNIC)) names(out_for_SQRTNIC) <- nms OUT <- c(out_for_NAR, out_for_SQRTNAR, out_for_NIC, out_for_SQRTNIC) return(OUT) } Again, here is my attempt at bootstrapping. result <- boot(n_data, statistic = DataSummary, R = 100) result Per suggestions, would I go with this code to achieve my goal? So, the best reference/resource is the boot help page. I found code through various sites and I got really confused because they were very different from each other. > set.seed(1007) > > x <- rnorm(100) > y <- x + rnorm(100) > dat <- data.frame(x, y) > stat2 <- function(DF, f){ > model <- lm(y ~ x, data = DF[f,]) > coef(model) > } > > boot(dat, stat1, R = 100) > boot(dat, stat2, R = 100) Bryan Mac bryanmac...@gmail.com > On Oct 2, 2016, at 5:37 AM, ruipbarra...@sapo.pt wrote: > > Right. > To see it in action just compare the results of the two calls to boot. > > library(boot) > > set.seed(1007) > > x <- rnorm(100) > y <- x + rnorm(100) > dat <- data.frame(x, y) > > #Wrong > stat1 <- function(DF, f){ > model <- lm(DF$y ~ DF$x, data = DF[f,]) #Doesn't bootstrap DF > coef(model) > } > > #Correct > stat2 <- function(DF, f){ > model <- lm(y ~ x, data = DF[f,]) > coef(model) > } > > boot(dat, stat1, R = 100) > boot(dat, stat2, R = 100) > > > Rui Barradas > > > Citando peter dalgaard <pda...@gmail.com>: > >>> On 01 Oct 2016, at 16:11 , Daniel Nordlund <djnordl...@gmail.com> wrote: >>> >>> You haven't told us anything about the structure of your data, or the >>> definition of the DataSummary function. >> >> Yes. Just let me add that a common error with boot() is not to pay attention >> to the required form of the statistic= function argument. It should depend >> on the data and a set of indices and (for nonparametic bootstrap) it is the >> indices that are random. >> >> Typical mistakes are to completely ignore the index argument, or to write >> clumsy code that ignores the data specification, as in >> coef(lm(df$y~df$x, data=d[f])). >> >> >> -- >> Peter Dalgaard, Professor, >> Center for Statistics, Copenhagen Business School >> Solbjerg Plads 3, 2000 Frederiksberg, Denmark >> Phone: (+45)38153501 >> Office: A 4.23 >> Email: pd....@cbs.dk Priv: pda...@gmail.com >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.