On Nov 21, 2014, at 6:52 AM, ivan wrote:

> I am aware of the fact that bootstrapping produces different CIs with every 
> run. I still believe that there is a difference between both types of 
> procedures. My understanding is that setting "w" in the boot() function 
> influences the "importance" of observations or how the bootstrap selects the 
> observations. I.e, observation i does not have the same probability of being 
> chosen as observation j when "w" is defined in the boot() function. If you 
> return res_boot you will notice that with "w" being set in the boot() 
> function, the function call states "weighted bootstrap". If not, it states 
> "ordinary nonparametric bootstrap". But maybe I am wrong. 

OK. So in the the second call w affects the probability of a case being sent to 
the boot-function as well as being used in the boot-function; while with the 
"non-weighted call" the w's are only affecting the individual mean estimates. 
So the second one is different. And as I suggested earlier you never described 
the goals of the investigation or the meaning of the variables. 

   I can tell you that when Davison and Hinkley offered examples of using a 
bootstrap for a weighted bootstrap mean, they compared a stratified analysis 
with an example where the weighting was only used on the inner function 
(example 3.2, practical 3.14 pp 72, 131 of their book) with one where the 
strata parameter was used. But so far I don't think you have ever described 
what sort of weights these actually are. In that example the weights were the 
inverse variances of the sample groups. They didn't use a 'weights' parameter 
in the boot call. I'm do not know if it was part of the S package that was 
being used at the time.

I tried to find an example of a weighted bootstrap in V&R 4e but did not see 
one. Prof Ripley is the maintainer of the boot package. In the V&R book, Angelo 
Canty is given the credit for writing the boot package for S. I think you 
should consult the code, first. And you should also look at the `stype` 
parameter where "w" is one option.

-- 
David.

> 
> On Thu, Nov 20, 2014 at 8:19 PM, David Winsemius <dwinsem...@comcast.net> 
> wrote:
> 
> On Nov 20, 2014, at 2:23 AM, i.petzev wrote:
> 
> > Hi David,
> >
> > sorry, I was not clear.
> 
> Right. You never were clear about what you wanted and your examples was so 
> statistically symmetric that it is still hard to see what is needed. The 
> examples below show CI's that are arguably equivalent. I can be faulted for 
> attempting to provide code that produced a sensible answer to a vague 
> question to which I was only guessing at the intent.
> 
> 
> > The difference comes from defining or not defining “w” in the boot() 
> > function. The results with your function and your approach are thus:
> >
> > set.seed(1111)
> > x <- rnorm(50)
> > y <- rnorm(50)
> > weights <- runif(50)
> > weights <- weights / sum(weights)
> > dataset <- cbind(x,y,weights)
> >
> > vw_m_diff <- function(dataset,w) {
> >   differences <- dataset[w,1]-dataset[w,2]
> >   weights <- dataset[w, "weights"]
> >   return(weighted.mean(x=differences, w=weights))
> > }
> > res_boot <- boot(dataset, statistic=vw_m_diff, R = 1000, w=dataset[,3])
> > boot.ci(res_boot)
> >
> > BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
> > Based on 1000 bootstrap replicates
> >
> > CALL :
> > boot.ci(boot.out = res_boot)
> >
> > Intervals :
> > Level      Normal              Basic
> > 95%   (-0.5657,  0.4962 )   (-0.5713,  0.5062 )
> >
> > Level     Percentile            BCa
> > 95%   (-0.6527,  0.4249 )   (-0.5579,  0.5023 )
> > Calculations and Intervals on Original Scale
> >
> > ********************************************************************************************************************
> >
> > However, without defining “w” in the bootstrap function, i.e., running an 
> > ordinary and not a weighted bootstrap, the results are:
> >
> > res_boot <- boot(dataset, statistic=vw_m_diff, R = 1000)
> > boot.ci(res_boot)
> >
> > BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
> > Based on 1000 bootstrap replicates
> >
> > CALL :
> > boot.ci(boot.out = res_boot)
> >
> > Intervals :
> > Level      Normal              Basic
> > 95%   (-0.6265,  0.4966 )   (-0.6125,  0.5249 )
> 
> I hope you are not saying that because those CI's are different that there is 
> some meaning in that difference. Bootstrap runs will always be "different" 
> than each other unless you use set.seed(.) before the runs.
> 
> >
> > Level     Percentile            BCa
> > 95%   (-0.6714,  0.4661 )   (-0.6747,  0.4559 )
> > Calculations and Intervals on Original Scale
> >
> > On 19 Nov 2014, at 17:49, David Winsemius <dwinsem...@comcast.net> wrote:
> >
> >>>> vw_m_diff <- function(dataset,w) {
> >>>>     differences <- dataset[w,1]-dataset[w,2]
> >>>>    weights <- dataset[w, "weights"]
> >>>>    return(weighted.mean(x=differences, w=weights))
> >>>>  }
> >
> 
> David Winsemius
> Alameda, CA, USA
> 
> 

David Winsemius
Alameda, CA, USA

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to