Your post suggests statistical confusion on several levels. But as this concerns statistics, not R, consult your local statistician or post on a statistical help list (e.g. stats.stackexchange.com).
Incidentally, the short answer to your question about the overlap is: why shouldn't they? You will hopefully receive a more informative longer answer from the above suggested (or similar) resources. -- Bert On Thu, May 3, 2012 at 9:37 AM, Malcolm Fairbrother <m.fairbrot...@bristol.ac.uk> wrote: > Dear list, > > I'm a bit perplexed why the 95% confidence bands for the predicted > probabilities for units where x=0 and x=1 overlap in the following instance. > > I've simulated binary data to which I've then fitted a simple logistic > regression model, with one covariate, and the coefficient on x is > statistically significant at the 0.05 level. I've then used two different > methods to generate 95% confidence bands for predicted probabilities, for > each of two possible values of x. Given the result of the model, I expected > the bands not to overlap… but they do. > > Can anyone please explain where I've gone wrong with the following code, OR > why it makes sense for the bands to overlap, despite the statistically > significant beta coefficient? There may be a good statistical reason for > this, but I'm not aware of it. > > Many thanks, > Malcolm Fairbrother > > > n <- 120 > set.seed(030512) > x <- rbinom(n, 1, 0.5) > dat <- within(data.frame(x), ybe <- rbinom(n, 1, plogis(-0.5 + x))) > mod1 <- glm(ybe ~ x, dat, family=binomial) > summary(mod1) # coefficient on x is statistically significant at the 0.05 > level… almost at the 0.01 level > > pred <- predict(mod1, newdata=data.frame(x=c(0,1)), se.fit=T) > with(pred, cbind(low = plogis(fit - 1.96*se.fit), est = plogis(fit), up = > plogis(fit + 1.96*se.fit))) # confidence bands based on SEs > > # simulation-based confidence bands: > sims <- t(replicate(200, coef(glm(simulate(mod1)$sim_1 ~ x, data=dat, > family=binomial)))) > pred0 <- plogis(quantile(sims%*%c(1,0), c(0.025, 0.5, 0.975))) > pred1 <- plogis(quantile(sims%*%c(1,1), c(0.025, 0.5, 0.975))) > rbind(pred0, pred1) > > # the upper bound of the prediction for x=0 is greater than the lower bound > of the prediction for x=1, using both methods > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.