Re: [R] Question on binomial data

David Winsemius Tue, 21 Apr 2009 16:07:20 -0700

Surely Faraway does not suggest using the Wald statistic in preferenceto the deviance?

Even if the distribution of deviance is not exactly chi-square, itappears generally accepted that a comparison of the difference indeviance to the chi-square statistic is better than using the ratio ofthe beta to se(beta) which is what that "Pr(>|z|)" number is.

Your permutation results look sensible and could conceivably beconsidered the gold standard.


--
David


On Apr 21, 2009, at 5:31 PM, ehud cohen wrote:

I thought of testing the difference in deviance between the null model
and the fitted model, assuming it is distributed as chi-sq. However,
Faraway writes that if the outcome is binary, the deviance
distribution is far from chisq.
I've done a permutation test:

N<-5000; # Towards the upper limit, as there are only 17 over 5 =
6,188 combination of the T/F data I have..
dev<-rep(0,N);
for (i in 1:N) {
        l1<-glm(sample(p)~w,family=binomial);
        dev[i]<-l1$dev;
}
print(mean(dev<l$dev))

and the outcome is 0.005 - which is close to the ttest.

I've repeated the same with calculating the statistics on the z-value
in summary(l1) each time instead of the deviance, and got a comparable
result.

I think it means that David is right, the Pr(>|z|) in glm output does
not mean much. I still don't know what does it mean.

Regarding your suggestion of using car's Anova:

Anova(l)

Anova Table (Type II tests)

Response: p
 LR Chisq Df Pr(>Chisq)
w   9.4008  1   0.002169 **

which is identical to:

pchisq(l$null.deviance-l$dev,1,lower=F)

which seems to be too low - which is probably due to the binaryresponse.


would you think the permutation method is appropriate to use in this
case? and extended also to a case with several covariates?



On Tue, Apr 21, 2009 at 10:34 PM,  <markle...@verizon.net> wrote:

hi: i would wait for one of the guRus to say something but my take( take it
with a grain of salt ) is that the results
are not so contradictory. the test of the significance of thecoefficient in
the GLM is 0.06. and the test that the
means are difference gives a pv-pvalue of 0.004. a couple ofreasons why
this might not be so contradictory:
A) the test gives greater significance in the t-test case but it'snot
really testing the same thing. the t-test is only testing that
the means are different. the glm is testing is that log odds ofthe means
of the two events ( pass and fail ) are linearly related to
a covariate.
b) your t-test is a little weird because it's only got sample offive in
one of the 2 samples and I'm not clear on whether it's assuming equal
variances and then pooling ( I think there's a pooled = TRUE optionfor
t.test  but I don't know the default value ).
definitely that's not a large sample size regardless of the poolingissue.
c) when you test the significance in a glm you need to compare thedeviance
of the model to the deviance of the nested null model.
John Fox's book desacribes this but I don't think it's the same aslooking
as the significance in the table output of glm. that's
a wald test and not the same as the deviance comparison( essentially alikelihood ratio test i think ). with small sample sizes, i thinkthesedifferences between these various test can be large. check out johnfox'stext for a nice description of testing in the generalized linearmodel
framework. you can use Anova from his car package to do this.
hopefully someone else wil say something though because i'd becurious to
see where i'm wrong/right or something new.
good luck.







On Apr 21, 2009, ehud cohen <ehudco.l...@gmail.com> wrote:

Hi,

We have an experiment with pass/fail outcome, and a continuous
parameter which may contribute to the outcome.

First, we've analyzed it by:

p=c(F,T,F,F,F,T,T,T,T,T,T,T,F,T,T,T,T);
w=c(53,67,59,59,53,89,72,56,65,63,62,58,59,72,61,68,63);
l<-glm(p~w,family=binomial)
summary(l)

Which turned out to be non significant.
Then, we thought of comparing the parameters of the two groups(passed
vs. failed)

t.test(w[which(p)],w[which(!p)],alternative="two.sided")

which turned highly significant.

I'd appreciate some insight...

Thanks, Ehud.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
Heritage Laboratories
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Question on binomial data

Reply via email to