Re: [R] Categorical Response Query

Greg Snow Tue, 21 Oct 2008 10:22:26 -0700

The second case also needs the argument: weight=n
Then all 3 models should give the same general fit (same coefficients, same 
predicted values).


The differences are subtle and may not be of interest.  Conceptually think 
about:  did you run 10 trials under a set of conditions (age=x, sex=y, class=z) 
and 9 of them were successes? This is model 2/3.  Or did you run a bunch of 
individual trials and just by chance 10 of them happened to have the same 
conditions (age=x, sex=y, class=z) and 9 of those 10 were successes? This is 
model 1.

The biggest visible difference is in the deviance calculations.  That comes 
about because in model 1 the saturated model can fit every point exactly (since 
the responses are all 0 or 1), in the other 2 the saturated model gives the 
same proportion for each combination of predictors as observed, but these are 
not 0/1 now.

The most important difference comes when you decide to extend the model, (mixed 
effects, bootstrapping) because the observational unit is different between 
model 1 and models 2 & 3 (I don't know of any differences between 2 & 3 other 
than looks/convenience).

Hope this helps,

--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
[EMAIL PROTECTED]
801.408.8111


> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
> project.org] On Behalf Of andyer weng
> Sent: Monday, October 20, 2008 4:39 PM
> To: r-help@r-project.org
> Subject: Re: [R] Categorical Response Query
>
> Hi all,
>
> I have a queston about Categorical response.
>
> i have a data frame containing age, sex, class, success(1=success,
> 0=non sucess).
> age, sex,class are the explantory variables, and sucess is the
> response variable.  and i can get n (the nunber of times each age
> occurs) and r (the number of sucess of that age).
>
> when I try to creat the regression relationship for these variables, I
> have seen many different cases, i just wonder which one fits me the
> best for this situation.
>
> 1st case,
> xxx.glm<-glm(success~age*sex*class,family=binomial, data=xxx.data)
>
> 2nd case
>
> xxx.glm<-glm(r/n~age*sex*class,family=binomial, data=xxx.data)
>
> 3rd case
>
> xxx.glm<-glm(cbind(r,n-r)~age*sex*class,family=binomial, data=xxx.data)
>
> what is difference between the above 3 cases? which one is the best to
> use?
>
> if Ii don't group the data, can I use the 1st case. if i group the
> data, can i use 2nd or 3rd case?
>
> please advise.
>
> Cheers.
> Andyer
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Categorical Response Query

Reply via email to