Re: [R] Logistic regression on aggregate data

Peter Dalgaard Mon, 11 Aug 2008 01:35:41 -0700

Vibhanshu Abhishek wrote:
> Hi,
>
> I want to run a logistic regression on summary data and I was not able to
> find the adequate R function to do so. The data is summarized daily and
> instead of the binary y_t I have n_d and m_d, where is the number of
> instances in which a choice could have been made and m_d is the number of
> instances in which the choice was made. The other characteristics have been
> averaged throughout the day.
>
> sample file:
> impression clicks pos
> 5049 1 16.68251
> 4983 1 16.75457
> 4908 1 16.76956
> 5093 0 16.70803
> 5049 254 2.299663
> 5023 307 2.300418
> 4946 252 2.315609
> 4932 247 2.303933
>
> Thanks for the help.
> Vibhanshu
>   
There are two ways:


d <- read.table("clipboard", header=TRUE)

glm(clicks/impression~pos, d, family=binomial, weights=impression)
glm(cbind(clicks, noclicks=impression - clicks)~pos, d, family=binomial)

I.e. the LHS can be a rate, in which case you need weights to tell the
difference between 5/10 and 500/1000, or it can be a matrix with two
columns, events and non-events (NB: not the number of trials!)

I think you'll find this in most textbooks on R...

-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - ([EMAIL PROTECTED])              FAX: (+45) 35327907

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Logistic regression on aggregate data

Reply via email to