Vibhanshu Abhishek wrote: > Hi, > > I want to run a logistic regression on summary data and I was not able to > find the adequate R function to do so. The data is summarized daily and > instead of the binary y_t I have n_d and m_d, where is the number of > instances in which a choice could have been made and m_d is the number of > instances in which the choice was made. The other characteristics have been > averaged throughout the day. > > sample file: > impression clicks pos > 5049 1 16.68251 > 4983 1 16.75457 > 4908 1 16.76956 > 5093 0 16.70803 > 5049 254 2.299663 > 5023 307 2.300418 > 4946 252 2.315609 > 4932 247 2.303933 > > Thanks for the help. > Vibhanshu > There are two ways:
d <- read.table("clipboard", header=TRUE) glm(clicks/impression~pos, d, family=binomial, weights=impression) glm(cbind(clicks, noclicks=impression - clicks)~pos, d, family=binomial) I.e. the LHS can be a rate, in which case you need weights to tell the difference between 5/10 and 500/1000, or it can be a matrix with two columns, events and non-events (NB: not the number of trials!) I think you'll find this in most textbooks on R... -- O__ ---- Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.