On Apr 16, 2013, at 22:20 , Noah Silverman wrote: > @Duncan, You make a very good point. Somehow I overlooked that 0 is not > positive. I guess that rules out the log normal model. > > My challenge here is finding the right model for this data. Originally it > was a nice count of students. Relatively easy to model with a zero inflated > Poisson model. The resulting residuals seemed reasonable. > > However, I was then instructed to change the count of students to a "rate" > which was calculated as students / population (Each school has its own > population.)) This is now no longer a count variable, but a proportion > between 0 and 1. > > This "rate" (students/population) is no longer Poisson, but is certainly not > normal either. So, I'm a bit lost as to the appropriate distribution to > represent it. > > Any thoughts? >
Off the cuff: Could it be more natural to model as a ZIP with log(pop) as an offset on the log-lambda scale? > > -- > Noah Silverman, M.S. > UCLA Department of Statistics > 8117 Math Sciences Building > Los Angeles, CA 90095 > > On Apr 16, 2013, at 12:44 PM, Thomas Lumley <tlum...@uw.edu> wrote: > >> On Wed, Apr 17, 2013 at 5:19 AM, Noah Silverman <noahsilver...@ucla.edu> >> wrote: >> Hi, >> >> I have some data, that when plotted looks very close to a log-normal >> distribution. My goal is to build a regression model to test how this >> variable responds to several independent variables. >> >> [snip] >> >> When I try to build a simple model, I also get an error: >> >> l <- glm(y~ x, family=gaussian(link="log")) >> >> Error in eval(expr, envir, enclos) : cannot find valid starting values: >> please specify some >> >> >> Duncan has described the problems with the lognormal. I will just point out >> that this 'simple model' is not lognormal. It is a model with normal errors >> and log link, ie. >> >> y ~ N(mu, sigma^2) >> log(mu) = x \beta >> >> >> -thomas >> >> -- >> Thomas Lumley >> Professor of Biostatistics >> University of Auckland > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.