On Thursday 06 March 2008 (07:03:34), Prof Brian Ripley wrote: > The only thing you are adding to earlier replies is incorrect: > > fitting by least squares does not imply a normal distribution. >
Thanks for the clarification, 'implies' is to strong. I should have written 'suggests' or 'is often motivated by'. When I wrote my post, there was only one earlier reply, which mentioned the objective function. The point I (rather clumsily) tried to make is: (a) and (b)+(c) differ in so far that under (a) y may have zero or negative values, while under (b) and (c) y may only have values above zero. So the models do not just differ in the objective function but also in their substantive interpretation, which may help deciding which is the 'correct' coefficient b. > For a regression model, least-squares is in various senses optimal when > the errors are i.i.d. and normal, but it is a reasonable procedure for > many other situations (but not for modestly long-tailed distributions, > the point of robust statistics). > > Although values from -Inf to +Inf are theoretically possible for a normal, > it has very little mass in the tails and is often used as a model for > non-negative quantities (and e.g. the justification of Box-Cox estimation > relies on this). > > On Wed, 5 Mar 2008, Martin Elff wrote: > > On Wednesday 05 March 2008 (14:53:27), Wolfgang Waser wrote: > >> Dear all, > >> > >> I did a non-linear least square model fit > >> > >> y ~ a * x^b > >> > >> (a) > nls(y ~ a * x^b, start=list(a=1,b=1)) > >> > >> to obtain the coefficients a & b. > >> > >> I did the same with the linearized formula, including a linear model > >> > >> log(y) ~ log(a) + b * log(x) > >> > >> (b) > nls(log10(y) ~ log10(a) + b*log10(x), start=list(a=1,b=1)) > >> (c) > lm(log10(y) ~ log10(x)) > >> > >> I expected coefficient b to be identical for all three cases. Hoever, > >> using my dataset, coefficient b was: > >> (a) 0.912 > >> (b) 0.9794 > >> (c) 0.9794 > >> > >> Coefficient a also varied between option (a) and (b), 107.2 and 94.7, > >> respectively. > > > > Models (a) and (b) entail different distributions of the dependent > > variable y and different ranges of values that y may take. > > (a) implies that y has, conditionally on x, a normal distribution and > > has a range of feasible values from -Inf to +Inf. > > (b) and (c) imply that log(y) has a normal distribution, that is, > > y has a log-normal distribution and can take values from zero to +Inf. > > > >> Is this supposed to happen? > > > > Given the above considerations, different results with respect to the > > intercept are definitely to be expected. > > > >> Which is the correct coefficient b? > > > > That depends - is y strictly non-negative or not ... > > > > Just my 20 cents... > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html and provide commented, > > minimal, self-contained, reproducible code. -- 10.0 times 0.1 is hardly ever 1.0 ---- Kernighan and Plauger ------------------------------------------------- Dr. Martin Elff Faculty of Social Sciences LSPWIVS (van Deth) University of Mannheim A5, 6 68131 Mannheim Germany Phone: +49-621-181-2093 Fax: +49-621-181-2099 E-Mail: [EMAIL PROTECTED] Web: http://webrum.uni-mannheim.de/sowi/elff/ http://www.sowi.uni-mannheim.de/lspwivs/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.