Toby Marthews wrote: > Dear R-help, > > Here's a simple example of nonlinear curve fitting where nls seems to get > the answer wrong on a very simple exponential fit (my R version 2.7.2). > > Look at this code below for a very basic curve fit using nls to fit to (a) > a logarithmic and (b) an exponential curve. I did the fits using > self-start functions and I compared the results with a more simple fit > using a straight lm() command. > > The logarithmic fit works 100% correctly below. The problem is with the > exponential fit: the nls command gives the wrong values and I have > completely failed to see why this should happen. I can't see any misake in > my self-start function, so why the mismatch?? Even worse, if I give nls a > trivial fit by removing the "#&#&" on line 6, it fails to converge at all > when the 'simpler' method with lm() easily gives the right answer (r=0.03 > and A=5). > > I did the same exp and ln fits using MS Excel and the numbers returned are > exactly as for the 'simpler ways', so nls is definitely in the wrong, but > the self-start is almost identical to the one for the logarithmic fit, > which works perfectly .... > > I can't see my mistake at all. Can anyone help?? > It's in the theory. A least squares fit to log(Y) is not equivalent to a least squares fit to Y. The latter assigns more weight to larger values. Fitting on a log scale is warranted if the error variance after transformation is independent of the magnitude of the response. This is pretty patently wrong for your data: look at plot(log(height)~dbh), which by the way also shows that the exponential model is wrong for these data.
So EXCEL is definitely in the wrong!!!!!! > Thanks, > Toby Marthews > > > traceson=FALSE;maxint=10000;minstepfactor=2^(-30);tolerance=1e-7 > xtxt="DBH in mm";ytxt="Height in m" > dbh=c(0.9,1.0,1.1,1.2,4.8,4.9,5.0,5.1,8.9,9.0,9.1,9.2,11.8,11.9,12.0,12.1,14.9,15.1,15.2,15.5) > height=c(5.770089,5.154794,4.888847,6.356770,14.849109,14.973146,15.423816,14.865038,21.335985,20.477624,20.915823,21.886004,23.714724,24.182210,23.954929,23.784659,25.334360,25.411320,26.218614,25.612478) > #&#&#height=5*exp(0.03*dbh) > > logarithmicInit=function(mCall,LHS,data) { > xy=sortedXyData(mCall[["x"]],LHS,data) > aaa=0.01 #Just guess aaa=0.01 to start the fit > bbb=min(xy$y) #Use the minimum y value as an initial guess > for bbb > value=c(aaa,bbb) > names(value)=mCall[c("aaa","bbb")] > return(value) > } > fmla=as.formula("~(aaa*log(x)+bbb)") > SSlogarithmic=selfStart(fmla,initial=logarithmicInit,parameters=c("aaa","bbb")) > > exponenInit=function(mCall,LHS,data) { > xy=sortedXyData(mCall[["x"]],LHS,data) > r=0.01 #Just guess r=0.01 to start the fit > A=min(xy$y) #Use the minimum y value as an initial guess for A > value=c(r,A) > names(value)=mCall[c("r","A")] > return(value) > } > fmla=as.formula("~(A*exp(r*x))") > SSexponen=selfStart(fmla,initial=exponenInit,parameters=c("r","A")) > > cat("Logarithmic fit (here, the self-start and the 'simpler' way match):\n") > tmp=getInitial(height~SSlogarithmic(dbh,aaa,bbb),data=data.frame(dbh,height)) > cat("(Starting values for the logarithmic fit were: aaa=",tmp[1],"and > bbb=",tmp[2],")\n") > modelfit=nls(height~SSlogarithmic(x=dbh,aaa,bbb),trace=traceson,control=list(maxiter=maxint,tol=tolerance,minFactor=minstepfactor)) > bfaaa=summary(modelfit)$coefficients[1];bfbbb=summary(modelfit)$coefficients[2] > cat("Best logarithmic fit is (",ytxt,")=(aaa*log(",xtxt,")+bbb) with > parameters aaa=",bfaaa,"and bbb=",bfbbb,"\n") > #Produces: Best logarithmic fit is ( Height in m )=(aaa*log( DBH in mm > )+bbb) with parameters aaa= 7.551666 and bbb= 4.594367 > > lnfit=lm(height~log(dbh)) #This is doing the ln fit without a self-start > function > cat("Done another, simpler, way, the best logarithmic fit has parameters: > aaa=",lnfit$coefficients[2],"and bbb=",lnfit$coefficients[1],"\n") > #Produces: Done another, simpler, way, the best logarithmic fit has > parameters: aaa= 7.551666 and bbb= 4.594367 > > cat("Exponential fit (here, the self-start and the 'simpler' way DON'T > match):\n") > tmp=getInitial(height~SSexponen(dbh,r,A),data=data.frame(dbh,height)) > cat("(Starting values for the exponential fit: r=",tmp[1],"and > A=",tmp[2],")\n") > modelfit=nls(height~SSexponen(x=dbh,r,A),trace=traceson,control=list(maxiter=maxint,tol=tolerance,minFactor=minstepfactor)) > bfr=summary(modelfit)$coefficients[1];bfA=summary(modelfit)$coefficients[2] > cat("Best exponential fit is (",ytxt,")=(A*exp(r*(",xtxt,"))) with > parameters r=",bfr,"and A=",bfA,"\n") > #Produces: Best exponential fit is ( Height in m )=(A*exp(r*( DBH in mm > ))) with parameters r= 0.07134055 and A= 9.47907 > > expfit=lm(log(height)~dbh) #This is doing the exp fit without a self-start > function > cat("Done another, simpler, way, the best exponential fit has parameters: > r=",expfit$coefficients[2],"and A=",exp(expfit$coefficients[1]),"\n") > #Produces: Done another, simpler, way, the best exponential fit has > parameters: r= 0.1028805 and A= 6.75045 > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- O__ ---- Peter Dalgaard Ă˜ster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - ([EMAIL PROTECTED]) FAX: (+45) 35327907 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.