Re: [R] How to calculate standard error of estimate (S) for my non-linear regression model?

Jeff Newmiller Sat, 26 Sep 2015 13:29:35 -0700

You may not be a statistician, but you should at least learn about thecalculations you are making. You cannot expect to convince others thatyour calculations are right just because "Peter on the internet said theywere right".

To give you a gentle push in this direction, I have reproduced thecalculations on that reference web page using R, so you can get a headstart on understanding how to perform them on your data. (Hint: about allyou should need to do is give your dta.nls to predict and use yourcolumn names instead of X and Y.) Once you have convinced yourself thatthe pre-defined functions are doing what you expect, then you can omit thedo-it-yourself option with confidence in the results.

Please note that no use of attach is included here... that usually ends inunhappiness at some point, so prefer to use the data argument instead.


###
dta <- data.frame( X=c(1,2,3,4,5), Y=c(1,2,1.3,3.75,2.25) )
nrow( dta )
# linear regression
y.lm <- lm( Y~X, data=dta )
# compute predictions from original data
## Be aware that you can also "predict" using a nonlinear regression fit
## Also be aware that you can compute estimates using different data if you
## specify the "newdata" argument... see the help for predict.lm
## ?predict.lm
dta$Yprime <- predict( y.lm )
# ?with
dta$YlessYprime <- with( dta, Y - Yprime )
dta$YlessYprime2 <- with( dta, YlessYprime^2 )

# confirm sums
# ?sum
sum( dta$X )
sum( dta$Y )
sum( dta$Yprime )
sum( dta$YlessYprime )
sum( dta$YlessYprime2 )

# standard error of the estimate for population data
sigma.est <- sqrt( sum( dta$YlessYprime2 ) / nrow( dta ) )
sigma.est

# sd function assumes sample standard deviation, can correct the result if# you want

# ?sd
# ?all.equal

all.equal( sigma.est, sd( dta$YlessYprime ) * sqrt( ( nrow( dta ) - 1 ) /nrow( dta ) ) )


# alternate formulation
SSY <- sum( ( dta$Y - mean( dta$Y ) )^2 )
rho <- with( dta, cor( Y, X ) )
all.equal( sigma.est, sqrt( (1-rho^2)*SSY/nrow(dta) ) )

# when working with a sample...
s.est <- sqrt( sum( dta$YlessYprime2 ) / ( nrow( dta ) - 2 ) )
s.est

####

dta <- data.frame( X=c(1,2,3,4,5), Y=c(1,2,1.3,3.75,2.25) )
nrow( dta )

[1] 5

# linear regression
y.lm <- lm( Y~X, data=dta )
# compute predictions from original data
## Be aware that you can also "predict" using a nonlinear regression fit
## Also be aware that you can compute estimates using different data if you
## specify the "newdata" argument... see the help for predict.lm
## ?predict.lm
dta$Yprime <- predict( y.lm )
# ?with
dta$YlessYprime <- with( dta, Y - Yprime )
dta$YlessYprime2 <- with( dta, YlessYprime^2 )

# confirm sums
# ?sum
sum( dta$X )

[1] 15

sum( dta$Y )

[1] 10.3

sum( dta$Yprime )

[1] 10.3

sum( dta$YlessYprime )

[1] 2.220446e-16

sum( dta$YlessYprime2 )

[1] 2.79075


# standard error of the estimate for population data
sigma.est <- sqrt( sum( dta$YlessYprime2 ) / nrow( dta ) )
sigma.est

[1] 0.7470944

# sd function assumes sample standard deviation, can correct the result# if you want
# ?sd
# ?all.equal
all.equal( sigma.est, sd( dta$YlessYprime ) * sqrt( ( nrow( dta ) - 1 )

/ nrow( dta ) ) )
[1] TRUE


# alternate formulation
SSY <- sum( ( dta$Y - mean( dta$Y ) )^2 )
rho <- with( dta, cor( Y, X ) )
all.equal( sigma.est, sqrt( (1-rho^2)*SSY/nrow(dta) ) )

[1] TRUE


# when working with a sample...
s.est <- sqrt( sum( dta$YlessYprime2 ) / ( nrow( dta ) - 2 ) )
s.est

[1] 0.9644947



On Sat, 26 Sep 2015, Michael Eisenring wrote:

Dear Peter,
Thank you for your answer.
If I look at my summary I see there a Residual standard error: 1394 on 53
degrees of freedom.
This number is very high (the fit of the curve is pretty bad I know but
still...). Are you sure the residual standard error given in the summary is
the same as the one described on this page:
http://onlinestatbook.com/2/regression/accuracy.html
I am basically just looking for a value that describes the goodness of fit
for my non-linear regression model.


This is probably a pretty obvious question, but I am not a statistician and
as you said the terminology is sometimes pretty confusing.
Thanks mike

-----Urspr?ngliche Nachricht-----
Von: peter dalgaard [mailto:pda...@gmail.com]
Gesendet: Samstag, 26. September 2015 01:43
An: Michael Eisenring <michael.eisenr...@gmx.ch>
Cc: r-help@r-project.org
Betreff: Re: [R] How to calculate standard error of estimate (S) for my
non-linear regression model?

This is one area in which terminology in (computational) statistics has gone
a bit crazy. The thing some call "standard error of estimate" is actually
the residual standard deviation in the regression model, not to be confused
with the standard errors that are associated with parameter estimates. In
summary(nls(...)) (and summary(lm()) for that matter), you'll find it as
"residual standard error",  and even that is a bit of a misnomer.

-pd

On 26 Sep 2015, at 07:08 , Michael Eisenring <michael.eisenr...@gmx.ch>

wrote:


Hi all,

I am looking for something that indicates the goodness of fit for my
non linear regression model (since R2 is not very reliable).

I read that the standard error of estimate (also known as standard
error of the regression) is a good alternative.



The standard error of estimate is described on this page (including
the
formula) http://onlinestatbook.com/2/regression/accuracy.html
<https://3c.gmx.net/mail/client/dereferrer?redirectUrl=http%3A%2F%2Fon
linest atbook.com%2F2%2Fregression%2Faccuracy.html>

Unfortunately however, I have no clue how to programm it in R. Does
anyone know and could help me?

Thank you very much.



I added an example of my model and a dput() of my data

#CODE

dta<-read.csv("Regression_exp2.csv",header=T, sep = ",")
attach(dta)      # tells R to do the following analyses on this dataset
head(dta)



# loading packages: analysis of mixed effect models
library(nls2)#model

#Aim: fit equation to data: y~yo+a*(1-b^x) : Two parameter exp. single
rise to the maximum # y =Gossypol (from my data set) x= Damage_cm
(from my data set) #The other 3 parameters are unknown: yo=Intercept,
a= assymptote ans b=slope

plot(Gossypol~Damage_cm, dta)
# Looking at the plot, 0 is a plausible estimate for y0:
# a+y0 is the asymptote, so estimate about 4000; # b is between 0 and
1, so estimate .5 dta.nls <- nls(Gossypol~y0+a*(1-b^Damage_cm), dta,
              start=list(y0=0, a=4000, b=.5))

xval <- seq(0, 10, 0.1)
lines(xval, predict(dta.nls, data.frame(Damage_cm=xval)))
profile(dta.nls, alpha= .05)


summary(dta.nls)







#INPUT

structure(list(Gossypol = c(948.2418407, 1180.171957, 3589.187889,
450.7205451, 349.0864019, 592.3403778, 723.885643, 2005.919344,
720.9785449, 1247.806111, 1079.846532, 1500.863038, 4198.569251,
3618.448997, 4140.242559, 1036.331811, 1013.807628, 2547.326207,
2508.417927, 2874.651764, 1120.955, 1782.864308, 1517.045807,
2287.228752, 4171.427741, 3130.376482, 1504.491931, 6132.876396,
3350.203452, 5113.942098, 1989.576826, 3470.09352, 4576.787021,
4854.985845, 1414.161257, 2608.716056, 910.8879471, 2228.522959,
2952.931863, 5909.068158, 1247.806111, 6982.035521, 2867.610671,
5629.979049, 6039.995102, 3747.076592, 3743.331903, 4274.324792,
3378.151945, 3736.144027, 5654.858696, 5972.926124, 3723.629772,
3322.115942, 3575.043632, 2818.419785), Treatment = structure(c(5L,
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 3L, 3L, 3L, 3L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 2L, 2L, 2L, 4L, 2L, 4L, 4L, 2L, 4L, 2L, 2L, 4L, 4L, 4L,
4L, 4L, 4L, 2L), .Label = c("1c_2d", "1c_7d", "3c_2d", "9c_2d", "C"),
class = "factor"), Damage_cm = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0.142, 0.4035, 0.4435, 0.491, 0.4955, 0.578, 0.5895, 0.6925, 0.6965,
0.756, 0.8295, 1.0475, 1.313, 1.516, 1.573, 1.62, 1.8115, 1.8185,
1.8595, 1.989, 2.129, 2.171, 2.3035, 2.411, 2.559, 2.966, 2.974,
3.211, 3.2665, 3.474, 3.51, 3.547, 4.023, 4.409, 4.516, 4.7245, 4.809,
4.9835, 5.568, 5.681, 5.683, 7.272, 8.043, 9.437, 9.7455),
Damage_groups = c(0.278, 1.616, 2.501, 3.401, 4.577, 5.644, 7.272,
8.043, 9.591, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA),
Gossypol_Averaged = c(1783.211, 3244.129, 2866.307, 3991.809,
4468.809, 5121.309, 3723.629772, 3322.115942, 3196.731, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), Groups = c(42006L, 42038L,
42067L, 42099L, 42130L, 42162L, 42193L, 42225L, 42257L, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA)), .Names = c("Gossypol",
"Treatment", "Damage_cm", "Damage_groups", "Gossypol_Averaged",
"Groups"), class = "data.frame", row.names = c(NA, -56L))




        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


--
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000
Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd....@cbs.dk  Priv: pda...@gmail.com

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnew...@dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to calculate standard error of estimate (S) for my non-linear regression model?

Reply via email to