----------------------------------------
> From: greg.s...@imail.org
> To: ghe...@blm.gov; r-help@r-project.org
> Date: Fri, 4 Feb 2011 17:49:51 -0700
> Subject: Re: [R] Quadratic regression: estimating the maximizing value
>
> No, your approach is not correct. For one you have not taken the covariance 
> between B and C into account, another thing is that the ratio of 2 normal 
> random variables is not necessarily normal, in some cases it can even follow 
> a Cauchy distribution. Also note that with only 1 degree of freedom the 
> Central Limit Theorem is not justified for using normal theory for non normal 
> distributions. So the normal based confidence intervals on the coefficients 
> are only reasonable if you are certain that the true error distribution is 
> normal or nearly normal.


To make this df issue more clear, take some code like this,

> for ( i in 1:4)
+ {
+ Ldat <- data.frame("X"=c(3,7,14,24), "Y"=c(1,5,8,0))
+ Ldat=Ldat[which( 1:4 != i),]
+ LM<-lm(formula = Y ~ X + I(X^2), data = Ldat)
+ print ( DZ(LM$coefficients[2],LM$coefficients[3]))
+ }
       X
13.46512
      X
13.1519
       X
13.11364
     X
14.625

and try to compute "confidence intervals" with 3 data points fitted to 3 
parameters.
What exactly does this mean? The result is exact if you only have 3 points or
that there is really no way to tell until you have a lot more data than 
parameters?


With your original data, look at residuals,

plot(LM) 

and note the residuals are all zero with any 3 data points. 

There is only so much you can do with limited data esp as number of points
approaches number of fitting parameters ( this is a common joke, always have
more parameters than data points ). Whatever reasons you have for expecting
the form to fit and things like measurement noise may be important to include
in your analysis to get anything meaningful out of it. 




>
> Some possibilities that may work for you:
>
> Use the nls function with the curve parameterized to use the vertex point as 
> one of the parameters (still need normality).
>
> Do bootstrapping (with only 4 points you are not going to get much with 
> non-parametric bootstrap, but parametric with some assumptions may work).
>
> Use a Bayesian approach (this may be best if you can find some meaningful 
> prior information).
>
> --
> Gregory (Greg) L. Snow Ph.D.
> Statistical Data Center
> Intermountain Healthcare
> greg.s...@imail.org
> 801.408.8111
>
>
> > -----Original Message-----
> > From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
> > project.org] On Behalf Of ghe...@blm.gov
> > Sent: Friday, February 04, 2011 2:34 PM
> > To: r-help@r-project.org
> > Subject: [R] Quadratic regression: estimating the maximizing value
> >
> >
> > A bioligist colleague sent me the following data.
> >
> > x Y
> > 3 1
> > 7 5
> > 14 8
> > 24 0
> >
> > (Yes, only four data points.) I don't know much about the
> > application, but apparently there are good empirical
> > reasons to use a quadratic model.
> >
> > The goal is to find the X value which maximizes the
> > response Y, and to find a confidence interval for this X
> > value.
> >
> > Finding the maximizing X value is pretty straightforward:
> >
> > >Ldat <- data.frame("X"=c(3,7,14,24), "Y"=c(1,5,8,0))
> > >(LM<-lm(formula = Y ~ X + I(X^2), data = Ldat))
> > Call:
> > lm(formula = Y ~ X + I(X^2), data = Ldat)
> >
> > Coefficients:
> > (Intercept) X I(X^2)
> > -3.86978 1.77762 -0.06729
> >
> > > DZ<-function(B,C) { (-B)/(2*C) } # Solve d/dx(A + Bx + Cx^2) = 0
> > > DZ(LM$coefficients[2],LM$coefficients[3])
> > X
> > 13.20961
> >
> >
> > To find a confidence interval, I used "confint()".
> > Default confidence level of 95% was not useful; used 80% instead,
> > and then computed DZ with the extreme X and I(X^2) coefficients:
> >
> > >(CI80<-confint(LM,level=0.8))
> >
> > 10 % 90 %
> > (Intercept) -5.6147948 -2.12476138
> > X 1.4476460 2.10759306
> > I(X^2) -0.0790312 -0.05553898
> >
> > > DZ(CI80[2,1],CI80[3,1])
> > [1] 9.1587
> > > DZ(CI80[2,2],CI80[3,2])
> > [1] 18.97400
> >
> > Conclusion: the 80% confidence level for the maximizing X value is
> > included in the range (9.158, 18.974)
> >
> > #################
> > Questions:
> >
> > 1) Is this the right procedure, or totally off base?
> >
> > 2) The coefficient of the "Intercept" is irrelevant to calculating
> > the maximizing value of X. Is there a way to reduce the size of
> > the confidence interval by doing a computation that leaves out this
> > parameter?
> >
> > 3) I believe that confint() indicates the axes of an ellipsoid,
> > rather than the corners of a box, in parameter space;
> > so that the above procedure is (slightly) too conservative.
> >
> > 4) Where are the calculations for confint() documented ?
> >
> > Thanks,
> > George Heine
> >
> > (Embedded image moved to file: pic23206.gif)Please consider the
> > environment
> > before printing this page
> >
> >
> >
> > <>=<>=<>=<>=<>=<>
> > =<>=<>=<>=<>=<>
> >
> > George Heine, PhD
> >
> > Mathematical Analyst
> >
> > National Operations
> > Center
> >
> > Bureau of Land Management
> >
> > voice (303) 236-0099
> >
> > fax (303) 236-1974
> >
> > cell (303) 905-5296
> >
> > <>=<>=<>=<>=<>=<>
> > =<>=<>=<>=<>=<>
> >
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
                                          
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to