Thanks to all of you for your suggestions and comments. I really appreciate it.

Some comments to Dennis' comments:
1) I am not concerned about predicting outside the original range. That would be nonsense anyway considering the physical phenomenon I am modeling. I am, however, concerned that the bootstrapping leads to extremely wide CIs at the extremes of the range when there are few data points. But I guess there is not much I can do about that as long as I rely on bootstrapping?

2) I have made a function that does the interpolation to the requested new x's from the original modeling data to get the residual variance and the model variance. Then it interpolates the combined SDs back the the new x values. See below.

3) I understand that. For this project it is not that important that the final prediction intervals are super accurate. But I need to hit the ballpark. I am only trying to do something that doesn't crossly underestimate the prediction error and doesn't make statisticians loose their lunch a first glance. I also cannot avoid that my data contains erroneous values and I will need to build many models unsupervised. But the fit should be good enough that I plan to eliminate values outside some multiple of the prediction interval and then re-calculate. And if the model is not good in any range I will throw it out completely.


Based on the formula of my last message I have made a function that at least gives less optimistic intervals than what I could get with the other methods I have tried. The function and example data can be found here https://github.com/stanstrup/retpred_shiny/blob/master/retdb_admin/make_predictions_CI_tests.R in case anymore has any comments, suggestions or expletives to my implementation.


----------------------
Jan Stanstrup
Postdoc

Metabolomics
Food Quality and Nutrition
Fondazione Edmund Mach



On 08/12/2014 05:40 PM, Bert Gunter wrote:
PI's of what? -- future individual values or mean values?

I assume quantreg provides quantiles for the latter, not the former.
(See ?predict.lm for a terse explanation of the difference). Both are
obtainable from bootstrapping but the details depend on what you are
prepared to assume. Consult references or your local statistician for
help if needed.

-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
Clifford Stoll




On Tue, Aug 12, 2014 at 8:20 AM, David Winsemius <dwinsem...@comcast.net> wrote:
On Aug 12, 2014, at 12:23 AM, Jan Stanstrup wrote:

Hi,

I am trying to find a way to estimate prediction intervals (PI) for a monotonic 
loess curve using bootstrapping.

At the moment my approach is to use the boot function from the boot package to 
bootstrap my loess model, which consist of loess + monoproc from the monoproc 
package (to force the fit to be monotonic which gives me much improved results 
with my particular data). The output from the monoproc package is simply the 
fitted y values at each x-value.
I then use boot.ci (again from the boot package) to get confidence intervals. The problem 
is that this gives me confidence intervals (CI) for the "fit" (is there a 
proper way to specify this?) and not a prediction interval. The interval is thus way too 
optimistic to give me an idea of the confidence interval of a predicted value.

For linear models predict.lm can give PI instead of CI by setting interval = 
"prediction". Further discussion of that here:
http://stats.stackexchange.com/questions/82603/understanding-the-confidence-band-from-a-polynomial-regression
http://stats.stackexchange.com/questions/44860/how-to-prediction-intervals-for-linear-regression-via-bootstrapping.

However I don't see a way to do that for boot.ci. Does there exist a way to get 
PIs after bootstrapping? If some sample code is required I am more than happy 
to supply it but I thought the question was general enough to be understandable 
without it.

Why not use the quantreg package to estimate the quantiles of interest to you? 
That way you would not be depending on Normal theory assumptions which you 
apparently don't trust. I've used it with the `cobs` function from the package 
of the same name to implement the monotonic constraint. I think there is a 
worked example in the quantreg package, but since I bought Koenker's book, I 
may be remembering from there.
--

David Winsemius
Alameda, CA, USA

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to