On Mon, Aug 30, 2010 at 01:50:03PM +0100, Prof Brian Ripley wrote: > The underlying problem is your expectations. > > R (unlike S) was set up many years ago to use na.omit as the > default, and when fitting both lm() and loess() silently omit cases > with missing values. So why should prediction from 'newdata' be > different unless documented to be so (which it is nowadays for > predict.lm, even though you are adding to the evidence that was a > mistake)?
Thanks for your insights into the undelying philisophy. I agree that na.omit is a sensible default for model fitting. But I am not so sure that quietly omitting unpredictable values is such a good idea - especially if predict methods for different types of model implement inconsistent approaches. I see no disadvantage in returning NA where no prediction/computation is possible -- the value is 'Not Available', after all. (And the length of the result vector would match nrow(newdata) which would be handy for most practical purposes) > loess() is somewhat different from lm() in that it does not in > general allow extrapolation, and the prediction for Inf and NaN is > simply undefined. Of course this is correct but I still think that predict.loess not only acts in a way that will most likely be surprising to most users but also inconsistent with itself (Inf vs. NA/NaN). If extrapolation is the problem Inf should not yield anything but it does (and the same applies to values outside of the original x-range): x <- rnorm(15) y <- rnorm(15) model.loess <- loess(y~x) predict(model.loess, data.frame(x=c(0.5, Inf))) # [1] -0.02508801 NA predict(model.loess, data.frame(x=min(x)-10)) # [1] NA Actually, while tracking down my problem I did consider that extrapolation could be the problem and, according to the last example in ?loess, tried to set control = loess.control(surface = "direct"). To my surprise, now even Inf fails - although I am much happier with getting an error message than with silent omission. Anyway, writing a little wrapper that puts NAs back into results, is not a big deal and in that respect my problem is solved. > Nevertheless, take a look at the version in R-devel (pre-2.12.0) > which give you more options. Thanks for that information - I will definitely have a look at that. cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.