Hi Katharina,
Thanks for sending this.
The problem is that the prediction data for site contain levels not
available in the (useable non-NA) fit data...
> levels(m$model$site)
[1] "KRB" "NP.FOR" "WKS.FRE" "WKS.KRE" "WKS.RIE" "WKS.WUE"
> levels(gapData$site)
[1] "KRB" "NP.FOR" "RIE.2" "WKS.BBR" "WKS.FRE" "WKS.HOE" "WKS.KRE"
[8] "WKS.RIE" "WKS.WUE"
predict.lm has a check for this, and so fails with a rather more
informative error message. e.g.
m0 <- lm(sensor1 ~ sensor2 + site + site:NthSampling,
data=xylemRohWeekXnn2011,na.action=na.omit)
predict(m0,gapData)
... factor site has new levels RIE.2, WKS.BBR
I'll add a better check to predict.gam.
best,
Simon
ps. if you want predictions with the random effects for site set to zero
then one trick is to use terms like s(site,bs="re",by=dum) in fitting
with dum set to 1. Then in prediction you can set 'site' to any existing
level, and dum to zero, in order to get a prediction for the missing
level, with the 'site' effect set to zero.
On 02/02/14 17:52, Katharina May wrote:
Hi Simon,
thank you for your reply, I really appreciate any help to understand
the problem here...
Unluckily the package upgrade didn't help with this issue.
An example reproducing the error, and a current sessionInfo() Output
can be found below.
Many thanks once again,
Katharina
R Code Example
<snip>
library(RCurl)
library(mgcv)
#retrieve xylemRohWeekXnn2011 test data frame
eval( expr = parse( text =
getURL("https://webdisk.ads.mwn.de/Handlers/AnonymousDownload.ashx?folder=1a7cbaa4&path=xylemRohWeekXnn2011.R")
))
xylemRohWeekXnn.fit.bam <- bam(sensor1 ~ sensor2 + s(site, bs="re")
+ s(site, NthSampling, bs="re") , data=xylemRohWeekXnn2011,
na.action=na.omit)
#subset data containing gaps for predicting
gapData <- xylemRohWeekXnn2011[is.na(xylemRohWeekXnn2011[,2]) &
!is.na(xylemRohWeekXnn2011[,11]),c(2:3,6:7, 11)]
xylemRohWeekXnnSite.fit <-
predict.gam(xylemRohWeekXnn.fit.bam,gapData, type="response", se=F)
</snap>
My current Session Information (sessionInfo() Output - also confirming
that the problem exists on both Windows and Mac OS X):
<snip>
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] mgcv_1.7-28 nlme_3.1-113 RCurl_1.95-4.1 bitops_1.0-6
loaded via a namespace (and not attached):
[1] grid_3.0.2 lattice_0.20-24 Matrix_1.1-2 tools_3.0.2
</snap>
On 31/01/14 12:57, Simon Wood wrote:
Hi Katharina,
Could you try upgrading to mgcv_1.7-28, please? There was an occasional
problem to do with matching factor levels, which is fixed, but I'm not
very confident that is what is going on.
If upgrading doesn't work, is there any chance you could send me a small
example dataset and code that produces the error, and I'll look at it?
best,
Simon
--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603 http://people.bath.ac.uk/sw283
--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603 http://people.bath.ac.uk/sw283
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.