Hi Mark,
The problem here is that the constructor expects there to be at least
one observation per location. The nb.l list has neighbourhood
information for 166 locations, while the 'obs' data contains
observations for only 99 of them (unique(obs$xy.idx)).
The solution probably requires more complicated construction of nb.l.
You can't just drop locations from the existing nb.l because that messes
up the internal indexing of nb.l. You could add a dummy observation with
zero weight for each of the extra locations, but I guess that isn't what
you really want to do for this application, as presumably the
neighbourhood structure is not supposed to lead to smoothing across the
gap between the two arms of the sausage...
best,
Simon
On 08/05/14 15:17, Mark Payne wrote:
Hi Roger and Simon,
Thanks for the replies. Simon's suggestion of an isolated or missing
neighbourhood doesn't hold either.
I've attached the code below - its my attempt to solve the FELSPLINE
sausage using mrf rather than a soap smoother. Its a bit convoluted, but
should run ok. I thought this would be a good starting example to get a
GMRF running, but then hit the problem mentioned. My attempts to track
the bug suggest that there is something wierd in the knots argument that
is being supplied to smooth.construct.mrf.smooth.spec() - but haven't
come so much further than that.
Code follows.
Mark
#GMRF Example
#Solves the classic FELSPINE problem using
#GMRF in mgcv. This example is a modified version of the
#example from smooth.construct.so.smooth.spec()
#in the mgcv package
rm(list=ls())
library(mgcv)
#Extract boundary
fsb <- fs.boundary()
#Create an underlying grid and evaluate the function on it
#Based on mgcv::fs.boundary() example
dx<-0.2;dy<-0.2 #Grid steps
id.fmt <- "%i/%i"
x.vals <- seq(-1,4,by=dx)
y.vals<-seq(-1,1,by=dy)
grd <- expand.grid(x=x.vals,y=y.vals)
tru.mat <- matrix(fs.test(grd$x,grd$y),length(x.vals),length(y.vals))
grd$truth <- as.vector(tru.mat)
grd$x.idx <- as.numeric(factor(grd$x,x.vals))
grd$y.idx <- as.numeric(factor(grd$y,y.vals))
grd$xy.idx <- sprintf(id.fmt,grd$x.idx,grd$y.idx)
grd <- subset(grd,!is.na <http://is.na>(truth))
## Simulate some fitting data, inside boundary...
n.samps<-250
x <- runif(n.samps)*5-1
y <- runif(n.samps)*2-1
obs <- data.frame(x=x,y=y)
obs$truth <- fs.test(obs$x,obs$y,b=1)
obs$z <- obs$truth + rnorm(n.samps)*.3 ## add noise
pt.inside <- inSide(fsb,x=x,y=y) ## remove outsiders
## Associate observation with grid cell
obs$x.rnd <- round(obs$x/dx)*dx
obs$y.rnd <- round(obs$y/dy)*dy
obs$x.idx <- as.numeric(factor(obs$x.rnd,x.vals))
obs$y.idx <- as.numeric(factor(obs$y.rnd,y.vals))
obs$xy.idx <- sprintf(id.fmt,obs$x.idx,obs$y.idx)
obs$xy.idx <- factor(obs$xy.idx,levels=grd$xy.idx)
#Filter observations that are outside or don't have an associated grid cell
obs <- subset(obs,pt.inside & xy.idx %in% grd$xy.idx )
## plot boundary with truth and data locations
par(mfrow=c(1,2))
image(x.vals,y.vals,tru.mat,col=heat.colors(100),xlab="x",ylab="y")
contour(x.vals,y.vals,tru.mat,levels=seq(-5,5,by=.25),add=TRUE)
lines(fsb$x,fsb$y);
points(obs$x,obs$y,pch=3);
#Plot grid
plot(y~x,grd)
lines(fsb$x,fsb$y);
#Setup neighbourhood adjancey
nb <- grd[,c("x.idx","y.idx","xy.idx")]
nb$N <- factor(sprintf(id.fmt,nb$x.idx,nb$y.idx+1),levels=nb$xy.idx)
nb$S <- factor(sprintf(id.fmt,nb$x.idx,nb$y.idx-1),levels=nb$xy.idx)
nb$E <- factor(sprintf(id.fmt,nb$x.idx+1,nb$y.idx),levels=nb$xy.idx)
nb$W <- factor(sprintf(id.fmt,nb$x.idx-1,nb$y.idx),levels=nb$xy.idx)
nb.mat <- sapply(nb[,c("N","S","E","W")],as.numeric)
nb.l <- lapply(split(nb.mat,nb$xy.idx),function(x) x[!is.na
<http://is.na>(x)])
#Fit MRF gam
mdl <- gam(z ~ s(xy.idx,bs="mrf",xt=list(nb=nb.l)),data=obs,method="REML")
On 8 May 2014 15:15, Simon Wood <s.w...@bath.ac.uk
<mailto:s.w...@bath.ac.uk>> wrote:
Hi Mark,
I'm not sure what is happening here - there is no chance that nb.l
contains a neighbourhood not in the levels of obs$xy.idx, I suppose?
i.e. is
all(names(nb.l)%in%levels(obs$__xy.idx))
also TRUE? Here is some code illustrating what nb should look like
(and in response to Roger Bivand's suggestion I also tried this
replacing all the labels with things like "x/y", but it still works).
## example mrf fit using polygons....
library(mgcv)
## Load Columbus Ohio crime data (see ?columbus for details and credits)
data(columb) ## data frame
data(columb.polys) ## district shapes list
xt <- list(polys=columb.polys) ## neighbourhood structure info for MRF
par(mfrow=c(2,2))
## First a full rank MRF...
b0 <- gam(crime ~
s(district,bs="mrf",xt=xt),__data=columb,method="REML")
## same fit based on direct neighbour spec...
nb <- mgcv:::pol2nb(columb.polys)$nb
xt <- list(nb=nb)
b <- gam(crime ~ s(district,bs="mrf",xt=xt),__data=columb,method="REML")
best,
Simon
On 08/05/14 01:58, Mark Payne wrote:
Hi,
Does anyone have an example of a Markov Random Field smoother
(MRF) in MGCV
where they have specified the neighbourhood directly, rather
than supplying
polygons? Does anyone understand how the rules should be? Based
on the
columb example, I have setup my data set and neighbourhood like so:
head(nb.l)
$`10/10`
[1] 135 155 153
$`10/2`
[1] 27 8 6
$`10/3`
[1] 48 7 28 26
$`10/4`
[1] 69 27 49 47
$`10/5`
[1] 48 70 68
$`10/7`
[1] 115 95 93
head(obs)
x y truth x.idx y.idx xy.idx
24 1.4835147 0.8026673 2.3605204 13 10 13/10
26 1.0452111 0.4673685 1.8316741 11 8 11/8
43 2.1514977 -0.2640058 -2.8812026 17 5 17/5
46 2.8473951 0.5445714 3.6347799 20 9 20/9
53 1.7983253 <tel:53%20%201.7983253> -0.6905912 -2.5473984 15
<tel:2.5473984%20%20%20%2015> 3 15/3
86 -0.1839814 -0.7824026 -0.5776616 5 2 5/2
but get the following error:
mdl <- gam(truth ~
s(xy.idx,bs="mrf",xt=list(nb=__nb.l)),data=obs,method="REML")
Error in smooth.construct.mrf.smooth.__spec(object, dk$data,
dk$knots) :
mismatch between nb/polys supplied area names and data area
names
However, there is a perfect match between the nb list names and
the data
area names:
all(levels(obs$xy.idx) %in% names(nb.l))
[1] TRUE
Any suggestions where to start?
Mark
[[alternative HTML version deleted]]
________________________________________________
R-help@r-project.org <mailto:R-help@r-project.org> mailing list
https://stat.ethz.ch/mailman/__listinfo/r-help
<https://stat.ethz.ch/mailman/listinfo/r-help>
PLEASE do read the posting guide
http://www.R-project.org/__posting-guide.html
<http://www.R-project.org/posting-guide.html>
and provide commented, minimal, self-contained, reproducible code.
--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603 <tel:%2B44%20%280%291225%20386603>
http://people.bath.ac.uk/sw283
--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603 http://people.bath.ac.uk/sw283
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.