Re: [R] MRF smoothers in MGCV - specifying neighbours

Simon Wood Thu, 08 May 2014 08:23:49 -0700

Hi Mark,

The problem here is that the constructor expects there to be at leastone observation per location. The nb.l list has neighbourhoodinformation for 166 locations, while the 'obs' data containsobservations for only 99 of them (unique(obs$xy.idx)).

The solution probably requires more complicated construction of nb.l.You can't just drop locations from the existing nb.l because that messesup the internal indexing of nb.l. You could add a dummy observation withzero weight for each of the extra locations, but I guess that isn't whatyou really want to do for this application, as presumably theneighbourhood structure is not supposed to lead to smoothing across thegap between the two arms of the sausage...


best,
Simon

On 08/05/14 15:17, Mark Payne wrote:

Hi Roger and Simon,

Thanks for the replies. Simon's suggestion of an isolated or missing
neighbourhood doesn't hold either.

I've attached the code below - its my attempt to solve the FELSPLINE
sausage using mrf rather than a soap smoother. Its a bit convoluted, but
should run ok. I thought this would be a good starting example to get a
GMRF running, but then hit the problem mentioned. My attempts to track
the bug suggest that there is something wierd in the knots argument that
is being supplied to smooth.construct.mrf.smooth.spec() - but haven't
come so much further than that.

Code follows.

Mark

#GMRF Example
#Solves the classic FELSPINE problem using
#GMRF in mgcv. This example is a modified version of the
#example from smooth.construct.so.smooth.spec()
#in the mgcv package

rm(list=ls())
library(mgcv)

#Extract boundary
fsb <- fs.boundary()

#Create an underlying grid and evaluate the function on it
#Based on mgcv::fs.boundary() example
dx<-0.2;dy<-0.2    #Grid steps
id.fmt <- "%i/%i"
x.vals <- seq(-1,4,by=dx)
y.vals<-seq(-1,1,by=dy)
grd <- expand.grid(x=x.vals,y=y.vals)
tru.mat <- matrix(fs.test(grd$x,grd$y),length(x.vals),length(y.vals))
grd$truth <- as.vector(tru.mat)
grd$x.idx <- as.numeric(factor(grd$x,x.vals))
grd$y.idx <- as.numeric(factor(grd$y,y.vals))
grd$xy.idx <- sprintf(id.fmt,grd$x.idx,grd$y.idx)
grd <- subset(grd,!is.na <http://is.na>(truth))

## Simulate some fitting data, inside boundary...
n.samps<-250
x <- runif(n.samps)*5-1
y  <- runif(n.samps)*2-1
obs <- data.frame(x=x,y=y)
obs$truth <- fs.test(obs$x,obs$y,b=1)
obs$z <- obs$truth + rnorm(n.samps)*.3 ## add noise
pt.inside <- inSide(fsb,x=x,y=y) ## remove outsiders

## Associate observation with grid cell
obs$x.rnd <- round(obs$x/dx)*dx
obs$y.rnd <- round(obs$y/dy)*dy
obs$x.idx <- as.numeric(factor(obs$x.rnd,x.vals))
obs$y.idx <- as.numeric(factor(obs$y.rnd,y.vals))
obs$xy.idx <- sprintf(id.fmt,obs$x.idx,obs$y.idx)
obs$xy.idx <- factor(obs$xy.idx,levels=grd$xy.idx)

#Filter observations that are outside or don't have an associated grid cell
obs <- subset(obs,pt.inside & xy.idx %in% grd$xy.idx )

## plot boundary with truth and data locations
par(mfrow=c(1,2))
image(x.vals,y.vals,tru.mat,col=heat.colors(100),xlab="x",ylab="y")
contour(x.vals,y.vals,tru.mat,levels=seq(-5,5,by=.25),add=TRUE)
lines(fsb$x,fsb$y);
points(obs$x,obs$y,pch=3);

#Plot grid
plot(y~x,grd)
lines(fsb$x,fsb$y);

#Setup neighbourhood adjancey
nb <- grd[,c("x.idx","y.idx","xy.idx")]
nb$N <- factor(sprintf(id.fmt,nb$x.idx,nb$y.idx+1),levels=nb$xy.idx)
nb$S <- factor(sprintf(id.fmt,nb$x.idx,nb$y.idx-1),levels=nb$xy.idx)
nb$E <- factor(sprintf(id.fmt,nb$x.idx+1,nb$y.idx),levels=nb$xy.idx)
nb$W <- factor(sprintf(id.fmt,nb$x.idx-1,nb$y.idx),levels=nb$xy.idx)
nb.mat <- sapply(nb[,c("N","S","E","W")],as.numeric)
nb.l <- lapply(split(nb.mat,nb$xy.idx),function(x) x[!is.na
<http://is.na>(x)])

#Fit MRF gam
mdl <- gam(z ~ s(xy.idx,bs="mrf",xt=list(nb=nb.l)),data=obs,method="REML")










On 8 May 2014 15:15, Simon Wood <s.w...@bath.ac.uk
<mailto:s.w...@bath.ac.uk>> wrote:

    Hi Mark,

    I'm not sure what is happening here - there is no chance that nb.l
    contains a neighbourhood not in the levels of obs$xy.idx, I suppose?
    i.e. is

    all(names(nb.l)%in%levels(obs$__xy.idx))

    also TRUE? Here is some code illustrating what nb should look like
    (and in response to Roger Bivand's suggestion I also tried this
    replacing all the labels with things like "x/y", but it still works).


    ## example mrf fit using polygons....
    library(mgcv)
    ## Load Columbus Ohio crime data (see ?columbus for details and credits)
    data(columb)       ## data frame
    data(columb.polys) ## district shapes list
    xt <- list(polys=columb.polys) ## neighbourhood structure info for MRF
    par(mfrow=c(2,2))
    ## First a full rank MRF...
    b0 <- gam(crime ~
    s(district,bs="mrf",xt=xt),__data=columb,method="REML")

    ## same fit based on direct neighbour spec...
    nb <- mgcv:::pol2nb(columb.polys)$nb
    xt <- list(nb=nb)
    b <- gam(crime ~ s(district,bs="mrf",xt=xt),__data=columb,method="REML")

    best,
    Simon




    On 08/05/14 01:58, Mark Payne wrote:

        Hi,

        Does anyone have an example of a Markov Random Field smoother
        (MRF) in MGCV
        where they have specified the neighbourhood directly, rather
        than supplying
        polygons? Does anyone understand how the rules should be? Based
        on the
        columb example, I have setup my data set and neighbourhood like so:

            head(nb.l)

        $`10/10`
        [1] 135 155 153

        $`10/2`
        [1] 27  8  6

        $`10/3`
        [1] 48  7 28 26

        $`10/4`
        [1] 69 27 49 47

        $`10/5`
        [1] 48 70 68

        $`10/7`
        [1] 115  95  93

            head(obs)

                      x          y      truth x.idx y.idx xy.idx
        24  1.4835147  0.8026673  2.3605204    13    10  13/10
        26  1.0452111  0.4673685  1.8316741    11     8   11/8
        43  2.1514977 -0.2640058 -2.8812026    17     5   17/5
        46  2.8473951  0.5445714  3.6347799    20     9   20/9
        53 1.7983253 <tel:53%20%201.7983253> -0.6905912 -2.5473984 15
        <tel:2.5473984%20%20%20%2015>     3   15/3
        86 -0.1839814 -0.7824026 -0.5776616     5     2    5/2



        but get the following error:

            mdl <- gam(truth ~

        s(xy.idx,bs="mrf",xt=list(nb=__nb.l)),data=obs,method="REML")
        Error in smooth.construct.mrf.smooth.__spec(object, dk$data,
        dk$knots) :
            mismatch between nb/polys supplied area names and data area
        names

        However, there is a perfect match between the nb list names and
        the data
        area names:

            all(levels(obs$xy.idx) %in% names(nb.l))

        [1] TRUE



        Any suggestions where to start?

        Mark

                 [[alternative HTML version deleted]]

        ________________________________________________
        R-help@r-project.org <mailto:R-help@r-project.org> mailing list
        https://stat.ethz.ch/mailman/__listinfo/r-help
        <https://stat.ethz.ch/mailman/listinfo/r-help>
        PLEASE do read the posting guide
        http://www.R-project.org/__posting-guide.html
        <http://www.R-project.org/posting-guide.html>
        and provide commented, minimal, self-contained, reproducible code.



    --
    Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
    +44 (0)1225 386603 <tel:%2B44%20%280%291225%20386603>
    http://people.bath.ac.uk/sw283



--
Simon Wood, Mathematical Science, University of Bath BA2 7AY UK
+44 (0)1225 386603               http://people.bath.ac.uk/sw283

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] MRF smoothers in MGCV - specifying neighbours

Reply via email to