Thanks.

This is in the netlib loess code: the size is used in Fortran (and an INTEGER) so we cannot increase it. I've added a test and thrown an error if the dimension is too large.


On 01/03/2013 11:27, Hiroyuki Kawakatsu wrote:
Hi,

I am segfaulting when using predict.loess() (checked with r62092).
I've traced the source with the help of valgrind (output pasted
below) and it appears that this is due to int overflow when
allocating an int work array in loess_workspace():

     liv = 50 + ((int)pow((double)2, (double)D) + 4) * nvmax + 2 * N;

where liv is an (global) int. For D=1 (one x variable), this
overflows at approx N = 4089 where N is the fitted sample size (not
prediction sample size).

I am aware that you are in the process of introducing long vectors
but a quick fix would be to error when predict.loess(..., se=TRUE)
and N is too large. (Ideally, one would use long int but does
fortran portably support long int?) The threshold N value may depend
on surface type (above is for surface=="interpolate").

The following sample code does not result in segfault but when run
with valgrind, it produces the warning about large range. (In the
code that segfaults N is about 77,000).

set.seed(1)
n = 5000      # n=4000 seems ok
x = rnorm(n)
y = x + rnorm(n)
yf = loess(y~x, span=0.75, control=loess.control(trace.hat="approximate"))
print( predict(yf, data.frame(x=1), se=TRUE) )

##---valgrid output with segfault (abridged):

test4()
==30841== Warning: set address range perms: large range [0x3962a040,
0x5fb42608) (defined)
==30841== Warning: set address range perms: large range [0x5fb43040,
0xf8c8e130) (defined)
==30841== Invalid write of size 4
==30841==    at 0xCD719F0: ehg139_ (loessf.f:1444)
==30841==    by 0xCD72E0C: ehg131_ (loessf.f:467)
==30841==    by 0xCD73A5A: lowesb_ (loessf.f:1530)
==30841==    by 0xCD2C774: loess_ise (loessc.c:219)
==30841==    by 0x486C7F: do_dotCode (dotcode.c:1744)
==30841==    by 0x4AB040: bcEval (eval.c:4544)
==30841==    by 0x4B6B3F: Rf_eval (eval.c:498)
==30841==    by 0x4BAD87: Rf_applyClosure (eval.c:960)
==30841==    by 0x4B6D5E: Rf_eval (eval.c:611)
==30841==    by 0x4B7A1E: do_eval (eval.c:2193)
==30841==    by 0x4AB040: bcEval (eval.c:4544)
==30841==    by 0x4B6B3F: Rf_eval (eval.c:498)
==30841==  Address 0xf8cd4144 is not stack'd, malloc'd or (recently)
free'd
==30841==

  *** caught segfault ***
address 0xf8cd4144, cause 'memory not mapped'

Traceback:
  1: predLoess(y, x, newx, s, weights, pars$robust, pars$span,
pars$degree,     pars$normalize, pars$parametric, pars$drop.square,
pars$surface,     pars$cell, pars$family, kd, divisor, se = se)
  2: eval(expr, envir, enclos)
  3: eval(substitute(expr), data, enclos = parent.frame())
  4: with.default(object, predLoess(y, x, newx, s, weights,
pars$robust,     pars$span, pars$degree, pars$normalize,
pars$parametric,     pars$drop.square, pars$surface, pars$cell,
pars$family, kd,     divisor, se = se))
  5: with(object, predLoess(y, x, newx, s, weights, pars$robust,
pars$span,     pars$degree, pars$normalize, pars$parametric,
pars$drop.square,     pars$surface, pars$cell, pars$family, kd,
divisor, se = se))
  6: predict.loess(y2, data.frame(hours = xmin), se = TRUE)
  7: predict(y2, data.frame(hours = xmin), se = TRUE)
  8: test4()
aborting ...
==30841==




--
Brian D. Ripley,                  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to