from:"Matteo Bertini"

[Rd] key-value mapping in C inside R?

2010-12-30 Thread Matteo Bertini


I'm testing some modifications in arima.c.
I've noticed that a big internal array of double (rbar) is usually 
sparse and I'd like to add an option to store it as key-value mapping.


Is there a library function or some other approach already used inside 
the R core for key-value mappings?


Thanks,
Matteo Bertini

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] key-value mapping in C inside R?

2010-12-30 Thread Matteo Bertini

Il giorno 30/dic/2010, alle ore 16.03, Simon Urbanek ha scritto:

> On Dec 30, 2010, at 7:50 AM, Matteo Bertini wrote:
> 
>> I'm testing some modifications in arima.c.
>> I've noticed that a big internal array of double (rbar) is usually sparse 
>> and I'd like to add an option to store it as key-value mapping.
>> 
>> Is there a library function or some other approach already used inside the R 
>> core for key-value mappings?
>> 
> 
> environment created with hash=TRUE, e.g.:
> e = new.env(TRUE, emptyenv())
> e[["foo"]] = "bar"
> 
> You can even set "size" with the expected size if you know it in advance.

Interesting, but is it possible to use this from the C code?

Thanks,
Matteo
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] key-value mapping in C inside R?

2011-01-03 Thread Matteo Bertini

On Thu, Dec 30, 2010 at 9:34 PM, Hadley Wickham  wrote:
> Why not use a sparse Matrix package?
> Hadley

In the original arima.c code the matrix is represented as an array of double.
I'd like to add a minimal overhead/changes on the code.

I think I'll use a simple sorted linked list to store the key-value map.
It will be enough for testing and I'll use something better if needed.

Thanks,
Matteo Bertini

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] stats/arima.c memory allocation

2011-04-09 Thread Matteo Bertini

Looking at the arima.c code related to arima fitting I noticed that the code
is mainly a merge of:

- Gardner, G, Harvey, A. C. and Phillips, G. D. A. (1980) Algorithm AS154.
An algorithm for exact maximum likelihood estimation of
autoregressive-moving average models by means of Kalman filtering. Applied
Statistics 29, 311322.
- Jones, R. H. (1980) Maximum likelihood fitting of ARMA models to time
series with missing observations. Technometrics 20 389395.

The first is used to fit the initial P0 matrix, and the second to do the
forecasts.

The AS154 implementation of P0 computation is O(r^4/8) in memory
requirements, where r is roughly the period length.

This is the origin of the ugly:

  src/library/stats/src/arima.c:838:if(r > 350) error(_("maximum
supported lag is 350"));

I noted on the same AS154 paper that the initial P0 verify this equation:

  P0 = T P0 T' + R R'

So I modified the arima.c code to find iteratively the solution of this
equation (starting from P0 = I)

The resulting code finds a solution very similar to the one of the original
code in a fraction of the occupied memory and in a time that is similar for
small lags and faster for bigger lags (without the r<350 limit).

Here the modified code: https://gist.github.com/911292

The question is, there are theoretical guarantees that the iterative
solution is the right solution?

Some hints/directions/books?
Matteo Bertini

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] maximum supported lag is 350

2009-07-22 Thread Matteo Bertini


I have found a strange error:

> fit = Arima(flow, c(1,0,1), list(order=c(0,1,1), period=96*7))
Error in makeARIMA(trarma[[1L]], trarma[[2L]], Delta, kappa) :
  maximum supported lag is 350

Is in fact quite common to have a lag > 350 using a weekly period in 
15min steps = 672 (standard in traffic flow prediction litterature for 
example).


Is there any reasons against doing what is suggested in the comment and 
avoid the hardcoded limit?


Thanks,
Matteo Bertini



# https://svn.r-project.org/R/trunk/src/library/stats/src/arima.c
SEXP getQ0(SEXP sPhi, SEXP sTheta)
{

[8<]

/* This is the limit using an int index.  We could use
   size_t and get more on a 64-bit system,
   but there seems no practical need. */
if(r > 350) error(_("maximum supported lag is 350"));

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] maximum supported lag is 350

2009-07-23 Thread Matteo Bertini


Il 22-07-2009 16:26, Matteo Bertini ha scritto:

I have found a strange error:

 > fit = Arima(flow, c(1,0,1), list(order=c(0,1,1), period=96*7))
Error in makeARIMA(trarma[[1L]], trarma[[2L]], Delta, kappa) :
maximum supported lag is 350

Is in fact quite common to have a lag > 350 using a weekly period in
15min steps = 672 (standard in traffic flow prediction litterature for
example).

Is there any reasons against doing what is suggested in the comment and
avoid the hardcoded limit?

Thanks,
Matteo Bertini



# https://svn.r-project.org/R/trunk/src/library/stats/src/arima.c
SEXP getQ0(SEXP sPhi, SEXP sTheta)
{

[8<]

/* This is the limit using an int index. We could use
size_t and get more on a 64-bit system,
but there seems no practical need. */
if(r > 350) error(_("maximum supported lag is 350"));



Can someone review this analysis?
I can propose a patch to fix the problem if it is ok.

As I can see from the code:

/* NB: nrbar could overflow */
int r = max(p, q + 1);
size_t np = r * (r + 1) / 2, nrbar = np * (np - 1) / 2;

where we want

nrbar <= SIZE_MAX/2

considering that:

np = r * (r + 1) / 2 < (r + 1)^2 / 2 = r2

and

nrbar = np * (np - 1) / 2 < np^2 / 2 < r2^2 / 2

we have a protective:

nrbar < (r + 1)^4 / 8

and we can check for:

(r + 1)^4 / 8 <= SIZE_MAX/2

mat...@zarathustra:~/Documents/src$ cat size_max.c
#include 
#include 
#include 

#define MAX_LAG (int)(sqrt(sqrt(8)) * sqrt(sqrt(SIZE_MAX/2)) - 1)

int main(void) {
int r;
size_t np, nrbar;
printf("MAX_LAG = %d\n", MAX_LAG);
for (r=MAX_LAG-5; rbut perhaps I was missing some detail given the overflow occurs at 
SIZE_MAX/2


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] key-value mapping in C inside R?

Re: [Rd] key-value mapping in C inside R?

Re: [Rd] key-value mapping in C inside R?

[Rd] stats/arima.c memory allocation

[Rd] maximum supported lag is 350

Re: [Rd] maximum supported lag is 350

6 matches

Site Navigation

Mail list logo

Footer information