[Rd] CRAN task views work only once per session (PR#9330)

2006-11-06 Thread Keith Ponting
Thankyou to those who have replied to this thread.

I have now reproduced similar effects in a way which does not directly
involve CRAN task views. (I have also reproduced the original problem on
a different machine within
our site and using a different mirror.)

The following sequence of commands works repeatedly within plain Rterm
(actually running under XEmacs) on Windows XP:

x<-url("http://www.sourcekeg.co.uk/cran/src/contrib/Views.rds",open="rb";
);
showConnections()
.readRDS(x);
showConnections()
close(x)
showConnections()

However under RGui on the same system, second and subsequent attempts
time out in .readRDS (see session log below) and note that although the
connection is shown as open immediately before the second .readRDS call,
it is shown as closed immediately after that (failing) call. 

I wonder whether something in RGui is holding on to connections after
they have been closed. Even while .readRDS is timing out in one RGui
session, I can run that sequence of calls in plain Rterm or (once!) in
another RGui session without problems, which I hope eliminates the
possibility that it is our company firewall or something else on my PC
holding on to the connection in some way.

-- session log starts


R version 2.4.0 (2006-10-03)
Copyright (C) 2006 The R Foundation for Statistical Computing
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

>
x<-url("http://www.sourcekeg.co.uk/cran/src/contrib/Views.rds",open="rb";
);
> showConnections()
  description class mode
text
3 "http://www.sourcekeg.co.uk/cran/src/contrib/Views.rds"; "url" "rb"
"binary"
  isopen   can read can write
3 "opened" "yes""no" 
> .readRDS(x);
> showConnections()
  descriptionclass
mode
3 "gzcon(http://www.sourcekeg.co.uk/cran/src/contrib/Views.rds)" "gzcon"
"rb"
  text isopen   can read can write
3 "binary" "opened" "yes""no" 
> close(x)
> showConnections()
 description class mode text isopen can read can write
>
x<-url("http://www.sourcekeg.co.uk/cran/src/contrib/Views.rds",open="rb";
);
> showConnections()
  description class mode
text
3 "http://www.sourcekeg.co.uk/cran/src/contrib/Views.rds"; "url" "rb"
"binary"
  isopen   can read can write
3 "opened" "yes""no" 
> .readRDS(x);
Error in .readRDS(x) : connection is not open
In addition: Warning message:
InternetOpenUrl failed: 'The operation timed out' 
> showConnections()
 description class mode text isopen can read can write
> close(x)
> showConnections()
 description class mode text isopen can read can write
>
x<-url("http://www.sourcekeg.co.uk/cran/src/contrib/Views.rds",open="rb";
);
> showConnections()
  description class mode
text
3 "http://www.sourcekeg.co.uk/cran/src/contrib/Views.rds"; "url" "rb"
"binary"
  isopen   can read can write
3 "opened" "yes""no" 
> .readRDS(x);
Error in .readRDS(x) : connection is not open
In addition: Warning message:
InternetOpenUrl failed: 'The operation timed out' 
> showConnections()
 description class mode text isopen can read can write
> close(x)
> showConnections()
 description class mode text isopen can read can write
> 
> sessionInfo()
R version 2.4.0 (2006-10-03) 
i386-pc-mingw32 

locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United
Kingdom.1252;LC_MONETARY=English_United
Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252

attached base packages:
[1] "methods"   "stats" "graphics"  "grDevices" "utils"
"datasets" 
[7] "base" 
> 

-- session log ends


If I remove the .readRDS(x) call, there is no problem - the following
command sequence does work repeatedly:
x<-url("http://www.sourcekeg.co.uk/cran/src/contrib/Views.rds",open="rb";
);
showConnections()
close(x)
showConnections()

Dr. Keith Ponting
Principal Scientist

Aurix Ltd
Malvern Hills Science Park
Geraldine Rd  Malvern  Worcestershire  WR14 3SZ  UK

__
This email has been scanned by the MessageLabs Email Security System.
For more information visit http://www.virtual-email.net/messagelabs.htm

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] CRAN task views work only once per session (PR#9330)

2006-11-06 Thread Prof Brian Ripley
I think the following item in NEWS for R-patched may be relevant:

 o   load()ing from a connection had a logic bug in when it closed
 the connection. (PR#9271)

so please try R-patched.


On Mon, 6 Nov 2006, Keith Ponting wrote:

> Thankyou to those who have replied to this thread.
>
> I have now reproduced similar effects in a way which does not directly
> involve CRAN task views. (I have also reproduced the original problem on
> a different machine within
> our site and using a different mirror.)
>
> The following sequence of commands works repeatedly within plain Rterm
> (actually running under XEmacs) on Windows XP:
>
> x<-url("http://www.sourcekeg.co.uk/cran/src/contrib/Views.rds",open="rb";
> );
> showConnections()
> .readRDS(x);
> showConnections()
> close(x)
> showConnections()
>
> However under RGui on the same system, second and subsequent attempts
> time out in .readRDS (see session log below) and note that although the
> connection is shown as open immediately before the second .readRDS call,
> it is shown as closed immediately after that (failing) call.
>
> I wonder whether something in RGui is holding on to connections after
> they have been closed. Even while .readRDS is timing out in one RGui
> session, I can run that sequence of calls in plain Rterm or (once!) in
> another RGui session without problems, which I hope eliminates the
> possibility that it is our company firewall or something else on my PC
> holding on to the connection in some way.
>
> -- session log starts
> 
>
> R version 2.4.0 (2006-10-03)
> Copyright (C) 2006 The R Foundation for Statistical Computing
> ISBN 3-900051-07-0
>
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
>
>  Natural language support but running in an English locale
>
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
>
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help.
> Type 'q()' to quit R.
>
>>
> x<-url("http://www.sourcekeg.co.uk/cran/src/contrib/Views.rds",open="rb";
> );
>> showConnections()
>  description class mode
> text
> 3 "http://www.sourcekeg.co.uk/cran/src/contrib/Views.rds"; "url" "rb"
> "binary"
>  isopen   can read can write
> 3 "opened" "yes""no"
>> .readRDS(x);
>> showConnections()
>  descriptionclass
> mode
> 3 "gzcon(http://www.sourcekeg.co.uk/cran/src/contrib/Views.rds)" "gzcon"
> "rb"
>  text isopen   can read can write
> 3 "binary" "opened" "yes""no"
>> close(x)
>> showConnections()
> description class mode text isopen can read can write
>>
> x<-url("http://www.sourcekeg.co.uk/cran/src/contrib/Views.rds",open="rb";
> );
>> showConnections()
>  description class mode
> text
> 3 "http://www.sourcekeg.co.uk/cran/src/contrib/Views.rds"; "url" "rb"
> "binary"
>  isopen   can read can write
> 3 "opened" "yes""no"
>> .readRDS(x);
> Error in .readRDS(x) : connection is not open
> In addition: Warning message:
> InternetOpenUrl failed: 'The operation timed out'
>> showConnections()
> description class mode text isopen can read can write
>> close(x)
>> showConnections()
> description class mode text isopen can read can write
>>
> x<-url("http://www.sourcekeg.co.uk/cran/src/contrib/Views.rds",open="rb";
> );
>> showConnections()
>  description class mode
> text
> 3 "http://www.sourcekeg.co.uk/cran/src/contrib/Views.rds"; "url" "rb"
> "binary"
>  isopen   can read can write
> 3 "opened" "yes""no"
>> .readRDS(x);
> Error in .readRDS(x) : connection is not open
> In addition: Warning message:
> InternetOpenUrl failed: 'The operation timed out'
>> showConnections()
> description class mode text isopen can read can write
>> close(x)
>> showConnections()
> description class mode text isopen can read can write
>>
>> sessionInfo()
> R version 2.4.0 (2006-10-03)
> i386-pc-mingw32
>
> locale:
> LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United
> Kingdom.1252;LC_MONETARY=English_United
> Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252
>
> attached base packages:
> [1] "methods"   "stats" "graphics"  "grDevices" "utils"
> "datasets"
> [7] "base"
>>
>
> -- session log ends
> 
>
> If I remove the .readRDS(x) call, there is no problem - the following
> command sequence does work repeatedly:
> x<-url("http://www.sourcekeg.co.uk/cran/src/contrib/Views.rds",open="rb";
> );
> showConnections()
> close(x)
> showConnections()
>
> Dr. Keith Ponting
> Principal Scientist
>
> Auri

Re: [Rd] CRAN task views work only once per session (PR#9330)

2006-11-06 Thread Keith Ponting
> -Original Message-
> From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] 
> Sent: 06 November 2006 11:42
> To: Keith Ponting
> Cc: r-devel@r-project.org
> Subject: Re: [Rd] CRAN task views work only once per session (PR#9330)
> 
> I think the following item in NEWS for R-patched may be relevant:
> 
>  o   load()ing from a connection had a logic bug in when it closed
>  the connection. (PR#9271)
> 
> so please try R-patched.

R-patched runs the offending code repeatedly without complaint (and my
original problem with ctv available.views is also solved) so I will look
forward to the next release.

Thankyou for your time (and to you and all responsible for R itself!).

Keith Ponting 


__
This email has been scanned by the MessageLabs Email Security System.
For more information visit http://www.virtual-email.net/messagelabs.htm

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] lme4 install, was Re: [R] Which executable is associated with R CMD INSTALL?

2006-11-06 Thread Douglas Bates
On 31 Oct 2006 12:05:21 +0100, Peter Dalgaard <[EMAIL PROTECTED]> wrote:
>
> [move to r-devel, put maintainer in loop]
>
> Patrick Connolly <[EMAIL PROTECTED]> writes:
>
> > On Mon, 30-Oct-2006 at 04:44PM -0500, Duncan Murdoch wrote:
> >
> >
> > |> Try "R CMD printenv R_HOME" and you'll find which R home directory it is
> > |> using.  You can see a lot more with "R CMD printenv" or various options
> > |> to "R CMD config".
> >
> > Thanks for that information.  It knocks my theory on the head.  Pity
> > that, because I might have been able to do something about it if that
> > was the problem.  Now I'm at a loss to work out why lme4 installation
> > cannot find Matrix, and more strangely, why nothing else gave a
> > similar problem.  I think I've tried every version of lme4 and Matrix
> > that has emerged since R-2.4.0 has been released.
> >
> > The fact that no other Red hat user has a problem indicates the
> > problem is this end; but I'm short of ideas about where to look.
> > Looks like it's back to versions 6 months old -- like proprietary
> > software users have to put up with. :-(
>
> Hmmm, I can provoke this on SUSE too (I was going to ask you to try
> this!):
>
> mkdir foo
> MY_R=~/misc/r-release-branch/BUILD/bin/R
> R_LIBS=foo $MY_R --vanilla << EOF
>  options(repos=c(CRAN="http://cran.dk.r-project.org";))
>  install.packages("lme4", depend=TRUE)
> EOF
>
> This installs Matrix alright, then dies with
>
> Warning in install.packages("lme4", depend = TRUE) :
>  argument 'lib' is missing: using foo
> trying URL 'http://cran.dk.r-project.org/src/contrib/lme4_0.9975-8.tar.gz'
> Content type 'application/x-tar' length 235617 bytes
> opened URL
> ==
> downloaded 230Kb
>
> * Installing *source* package 'lme4' ...
> ** libs
>
> gcc -I/usr/local/src/pd/r-release-branch/BUILD/include
>   -I/usr/local/src/pd/r-release-branch/BUILD/include
>   -I/usr/local/include -fpic -g -O2 -std=gnu99 -c Wishart.c -o Wishart.o
>
> In file included from Wishart.h:4,
>  from Wishart.c:1:
> lme4_utils.h:9:20: Matrix.h: No such file or directory
> In file included from Wishart.h:4,
>  from Wishart.c:1:
> lme4_utils.h:25: error: syntax error before "c"
> lme4_utils.h:25: warning: type defaults to `int' in declaration of `c'
> lme4_utils.h:25: warning: data definition has no type or storage class
> lme4_utils.h:163: error: syntax error before "cholmod_factor"
> lme4_utils.h:177: error: syntax error before "cholmod_factor"
> make: *** [Wishart.o] Error 1
> ERROR: compilation failed for package 'lme4'
>
> And Matrix.h is sitting happily in foo/Matrix/include/ but obviously
> not getting included. Should there have been a copy inside lme4?

Not according to my understanding of how the "LinkingTo:"
specification in the DESCRIPTION file is to work.

The headers for the C functions exported by the Matrix package are
defined Matrix/inst/include so they can be changed in one place if the
API changes.  The "LinkingTo:" specification is designed to put these
headers on the -I path at the time of compilation of the lme4 package
source code.  For some reason it didn't do that in this case.

> What is not obvious to me is how this can work anywhere...
>
> I also tried unpacking the lme4 directory from its tarball, dropping
> all files from the Matrix installed include dir into lme4/src and then
>
> $MY_R CMD INSTALL -l foo lme4
>
> but no go
>
> gcc -shared -L/usr/local/lib64 -o lme4.so Matrix_stubs.o Wishart.o
> glmer.o init.o lme4_utils.o lmer.o local_stubs.o pedigree.o
> -L/usr/local/src/pd/r-release-branch/BUILD/lib -lRlapack
> -L/usr/local/src/pd/r-release-branch/BUILD/lib -lRblas -lg2c -lm
> -lgcc_s
>
> local_stubs.o(.text+0x0): In function `M_numeric_as_chm_dense':
> /home/bs/pd/tmp/lme4/src/Matrix_stubs.c:420: multiple definition of 
> `M_numeric_as_chm_dense'
> Matrix_stubs.o(.text+0x0):/home/bs/pd/tmp/lme4/src/Matrix_stubs.c:420: first 
> defined here

You don't want to compile both Matrix_stubs.c and local_stubs.c.  All
that local_stubs.c does is include Matrix_stubs.c, the reason being
that those wrapper functions are needed locally but it is much cleaner
if they are defined in only one place.  The file Matrix_stubs.c is
never compiled by itself.  It is just there to provide the definitions
of the functions.  Check Luke's documentation and examples for using
R_RegisterCCallable and R_GetCCallable.


> local_stubs.o(.text+0x70): In function `M_dpoMatrix_chol':
> /home/bs/pd/tmp/lme4/src/Matrix_stubs.c:412: multiple definition of 
> `M_dpoMatrix_chol'
> Matrix_stubs.o(.text+0x70):/home/bs/pd/tmp/lme4/src/Matrix_stubs.c:412: first 
> defined here
> ..etc..
>
> --
>O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
>   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
>  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
> ~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907
>

Re: [Rd] lme4 install, was Re: [R] Which executable is associated with R CMD INSTALL?

2006-11-06 Thread Peter Dalgaard
"Douglas Bates" <[EMAIL PROTECTED]> writes:

> On 31 Oct 2006 12:05:21 +0100, Peter Dalgaard <[EMAIL PROTECTED]> wrote:
> >
> > [move to r-devel, put maintainer in loop]
> >
> > Patrick Connolly <[EMAIL PROTECTED]> writes:
> >
> > > On Mon, 30-Oct-2006 at 04:44PM -0500, Duncan Murdoch wrote:
> > >
> > >
> > > |> Try "R CMD printenv R_HOME" and you'll find which R home directory it 
> > > is
> > > |> using.  You can see a lot more with "R CMD printenv" or various options
> > > |> to "R CMD config".
> > >
> > > Thanks for that information.  It knocks my theory on the head.  Pity
> > > that, because I might have been able to do something about it if that
> > > was the problem.  Now I'm at a loss to work out why lme4 installation
> > > cannot find Matrix, and more strangely, why nothing else gave a
> > > similar problem.  I think I've tried every version of lme4 and Matrix
> > > that has emerged since R-2.4.0 has been released.
> > >
> > > The fact that no other Red hat user has a problem indicates the
> > > problem is this end; but I'm short of ideas about where to look.
> > > Looks like it's back to versions 6 months old -- like proprietary
> > > software users have to put up with. :-(
> >
> > Hmmm, I can provoke this on SUSE too (I was going to ask you to try
> > this!):
> >
> > mkdir foo
> > MY_R=~/misc/r-release-branch/BUILD/bin/R
> > R_LIBS=foo $MY_R --vanilla << EOF
> >  options(repos=c(CRAN="http://cran.dk.r-project.org";))
> >  install.packages("lme4", depend=TRUE)
> > EOF
> >
> > This installs Matrix alright, then dies with

> > And Matrix.h is sitting happily in foo/Matrix/include/ but obviously
> > not getting included. Should there have been a copy inside lme4?
> 
> Not according to my understanding of how the "LinkingTo:"
> specification in the DESCRIPTION file is to work.

Kurt pointed out the real reason: Using a relative path for R_LIBS was
asking for trouble.

So it was my own fault, I just got blinded by seeing symptoms so
close to those reported by Patrick.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] lme4 install, was Re: [R] Which executable is associated with R CMD INSTALL?

2006-11-06 Thread Douglas Bates
On 06 Nov 2006 15:41:11 +0100, Peter Dalgaard <[EMAIL PROTECTED]> wrote:
> "Douglas Bates" <[EMAIL PROTECTED]> writes:
>
> > On 31 Oct 2006 12:05:21 +0100, Peter Dalgaard <[EMAIL PROTECTED]> wrote:
> > >
> > > [move to r-devel, put maintainer in loop]
> > >
> > > Patrick Connolly <[EMAIL PROTECTED]> writes:
> > >
> > > > On Mon, 30-Oct-2006 at 04:44PM -0500, Duncan Murdoch wrote:
> > > >
> > > >
> > > > |> Try "R CMD printenv R_HOME" and you'll find which R home directory 
> > > > it is
> > > > |> using.  You can see a lot more with "R CMD printenv" or various 
> > > > options
> > > > |> to "R CMD config".
> > > >
> > > > Thanks for that information.  It knocks my theory on the head.  Pity
> > > > that, because I might have been able to do something about it if that
> > > > was the problem.  Now I'm at a loss to work out why lme4 installation
> > > > cannot find Matrix, and more strangely, why nothing else gave a
> > > > similar problem.  I think I've tried every version of lme4 and Matrix
> > > > that has emerged since R-2.4.0 has been released.
> > > >
> > > > The fact that no other Red hat user has a problem indicates the
> > > > problem is this end; but I'm short of ideas about where to look.
> > > > Looks like it's back to versions 6 months old -- like proprietary
> > > > software users have to put up with. :-(
> > >
> > > Hmmm, I can provoke this on SUSE too (I was going to ask you to try
> > > this!):
> > >
> > > mkdir foo
> > > MY_R=~/misc/r-release-branch/BUILD/bin/R
> > > R_LIBS=foo $MY_R --vanilla << EOF
> > >  options(repos=c(CRAN="http://cran.dk.r-project.org";))
> > >  install.packages("lme4", depend=TRUE)
> > > EOF
> > >
> > > This installs Matrix alright, then dies with
> 
> > > And Matrix.h is sitting happily in foo/Matrix/include/ but obviously
> > > not getting included. Should there have been a copy inside lme4?
> >
> > Not according to my understanding of how the "LinkingTo:"
> > specification in the DESCRIPTION file is to work.
>
> Kurt pointed out the real reason: Using a relative path for R_LIBS was
> asking for trouble.
>
> So it was my own fault, I just got blinded by seeing symptoms so
> close to those reported by Patrick.

Thanks for the clarification.

However, I think it exposes a problem with R CMD INSTALL.  If there is
a LinkingTo: directive in the DESCRIPTION file and the package's
include directory cannot be found I think that R CMD INSTALL should
abort.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] memory issues with new release (PR#9344)

2006-11-06 Thread delmeric
Full_Name: Derek Elmerick
Version: 2.4.0
OS: Windows XP
Submission from: (NULL) (38.117.162.243)



hello -

i have some code that i run regularly using R version 2.3.x . the final step of
the code is to build a multinomial logit model. the dataset is large; however, i
have not had issues in the past. i just installed the 2.4.0 version of R and now
have memory allocation issues. to verify, i ran the code again against the 2.3
version and no problems. since i have set the memory limit to the max size, i
have no alternative but to downgrade to the 2.3 version. thoughts?

thanks,
derek

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] memory issues with new release (PR#9344)

2006-11-06 Thread Peter Dalgaard
[EMAIL PROTECTED] writes:

> Full_Name: Derek Elmerick
> Version: 2.4.0
> OS: Windows XP
> Submission from: (NULL) (38.117.162.243)
> 
> 
> 
> hello -
> 
> i have some code that i run regularly using R version 2.3.x . the final step 
> of
> the code is to build a multinomial logit model. the dataset is large; 
> however, i
> have not had issues in the past. i just installed the 2.4.0 version of R and 
> now
> have memory allocation issues. to verify, i ran the code again against the 2.3
> version and no problems. since i have set the memory limit to the max size, i
> have no alternative but to downgrade to the 2.3 version. thoughts?

And what do you expect the maintainers to do about it? (I.e. why are
you filing a bug report.)

You give absolutely no handle on what the cause of the problem might
be, or even to reproduce it. It may be a bug, or maybe just R
requiring more memory to run than previously.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] memory issues with new release (PR#9344)

2006-11-06 Thread Derek Stephen Elmerick
thanks for the friendly reply. i think my description was fairly clear: i
import a large dataset and run a model. using the same dataset, the
process worked previously and it doesn't work now. if the new version of R
requires more memory and this compromises some basic data analyses, i would
label this as a bug. if this memory issue was mentioned in the
documentation, then i apologize. this email was clearly not well received,
so if there is a more appropriate place to post these sort of questions,
that would be helpful.

-derek




On 06 Nov 2006 18:20:33 +0100, Peter Dalgaard <[EMAIL PROTECTED]>
wrote:
>
> [EMAIL PROTECTED] writes:
>
> > Full_Name: Derek Elmerick
> > Version: 2.4.0
> > OS: Windows XP
> > Submission from: (NULL) (38.117.162.243)
> >
> >
> >
> > hello -
> >
> > i have some code that i run regularly using R version 2.3.x . the final
> step of
> > the code is to build a multinomial logit model. the dataset is large;
> however, i
> > have not had issues in the past. i just installed the 2.4.0 version of R
> and now
> > have memory allocation issues. to verify, i ran the code again against
> the 2.3
> > version and no problems. since i have set the memory limit to the max
> size, i
> > have no alternative but to downgrade to the 2.3 version. thoughts?
>
> And what do you expect the maintainers to do about it? ( I.e. why are
> you filing a bug report.)
>
> You give absolutely no handle on what the cause of the problem might
> be, or even to reproduce it. It may be a bug, or maybe just R
> requiring more memory to run than previously.
>
> --
>   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
> (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
> 35327918
> ~~ - ([EMAIL PROTECTED])  FAX: (+45)
> 35327907
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] memory issues with new release (PR#9344)

2006-11-06 Thread Kasper Daniel Hansen
It would be helpful to produce a script that reproduces the error on  
your system. And include details on the size of your data set and  
what you are doing with it. It is unclear what function is actually  
causing the error and such. Really, in order to do something about it  
you need to show how to actually obtain the error.

To my knowledge nothing _major_ has happened with the memory  
consumption, but of course R could use slightly more memory for  
specific purposes.

But chances are that this is not really memory related but more  
related to the functions your are using - perhaps a bug or perhaps a  
user error.

Kasper

On Nov 6, 2006, at 10:20 AM, Derek Stephen Elmerick wrote:

> thanks for the friendly reply. i think my description was fairly  
> clear: i
> import a large dataset and run a model. using the same dataset, the
> process worked previously and it doesn't work now. if the new  
> version of R
> requires more memory and this compromises some basic data analyses,  
> i would
> label this as a bug. if this memory issue was mentioned in the
> documentation, then i apologize. this email was clearly not well  
> received,
> so if there is a more appropriate place to post these sort of  
> questions,
> that would be helpful.
>
> -derek
>
>
>
>
> On 06 Nov 2006 18:20:33 +0100, Peter Dalgaard  
> <[EMAIL PROTECTED]>
> wrote:
>>
>> [EMAIL PROTECTED] writes:
>>
>>> Full_Name: Derek Elmerick
>>> Version: 2.4.0
>>> OS: Windows XP
>>> Submission from: (NULL) (38.117.162.243)
>>>
>>>
>>>
>>> hello -
>>>
>>> i have some code that i run regularly using R version 2.3.x . the  
>>> final
>> step of
>>> the code is to build a multinomial logit model. the dataset is  
>>> large;
>> however, i
>>> have not had issues in the past. i just installed the 2.4.0  
>>> version of R
>> and now
>>> have memory allocation issues. to verify, i ran the code again  
>>> against
>> the 2.3
>>> version and no problems. since i have set the memory limit to the  
>>> max
>> size, i
>>> have no alternative but to downgrade to the 2.3 version. thoughts?
>>
>> And what do you expect the maintainers to do about it? ( I.e. why are
>> you filing a bug report.)
>>
>> You give absolutely no handle on what the cause of the problem might
>> be, or even to reproduce it. It may be a bug, or maybe just R
>> requiring more memory to run than previously.
>>
>> --
>>   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
>> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
>> (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
>> 35327918
>> ~~ - ([EMAIL PROTECTED])  FAX: (+45)
>> 35327907
>>
>
>   [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] memory issues with new release (PR#9344)

2006-11-06 Thread khansen
It would be helpful to produce a script that reproduces the error on =20
your system. And include details on the size of your data set and =20
what you are doing with it. It is unclear what function is actually =20
causing the error and such. Really, in order to do something about it =20=

you need to show how to actually obtain the error.

To my knowledge nothing _major_ has happened with the memory =20
consumption, but of course R could use slightly more memory for =20
specific purposes.

But chances are that this is not really memory related but more =20
related to the functions your are using - perhaps a bug or perhaps a =20
user error.

Kasper

On Nov 6, 2006, at 10:20 AM, Derek Stephen Elmerick wrote:

> thanks for the friendly reply. i think my description was fairly =20
> clear: i
> import a large dataset and run a model. using the same dataset, the
> process worked previously and it doesn't work now. if the new =20
> version of R
> requires more memory and this compromises some basic data analyses, =20=

> i would
> label this as a bug. if this memory issue was mentioned in the
> documentation, then i apologize. this email was clearly not well =20
> received,
> so if there is a more appropriate place to post these sort of =20
> questions,
> that would be helpful.
>
> -derek
>
>
>
>
> On 06 Nov 2006 18:20:33 +0100, Peter Dalgaard =20
> <[EMAIL PROTECTED]>
> wrote:
>>
>> [EMAIL PROTECTED] writes:
>>
>>> Full_Name: Derek Elmerick
>>> Version: 2.4.0
>>> OS: Windows XP
>>> Submission from: (NULL) (38.117.162.243)
>>>
>>>
>>>
>>> hello -
>>>
>>> i have some code that i run regularly using R version 2.3.x . the =20=

>>> final
>> step of
>>> the code is to build a multinomial logit model. the dataset is =20
>>> large;
>> however, i
>>> have not had issues in the past. i just installed the 2.4.0 =20
>>> version of R
>> and now
>>> have memory allocation issues. to verify, i ran the code again =20
>>> against
>> the 2.3
>>> version and no problems. since i have set the memory limit to the =20=

>>> max
>> size, i
>>> have no alternative but to downgrade to the 2.3 version. thoughts?
>>
>> And what do you expect the maintainers to do about it? ( I.e. why are
>> you filing a bug report.)
>>
>> You give absolutely no handle on what the cause of the problem might
>> be, or even to reproduce it. It may be a bug, or maybe just R
>> requiring more memory to run than previously.
>>
>> --
>>   O__   Peter Dalgaard =D8ster Farimagsgade 5, Entr.B
>> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
>> (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
>> 35327918
>> ~~ - ([EMAIL PROTECTED])  FAX: (+45)
>> 35327907
>>
>
>   [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] memory issues with new release (PR#9344)

2006-11-06 Thread Peter Dalgaard
"Derek Stephen Elmerick" <[EMAIL PROTECTED]> writes:

> thanks for the friendly reply. i think my description was fairly clear: i
> import a large dataset and run a model. using the same dataset, the
> process worked previously and it doesn't work now. if the new version of R
> requires more memory and this compromises some basic data analyses, i would
> label this as a bug. if this memory issue was mentioned in the
> documentation, then i apologize. this email was clearly not well received,
> so if there is a more appropriate place to post these sort of questions,
> that would be helpful.

We have mailing lists. 

[EMAIL PROTECTED]

would be appropriate. 

You're still not giving sufficient information for anyone to come up
with a sensible reply, though.

Apologies if the tone came out a bit sharp, but allow me to
remind you that the first line of the report form you used reads:

"Before submitting a bug report, please read Chapter `R Bugs' of `The R
FAQ'. It describes what a bug is and how to report a bug."

You do have to realize that from the point of view of improving R, it
simply is not very informative that there is a user who has a problem
which (just?) fitted in available memory in a previous version, but
doesn't anymore.

> 
> On 06 Nov 2006 18:20:33 +0100, Peter Dalgaard <[EMAIL PROTECTED]>
> wrote:
> >
> > [EMAIL PROTECTED] writes:
> >
> > > Full_Name: Derek Elmerick
> > > Version: 2.4.0
> > > OS: Windows XP
> > > Submission from: (NULL) (38.117.162.243)
> > >
> > >
> > >
> > > hello -
> > >
> > > i have some code that i run regularly using R version 2.3.x . the final
> > step of
> > > the code is to build a multinomial logit model. the dataset is large;
> > however, i
> > > have not had issues in the past. i just installed the 2.4.0 version of R
> > and now
> > > have memory allocation issues. to verify, i ran the code again against
> > the 2.3
> > > version and no problems. since i have set the memory limit to the max
> > size, i
> > > have no alternative but to downgrade to the 2.3 version. thoughts?
> >
> > And what do you expect the maintainers to do about it? ( I.e. why are
> > you filing a bug report.)
> >
> > You give absolutely no handle on what the cause of the problem might
> > be, or even to reproduce it. It may be a bug, or maybe just R
> > requiring more memory to run than previously.
> >
> > --
> >   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
> > c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
> > (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
> > 35327918
> > ~~ - ([EMAIL PROTECTED])  FAX: (+45)
> > 35327907
> >

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] memory issues with new release (PR#9344)

2006-11-06 Thread p . dalgaard
"Derek Stephen Elmerick" <[EMAIL PROTECTED]> writes:

> thanks for the friendly reply. i think my description was fairly clear: i
> import a large dataset and run a model. using the same dataset, the
> process worked previously and it doesn't work now. if the new version of R
> requires more memory and this compromises some basic data analyses, i wou=
ld
> label this as a bug. if this memory issue was mentioned in the
> documentation, then i apologize. this email was clearly not well received,
> so if there is a more appropriate place to post these sort of questions,
> that would be helpful.

We have mailing lists.=20

[EMAIL PROTECTED]

would be appropriate.=20

You're still not giving sufficient information for anyone to come up
with a sensible reply, though.

Apologies if the tone came out a bit sharp, but allow me to
remind you that the first line of the report form you used reads:

"Before submitting a bug report, please read Chapter `R Bugs' of `The R
FAQ'. It describes what a bug is and how to report a bug."

You do have to realize that from the point of view of improving R, it
simply is not very informative that there is a user who has a problem
which (just?) fitted in available memory in a previous version, but
doesn't anymore.

>=20
> On 06 Nov 2006 18:20:33 +0100, Peter Dalgaard <[EMAIL PROTECTED]>
> wrote:
> >
> > [EMAIL PROTECTED] writes:
> >
> > > Full_Name: Derek Elmerick
> > > Version: 2.4.0
> > > OS: Windows XP
> > > Submission from: (NULL) (38.117.162.243)
> > >
> > >
> > >
> > > hello -
> > >
> > > i have some code that i run regularly using R version 2.3.x . the fin=
al
> > step of
> > > the code is to build a multinomial logit model. the dataset is large;
> > however, i
> > > have not had issues in the past. i just installed the 2.4.0 version o=
f R
> > and now
> > > have memory allocation issues. to verify, i ran the code again against
> > the 2.3
> > > version and no problems. since i have set the memory limit to the max
> > size, i
> > > have no alternative but to downgrade to the 2.3 version. thoughts?
> >
> > And what do you expect the maintainers to do about it? ( I.e. why are
> > you filing a bug report.)
> >
> > You give absolutely no handle on what the cause of the problem might
> > be, or even to reproduce it. It may be a bug, or maybe just R
> > requiring more memory to run than previously.
> >
> > --
> >   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, Entr.B
> > c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
> > (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
> > 35327918
> > ~~ - ([EMAIL PROTECTED])  FAX: (+45)
> > 35327907
> >

--=20
   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] memory issues with new release (PR#9344)

2006-11-06 Thread Derek Stephen Elmerick

Thanks for the replies. Point taken regarding submission protocol. I have
included a text file attachment that shows the R output with version 2.3.0and
2.4.0. A label distinguishing the version is included in the comments.

A quick background on the attached example. The dataset has 650,000 records
and 32 variables. the response is dichotomous (0/1) and i ran a logistic
model (i previously mentioned multinomial, but decided to start simple for
the example). Covariates in the model may be continuous or categorical, but
all are numeric. You'll notice that the code is the same for both versions;
however, there is a memory error with the 2.3.0 version. i ran this several
times and in different orders to make sure it was not some sort of hardware
issue.

If there is some sort of additional output that would be helpful, I can
provide as well. Or, if there is nothing I can do, that is fine also.

-Derek


On 11/6/06, Kasper Daniel Hansen <[EMAIL PROTECTED]> wrote:


It would be helpful to produce a script that reproduces the error on
your system. And include details on the size of your data set and
what you are doing with it. It is unclear what function is actually
causing the error and such. Really, in order to do something about it
you need to show how to actually obtain the error.

To my knowledge nothing _major_ has happened with the memory
consumption, but of course R could use slightly more memory for
specific purposes.

But chances are that this is not really memory related but more
related to the functions your are using - perhaps a bug or perhaps a
user error.

Kasper

On Nov 6, 2006, at 10:20 AM, Derek Stephen Elmerick wrote:

> thanks for the friendly reply. i think my description was fairly
> clear: i
> import a large dataset and run a model. using the same dataset, the
> process worked previously and it doesn't work now. if the new
> version of R
> requires more memory and this compromises some basic data analyses,
> i would
> label this as a bug. if this memory issue was mentioned in the
> documentation, then i apologize. this email was clearly not well
> received,
> so if there is a more appropriate place to post these sort of
> questions,
> that would be helpful.
>
> -derek
>
>
>
>
> On 06 Nov 2006 18:20:33 +0100, Peter Dalgaard
> <[EMAIL PROTECTED]>
> wrote:
>>
>> [EMAIL PROTECTED] writes:
>>
>>> Full_Name: Derek Elmerick
>>> Version: 2.4.0
>>> OS: Windows XP
>>> Submission from: (NULL) ( 38.117.162.243)
>>>
>>>
>>>
>>> hello -
>>>
>>> i have some code that i run regularly using R version 2.3.x . the
>>> final
>> step of
>>> the code is to build a multinomial logit model. the dataset is
>>> large;
>> however, i
>>> have not had issues in the past. i just installed the 2.4.0
>>> version of R
>> and now
>>> have memory allocation issues. to verify, i ran the code again
>>> against
>> the 2.3
>>> version and no problems. since i have set the memory limit to the
>>> max
>> size, i
>>> have no alternative but to downgrade to the 2.3 version. thoughts?
>>
>> And what do you expect the maintainers to do about it? ( I.e. why are
>> you filing a bug report.)
>>
>> You give absolutely no handle on what the cause of the problem might
>> be, or even to reproduce it. It may be a bug, or maybe just R
>> requiring more memory to run than previously.
>>
>> --
>>   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
>> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
>> (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
>> 35327918
>> ~~ - ([EMAIL PROTECTED])  FAX: (+45)
>> 35327907
>>
>
>   [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel




> ##
> ### R 2.4.0
> ##
> 
> rm(list=ls(all=TRUE))
> memory.limit(size=4095)
NULL
> 
> clnt=read.table(file="K:\\all_data_reduced_vars.dat",header=T,sep="\t")
> 
> chk.rsp=glm(formula = resp_chkonly ~ x1 + x2 + x3 + x4 +
+ x5 + x6 + x7 + x8 + x9 + x10 + x11 + x12 + x13 +
+ x14 + x15 + x16 + x17 + x18 + x19 +x20 +
+ x21 + x22 +x23 + x24 + x25 + x26 +x27 + 
+ x28 + x29 + x30 + x27*x29 + x28*x30, family = binomial, 
+ data = clnt)
Error: cannot allocate vector of size 167578 Kb
> 
> dim(clnt)
[1] 65 32
> sum(clnt)
[1] 112671553493
> 

##
##

> ##
> ### R 2.3.0
> ##
> 
> rm(list=ls(all=TRUE))
> memory.limit(size=4095)
NULL
> 
> clnt=read.table(file="K:\\all_data_reduced_vars.dat",header=T,sep="\t")
> 
> chk.rsp=glm(formula = resp_chkonly ~ x1 + x2 + x3 + x4 +
+ x5 + x6 + x7 + x8 + x9 + x10 + x11 + x12 + x13 +
+ x14 + x15 + x16 + x17 + x18 + x19 +x20 +
+ x21 + x22 +x23 + x24 + x25 + x26 +x27 + 
+ x28 + x29 + x30 + x27*x29 + x28*x30, family = binomial, 
+ data = clnt)
> 
> dim(clnt)
[1] 65 32
> sum(clnt)
[1] 112671553493

Re: [Rd] memory issues with new release (PR#9344)

2006-11-06 Thread Peter Dalgaard
"Derek Stephen Elmerick" <[EMAIL PROTECTED]> writes:

> Thanks for the replies. Point taken regarding submission protocol. I have
> included a text file attachment that shows the R output with version 2.3.0and
> 2.4.0. A label distinguishing the version is included in the comments.
> 
> A quick background on the attached example. The dataset has 650,000 records
> and 32 variables. the response is dichotomous (0/1) and i ran a logistic
> model (i previously mentioned multinomial, but decided to start simple for
> the example). Covariates in the model may be continuous or categorical, but
> all are numeric. You'll notice that the code is the same for both versions;
> however, there is a memory error with the 2.3.0 version. i ran this several
> times and in different orders to make sure it was not some sort of hardware
> issue.
> 
> If there is some sort of additional output that would be helpful, I can
> provide as well. Or, if there is nothing I can do, that is fine also.

I don't think it was ever possible to request 4GB on XP. The version
difference might be caused by different response to invalid input in
memory.limit(). What does memory.limit(NA) tell you after the call to
memory.limit(4095) in the two versions? 

If that is not the reason: What is the *real* restriction of memory on
your system? Do you actually have 4GB in your system (RAM+swap)? 

Your design matrix is on the order of 160 MB, so shouldn't be a
problem with a GB-sized workspace. However, three copies of it will
brush against 512 MB, and it's not unlikely to have that many copies
around. 


 
> -Derek
> 
> 
> On 11/6/06, Kasper Daniel Hansen <[EMAIL PROTECTED]> wrote:
> >
> > It would be helpful to produce a script that reproduces the error on
> > your system. And include details on the size of your data set and
> > what you are doing with it. It is unclear what function is actually
> > causing the error and such. Really, in order to do something about it
> > you need to show how to actually obtain the error.
> >
> > To my knowledge nothing _major_ has happened with the memory
> > consumption, but of course R could use slightly more memory for
> > specific purposes.
> >
> > But chances are that this is not really memory related but more
> > related to the functions your are using - perhaps a bug or perhaps a
> > user error.
> >
> > Kasper
> >
> > On Nov 6, 2006, at 10:20 AM, Derek Stephen Elmerick wrote:
> >
> > > thanks for the friendly reply. i think my description was fairly
> > > clear: i
> > > import a large dataset and run a model. using the same dataset, the
> > > process worked previously and it doesn't work now. if the new
> > > version of R
> > > requires more memory and this compromises some basic data analyses,
> > > i would
> > > label this as a bug. if this memory issue was mentioned in the
> > > documentation, then i apologize. this email was clearly not well
> > > received,
> > > so if there is a more appropriate place to post these sort of
> > > questions,
> > > that would be helpful.
> > >
> > > -derek
> > >
> > >
> > >
> > >
> > > On 06 Nov 2006 18:20:33 +0100, Peter Dalgaard
> > > <[EMAIL PROTECTED]>
> > > wrote:
> > >>
> > >> [EMAIL PROTECTED] writes:
> > >>
> > >>> Full_Name: Derek Elmerick
> > >>> Version: 2.4.0
> > >>> OS: Windows XP
> > >>> Submission from: (NULL) ( 38.117.162.243)
> > >>>
> > >>>
> > >>>
> > >>> hello -
> > >>>
> > >>> i have some code that i run regularly using R version 2.3.x . the
> > >>> final
> > >> step of
> > >>> the code is to build a multinomial logit model. the dataset is
> > >>> large;
> > >> however, i
> > >>> have not had issues in the past. i just installed the 2.4.0
> > >>> version of R
> > >> and now
> > >>> have memory allocation issues. to verify, i ran the code again
> > >>> against
> > >> the 2.3
> > >>> version and no problems. since i have set the memory limit to the
> > >>> max
> > >> size, i
> > >>> have no alternative but to downgrade to the 2.3 version. thoughts?
> > >>
> > >> And what do you expect the maintainers to do about it? ( I.e. why are
> > >> you filing a bug report.)
> > >>
> > >> You give absolutely no handle on what the cause of the problem might
> > >> be, or even to reproduce it. It may be a bug, or maybe just R
> > >> requiring more memory to run than previously.
> > >>
> > >> --
> > >>   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
> > >> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
> > >> (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
> > >> 35327918
> > >> ~~ - ([EMAIL PROTECTED])  FAX: (+45)
> > >> 35327907
> > >>
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> >
> 
> 
> 
> > ##
> > ### R 2.4.0
> > ##
> > 
> > rm(list=ls(all=TRUE))
> > memory.limit(size=4095)
> NULL
> > 
> > clnt=read.table(file="K

Re: [Rd] memory issues with new release (PR#9344)

2006-11-06 Thread p . dalgaard
"Derek Stephen Elmerick" <[EMAIL PROTECTED]> writes:

> Thanks for the replies. Point taken regarding submission protocol. I have
> included a text file attachment that shows the R output with version 2.3.=
0and
> 2.4.0. A label distinguishing the version is included in the comments.
>=20
> A quick background on the attached example. The dataset has 650,000 recor=
ds
> and 32 variables. the response is dichotomous (0/1) and i ran a logistic
> model (i previously mentioned multinomial, but decided to start simple for
> the example). Covariates in the model may be continuous or categorical, b=
ut
> all are numeric. You'll notice that the code is the same for both version=
s;
> however, there is a memory error with the 2.3.0 version. i ran this sever=
al
> times and in different orders to make sure it was not some sort of hardwa=
re
> issue.
>=20
> If there is some sort of additional output that would be helpful, I can
> provide as well. Or, if there is nothing I can do, that is fine also.

I don't think it was ever possible to request 4GB on XP. The version
difference might be caused by different response to invalid input in
memory.limit(). What does memory.limit(NA) tell you after the call to
memory.limit(4095) in the two versions?=20

If that is not the reason: What is the *real* restriction of memory on
your system? Do you actually have 4GB in your system (RAM+swap)?=20

Your design matrix is on the order of 160 MB, so shouldn't be a
problem with a GB-sized workspace. However, three copies of it will
brush against 512 MB, and it's not unlikely to have that many copies
around.=20


=20
> -Derek
>=20
>=20
> On 11/6/06, Kasper Daniel Hansen <[EMAIL PROTECTED]> wrote:
> >
> > It would be helpful to produce a script that reproduces the error on
> > your system. And include details on the size of your data set and
> > what you are doing with it. It is unclear what function is actually
> > causing the error and such. Really, in order to do something about it
> > you need to show how to actually obtain the error.
> >
> > To my knowledge nothing _major_ has happened with the memory
> > consumption, but of course R could use slightly more memory for
> > specific purposes.
> >
> > But chances are that this is not really memory related but more
> > related to the functions your are using - perhaps a bug or perhaps a
> > user error.
> >
> > Kasper
> >
> > On Nov 6, 2006, at 10:20 AM, Derek Stephen Elmerick wrote:
> >
> > > thanks for the friendly reply. i think my description was fairly
> > > clear: i
> > > import a large dataset and run a model. using the same dataset, the
> > > process worked previously and it doesn't work now. if the new
> > > version of R
> > > requires more memory and this compromises some basic data analyses,
> > > i would
> > > label this as a bug. if this memory issue was mentioned in the
> > > documentation, then i apologize. this email was clearly not well
> > > received,
> > > so if there is a more appropriate place to post these sort of
> > > questions,
> > > that would be helpful.
> > >
> > > -derek
> > >
> > >
> > >
> > >
> > > On 06 Nov 2006 18:20:33 +0100, Peter Dalgaard
> > > <[EMAIL PROTECTED]>
> > > wrote:
> > >>
> > >> [EMAIL PROTECTED] writes:
> > >>
> > >>> Full_Name: Derek Elmerick
> > >>> Version: 2.4.0
> > >>> OS: Windows XP
> > >>> Submission from: (NULL) ( 38.117.162.243)
> > >>>
> > >>>
> > >>>
> > >>> hello -
> > >>>
> > >>> i have some code that i run regularly using R version 2.3.x . the
> > >>> final
> > >> step of
> > >>> the code is to build a multinomial logit model. the dataset is
> > >>> large;
> > >> however, i
> > >>> have not had issues in the past. i just installed the 2.4.0
> > >>> version of R
> > >> and now
> > >>> have memory allocation issues. to verify, i ran the code again
> > >>> against
> > >> the 2.3
> > >>> version and no problems. since i have set the memory limit to the
> > >>> max
> > >> size, i
> > >>> have no alternative but to downgrade to the 2.3 version. thoughts?
> > >>
> > >> And what do you expect the maintainers to do about it? ( I.e. why are
> > >> you filing a bug report.)
> > >>
> > >> You give absolutely no handle on what the cause of the problem might
> > >> be, or even to reproduce it. It may be a bug, or maybe just R
> > >> requiring more memory to run than previously.
> > >>
> > >> --
> > >>   O__   Peter Dalgaard =C3=98ster Farimagsgade 5, En=
tr.B
> > >> c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
> > >> (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
> > >> 35327918
> > >> ~~ - ([EMAIL PROTECTED])  FAX: (+45)
> > >> 35327907
> > >>
> > >
> > >   [[alternative HTML version deleted]]
> > >
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> >
>=20
>=20
>=20
> > ##
> > ### R 2.4.0
> > ##
> >=20
> > rm(list=3Dls(all=3DTRUE))
> > memory.limit(siz

Re: [Rd] allocVector bug ?

2006-11-06 Thread Vladimir Dergachev

Hi Luke, 

   Thank you for the patient reply ! 

   I have looked into the issue a little deeper, comments below:

On Thursday 02 November 2006 11:26 pm, Luke Tierney wrote:
> On Wed, 1 Nov 2006, Vladimir Dergachev wrote:
> > Hi all,
> >
> >  I was looking at the following piece of code in src/main/memory.c,
> > function allocVector :
> >
> >if (size <= NodeClassSize[1]) {
> > node_class = 1;
> > alloc_size = NodeClassSize[1];
> >}
> >else {
> > node_class = LARGE_NODE_CLASS;
> > alloc_size = size;
> > for (i = 2; i < NUM_SMALL_NODE_CLASSES; i++) {
> > if (size <= NodeClassSize[i]) {
> > node_class = i;
> > alloc_size = NodeClassSize[i];
> > break;
> > }
> > }
> >}
> >
> >
> > It appears that for LARGE_NODE_CLASS the variable alloc_size should not
> > be size, but something far less as we are not using vector heap, but
> > rather calling malloc directly in the code below (and from discussions I
> > read on this mailing list I think that these two are different - please
> > let me know if I am wrong).
> >
> > So when allocate a large vector the garbage collector goes nuts trying to
> > find all that space which is not going to be needed after all.
>
> This is as intended, not a bug. The garbage collector does not "go
> nuts" -- it is doing a garbage collection that may release memory in
> advance of making a large allocation.  The size of the current
> allocation request is used as part of the process of deciding when to
> satisfy an allocation by malloc (of a single large noda or a page) and
> when to first do a gc.  It is essential to do this for large
> allocations as well to keep the memory footprint down and help reduce
> fragmentation.

I generally agree with this, however I believe that current logic breaks down 
for large allocation sizes and my code ends up spending 70% (and up) of 
computer time spinning inside garbage collector (I run oprofile to observe 
what is going on).

I do realize that garbage collection is not an easy problem  and that hardware 
and software environments change - my desire is simply to have a version of R 
that is usable for the problems I am dealing with as, aside from slowdown 
with large vector sizes, I find R a very capable tool.

I would greatly appreciate if you could comment on the following observations:

  1. The time spent during single garbage collector run grows with the number 
of nodes - from looking at the code I believe it is linear, but I am not 
certain.

  2. In my case the data.frame contains a few string vectors. These allocate 
lots of CHARSXPs which are the main cause of slowdown of each garbage 
collector run. Would you have any suggestions on optimizing this particular 
situation ? 

  3. Any time a data.frame is created, or one performs an attach() operation 
there is a series of allocations - and if one of them causes memory to expand 
all the rest will too. 

  I put in a fprintf() statement to show alloc_size, VHEAP_FREE and RV_size 
when allocVector is called (this is done only for node_class == 
LARGE_NODE_CLASS).

  First output snippet is from the time the script starts and tries to create 
data.frame:

alloc_size=128 VHEAP_FREE=604182 R_VSize=786432
alloc_size=88 VHEAP_FREE=660051 R_VSize=786432
alloc_size=88 VHEAP_FREE=659963 R_VSize=786432
alloc_size=4078820 VHEAP_FREE=659874 R_VSize=786432
alloc_size=4078820 VHEAP_FREE=260678 R_VSize=4465461
alloc_size=4078820 VHEAP_FREE=260678 R_VSize=8544282
alloc_size=4078820 VHEAP_FREE=260678 R_VSize=12623103
...
alloc_size=4078820 VHEAP_FREE=260677 R_VSize=271628325
alloc_size=4078820 VHEAP_FREE=260677 R_VSize=275707147

As you can see the VHEAP_FREE() attach(B)
alloc_size=4078820 VHEAP_FREE=1274112 R_VSize=294022636
alloc_size=4078820 VHEAP_FREE=499351 R_VSize=297325768
...
alloc_size=4078820 VHEAP_FREE=602082 R_VSize=568670030
alloc_size=4078820 VHEAP_FREE=602082 R_VSize=572748850
alloc_size=4078820 VHEAP_FREE=602082 R_VSize=576827670
alloc_size=88 VHEAP_FREE=602082 R_VSize=580906490
alloc_size=88 VHEAP_FREE=601915 R_VSize=580906490
alloc_size=88 VHEAP_FREE=601798 R_VSize=580906490
alloc_size=88 VHEAP_FREE=601678 R_VSize=580906490
...
alloc_size=44 VHEAP_FREE=591581 R_VSize=580906490
alloc_size=88 VHEAP_FREE=591323 R_VSize=580906490
alloc_size=44 VHEAP_FREE=591220 R_VSize=580906490

So we have the same behaviour as before - the garbage collector gets run every 
time attach creates a new large vector, but functions perfectly for smaller 
vector sizes.

Next, I did detach(B) (which freed up memory) followed by "F<-B[,1]":

alloc_size=113 VHEAP_FREE=588448 R_VSize=580906490
alloc_size=618 VHEAP_FREE=588335 R_VSize=580906490
alloc_size=618 VHEAP_FREE=587717 R_VSize=580906490
alloc_size=128 VHEAP_FREE=587099 R_VSize=580906490
alloc_size=88 VHEAP_FREE=586825 R_VSize=580906490
alloc_size=4078820 VHEAP_FREE=586737 R_VSize=580906490
alloc_size=4078820 VHEAP_FREE=284079854 R_VSize=580906490
alloc_size=4078820 VHE

Re: [Rd] memory issues with new release (PR#9344)

2006-11-06 Thread Derek Stephen Elmerick

Peter,

I ran the memory limit function you mention below and both versions provide
the same result:



memory.limit(size=4095)

NULL

memory.limit(NA)

[1] 4293918720



I do have 4GB ram on my PC. As a more reproducible form of the test, I
have attached output that uses a randomly generated dataset after fixing the
seed. Same result as last time: works with 2.3.0 and not 2.4.0. I guess the
one caveat here is that I just increased the dataset size until I got the
memory issue with at least one of the R versions. It's okay. No need to
spend more time on this. I really don't mind using the previous version.
Like you mentioned, probably just a function of the new version requiring
more memory.

Thanks,
Derek



On 06 Nov 2006 21:42:04 +0100, Peter Dalgaard <[EMAIL PROTECTED]>
wrote:


"Derek Stephen Elmerick" <[EMAIL PROTECTED]> writes:

> Thanks for the replies. Point taken regarding submission protocol. I
have
> included a text file attachment that shows the R output with version
2.3.0and
> 2.4.0. A label distinguishing the version is included in the comments.
>
> A quick background on the attached example. The dataset has 650,000
records
> and 32 variables. the response is dichotomous (0/1) and i ran a logistic
> model (i previously mentioned multinomial, but decided to start simple
for
> the example). Covariates in the model may be continuous or categorical,
but
> all are numeric. You'll notice that the code is the same for both
versions;
> however, there is a memory error with the 2.3.0 version. i ran this
several
> times and in different orders to make sure it was not some sort of
hardware
> issue.
>
> If there is some sort of additional output that would be helpful, I can
> provide as well. Or, if there is nothing I can do, that is fine also.

I don't think it was ever possible to request 4GB on XP. The version
difference might be caused by different response to invalid input in
memory.limit(). What does memory.limit(NA) tell you after the call to
memory.limit(4095) in the two versions?

If that is not the reason: What is the *real* restriction of memory on
your system? Do you actually have 4GB in your system (RAM+swap)?

Your design matrix is on the order of 160 MB, so shouldn't be a
problem with a GB-sized workspace. However, three copies of it will
brush against 512 MB, and it's not unlikely to have that many copies
around.



> -Derek
>
>
> On 11/6/06, Kasper Daniel Hansen < [EMAIL PROTECTED]> wrote:
> >
> > It would be helpful to produce a script that reproduces the error on
> > your system. And include details on the size of your data set and
> > what you are doing with it. It is unclear what function is actually
> > causing the error and such. Really, in order to do something about it
> > you need to show how to actually obtain the error.
> >
> > To my knowledge nothing _major_ has happened with the memory
> > consumption, but of course R could use slightly more memory for
> > specific purposes.
> >
> > But chances are that this is not really memory related but more
> > related to the functions your are using - perhaps a bug or perhaps a
> > user error.
> >
> > Kasper
> >
> > On Nov 6, 2006, at 10:20 AM, Derek Stephen Elmerick wrote:
> >
> > > thanks for the friendly reply. i think my description was fairly
> > > clear: i
> > > import a large dataset and run a model. using the same dataset, the
> > > process worked previously and it doesn't work now. if the new
> > > version of R
> > > requires more memory and this compromises some basic data analyses,
> > > i would
> > > label this as a bug. if this memory issue was mentioned in the
> > > documentation, then i apologize. this email was clearly not well
> > > received,
> > > so if there is a more appropriate place to post these sort of
> > > questions,
> > > that would be helpful.
> > >
> > > -derek
> > >
> > >
> > >
> > >
> > > On 06 Nov 2006 18:20:33 +0100, Peter Dalgaard
> > > < [EMAIL PROTECTED]>
> > > wrote:
> > >>
> > >> [EMAIL PROTECTED] writes:
> > >>
> > >>> Full_Name: Derek Elmerick
> > >>> Version: 2.4.0
> > >>> OS: Windows XP
> > >>> Submission from: (NULL) ( 38.117.162.243 )
> > >>>
> > >>>
> > >>>
> > >>> hello -
> > >>>
> > >>> i have some code that i run regularly using R version 2.3.x . the
> > >>> final
> > >> step of
> > >>> the code is to build a multinomial logit model. the dataset is
> > >>> large;
> > >> however, i
> > >>> have not had issues in the past. i just installed the 2.4.0
> > >>> version of R
> > >> and now
> > >>> have memory allocation issues. to verify, i ran the code again
> > >>> against
> > >> the 2.3
> > >>> version and no problems. since i have set the memory limit to the
> > >>> max
> > >> size, i
> > >>> have no alternative but to downgrade to the 2.3 version. thoughts?
> > >>
> > >> And what do you expect the maintainers to do about it? ( I.e. why
are
> > >> you filing a bug report.)
> > >>
> > >> You give absolutely no handle on what the cause of the problem
might
> > >> be, or even to reproduce it.

[Rd] gc()$Vcells < 0 (PR#9345)

2006-11-06 Thread dmaszle
Full_Name: Don Maszle
Version: 2.3.0
OS: x86_64-unknown-linux-gnu
Submission from: (NULL) (206.86.87.3)


# On our new 32 GB x86_64 machine

R : Copyright 2006, The R Foundation for Statistical Computing
Version 2.3.0 (2006-04-24)
ISBN 3-900051-07-0

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> R.version
   _ 
platform   x86_64-unknown-linux-gnu  
arch   x86_64
os linux-gnu 
system x86_64, linux-gnu 
status   
major  2 
minor  3.0   
year   2006  
month  04
day24
svn rev37909 
language   R 
version.string Version 2.3.0 (2006-04-24)

> x<-matrix(nrow=44000,ncol=48000)
> y<-matrix(nrow=44000,ncol=48000)
> z<-matrix(nrow=44000,ncol=48000)
> gc()
  used(Mb) gc trigger(Mb) max used(Mb)
Ncells  177801 9.5 40750021.8   3518.7
Vcells -1126881981 24170.6 NA 24173.4   NA 24170.6
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] memory issues with new release (PR#9344)

2006-11-06 Thread Peter Dalgaard
"Derek Stephen Elmerick" <[EMAIL PROTECTED]> writes:

> Peter,
> 
> I ran the memory limit function you mention below and both versions provide
> the same result:
> 
> >
> > memory.limit(size=4095)
> NULL
> > memory.limit(NA)
> [1] 4293918720
> >
> I do have 4GB ram on my PC. As a more reproducible form of the test, I
> have attached output that uses a randomly generated dataset after fixing the
> seed. Same result as last time: works with 2.3.0 and not 2.4.0. I guess the
> one caveat here is that I just increased the dataset size until I got the
> memory issue with at least one of the R versions. It's okay. No need to
> spend more time on this. I really don't mind using the previous version.
> Like you mentioned, probably just a function of the new version requiring
> more memory.


Hmm, you might want to take a final look at the Windows FAQ 2.9. I am
still not quite convinced you're really getting more than the default
1.5 GB.

Also, how much can you increase the problem size on 2.3.0 before it
breaks? If you can only go to say 39 or 40 variables, then there's
probably not much we can do. If it is orders of magnitude, then we may
have a real bug (or not: sometimes we fix bugs resulting from things
not being duplicated when they should have been, the fixed code then
uses more memory than the unfixed code.)

 
> Thanks,
> Derek
> 
> 
> 
> On 06 Nov 2006 21:42:04 +0100, Peter Dalgaard <[EMAIL PROTECTED]>
> wrote:
> >
> > "Derek Stephen Elmerick" <[EMAIL PROTECTED]> writes:
> >
> > > Thanks for the replies. Point taken regarding submission protocol. I
> > have
> > > included a text file attachment that shows the R output with version
> > 2.3.0and
> > > 2.4.0. A label distinguishing the version is included in the comments.
> > >
> > > A quick background on the attached example. The dataset has 650,000
> > records
> > > and 32 variables. the response is dichotomous (0/1) and i ran a logistic
> > > model (i previously mentioned multinomial, but decided to start simple
> > for
> > > the example). Covariates in the model may be continuous or categorical,
> > but
> > > all are numeric. You'll notice that the code is the same for both
> > versions;
> > > however, there is a memory error with the 2.3.0 version. i ran this
> > several
> > > times and in different orders to make sure it was not some sort of
> > hardware
> > > issue.
> > >
> > > If there is some sort of additional output that would be helpful, I can
> > > provide as well. Or, if there is nothing I can do, that is fine also.
> >
> > I don't think it was ever possible to request 4GB on XP. The version
> > difference might be caused by different response to invalid input in
> > memory.limit(). What does memory.limit(NA) tell you after the call to
> > memory.limit(4095) in the two versions?
> >
> > If that is not the reason: What is the *real* restriction of memory on
> > your system? Do you actually have 4GB in your system (RAM+swap)?
> >
> > Your design matrix is on the order of 160 MB, so shouldn't be a
> > problem with a GB-sized workspace. However, three copies of it will
> > brush against 512 MB, and it's not unlikely to have that many copies
> > around.
> >
> >
> >
> > > -Derek
> > >
> > >
> > > On 11/6/06, Kasper Daniel Hansen < [EMAIL PROTECTED]> wrote:
> > > >
> > > > It would be helpful to produce a script that reproduces the error on
> > > > your system. And include details on the size of your data set and
> > > > what you are doing with it. It is unclear what function is actually
> > > > causing the error and such. Really, in order to do something about it
> > > > you need to show how to actually obtain the error.
> > > >
> > > > To my knowledge nothing _major_ has happened with the memory
> > > > consumption, but of course R could use slightly more memory for
> > > > specific purposes.
> > > >
> > > > But chances are that this is not really memory related but more
> > > > related to the functions your are using - perhaps a bug or perhaps a
> > > > user error.
> > > >
> > > > Kasper
> > > >
> > > > On Nov 6, 2006, at 10:20 AM, Derek Stephen Elmerick wrote:
> > > >
> > > > > thanks for the friendly reply. i think my description was fairly
> > > > > clear: i
> > > > > import a large dataset and run a model. using the same dataset, the
> > > > > process worked previously and it doesn't work now. if the new
> > > > > version of R
> > > > > requires more memory and this compromises some basic data analyses,
> > > > > i would
> > > > > label this as a bug. if this memory issue was mentioned in the
> > > > > documentation, then i apologize. this email was clearly not well
> > > > > received,
> > > > > so if there is a more appropriate place to post these sort of
> > > > > questions,
> > > > > that would be helpful.
> > > > >
> > > > > -derek
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On 06 Nov 2006 18:20:33 +0100, Peter Dalgaard
> > > > > < [EMAIL PROTECTED]>
> > > > > wrote:
> > > > >>
> > > > >> [EMAIL PROTECTED] writes:
> > > > >>
> > > > >

Re: [Rd] memory issues with new release (PR#9344)

2006-11-06 Thread p . dalgaard
"Derek Stephen Elmerick" <[EMAIL PROTECTED]> writes:

> Peter,
>=20
> I ran the memory limit function you mention below and both versions provi=
de
> the same result:
>=20
> >
> > memory.limit(size=3D4095)
> NULL
> > memory.limit(NA)
> [1] 4293918720
> >
> I do have 4GB ram on my PC. As a more reproducible form of the test, I
> have attached output that uses a randomly generated dataset after fixing =
the
> seed. Same result as last time: works with 2.3.0 and not 2.4.0. I guess t=
he
> one caveat here is that I just increased the dataset size until I got the
> memory issue with at least one of the R versions. It's okay. No need to
> spend more time on this. I really don't mind using the previous version.
> Like you mentioned, probably just a function of the new version requiring
> more memory.


Hmm, you might want to take a final look at the Windows FAQ 2.9. I am
still not quite convinced you're really getting more than the default
1.5 GB.

Also, how much can you increase the problem size on 2.3.0 before it
breaks? If you can only go to say 39 or 40 variables, then there's
probably not much we can do. If it is orders of magnitude, then we may
have a real bug (or not: sometimes we fix bugs resulting from things
not being duplicated when they should have been, the fixed code then
uses more memory than the unfixed code.)

=20
> Thanks,
> Derek
>=20
>=20
>=20
> On 06 Nov 2006 21:42:04 +0100, Peter Dalgaard <[EMAIL PROTECTED]>
> wrote:
> >
> > "Derek Stephen Elmerick" <[EMAIL PROTECTED]> writes:
> >
> > > Thanks for the replies. Point taken regarding submission protocol. I
> > have
> > > included a text file attachment that shows the R output with version
> > 2.3.0and
> > > 2.4.0. A label distinguishing the version is included in the comments.
> > >
> > > A quick background on the attached example. The dataset has 650,000
> > records
> > > and 32 variables. the response is dichotomous (0/1) and i ran a logis=
tic
> > > model (i previously mentioned multinomial, but decided to start simple
> > for
> > > the example). Covariates in the model may be continuous or categorica=
l,
> > but
> > > all are numeric. You'll notice that the code is the same for both
> > versions;
> > > however, there is a memory error with the 2.3.0 version. i ran this
> > several
> > > times and in different orders to make sure it was not some sort of
> > hardware
> > > issue.
> > >
> > > If there is some sort of additional output that would be helpful, I c=
an
> > > provide as well. Or, if there is nothing I can do, that is fine also.
> >
> > I don't think it was ever possible to request 4GB on XP. The version
> > difference might be caused by different response to invalid input in
> > memory.limit(). What does memory.limit(NA) tell you after the call to
> > memory.limit(4095) in the two versions?
> >
> > If that is not the reason: What is the *real* restriction of memory on
> > your system? Do you actually have 4GB in your system (RAM+swap)?
> >
> > Your design matrix is on the order of 160 MB, so shouldn't be a
> > problem with a GB-sized workspace. However, three copies of it will
> > brush against 512 MB, and it's not unlikely to have that many copies
> > around.
> >
> >
> >
> > > -Derek
> > >
> > >
> > > On 11/6/06, Kasper Daniel Hansen < [EMAIL PROTECTED]> wrote:
> > > >
> > > > It would be helpful to produce a script that reproduces the error on
> > > > your system. And include details on the size of your data set and
> > > > what you are doing with it. It is unclear what function is actually
> > > > causing the error and such. Really, in order to do something about =
it
> > > > you need to show how to actually obtain the error.
> > > >
> > > > To my knowledge nothing _major_ has happened with the memory
> > > > consumption, but of course R could use slightly more memory for
> > > > specific purposes.
> > > >
> > > > But chances are that this is not really memory related but more
> > > > related to the functions your are using - perhaps a bug or perhaps a
> > > > user error.
> > > >
> > > > Kasper
> > > >
> > > > On Nov 6, 2006, at 10:20 AM, Derek Stephen Elmerick wrote:
> > > >
> > > > > thanks for the friendly reply. i think my description was fairly
> > > > > clear: i
> > > > > import a large dataset and run a model. using the same dataset, t=
he
> > > > > process worked previously and it doesn't work now. if the new
> > > > > version of R
> > > > > requires more memory and this compromises some basic data analyse=
s,
> > > > > i would
> > > > > label this as a bug. if this memory issue was mentioned in the
> > > > > documentation, then i apologize. this email was clearly not well
> > > > > received,
> > > > > so if there is a more appropriate place to post these sort of
> > > > > questions,
> > > > > that would be helpful.
> > > > >
> > > > > -derek
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On 06 Nov 2006 18:20:33 +0100, Peter Dalgaard
> > > > > < [EMAIL PROTECTED]>
> > > > > wrote:
> > > > >>
> > > > >> [EMAIL PROTECT

Re: [Rd] gc()$Vcells < 0 (PR#9345)

2006-11-06 Thread Peter Dalgaard
[EMAIL PROTECTED] writes:

> Full_Name: Don Maszle
> Version: 2.3.0
> OS: x86_64-unknown-linux-gnu
> Submission from: (NULL) (206.86.87.3)
> 
> 
> # On our new 32 GB x86_64 machine
> 
> R : Copyright 2006, The R Foundation for Statistical Computing
> Version 2.3.0 (2006-04-24)
> ISBN 3-900051-07-0
> 
> R is free software and comes with ABSOLUTELY NO WARRANTY.
> You are welcome to redistribute it under certain conditions.
> Type 'license()' or 'licence()' for distribution details.
> 
>   Natural language support but running in an English locale
> 
> R is a collaborative project with many contributors.
> Type 'contributors()' for more information and
> 'citation()' on how to cite R or R packages in publications.
> 
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help.
> Type 'q()' to quit R.
> 
> > R.version
>_ 
> platform   x86_64-unknown-linux-gnu  
> arch   x86_64
> os linux-gnu 
> system x86_64, linux-gnu 
> status   
> major  2 
> minor  3.0   
> year   2006  
> month  04
> day24
> svn rev37909 
> language   R 
> version.string Version 2.3.0 (2006-04-24)
> 
> > x<-matrix(nrow=44000,ncol=48000)
> > y<-matrix(nrow=44000,ncol=48000)
> > z<-matrix(nrow=44000,ncol=48000)
> > gc()
>   used(Mb) gc trigger(Mb) max used(Mb)
> Ncells  177801 9.5 40750021.8   3518.7
> Vcells -1126881981 24170.6 NA 24173.4   NA 24170.6

Sorry, can't reproduce that. Please send 32GB machine ;-)

You might want to retry with 2.4.0(-patched) for good measure, though.

-- 
   O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] gc()$Vcells < 0 (PR#9345)

2006-11-06 Thread Vladimir Dergachev
On Monday 06 November 2006 6:12 pm, [EMAIL PROTECTED] wrote:
> version.string Version 2.3.0 (2006-04-24)
>
> > x<-matrix(nrow=44000,ncol=48000)
> > y<-matrix(nrow=44000,ncol=48000)
> > z<-matrix(nrow=44000,ncol=48000)
> > gc()
>
>   used(Mb) gc trigger(Mb) max used(Mb)
> Ncells  177801 9.5 40750021.8   3518.7
> Vcells -1126881981 24170.6 NA 24173.4   NA 24170.6
>

Happens to me with versions 2.40 and 2.3.1. The culprit is this line
in src/main/memory.c:

INTEGER(value)[1] = R_VSize - VHEAP_FREE();

Since the amount used is greater than 4G and INTEGER is 32bit long 
(even on 64 bit machines) this returns (harmless) nonsense. 

The megabyte value nearby is correct and gc trigger and max used fields are 
marked as NA already.

  best

 Vladimir Dergachev

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] data frame subscription operator

2006-11-06 Thread Vladimir Dergachev

Hi all, 

   I was looking at the data frame subscription operator (attached in the end 
of this e-mail) and got puzzled by the following line:

class(x) <- attr(x, "row.names") <- NULL

This appears to set the class and row.names attributes of the incoming data 
frame to NULL. So far I was not able to figure out why this is necessary - 
could anyone help ?

The reason I am looking at it is that changing attributes forces duplication 
of the data frame and this is the largest cause of slowness of data.frames in 
general.

   thank you very much !

Vladimir Dergachev


> `[.data.frame`
function (x, i, j, drop = if (missing(i)) TRUE else length(cols) ==
1)
{
mdrop <- missing(drop)
Narg <- nargs() - (!mdrop)
if (Narg < 3) {
if (!mdrop)
warning("drop argument will be ignored")
if (missing(i))
return(x)
if (is.matrix(i))
return(as.matrix(x)[i])
y <- NextMethod("[")
nm <- names(y)
if (!is.null(nm) && any(is.na(nm)))
stop("undefined columns selected")
if (any(duplicated(nm)))
names(y) <- make.unique(nm)
return(structure(y, class = oldClass(x), row.names = attr(x,
"row.names")))
}
rows <- attr(x, "row.names")
cols <- names(x)
cl <- oldClass(x)
class(x) <- attr(x, "row.names") <- NULL
if (missing(i)) {
if (!missing(j))
x <- x[j]
cols <- names(x)
if (any(is.na(cols)))
stop("undefined columns selected")
}
else {
if (is.character(i))
i <- pmatch(i, as.character(rows), duplicates.ok = TRUE)
rows <- rows[i]
if (!missing(j)) {
x <- x[j]
cols <- names(x)
if (any(is.na(cols)))
stop("undefined columns selected")
}
for (j in seq_along(x)) {
xj <- x[[j]]
x[[j]] <- if (length(dim(xj)) != 2)
xj[i]
else xj[i, , drop = FALSE]
}
}
if (drop) {
drop <- FALSE
n <- length(x)
if (n == 1) {
x <- x[[1]]
drop <- TRUE
}
else if (n > 1) {
xj <- x[[1]]
nrow <- if (length(dim(xj)) == 2)
dim(xj)[1]
else length(xj)
if (!mdrop && nrow == 1) {
drop <- TRUE
names(x) <- cols
attr(x, "row.names") <- NULL
}
}
}
if (!drop) {
names(x) <- cols
if (any(is.na(rows) | duplicated(rows))) {
rows[is.na(rows)] <- "NA"
rows <- make.unique(rows)
}
if (any(duplicated(nm <- names(x
names(x) <- make.unique(nm)
attr(x, "row.names") <- rows
class(x) <- cl
}
x
}


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] rbind with auto-row-named data.frame + list (PR#9346)

2006-11-06 Thread Mark . Bravington
There's a problem new to R2.4.0 when rbinding an auto-row-named
data.frame to a list:

> rbind( data.frame( x=1), list( x=2))
Error in attr(value, "row.names") <- rlabs : 
row names must be 'character' or 'integer', not 'double'

Works OK with 2 data.frames or 2 lists.

Mark Bravington
CSIRO Mathematical & Information Sciences
Marine Laboratory
Castray Esplanade
Hobart 7001
TAS

ph (+61) 3 6232 5118
fax (+61) 3 6232 5012
mob (+61) 438 315 623

--please do not edit the information below--

Version:
 platform = i386-pc-mingw32
 arch = i386
 os = mingw32
 system = i386, mingw32
 status = 
 major = 2
 minor = 4.0
 year = 2006
 month = 10
 day = 03
 svn rev = 39566
 language = R
 version.string = R version 2.4.0 (2006-10-03)

Windows XP Professional (build 2600) Service Pack 2.0

Locale:
LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC_MON
ETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia.1252

Search Path:
 .GlobalEnv, package:methods, package:stats, package:graphics,
package:grDevices, package:utils, package:datasets, Autoloads,
package:base

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] lme4 install, was Re: [R] Which executable is associated with R CMD INSTALL?

2006-11-06 Thread Prof Brian Ripley
On Mon, 6 Nov 2006, Douglas Bates wrote:

[...]

> However, I think it exposes a problem with R CMD INSTALL.  If there is
> a LinkingTo: directive in the DESCRIPTION file and the package's
> include directory cannot be found I think that R CMD INSTALL should
> abort.

That was a deliberate design decision: LinkingTo means 'add to the C 
include path if found'.  I think rather a package that cannot work without 
the link should ensure that the compilation fails cleanly -- I was trying 
to allow the package writer the freedom to ignore the link if it was not 
available.

-- 
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel