Re: [Rd] S3 methods with full name in documentation?

2012-03-21 Thread Spencer Graves

Hi, Duncan:


On 3/20/2012 6:43 PM, Duncan Murdoch wrote:

On 12-03-20 4:40 PM, Spencer Graves wrote:

Hello:


Is there a recommended way to inform "R CMD check" that a
function like "as.numeric" is NOT a method for the S3 generic function
"as" for objects of class "numeric"?


I ask, because I'm getting "NOTE" messages for many function
names like this (e.g., "density.fd" in the "fda" package):  If there
were a way to tell "R CMD check" that a function name is NOT a method
for an S3 generic, it would make it easier for me to see the messages I
should take seriously.


I don't think so.  The problem you are seeing is that "density" is a 
generic, so density.fd looks like a method for it.  In fact, if you 
define an object of class "fd" and call density() on it while fda is 
attached, your density.fd function will be called:


> x <- structure(1, class="fd")
> density(x)
Error in inherits(WfdParobj, "fdPar") :
  argument "WfdParobj" is missing, with no default

So in fact, density.fd *is* an S3 method, even though you didn't know it.



  Well, yes, I guess I knew that, but I also knew that "density.fd" 
should not be called with the first argument having class "fd", even 
more that "as.numeric", which is also recognized as the "numeric" method 
for the presumed generic function "as".  [Yes, I know that "as" does not 
call methods dispatch such as "UseMethod" called by "density", so 
"as.numeric" is different from that perspective.]




Nowadays every package has a namespace, and eventually maybe S3 
methods that aren't declared in the namespace as S3 methods won't be 
recognized as S3 methods.  But for now, the only real way around these 
warnings is not to name things in a way that makes them appear to be 
S3 methods.



  Thanks.  I asked, because I thought there might be something one 
could do in a NAMESPACE to avoid this.  "density.fd" has been around for 
some time, and picking a different name for it (and similar functions in 
the fda package) would break too much existing code and impose an 
unacceptable burden on long-time fda users to justify that option -- 
similar to "as.numeric".



  Thanks, again.


  Best Wishes,
  Spencer


Duncan Murdoch


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] enableJIT() and internal R completions (was: [ESS-bugs] ess-mode 12.03; ess hangs emacs)

2012-03-21 Thread Vitalie Spinu

Hello, 

JIT compiler interferes with internal R completions:

compiler::enableJIT(2)
utils:::functionArgs("density", '')

gives:

utils:::functionArgs("density", '')
Note: no visible global function definition for 'bw.nrd0' 
Note: no visible global function definition for 'bw.nrd' 
Note: no visible global function definition for 'bw.ucv' 
Note: no visible global function definition for 'bw.bcv' 
Note: no visible global function definition for 'bw.SJ' 
Note: no visible global function definition for 'bw.SJ' 
Note: no visible binding for global variable 'C_massdist' 
Note: no visible global function definition for 'dnorm' 
Note: no visible global function definition for 'fft' 
Note: no visible global function definition for 'fft' 
Note: no visible global function definition for 'fft' 
Note: no visible global function definition for 'approx' 
Note: no visible global function definition for 'bw.nrd0' 
Note: no visible global function definition for 'bw.nrd' 
Note: no visible global function definition for 'bw.ucv' 
Note: no visible global function definition for 'bw.bcv' 
Note: no visible global function definition for 'bw.SJ' 
Note: no visible global function definition for 'bw.SJ' 
Note: no visible binding for global variable 'C_massdist' 
Note: no visible global function definition for 'dnorm' 
Note: no visible global function definition for 'fft' 
Note: no visible global function definition for 'fft' 
Note: no visible global function definition for 'fft' 
Note: no visible global function definition for 'approx' 
Note: no visible global function definition for 'bw.nrd0' 
Note: no visible global function definition for 'bw.nrd' 
Note: no visible global function definition for 'bw.ucv' 
Note: no visible global function definition for 'bw.bcv' 
Note: no visible global function definition for 'bw.SJ' 
Note: no visible global function definition for 'bw.SJ' 
Note: no visible binding for global variable 'C_massdist' 
Note: no visible global function definition for 'dnorm' 
Note: no visible global function definition for 'fft' 
Note: no visible global function definition for 'fft' 
Note: no visible global function definition for 'fft' 
Note: no visible global function definition for 'approx' 
 [1] "x="  "...=""bw=" "adjust=" "kernel=" 
"weights="   
 [7] "window=" "width="  "give.Rkern=" "n="  "from="   
"to="
[13] "cut=""na.rm=" 


Disabling JIT warnings removes the notes, but the call remains to be
extremely slow, as the compiler still processes all the exceptions.

Thanks, 
Vitalie.

 Sam Steingold 
 on Tue, 20 Mar 2012 13:09:07 -0400 wrote:

  > ess hangs emacs completely.

  > I just created a brand new Rprofile

  > options(error = utils::recover)
  > library(compiler)
  > compiler::enableJIT(3)


  > and now and I start R and start typing I see this:
  >> options(STERM=.)
  >> matrix(nrow=|)
  > ("|" stands for my cursor)
  > and "nrow: x" in the minibuffer.
  > that's it.
  > emacs is stuck.

  > Program received signal SIGTSTP, Stopped (user).
  > 0x005a115a in exec_byte_code (bytestr=8748337, vector=8748373, 
maxdepth=, args_template=11970946, 
  > nargs=, args=) at 
/home/sds/src/emacs/trunk/src/bytecode.c:487
  > 487   stack.pc = stack.byte_string_start = SDATA (bytestr);
  > (gdb) where
  > #0  0x005a115a in exec_byte_code (bytestr=8748337, vector=8748373, 
maxdepth=, args_template=11970946, 
  > nargs=, args=) at 
/home/sds/src/emacs/trunk/src/bytecode.c:487
  > #1  0x00569111 in funcall_lambda (fun=8748253, nargs=, arg_vector=0x7fffb0b8)
  > at /home/sds/src/emacs/trunk/src/eval.c:3233
  > #2  0x0056948b in Ffuncall (nargs=3, args=0x7fffb0b0) at 
/home/sds/src/emacs/trunk/src/eval.c:3063
  > #3  0x005a1d46 in exec_byte_code (bytestr=, 
vector=, maxdepth=, 
  > args_template=, nargs=, args=) at /home/sds/src/emacs/trunk/src/bytecode.c:785
  > #4  0x00569111 in funcall_lambda (fun=20119685, nargs=, arg_vector=0x7fffb278)
  > at /home/sds/src/emacs/trunk/src/eval.c:3233
  > #5  0x0056948b in Ffuncall (nargs=4, args=0x7fffb270) at 
/home/sds/src/emacs/trunk/src/eval.c:3063
  > #6  0x005a1d46 in exec_byte_code (bytestr=, 
vector=, maxdepth=, 
  > args_template=, nargs=, args=) at /home/sds/src/emacs/trunk/src/bytecode.c:785
  > #7  0x00569111 in funcall_lambda (fun=20126229, nargs=, arg_vector=0x7fffb448)
  > at /home/sds/src/emacs/trunk/src/eval.c:3233
  > #8  0x0056948b in Ffuncall (nargs=6, args=0x7fffb440) at 
/home/sds/src/emacs/trunk/src/eval.c:3063
  > #9  0x005a1d46 in exec_byte_code (bytestr=, 
vector=, maxdepth=, 
  > args_template=, nargs=, args=) at /home/sds/src/emacs/trunk/src/bytecode.c:785
  > #10 0x00569111 in funcall_lambda (fun=19976549, nargs=, arg_vector=0x7fffb618)
  > at /home/sds/src/emacs/trunk/src/eval.c:3233
  > #11 0x005694

[Rd] enableJIT() prohibits usual R debugging

2012-03-21 Thread Vitalie Spinu

Hi, 

Browser doesn't work properly with the compiler enabled. It might be
intended behavior, but it's not documented.


compiler::enableJIT(1)
foo <- function(){
browser()
cat("here\n")
}



Browser doesn't stop, and I am getting:

> foo()
Called from: foo()
Browse[1]> here
> 

Thanks, 
Vitalie.

> sessionInfo()
R version 2.14.2 (2012-02-29)
Platform: i686-pc-linux-gnu (32-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C   LC_TIME=en_US.UTF-8  
 
 [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=C LC_NAME=C  LC_ADDRESS=C 
 
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C  
 

attached base packages:
[1] compiler  stats graphics  grDevices utils datasets  methods   base  
   

loaded via a namespace (and not attached):
[1] tools_2.14.2
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] enableJIT() prohibits usual R debugging

2012-03-21 Thread Feng Li

FYI

https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=14594


On 03/21/2012 10:19 AM, Vitalie Spinu wrote:


Hi,

Browser doesn't work properly with the compiler enabled. It might be
intended behavior, but it's not documented.


compiler::enableJIT(1)
foo<- function(){
 browser()
 cat("here\n")
}



Browser doesn't stop, and I am getting:


foo()

Called from: foo()
Browse[1]>  here




Thanks,
Vitalie.


sessionInfo()

R version 2.14.2 (2012-02-29)
Platform: i686-pc-linux-gnu (32-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C   LC_TIME=en_US.UTF-8
  [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=C LC_NAME=C  LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] compiler  stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.14.2






--
Feng Li
Department of Statistics
Stockholm University
SE-106 91 Stockholm, Sweden
http://feng.li/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] enableJIT() prohibits usual R debugging

2012-03-21 Thread luke-tierney

I can't reproduce this in either 2.14.1 or R-devel.

luke

On Wed, 21 Mar 2012, Vitalie Spinu wrote:



Hi,

Browser doesn't work properly with the compiler enabled. It might be
intended behavior, but it's not documented.


compiler::enableJIT(1)
foo <- function(){
   browser()
   cat("here\n")
}



Browser doesn't stop, and I am getting:


foo()

Called from: foo()
Browse[1]> here




Thanks,
Vitalie.


sessionInfo()

R version 2.14.2 (2012-02-29)
Platform: i686-pc-linux-gnu (32-bit)

locale:
[1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C   LC_TIME=en_US.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8
LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=C LC_NAME=C  LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] compiler  stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tools_2.14.2




__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] enableJIT() prohibits usual R debugging

2012-03-21 Thread Vitalie Spinu
 
 on Wed, 21 Mar 2012 07:46:21 -0500 wrote:

  > I can't reproduce this in either 2.14.1 or R-devel.

Hm .. I cannot reproduce it, nor with the latest R-devel, nor with 2.14.2
anymore. Some local glitch or something ...

Vitalie.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] uncompressed saves warning

2012-03-21 Thread Michael Friendly

[Env:  Windows XP Pro / R 2.14.1 / StatET / R-Forge]

A package of mine now generates a Warning under R 2.15.0 beta on CRAN 
checks:


* checking data for ASCII and uncompressed saves ... WARNING

  Note: significantly better compression could be obtained
by using R CMD build --resave-data
old_size new_size compress
  gfrance.rda  300Kb179Kb   xz
  gfrance85.rda295Kb176Kb   xz

What is the equivalent R command to compress these files in my project tree?

--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:   http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] enableJIT() prohibits usual R debugging

2012-03-21 Thread Vitalie Spinu
 Vitalie Spinu 
 on Wed, 21 Mar 2012 14:39:52 +0100 wrote:

 
 on Wed, 21 Mar 2012 07:46:21 -0500 wrote:

  >> I can't reproduce this in either 2.14.1 or R-devel.

  > Hm .. I cannot reproduce it, nor with the latest R-devel, nor with 2.14.2
  > anymore. Some local glitch or something ...

Instead I can reproduce similar problem as pointed by Feng:

compiler::enableJIT(1)
foo <- function(){
browser()
cat("here\n")
cat("here\n")
}

foo()

and then "n RET". Everything is skipped. 

If this is the intended behavior then a very loud note would be really
welcome in the "compile" help page.

Thanks, 
Vitalie.

> version
   _ 
platform   i686-pc-linux-gnu 
arch   i686  
os linux-gnu 
system i686, linux-gnu   
status Under development (unstable)  
major  2 
minor  16.0  
year   2012  
month  03
day20
svn rev58793 
language   R 
version.string R Under development (unstable) (2012-03-20 r58793)
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] enableJIT() prohibits usual R debugging

2012-03-21 Thread luke-tierney

On Wed, 21 Mar 2012, Vitalie Spinu wrote:


Vitalie Spinu 
on Wed, 21 Mar 2012 14:39:52 +0100 wrote:




on Wed, 21 Mar 2012 07:46:21 -0500 wrote:


 >> I can't reproduce this in either 2.14.1 or R-devel.

 > Hm .. I cannot reproduce it, nor with the latest R-devel, nor with 2.14.2
 > anymore. Some local glitch or something ...

Instead I can reproduce similar problem as pointed by Feng:

compiler::enableJIT(1)
foo <- function(){
   browser()
   cat("here\n")
   cat("here\n")
}

foo()

and then "n RET". Everything is skipped.


Then the cunction call continues as it would vor 'c' -- you can't
single step through compiled code (and debugging a compiled function
switches to the interpreted version for that reason). I thought I had
put a note about this in ?browser and ?debug but apparently not --
will do soon. Thanks for pointing this out.

luke



If this is the intended behavior then a very loud note would be really
welcome in the "compile" help page.

Thanks,
Vitalie.


version

  _
platform   i686-pc-linux-gnu
arch   i686
os linux-gnu
system i686, linux-gnu
status Under development (unstable)
major  2
minor  16.0
year   2012
month  03
day20
svn rev58793
language   R
version.string R Under development (unstable) (2012-03-20 r58793)






--
Luke Tierney
Chair, Statistics and Actuarial Science
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa  Phone: 319-335-3386
Department of Statistics andFax:   319-335-3017
   Actuarial Science
241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] uncompressed saves warning

2012-03-21 Thread Uwe Ligges



On 21.03.2012 14:58, Michael Friendly wrote:

[Env: Windows XP Pro / R 2.14.1 / StatET / R-Forge]

A package of mine now generates a Warning under R 2.15.0 beta on CRAN
checks:

* checking data for ASCII and uncompressed saves ... WARNING

Note: significantly better compression could be obtained
by using R CMD build --resave-data
old_size new_size compress
gfrance.rda 300Kb 179Kb xz
gfrance85.rda 295Kb 176Kb xz

What is the equivalent R command to compress these files in my project
tree?



Michael,

if you use
R CMD build --resave-data
to build the tar archive, the versions therein are recompressed.

Otherwise, you can also open the files and resave them via save() and 
appropriate arguments.


Or use  resaveRdaFiles() in package tools to runn it on a whole folder 
automatically.


Best,
uwe

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] overriding "summary.default" or "summary.data.frame". How?

2012-03-21 Thread Uwe Ligges

Simple answer: Never ever override R base functionality.

Best,
Uwe Ligges





On 20.03.2012 16:24, Paul Johnson wrote:

I suppose everybody who makes a package for the first time thinks "I
can change anything!" and then runs into this same question. Has
anybody written out information on how a package can override
functions in R base in the R 2.14 (mandatory NAMESPACE era)?

Suppose I want to alphabetize variables in a summary.data.frame, or
return the standard deviation with the mean in summary output.  I'm
pasting in a working example below.  It has new "summary.factor"
method. It also has a function summarize that I might like to use in
place of summary.data.frame.

How would my new methods "drop on top" of R base functions?  It
appears my functions (summarizeFactors) can find my summary.factor,
but R's own summary uses its own summary.factor.


## summarizeNumerics takes a data frame or matrix, scans the columns
## to select only the numeric variables.  By default it alphabetizes
## the columns (use alphaSort = FALSE to stop that). It then
## calculates the quantiles for each variable, as well as the mean,
## standard deviation, and variance, and then packs those results into
## a matrix. The main benefits from this compared to R's default
## summary are 1) more summary information is returned for each
## variable, and the results are returned in a matrix that is easy to
## use in further analysis.
summarizeNumerics<- function(dat, alphaSort = TRUE,  digits = max(3,
getOption("digits") - 3)){
   if (!is.data.frame(dat)) dat<- as.data.frame(dat)
   nums<- sapply(dat, is.numeric)
   datn<- dat[ , nums, drop = FALSE]
   if (alphaSort) datn<- datn[ , sort(colnames(datn)), drop = FALSE]
   sumdat<- apply(datn, 2, stats::quantile, na.rm=TRUE)
   sumdat<- rbind(sumdat, mean= apply(datn, 2, mean, na.rm=TRUE))
   sumdat<- rbind(sumdat, sd= apply(datn, 2, sd, na.rm=TRUE))
   sumdat<- rbind(sumdat, var= apply(datn, 2, var, na.rm=TRUE))
   sumdat<- rbind(sumdat, "NA's"=apply(datn, 2, function(x) sum(is.na(x
   signif(sumdat, digits)
}


summary.factor<- function(y, numLevels) {
   ## 5 nested functions to be used later

   divr<- function(p=0){
 ifelse ( p>0&  p<  1, -p * log2(p), 0)
   }
   entropy<- function(p){
 sum ( divr(p) )
   }
   maximumEntropy<- function(N) -log2(1/N)
   normedEntropy<- function(x) entropy(x)/ maximumEntropy(length(x))
   nas<- is.na(y)
   y<- factor(y)
   ll<- levels(y)
   tbl<- table(y)
   tt<- c(tbl)
   names(tt)<- dimnames(tbl)[[1L]]
   o<- sort.list(tt, decreasing = TRUE)
   if (length(ll)>  numLevels){
 toExclude<- numLevels:length(ll)
 tt<- c(tt[o[-toExclude]], `(All Others)` = sum(tt[o[toExclude]]),
`NA's`=sum(nas))
   }else{
 tt<- c(tt[o], `NA's`=sum(nas))
   }
   props<- prop.table(tbl);
   tt<- c(tt, "Entropy"=entropy(props), "NormedEntropy"= normedEntropy(props))
}


## Takes a data frame or matrix, scans the columns to find the
## variables that are not numeric and keeps them. By default it
## alphabetizes them (alphaSort = FALSE to stop that).  It then treats
## all non-numeric variables as if they were factors, and summarizes
## each in a say that I find useful. In particular, for each factor,
## it provides a table of the most frequently occurring values (the
## top "numLevels" values are represented).  As a diversity indictor,
## it calculates the Entropy and NormedEntropy values for each
## variable.  Note not all of this is original. It combines my code
## and R code from base/summary.R
summarizeFactors<- function(dat = NULL, numLevels = 10, alphaSort =
TRUE, digits = max(3, getOption("digits") - 3))
{

   ##copies from R base::summary.R summary.data.frame
   ncw<- function(x) {
 z<- nchar(x, type="w")
 if (any(na<- is.na(z))) {
 # FIXME: can we do better
   z[na]<- nchar(encodeString(z[na]), "b")
 }
 z
   }

   if (!is.data.frame(dat)) dat<- as.data.frame(dat)
   ##treat any nonnumeric as a factor
   factors<- sapply(dat, function(x) {!is.numeric(x) })
   ##If only one factor, need drop=FALSE.
   datf<- dat[ , factors, drop = FALSE]
   if (alphaSort) datf<- datf[ , sort(colnames(datf)), drop = FALSE]
   z<- lapply(datf, summary.factor, numLevels=numLevels)
   nv<- length(datf)
   nm<- names(datf)
   lw<- numeric(nv)
   nr<- max(unlist(lapply(z, NROW)))
   for(i in 1L:nv) {
 sms<- z[[i]]
 lbs<- format(names(sms))
 sms<- paste(lbs, ":", format(sms, digits = digits), "  ",
  sep = "")
 lw[i]<- ncw(lbs[1L])
 length(sms)<- nr
 z[[i]]<- sms
   }
   z<- unlist(z, use.names=TRUE)
   dim(z)<- c(nr, nv)
   if(any(is.na(lw)))
 warning("probably wrong encoding in names(.) of column ",
 paste(which(is.na(lw)), collapse = ", "))
 blanks<- paste(character(max(lw, na.rm=TRUE) + 2L), collapse = " ")
   pad<- floor(lw - ncw(nm)/2)
   nm<- paste(substring(blanks, 1, pad), nm, sep = "")
   dimnames(z)<- list(rep.int("", nr), nm)
   attr(z, "class")<- c("table")
   z
}

#

Re: [Rd] bzip2'ed data under data/

2012-03-21 Thread Prof Brian Ripley

On 19/03/2012 20:25, Sebastian P. Luque wrote:

Hi,

R CMD check PACKAGE_VERSION_tar.gz gives warning:

Files not of a type allowed in a ‘data’ directory:
   ‘tser1.csv.bz2’ ‘tser2.csv.bz2’
Please use e.g. ‘inst/extdata’ for non-R data files

which I didn't expect, based on section 1.1.5 (Data in packages) of the
Writing R Extensions manual:

Tables (`.tab', `.txt', or `.csv' files) can be compressed by
`gzip', `bzip2' or `xz', optionally with additional extension `.gz',
`.bz2' or `.xz'.  However, such files can only be used with R 2.10.0 or
later, and so the package should have an appropriate `Depends' entry in
its DESCRIPTION file.

In this case, I have a Depends: R (>= 2.13.0), and the package was built
with R version 2.15.0 beta (2012-03-16 r58769), Platform:
x86_64-pc-linux-gnu (64-bit), so I don't understand the warning.

Cheers,



Well, the extension is allowed 'optionally' to be .csv.bz2, but that 
does not make it good practice and I would suggest not using it.


But that 'check' picked it up was a typo in the code 'check' used to 
specify types of data() files, corrected since your build of R so I 
would expect current R-devel or R-pre-release not to give the NOTE.  I 
am not sure whether or not that has any ramifications for users of the 
package with older versions of R, but we know calling the compressed 
file foo.csv would work.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] enableJIT() and internal R completions (was: [ESS-bugs] ess-mode 12.03; ess hangs emacs)

2012-03-21 Thread luke-tierney

The compiler/JIT is behaving as expected. The idea of compiling on
duplicate, which level 2 enables, is to address some idioms where
functions are modified at runtime.  Ideally it would be good to avoid
these idioms, and we may eventually get there, but for now they are an
issue.

In this particular case getAnywhere contains

if (is.function(x))
environment(x) <- baseenv()
x

and the new function is duplicated and compiled in an lapply() call,
but the function doesn't make much sense with the baseenv()
environment, hence all the warnings.  This seems to be done so that
identical() can be used to compare functions and consider functions
that only differ in the environment to be edentical. It would probably
be better to do this in another more explicit way -- I'll look into
it.

(This does not occur in 2.4.1 because of a lack of duplicating bug in
lapply that has since been fixed.)

luke

On Wed, 21 Mar 2012, Vitalie Spinu wrote:



Hello,

JIT compiler interferes with internal R completions:

compiler::enableJIT(2)
utils:::functionArgs("density", '')

gives:

utils:::functionArgs("density", '')
Note: no visible global function definition for 'bw.nrd0'
Note: no visible global function definition for 'bw.nrd'
Note: no visible global function definition for 'bw.ucv'
Note: no visible global function definition for 'bw.bcv'
Note: no visible global function definition for 'bw.SJ'
Note: no visible global function definition for 'bw.SJ'
Note: no visible binding for global variable 'C_massdist'
Note: no visible global function definition for 'dnorm'
Note: no visible global function definition for 'fft'
Note: no visible global function definition for 'fft'
Note: no visible global function definition for 'fft'
Note: no visible global function definition for 'approx'
Note: no visible global function definition for 'bw.nrd0'
Note: no visible global function definition for 'bw.nrd'
Note: no visible global function definition for 'bw.ucv'
Note: no visible global function definition for 'bw.bcv'
Note: no visible global function definition for 'bw.SJ'
Note: no visible global function definition for 'bw.SJ'
Note: no visible binding for global variable 'C_massdist'
Note: no visible global function definition for 'dnorm'
Note: no visible global function definition for 'fft'
Note: no visible global function definition for 'fft'
Note: no visible global function definition for 'fft'
Note: no visible global function definition for 'approx'
Note: no visible global function definition for 'bw.nrd0'
Note: no visible global function definition for 'bw.nrd'
Note: no visible global function definition for 'bw.ucv'
Note: no visible global function definition for 'bw.bcv'
Note: no visible global function definition for 'bw.SJ'
Note: no visible global function definition for 'bw.SJ'
Note: no visible binding for global variable 'C_massdist'
Note: no visible global function definition for 'dnorm'
Note: no visible global function definition for 'fft'
Note: no visible global function definition for 'fft'
Note: no visible global function definition for 'fft'
Note: no visible global function definition for 'approx'
[1] "x="  "...=""bw=" "adjust=" "kernel=" 
"weights="
[7] "window=" "width="  "give.Rkern=" "n="  "from="   "to="
[13] "cut=""na.rm="


Disabling JIT warnings removes the notes, but the call remains to be
extremely slow, as the compiler still processes all the exceptions.

Thanks,
Vitalie.


Sam Steingold 
on Tue, 20 Mar 2012 13:09:07 -0400 wrote:


 > ess hangs emacs completely.

 > I just created a brand new Rprofile

 > options(error = utils::recover)
 > library(compiler)
 > compiler::enableJIT(3)


 > and now and I start R and start typing I see this:
 >> options(STERM=.)
 >> matrix(nrow=|)
 > ("|" stands for my cursor)
 > and "nrow: x" in the minibuffer.
 > that's it.
 > emacs is stuck.

 > Program received signal SIGTSTP, Stopped (user).
 > 0x005a115a in exec_byte_code (bytestr=8748337, vector=8748373, 
maxdepth=, args_template=11970946,
 > nargs=, args=) at 
/home/sds/src/emacs/trunk/src/bytecode.c:487
 > 487   stack.pc = stack.byte_string_start = SDATA (bytestr);
 > (gdb) where
 > #0  0x005a115a in exec_byte_code (bytestr=8748337, vector=8748373, 
maxdepth=, args_template=11970946,
 > nargs=, args=) at 
/home/sds/src/emacs/trunk/src/bytecode.c:487
 > #1  0x00569111 in funcall_lambda (fun=8748253, nargs=, arg_vector=0x7fffb0b8)
 > at /home/sds/src/emacs/trunk/src/eval.c:3233
 > #2  0x0056948b in Ffuncall (nargs=3, args=0x7fffb0b0) at 
/home/sds/src/emacs/trunk/src/eval.c:3063
 > #3  0x005a1d46 in exec_byte_code (bytestr=, vector=, maxdepth=,
 > args_template=, nargs=, args=) at /home/sds/src/emacs/trunk/src/bytecode.c:785
 > #4  0x00569111 in funcall_lambda (fun=20119685, nargs=, arg_vector=0x7fffb278)
 > at /home/sds/src/emacs/trunk/src/eval.c:3233
 > #

[Rd] Why is there no within.environment function?

2012-03-21 Thread Richard Cotton
If I want to assign some variables into an environment, it seems
natural to do something like

e <- new.env()
within(e,
{
  x <- 1:5
  y <- runif(5)
}
)

This throws an error, since within.environment doesn't exist.  I
realise I can work around it using

as.environment(within(as.list(e),
{
  x <- 1:5
  y <- runif(5)
}
))

Just wondering why I can't use within directly with environments.

--
4dpiecharts.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Why is there no within.environment function?

2012-03-21 Thread William Dunlap
Wouldn't within.environment be identical to with.environment?
  > e <- new.env()
  > with(e, { One <- 1 ; Two <- 2+2i ; Theee <- One + Two })
  > objects(e)
  [1] "One"   "Theee" "Two"
It might make the transition between lists and environments
simpler if within.environment  existed.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


> -Original Message-
> From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] On 
> Behalf
> Of Richard Cotton
> Sent: Wednesday, March 21, 2012 2:51 PM
> To: r-devel@r-project.org
> Subject: [Rd] Why is there no within.environment function?
> 
> If I want to assign some variables into an environment, it seems
> natural to do something like
> 
> e <- new.env()
> within(e,
> {
>   x <- 1:5
>   y <- runif(5)
> }
> )
> 
> This throws an error, since within.environment doesn't exist.  I
> realise I can work around it using
> 
> as.environment(within(as.list(e),
> {
>   x <- 1:5
>   y <- runif(5)
> }
> ))
> 
> Just wondering why I can't use within directly with environments.
> 
> --
> 4dpiecharts.com
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Why is there no within.environment function?

2012-03-21 Thread Gabor Grothendieck
On Wed, Mar 21, 2012 at 5:51 PM, Richard Cotton  wrote:
> If I want to assign some variables into an environment, it seems
> natural to do something like
>
> e <- new.env()
> within(e,
>    {
>      x <- 1:5
>      y <- runif(5)
>    }
> )
>
> This throws an error, since within.environment doesn't exist.  I
> realise I can work around it using
>

'with' already does that:

e <- new.env()
with(e, {
   x <- 1.5
   y <- runif(5)
})
ls(e) # lists x and y

Also since proto objects are environments this works:

library(proto)
p <- proto(x = 1.5, y = runif(5))
p$ls() # lists x and y

-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Why is there no within.environment function?

2012-03-21 Thread peter dalgaard

On Mar 21, 2012, at 23:01 , William Dunlap wrote:

> Wouldn't within.environment be identical to with.environment?
>> e <- new.env()
>> with(e, { One <- 1 ; Two <- 2+2i ; Theee <- One + Two })
>> objects(e)
>  [1] "One"   "Theee" "Two"
> It might make the transition between lists and environments
> simpler if within.environment  existed.
> 

evalq() would be rather more to the point. Then again, with() _is_ really just 
a sugar-coated evalq(). 

within() was quite specifically created because you couldn't do the same kind 
of things with data frames that you could do with environments, so the current 
thread does seem a bit peculiar to me... (The original design of within() would 
modify the object in-place, like fix(), but Luke objected.)


  


> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
> 
> 
>> -Original Message-
>> From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] 
>> On Behalf
>> Of Richard Cotton
>> Sent: Wednesday, March 21, 2012 2:51 PM
>> To: r-devel@r-project.org
>> Subject: [Rd] Why is there no within.environment function?
>> 
>> If I want to assign some variables into an environment, it seems
>> natural to do something like
>> 
>> e <- new.env()
>> within(e,
>>{
>>  x <- 1:5
>>  y <- runif(5)
>>}
>> )
>> 
>> This throws an error, since within.environment doesn't exist.  I
>> realise I can work around it using
>> 
>> as.environment(within(as.list(e),
>>{
>>  x <- 1:5
>>  y <- runif(5)
>>}
>> ))
>> 
>> Just wondering why I can't use within directly with environments.
>> 
>> --
>> 4dpiecharts.com
>> 
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Why is there no within.environment function?

2012-03-21 Thread Gavin Simpson
On Wed, 2012-03-21 at 22:01 +, William Dunlap wrote:
> Wouldn't within.environment be identical to with.environment?
>   > e <- new.env()
>   > with(e, { One <- 1 ; Two <- 2+2i ; Theee <- One + Two })
>   > objects(e)
>   [1] "One"   "Theee" "Two"
> It might make the transition between lists and environments
> simpler if within.environment  existed.
> 
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com

One doesn't normally think of `with()` as changing it's `data` argument,
which might be one reason the connection to `with.environment()` was not
made here.

> d <- data.frame()
> with(d, {
+ A <- 1:3
+ B <- 1:3
+ })
> d
data frame with 0 columns and 0 rows

The behaviour of `with.environment()` makes sense once you think about
it as there is only one environment `e` (from your example), and when it
is updated during the call to `with()` it isn't a copy that is being
updated but the real thing. So I can see why it was overlooked.

G

> 
> > -Original Message-
> > From: r-devel-boun...@r-project.org [mailto:r-devel-boun...@r-project.org] 
> > On Behalf
> > Of Richard Cotton
> > Sent: Wednesday, March 21, 2012 2:51 PM
> > To: r-devel@r-project.org
> > Subject: [Rd] Why is there no within.environment function?
> > 
> > If I want to assign some variables into an environment, it seems
> > natural to do something like
> > 
> > e <- new.env()
> > within(e,
> > {
> >   x <- 1:5
> >   y <- runif(5)
> > }
> > )
> > 
> > This throws an error, since within.environment doesn't exist.  I
> > realise I can work around it using
> > 
> > as.environment(within(as.list(e),
> > {
> >   x <- 1:5
> >   y <- runif(5)
> > }
> > ))
> > 
> > Just wondering why I can't use within directly with environments.
> > 
> > --
> > 4dpiecharts.com
> > 
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
 Dr. Gavin Simpson [t] +44 (0)20 7679 0522
 ECRC, UCL Geography,  [f] +44 (0)20 7679 0565
 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
 Gower Street, London  [w] http://www.ucl.ac.uk/~ucfagls/
 UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R's copying of arguments (Re: Julia)

2012-03-21 Thread Simon Urbanek

On Mar 20, 2012, at 3:08 PM, Hervé Pagès wrote:

> Hi Oliver,
> 
> On 03/17/2012 08:35 AM, oliver wrote:
>> Hello,
>> 
>> regarding the copying issue,
>> I would like to point to the
>> 
>> "Writing R-Extensions" documentation.
>> 
>> There it is mentio9ned, that functions of extensions
>> that use the .C interface normally do get their arguments
>> pre-copied...
>> 
>> 
>> In section 5.2:
>> 
>>   "There can be up to 65 further arguments giving R objects to be
>>   passed to compiled code. Normally these are copied before being
>>   passed in, and copied again to an R list object when the compiled
>>   code returns."
>> 
>> But for the .Call and .Extension interfaces this is NOT the case.
>> 
>> 
>> 
>> In section 5.9:
>>   "The .Call and .External interfaces allow much more control, but
>>   they also impose much greater responsibilities so need to be used
>>   with care. Neither .Call nor .External copy their arguments. You
>>   should treat arguments you receive through these interfaces as
>>   read-only."
>> 
>> 
>> Why is read-only preferred?
>> 
>> Please, see the discussion in section 5.9.10.
>> 
>> It's mentioned there, that a copy of an object in the R-language
>> not necessarily doies a real copy of that object, but instead of
>> this, just a "rerference" to the real data is created (two names
>> referring to one bulk of data). That's typical functional
>> programming: not a variable, but a name (and possibly more than one
>> name) bound to an object.
>> 
>> 
>> Of course, if yo change the orgiginal named value, when there
>> would be no copy of it, before changing it, then both names
>> would refer to the changed data.
>> of course that is not, what is wanted.
>> 
>> But what you also can see in section 5.9.10 is, that
>> there already is a mechanism (reference counting) that allows
>> to distinguish between unnamed and named object.
>> 
>> So, this is directly adressing the points you have mentioned in your
>> examples.
>> 
>> So, at least in principial, R allows to do in-place modifications
>> of object with the .Call interface.
>> 
>> You seem to refer to the .C interface, and I had explored the .Call
>> interface. That's the reason why you may insist on "it's copyied
>> always" and I wondered, what you were talking about, because the
>> .Call interface allowed me rather C-like raw style of programming
>> (and the user of it to decide, if copying will be done or not).
>> 
>> The mechanism to descide, if copying should be done or not,
>> also is mentioined in section 5.9.10: NAMED and SET_NAMED macros.
>> with NAMED you can get the number of references.
>> 
>> But later in that section it is mentioned, that - at least for now -
>> NAMED always returns the value 2.
>> 
>> 
>>   "Currently all arguments to a .Call call will have NAMED set to 2,
>>   and so users must assume that they need to be duplicated before
>>   alteration."
>>(section 5.9.10, last sentence)
>> 
>> 
>> So, the in-place modification can be done already with the .Call
>> intefcae for example. But the decision if it is safe or not
>> is not supported at the moment.
>> 
>> So the situation is somewhere between: "it is possible" and
>> "R does not support a safe decision if, what is possible, also
>> can be recommended".
>> At the moment R rather deprecates in-place modification by default
>> (the save way, and I agree with this default).
>> 
>> But it's not true, that R in general copies arguments.
>> 
>> But this seems to be true for the .C interface.
>> 
>> Maybe a lot of performance-/memory-problems can be solved
>> by rewriting already existing packages, by providing them
>> via .Call instead of .C.
> 
> My understanding is that most packages use the .C interface
> because it's simpler to deal with and because they don't need
> to pass complicated objects at the C level, just atomic vectors.
> My guess is that it's probably rarely the case that the cost
> of copying the arguments passed to .C is significant, but,
> if that was the case, then they could always call .C() with
> DUP=FALSE. However, using DUP=FALSE is dangerous (see Warning
> section in the man page).
> 
> No need to switch to .Call
> 

I strongly disagree. I'm appalled to see that sentence here. The overhead is 
significant for any large vector and it is in particular unnecessary since in 
.C you have to allocate *and copy* space even for results (twice!). Also it is 
very error-prone, because you have no information about the length of vectors 
so it's easy to run out of bounds and there is no way to check. IMHO .C should 
not be used for any code written in this century (the only exception may be if 
you are passing no data, e.g. if all you do is to pass a flag and expect no 
result, you can get away with it even if it is more dangerous). It is a legacy 
interface that dates way back and is essentially just re-named .Fortran 
interface. Again, I would strongly recommend the use of .Call in any recent 
code because it is safer and more effic

Re: [Rd] R's copying of arguments (Re: Julia)

2012-03-21 Thread Hervé Pagès

On 03/21/2012 06:23 PM, Simon Urbanek wrote:


On Mar 20, 2012, at 3:08 PM, Hervé Pagès wrote:


Hi Oliver,

On 03/17/2012 08:35 AM, oliver wrote:

Hello,

regarding the copying issue,
I would like to point to the

"Writing R-Extensions" documentation.

There it is mentio9ned, that functions of extensions
that use the .C interface normally do get their arguments
pre-copied...


In section 5.2:

   "There can be up to 65 further arguments giving R objects to be
   passed to compiled code. Normally these are copied before being
   passed in, and copied again to an R list object when the compiled
   code returns."

But for the .Call and .Extension interfaces this is NOT the case.



In section 5.9:
   "The .Call and .External interfaces allow much more control, but
   they also impose much greater responsibilities so need to be used
   with care. Neither .Call nor .External copy their arguments. You
   should treat arguments you receive through these interfaces as
   read-only."


Why is read-only preferred?

Please, see the discussion in section 5.9.10.

It's mentioned there, that a copy of an object in the R-language
not necessarily doies a real copy of that object, but instead of
this, just a "rerference" to the real data is created (two names
referring to one bulk of data). That's typical functional
programming: not a variable, but a name (and possibly more than one
name) bound to an object.


Of course, if yo change the orgiginal named value, when there
would be no copy of it, before changing it, then both names
would refer to the changed data.
of course that is not, what is wanted.

But what you also can see in section 5.9.10 is, that
there already is a mechanism (reference counting) that allows
to distinguish between unnamed and named object.

So, this is directly adressing the points you have mentioned in your
examples.

So, at least in principial, R allows to do in-place modifications
of object with the .Call interface.

You seem to refer to the .C interface, and I had explored the .Call
interface. That's the reason why you may insist on "it's copyied
always" and I wondered, what you were talking about, because the
.Call interface allowed me rather C-like raw style of programming
(and the user of it to decide, if copying will be done or not).

The mechanism to descide, if copying should be done or not,
also is mentioined in section 5.9.10: NAMED and SET_NAMED macros.
with NAMED you can get the number of references.

But later in that section it is mentioned, that - at least for now -
NAMED always returns the value 2.


   "Currently all arguments to a .Call call will have NAMED set to 2,
   and so users must assume that they need to be duplicated before
   alteration."
(section 5.9.10, last sentence)


So, the in-place modification can be done already with the .Call
intefcae for example. But the decision if it is safe or not
is not supported at the moment.

So the situation is somewhere between: "it is possible" and
"R does not support a safe decision if, what is possible, also
can be recommended".
At the moment R rather deprecates in-place modification by default
(the save way, and I agree with this default).

But it's not true, that R in general copies arguments.

But this seems to be true for the .C interface.

Maybe a lot of performance-/memory-problems can be solved
by rewriting already existing packages, by providing them
via .Call instead of .C.


My understanding is that most packages use the .C interface
because it's simpler to deal with and because they don't need
to pass complicated objects at the C level, just atomic vectors.
My guess is that it's probably rarely the case that the cost
of copying the arguments passed to .C is significant, but,
if that was the case, then they could always call .C() with
DUP=FALSE. However, using DUP=FALSE is dangerous (see Warning
section in the man page).

No need to switch to .Call



I strongly disagree. I'm appalled to see that sentence here.


Come on!


The overhead is significant for any large vector and it is in particular 
unnecessary since in .C you have to allocate *and copy* space even for results 
(twice!). Also it is very error-prone, because you have no information about 
the length of vectors so it's easy to run out of bounds and there is no way to 
check. IMHO .C should not be used for any code written in this century (the 
only exception may be if you are passing no data, e.g. if all you do is to pass 
a flag and expect no result, you can get away with it even if it is more 
dangerous). It is a legacy interface that dates way back and is essentially 
just re-named .Fortran interface. Again, I would strongly recommend the use of 
.Call in any recent code because it is safer and more efficient (if you don't 
care about either attribute, well, feel free ;)).


So aleph will not support the .C interface? ;-)

H.



Cheers,
Simon







Cheers,
H.




Ciao,
Oliver




On Tue, Mar 06, 2012 at 04:44:49PM +, William Dunla

Re: [Rd] R's copying of arguments (Re: Julia)

2012-03-21 Thread Simon Urbanek

On Mar 21, 2012, at 9:31 PM, Hervé Pagès wrote:

> On 03/21/2012 06:23 PM, Simon Urbanek wrote:
>> 
>> On Mar 20, 2012, at 3:08 PM, Hervé Pagès wrote:
>> 
>>> Hi Oliver,
>>> 
>>> On 03/17/2012 08:35 AM, oliver wrote:
 Hello,
 
 regarding the copying issue,
 I would like to point to the
 
 "Writing R-Extensions" documentation.
 
 There it is mentio9ned, that functions of extensions
 that use the .C interface normally do get their arguments
 pre-copied...
 
 
 In section 5.2:
 
   "There can be up to 65 further arguments giving R objects to be
   passed to compiled code. Normally these are copied before being
   passed in, and copied again to an R list object when the compiled
   code returns."
 
 But for the .Call and .Extension interfaces this is NOT the case.
 
 
 
 In section 5.9:
   "The .Call and .External interfaces allow much more control, but
   they also impose much greater responsibilities so need to be used
   with care. Neither .Call nor .External copy their arguments. You
   should treat arguments you receive through these interfaces as
   read-only."
 
 
 Why is read-only preferred?
 
 Please, see the discussion in section 5.9.10.
 
 It's mentioned there, that a copy of an object in the R-language
 not necessarily doies a real copy of that object, but instead of
 this, just a "rerference" to the real data is created (two names
 referring to one bulk of data). That's typical functional
 programming: not a variable, but a name (and possibly more than one
 name) bound to an object.
 
 
 Of course, if yo change the orgiginal named value, when there
 would be no copy of it, before changing it, then both names
 would refer to the changed data.
 of course that is not, what is wanted.
 
 But what you also can see in section 5.9.10 is, that
 there already is a mechanism (reference counting) that allows
 to distinguish between unnamed and named object.
 
 So, this is directly adressing the points you have mentioned in your
 examples.
 
 So, at least in principial, R allows to do in-place modifications
 of object with the .Call interface.
 
 You seem to refer to the .C interface, and I had explored the .Call
 interface. That's the reason why you may insist on "it's copyied
 always" and I wondered, what you were talking about, because the
 .Call interface allowed me rather C-like raw style of programming
 (and the user of it to decide, if copying will be done or not).
 
 The mechanism to descide, if copying should be done or not,
 also is mentioined in section 5.9.10: NAMED and SET_NAMED macros.
 with NAMED you can get the number of references.
 
 But later in that section it is mentioned, that - at least for now -
 NAMED always returns the value 2.
 
 
   "Currently all arguments to a .Call call will have NAMED set to 2,
   and so users must assume that they need to be duplicated before
   alteration."
(section 5.9.10, last sentence)
 
 
 So, the in-place modification can be done already with the .Call
 intefcae for example. But the decision if it is safe or not
 is not supported at the moment.
 
 So the situation is somewhere between: "it is possible" and
 "R does not support a safe decision if, what is possible, also
 can be recommended".
 At the moment R rather deprecates in-place modification by default
 (the save way, and I agree with this default).
 
 But it's not true, that R in general copies arguments.
 
 But this seems to be true for the .C interface.
 
 Maybe a lot of performance-/memory-problems can be solved
 by rewriting already existing packages, by providing them
 via .Call instead of .C.
>>> 
>>> My understanding is that most packages use the .C interface
>>> because it's simpler to deal with and because they don't need
>>> to pass complicated objects at the C level, just atomic vectors.
>>> My guess is that it's probably rarely the case that the cost
>>> of copying the arguments passed to .C is significant, but,
>>> if that was the case, then they could always call .C() with
>>> DUP=FALSE. However, using DUP=FALSE is dangerous (see Warning
>>> section in the man page).
>>> 
>>> No need to switch to .Call
>>> 
>> 
>> I strongly disagree. I'm appalled to see that sentence here.
> 
> Come on!
> 
>> The overhead is significant for any large vector and it is in particular 
>> unnecessary since in .C you have to allocate *and copy* space even for 
>> results (twice!). Also it is very error-prone, because you have no 
>> information about the length of vectors so it's easy to run out of bounds 
>> and there is no way to check. IMHO .C should not be used for any code 
>> written in this century (the only 

[Rd] Thai vignette, cross-compile for Mac OS X, universal/multiarch (Fwd: Mac OS X builds of CelQuantileNorm, vcftools/samtools/tabix, and snpStats)

2012-03-21 Thread Hin-Tak Leung

FYI.

There is a Thai vignette - and it went a lot further doing some Thai text 
processing in R, than the earlier Chinese/Tibetan/LiangshanYi/Arabic vignette, 
which was in reality just Chinese + Cairo graphics.


Managed to cross-compile an R package for Mac OS X from Linux; and it seems to 
be working. See screenshots below. I'd be interested to know if there are less 
obvious bugs; however numerical difference with native snpStats 1.5.5 is more 
likely because 1.5.5 is buggy than 1.5.5.1 mis-compiled.


Also noticed that R on Mac OS X is universal, with symlinks to the per-arch 
multiarch dylibs. Is there any plans for R packages to switch over to universal 
instead of multi-arch?
(the sentence below as strictly speaking wrong - the R package was 
cross-compiled to multi-arch rather than universal).


FYI a few other not-so-relevant things but may be useful to some.

 Original Message 
Subject: Mac OS X builds of CelQuantileNorm, vcftools/samtools/tabix, and 
snpStats
Date: Wed, 14 Mar 2012 02:59:42 + (GMT)
From: Hin-Tak Leung 
Reply-To: ht...@users.sourceforge.net
To: bonsai list 

CelQantileNorm, vcftools/samtools/tabix are bult for Mac OS X. These are 
universal binaries and work for all recent variants of Powerpc-, Intel 
32-bit/64-bit Macs.


snpStats 1.5.5 (out a week ago) turned out to be "differently buggy" from 
previous, so there are windows and Mac OS X build of 1.5.5.1 . The Mac OS X 
build is also Universal and was tested for both intel 32-bit/64-bit macs. See also

http://outmodedbonsai.sourceforge.net/InstallingLocalRPackages.html
for a few brief instructions and screenshots for installing R packages from 
downloads.


Here is the new Mac OS X area:
http://sourceforge.net/projects/outmodedbonsai/files/Packages%20for%20Mac%20OS%20X/


 Original Message 
Subject: Two new vignettes, and a bunch of updates here and elsewhere
Date: Tue, 06 Mar 2012 06:39:38 +
From: Hin-Tak Leung 
To: outmodedbonsai-annou...@lists.sourceforge.net

There is a new vignette, "Algorithms and Thailand", which covers
text/stream/sequence and spatial algorithms, and the use of Thai language in a
vignette.
(Chinese/Tibetan/Arabic/Liangshan Yi was covered in a different vignette
previously).
More on this and emacs's multilingual extension further down.

"snpMatrix Tutorial 2007" was the first ever tutorial written in spring 2007
[1]; it was pre-vignette and therefore had all the hapmap/pedfile output
recorded verbatim; newly revived to work as a testsuite for input/output of
hapmap/ped files (and flaged and fixed a few bugs introduced in the last few
years). The bug fixes will appear in the upcoming snpMatrix 1.19.0.12, which
hopefully will include some continual work to add read.tped() and read.vcf().

The usual suspects:

- new linux and win32 mono 2.10.8.1 builds with the large heap patch;

- "less buggy" snpStats 1.4.1.1, 1.5.2.1, 1.5.4.1 ;

- snpMatrix124 1.2.4.6, fixes a small bug;

- new BeanSprout manual which have a bit more information about various Genome
Studio versions;

- new linux native and non-native build of the whole BeanSprout family against
Genome Studio 2011.1 (they were supposed to happen when I release the android
port, but then I forgot).

There are the 2nd monthly snapshots of win32 and android builds of vcftools,
tabix, samtools. I'd probably continue on a semi-regular monthly basis until
after snpMatrix 1.19.0.12 before I tidy up the adaptations and send upstream.

Besides the two 2.10.8.1 builds, there is a mono 2.11pre snapshot, (mono
2.11pre-push-1480-g0ed3827.x86_64.fc16.tbz2) to address a bug [2].

Lastly, while writing Thai vignette, I fixed a problem which was filed against
emacs [3] ; My notes on 'Typesetting Thai', the emacs lisp script which fixed
that problem and other enhancements mentioned in the notes, a snapshot of
CJK/LaTeX with those enhancements, and perl/python script for unpacking zip
files generated on Chinese windows onto linux/mac's correctly, are housed
elsewhere [4]. While these have nothing to do with science, people processing
non-english data (and specifically Chinese data) may find them useful.

Unless otherwise stated, all of these are somewhere under:
http://sourceforge.net/projects/outmodedbonsai/files/

[1] The original is in the 'historical' section and the updated in 
'next/1.19.0.12'.

[2] https://bugzilla.novell.com/show_bug.cgi?id=720031

[3] http://debbugs.gnu.org/cgi/bugreport.cgi?bug=8108

[4] http://htl10.users.sourceforge.net/Languages/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R 2.14.1 memory management under Windows

2012-03-21 Thread Spencer Graves
I computed "system.time(diag(3))" with R 2.12.0 on Fedora 13 Linux 
with 4 GB RAM and with R 2.14.1 on Windows 7 with 8 GB RAM:



Linux (4 GB RAM):  0, 0.21, 0.21 -- a fifth of a second


Windows 7 (8 GB RAM):  11.37 7.47 93.19 -- over 1.5 minutes.  Moreover, 
during most of that time, I could not switch windows or get any response 
from the system.  When I first encountered this, I thought Windows was 
hung permanently and the only way out was a hard reset and reboot.



  On both systems, diag(3) generated, "Error:  cannot allocate 
vector of size ___ Gb", with "___" = 3.4 for Linux with 4 GB RAM and 6.7 
for Windows with 8 GB RAM.  Linux with half the RAM and an older version 
of R was done with this in 0.21 seconds.  Windows 7 went into suspension 
for over 93 seconds -- 1.5 minutes before giving an error message.



   I don't know how easy this would be to fix under Windows, but I 
felt a need to report it.



  Best Wishes,
  Spencer


--
Spencer Graves, PE, PhD
President and Chief Technology Officer
Structure Inspection and Monitoring, Inc.
751 Emerson Ct.
San José, CA 95126
ph:  408-655-4567
web:  www.structuremonitoring.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R 2.14.1 memory management under Windows

2012-03-21 Thread Joshua Wiley
On Wed, Mar 21, 2012 at 10:14 PM, Spencer Graves
 wrote:
> I computed "system.time(diag(3))" with R 2.12.0 on Fedora 13 Linux with
> 4 GB RAM and with R 2.14.1 on Windows 7 with 8 GB RAM:
>
>
> Linux (4 GB RAM):  0, 0.21, 0.21 -- a fifth of a second
>
>
> Windows 7 (8 GB RAM):  11.37 7.47 93.19 -- over 1.5 minutes.  Moreover,
> during most of that time, I could not switch windows or get any response
> from the system.  When I first encountered this, I thought Windows was hung
> permanently and the only way out was a hard reset and reboot.
>
>
>      On both systems, diag(3) generated, "Error:  cannot allocate vector
> of size ___ Gb", with "___" = 3.4 for Linux with 4 GB RAM and 6.7 for
> Windows with 8 GB RAM.  Linux with half the RAM and an older version of R
> was done with this in 0.21 seconds.  Windows 7 went into suspension for over
> 93 seconds -- 1.5 minutes before giving an error message.
>
>
>       I don't know how easy this would be to fix under Windows, but I felt a
> need to report it.

This seems like it may be an issue with paging, which Windows has
traditionally not excelled at.  That said, on Windows 7 x64 with 6GB
RAM and another 6GB paging file with R version 2.14.1 (2011-12-22), I
get:

> system.time(diag(3))
Error: cannot allocate vector of size 3.4 Gb
Timing stopped at: 0.01 0 0.01


Cheers,

Josh

so the timing is comparable to nix.
>
>
>      Best Wishes,
>      Spencer
>
>
> --
> Spencer Graves, PE, PhD
> President and Chief Technology Officer
> Structure Inspection and Monitoring, Inc.
> 751 Emerson Ct.
> San José, CA 95126
> ph:  408-655-4567
> web:  www.structuremonitoring.com
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel