[Rd] RFC: getifexists() {was [Bug 16065] "exists" ...}

2015-01-08 Thread Martin Maechler
In November, we had a "bug repository conversation"
with Peter Hagerty and myself:

  https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16065

where the bug report title started with

 --->>  "exists" is a bottleneck for dispatch and package loading, ...

Peter proposed an extra simplified and henc faster version of exists(),
and I commented

> --- Comment #2 from Martin Maechler  ---
> I'm very grateful that you've started exploring the bottlenecks of loading
> packages with many S4 classes (and methods)...
> and I hope we can make real progress there rather sooner than later.

> OTOH, your `summaryRprof()` in your vignette indicates that exists() may 
use
> upto 10% of the time spent in library(reportingTools),  and your speedup
> proposals of exist()  may go up to ca 30%  which is good and well worth
> considering,  but still we can only expect 2-3% speedup for package 
loading
> which unfortunately is not much.

> Still I agree it is worth looking at exists() as you did  ... and 
> consider providing a fast simplified version of it in addition to current
> exists() [I think].

> BTW, as we talk about enhancements here, maybe consider a further 
possibility:
> My subjective guess is that probably more than half of exists() uses are 
of the
> form

> if(exists(name, where, ...)) {
>get(name, whare, )
>..
> } else { 
> NULL / error() / .. or similar
> }

> i.e. many exists() calls when returning TRUE are immediately followed by 
the
> corresponding get() call which repeats quite a bit of the lookup that 
exists()
> has done.

> Instead, I'd imagine a function, say  getifexists(name, ...) that does 
both at
> once in the "exists is TRUE" case but in a way we can easily keep the 
if(.) ..
> else clause above.  One already existing approach would use

> if(!inherits(tryCatch(xx <- get(name, where, ...), error=function(e)e), 
"error")) {

>   ... (( work with xx )) ...

> } else  { 
>NULL / error() / .. or similar
> }

> but of course our C implementation would be more efficient and use more 
concise
> syntax {which should not look like error handling}.   Follow ups to this 
idea
> should really go to R-devel (the mailing list).

and now I do follow up here myself :

I found that  'getifexists()' is actually very simple to implement,
I have already tested it a bit, but not yet committed to R-devel
(the "R trunk" aka "master branch") because I'd like to get
public comments {RFC := Request For Comments}.

My version of the help file {for both exists() and getifexists()}
rendered in text is

-- help(getifexists) ---
Is an Object Defined?

Description:

 Look for an R object of the given name and possibly return it

Usage:

 exists(x, where = -1, envir = , frame, mode = "any",
inherits = TRUE)
 
 getifexists(x, where = -1, envir = as.environment(where),
 mode = "any", inherits = TRUE, value.if.not = NULL)
 
Arguments:

   x: a variable name (given as a character string).

   where: where to look for the object (see the details section); if
  omitted, the function will search as if the name of the
  object appeared unquoted in an expression.

   envir: an alternative way to specify an environment to look in, but
  it is usually simpler to just use the ‘where’ argument.

   frame: a frame in the calling list.  Equivalent to giving ‘where’ as
  ‘sys.frame(frame)’.

mode: the mode or type of object sought: see the ‘Details’ section.

inherits: should the enclosing frames of the environment be searched?

value.if.not: the return value of ‘getifexists(x, *)’ when ‘x’ does not
  exist.

Details:

 The ‘where’ argument can specify the environment in which to look
 for the object in any of several ways: as an integer (the position
 in the ‘search’ list); as the character string name of an element
 in the search list; or as an ‘environment’ (including using
 ‘sys.frame’ to access the currently active function calls).  The
 ‘envir’ argument is an alternative way to specify an environment,
 but is primarily there for back compatibility.

 This function looks to see if the name ‘x’ has a value bound to it
 in the specified environment.  If ‘inherits’ is ‘TRUE’ and a value
 is not found for ‘x’ in the specified environment, the enclosing
 frames of the environment are searched until the name ‘x’ is
 encountered.  See ‘environment’ and the ‘R Language Definition’
 manual for details about the structure of environments and their
 enclosures.

 *Warning:* ‘inherits = TRUE’ is the default behaviour for R but
 not for S.

 If ‘mode’ is specified then only objects of that type are sought.
 The ‘mode’ may specify one of the collections ‘"numeric"’ and
 ‘"

Re: [Rd] RFC: getifexists() {was [Bug 16065] "exists" ...}

2015-01-08 Thread Duncan Murdoch
On 08/01/2015 4:16 AM, Martin Maechler wrote:
> In November, we had a "bug repository conversation"
> with Peter Hagerty and myself:
> 
>   https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16065
> 
> where the bug report title started with
> 
>  --->>  "exists" is a bottleneck for dispatch and package loading, ...
> 
> Peter proposed an extra simplified and henc faster version of exists(),
> and I commented
> 
> > --- Comment #2 from Martin Maechler  ---
> > I'm very grateful that you've started exploring the bottlenecks of 
> loading
> > packages with many S4 classes (and methods)...
> > and I hope we can make real progress there rather sooner than later.
> 
> > OTOH, your `summaryRprof()` in your vignette indicates that exists() 
> may use
> > upto 10% of the time spent in library(reportingTools),  and your speedup
> > proposals of exist()  may go up to ca 30%  which is good and well worth
> > considering,  but still we can only expect 2-3% speedup for package 
> loading
> > which unfortunately is not much.
> 
> > Still I agree it is worth looking at exists() as you did  ... and 
> > consider providing a fast simplified version of it in addition to 
> current
> > exists() [I think].
> 
> > BTW, as we talk about enhancements here, maybe consider a further 
> possibility:
> > My subjective guess is that probably more than half of exists() uses 
> are of the
> > form
> 
> > if(exists(name, where, ...)) {
> >get(name, whare, )
> >..
> > } else { 
> > NULL / error() / .. or similar
> > }
> 
> > i.e. many exists() calls when returning TRUE are immediately followed 
> by the
> > corresponding get() call which repeats quite a bit of the lookup that 
> exists()
> > has done.
> 
> > Instead, I'd imagine a function, say  getifexists(name, ...) that does 
> both at
> > once in the "exists is TRUE" case but in a way we can easily keep the 
> if(.) ..
> > else clause above.  One already existing approach would use
> 
> > if(!inherits(tryCatch(xx <- get(name, where, ...), error=function(e)e), 
> "error")) {
> 
> >   ... (( work with xx )) ...
> 
> > } else  { 
> >NULL / error() / .. or similar
> > }
> 
> > but of course our C implementation would be more efficient and use more 
> concise
> > syntax {which should not look like error handling}.   Follow ups to 
> this idea
> > should really go to R-devel (the mailing list).
> 
> and now I do follow up here myself :
> 
> I found that  'getifexists()' is actually very simple to implement,
> I have already tested it a bit, but not yet committed to R-devel
> (the "R trunk" aka "master branch") because I'd like to get
> public comments {RFC := Request For Comments}.
> 

I don't like the name -- I'd prefer getIfExists.  As Baath (2012, R
Journal) pointed out, R names are very inconsistent in naming
conventions, but lowerCamelCase is the most common choice.  Second most
common is period.separated, so an argument could be made for
get.if.exists, but there's still the possibility of confusion with S3
methods, and users of other languages where "." is an operator find it a
little strange.

If you don't like lowerCamelCase (and a lot of people don't), then I
think underscore_separated is the next best choice, so would use
get_if_exists.

Another possibility is to make no new name at all, and just add an
optional parameter to get() (which if present acts as your value.if.not
parameter, if not present keeps the current "object not found" error).

Duncan Murdoch


> My version of the help file {for both exists() and getifexists()}
> rendered in text is
> 
> -- help(getifexists) ---
> Is an Object Defined?
> 
> Description:
> 
>  Look for an R object of the given name and possibly return it
> 
> Usage:
> 
>  exists(x, where = -1, envir = , frame, mode = "any",
> inherits = TRUE)
>  
>  getifexists(x, where = -1, envir = as.environment(where),
>  mode = "any", inherits = TRUE, value.if.not = NULL)
>  
> Arguments:
> 
>x: a variable name (given as a character string).
> 
>where: where to look for the object (see the details section); if
>   omitted, the function will search as if the name of the
>   object appeared unquoted in an expression.
> 
>envir: an alternative way to specify an environment to look in, but
>   it is usually simpler to just use the ‘where’ argument.
> 
>frame: a frame in the calling list.  Equivalent to giving ‘where’ as
>   ‘sys.frame(frame)’.
> 
> mode: the mode or type of object sought: see the ‘Details’ section.
> 
> inherits: should the enclosing frames of the environment be searched?
> 
> value.if.not: the return value of ‘getifexists(x, *)’ when ‘x’ does not
>   exist.
> 
> Details:
> 
>  The ‘where’ argument can specify the environme

Re: [Rd] gsub with perl=TRUE results in 'this version of PCRE is not compiled with Unicode property support' in R-devel

2015-01-08 Thread Prof Brian Ripley
Why are you reporting that your PCRE library does not have something 
which the R-admin manual says it should preferably have?  To wit, 
footnote 37 says


'and not PCRE2, which started at version 10.0. PCRE must be built with 
UTF-8 support (not the default) and support for Unicode properties is 
assumed by some R packages. Neither are tested by configure. JIT support 
is desirable.'


That certainly does not fail on my Linux, Windows and OS X builds of 
R-devel.  (Issues about pre-built binaries, if that is what you used, 
should be reported to their maintainers, not here.)


And the help does say in ?regex

 In UTF-8 mode, some Unicode properties may be supported via
 ‘\p{xx}’ and ‘\P{xx}’ which match characters with and without
 property ‘xx’ respectively.

Note the 'may'.




On 07/01/2015 23:25, Dan Tenenbaum wrote:

The following code:

res <- gsub("(*UCP)\\b(i)\\b",
 "", "nhgrimelanomaclass", perl = TRUE)

results in:

Error in gsub(sprintf("(*UCP)\\b(%s)\\b", "i"), "", "nhgrimelanomaclass",  :
   invalid regular expression '(*UCP)\b(i)\b'
In addition: Warning message:
In gsub(sprintf("(*UCP)\\b(%s)\\b", "i"), "", "nhgrimelanomaclass",  :
   PCRE pattern compilation error
'this version of PCRE is not compiled with Unicode property support'
at '(*UCP)\b(i)\b'

on

R Under development (unstable) (2015-01-01 r67290)
Platform: x86_64-apple-darwin13.4.0 (64-bit)
Running under: OS X 10.9.5 (Mavericks)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

And also on the same version of R-devel on Snow Leopard, Windows, and Linux. 
But it does not produce an error on

R version 3.1.2 (2014-10-31)
Platform: x86_64-apple-darwin13.4.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

Dan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford
1 South Parks Road, Oxford OX1 3TG, UK

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] On base::rank

2015-01-08 Thread Arunkumar Srinivasan
Have a look at the following, taken from base::rank:

...
if (!is.na(na.last) && any(nas)) {
yy <- integer(length(x)) # <~
storage.mode(yy) <- storage.mode(y) # <
yy <- NA
NAkeep <- (na.last == "keep")
if (NAkeep || na.last) {
yy[!nas] <- y
if (!NAkeep)
yy[nas] <- (length(y) + 1L):length(yy)
}
...

Alternatively, look at lines 36 and 37 here:
https://github.com/wch/r-source/blob/fbf5cdf29d923395b537a9893f46af1aa75e38f3/src/library/base/R/rank.R#L36

There seems to be no need for those lines, IIUC. Isn't it? 'yy' is
replaced with NA in the ver next line.

Best,
Arun.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] New version of Rtools for Windows

2015-01-08 Thread Avraham Adler
Very timely, as this is how I got into the problem I posted about
earlier; maybe some of the problems I ran into will mean more to the
you and the experts on this thread, Dr. Murdoch.For reference, I run
Windows 7 64bit, and I am trying to build a 64 bit version of R-3.1.2.

As we discussed offline, Dr. Murdoch, I've been trying to build R
using more recent tools than GCC4.6.3 prerelease. Ruben Von Boxen
(rubenvb) told me he is no longer developing his own builds of GCC,
but is focusing on MSYS2 and the mingw64 personal builds. So, similar
to what Jeroen said, I first installed MSYS2, whose initial
installation on windows is not so simple[1]. After the initial
install, the following packages need to be manually installed: make,
tar, zip, unzip, zlib, and rsync. I also installed base-devel, which
is way more than necessary, but there may be packages in there which
are necessary.

I originally installed the most up-to-date version of GCC (4.9.2)[2],
and I did pick the -seh version, as since I install (almost) all
packages from source (the one exception being nloptr for now), the
exception handling should be consistent and it is supposed to up to
~15% faster[3].

The initial build crashed with the following error:

gcc -std=gnu99 -m64 -I../../include -I. -DHAVE_CONFIG_H  -O3 -Wall
-pedantic -mtune=core2   -c xmalloc.c -o xmalloc.o
ar crs libtre.a regcomp.o regerror.o regexec.o tre-ast.o tre-compile.o
tre-match -approx.o tre-match-backtrack.o tre-match-parallel.o
tre-mem.o tre-parse.o tre-stack.o xmalloc.o
gcc -std=gnu99 -m64   -O3 -Wall -pedantic -mtune=core2   -c compat.c -o compat.o
compat.c:65:5: error: redefinition of 'snprintf'
 int snprintf(char *buffer, size_t max, const char *format, ...)
 ^
In file included from compat.c:3:0:
F:/MinGW64/x86_64-w64-mingw32/include/stdio.h:553:5: note: previous
definition of 'snprintf' was here
 int snprintf (char * __restrict__ __stream, size_t __n, const char *
__restrict__ __format, ...)
 ^
compat.c:75:5: error: redefinition of 'vsnprintf'
 int vsnprintf(char *buffer, size_t bufferSize, const char *format,
va_list args)
 ^
In file included from compat.c:3:0:
F:/MinGW64/x86_64-w64-mingw32/include/stdio.h:543:7: note: previous
definition of 'vsnprintf' was here
   int vsnprintf (char * __restrict__ __stream, size_t __n, const char
* __restrict__ __format, va_list __local_argv)
   ^
../../gnuwin32/MkRules:218: recipe for target 'compat.o' failed
make[4]: *** [compat.o] Error 1
Makefile:120: recipe for target 'rlibs' failed
make[3]: *** [rlibs] Error 1
Makefile:179: recipe for target '../../bin/x64/R.dll' failed
make[2]: *** [../../bin/x64/R.dll] Error 2
Makefile:104: recipe for target 'rbuild' failed
make[1]: *** [rbuild] Error 2
Makefile:14: recipe for target 'all' failed
make: *** [all] Error 2

After doing some checking (for example see [4]), I asked Duncan about
the problem, and he suggested moving the #ifndef _W64 in compat.c up
above the offending lines (65-75). That did not work, so, I figured
(it seems mistakenly from the other thread) that if those functions
are included from stdio already, I can just delete them from compat.c.
The specific lines are:

int snprintf(char *buffer, size_t max, const char *format, ...)
{
int res;
va_list(ap);
va_start(ap, format);
res = trio_vsnprintf(buffer, max, format, ap);
va_end(ap);
return res;
}

int vsnprintf(char *buffer, size_t bufferSize, const char *format, va_list args)
{
return trio_vsnprintf(buffer, bufferSize, format, args);
}

Continuing the build using 4.9.2 crashed again at the following point:

gcc -std=gnu99 -m64 -I../include -I. -I../extra -DHAVE_CONFIG_H
-DR_DLL_BUILD  -O3 -Wall -pedantic -mtune=core2   -c malloc.c -o
malloc.o
windres -F pe-x86-64  -I../include -i dllversion.rc -o dllversion.o
gcc -std=gnu99 -m64 -shared -s -mwindows -o R.dll R.def console.o
dynload.o editor.o embeddedR.o extra.o opt.o pager.o preferences.o
psignal.o rhome.o rt_complete.o rui.o run.o shext.o sys-win32.o
system.o dos_wglob.o malloc.o ../main/libmain.a ../appl/libappl.a
../nmath/libnmath.a getline/gl.a ../extra/xdr/libxdr.a
../extra/pcre/libpcre.a ../extra/bzip2/libbz2.a
../extra/intl/libintl.a ../extra/trio/libtrio.a ../extra/tzone/libtz.a
../extra/tre/libtre.a ../extra/xz/liblzma.a dllversion.o -fopenmp -L.
-lgfortran -lRblas -L../../bin/x64 -lRzlib -lRgraphapp -lRiconv
-lcomctl32 -lversion
collect2.exe: error: ld returned 5 exit status
Makefile:150: recipe for target 'R.dll' failed
make[3]: *** [R.dll] Error 1
Makefile:179: recipe for target '../../bin/x64/R.dll' failed
make[2]: *** [../../bin/x64/R.dll] Error 2
Makefile:104: recipe for target 'rbuild' failed
make[1]: *** [rbuild] Error 2
Makefile:14: recipe for target 'all' failed
make: *** [all] Error 2

As all those files existed in their correct places, the only reason I
could think of that this would fail here is that GCC version 4.9 did
make some changes to enhance link-time optimization [5], and probably
something isn't com

Re: [Rd] New version of Rtools for Windows

2015-01-08 Thread Henric Winell

On 2015-01-08 14:18, Avraham Adler wrote:


Very timely, as this is how I got into the problem I posted about
earlier; maybe some of the problems I ran into will mean more to the
you and the experts on this thread, Dr. Murdoch.For reference, I run
Windows 7 64bit, and I am trying to build a 64 bit version of R-3.1.2.

As we discussed offline, Dr. Murdoch, I've been trying to build R
using more recent tools than GCC4.6.3 prerelease. Ruben Von Boxen
(rubenvb) told me he is no longer developing his own builds of GCC,
but is focusing on MSYS2 and the mingw64 personal builds. So, similar
to what Jeroen said, I first installed MSYS2, whose initial
installation on windows is not so simple[1]. After the initial
install, the following packages need to be manually installed: make,
tar, zip, unzip, zlib, and rsync. I also installed base-devel, which
is way more than necessary, but there may be packages in there which
are necessary.

I originally installed the most up-to-date version of GCC (4.9.2)[2],
and I did pick the -seh version, as since I install (almost) all
packages from source (the one exception being nloptr for now), the
exception handling should be consistent and it is supposed to up to
~15% faster[3].

The initial build crashed with the following error:

gcc -std=gnu99 -m64 -I../../include -I. -DHAVE_CONFIG_H  -O3 -Wall
-pedantic -mtune=core2   -c xmalloc.c -o xmalloc.o
ar crs libtre.a regcomp.o regerror.o regexec.o tre-ast.o tre-compile.o
tre-match -approx.o tre-match-backtrack.o tre-match-parallel.o
tre-mem.o tre-parse.o tre-stack.o xmalloc.o
gcc -std=gnu99 -m64   -O3 -Wall -pedantic -mtune=core2   -c compat.c -o compat.o
compat.c:65:5: error: redefinition of 'snprintf'
  int snprintf(char *buffer, size_t max, const char *format, ...)
  ^
In file included from compat.c:3:0:
F:/MinGW64/x86_64-w64-mingw32/include/stdio.h:553:5: note: previous
definition of 'snprintf' was here
  int snprintf (char * __restrict__ __stream, size_t __n, const char *
__restrict__ __format, ...)
  ^
compat.c:75:5: error: redefinition of 'vsnprintf'
  int vsnprintf(char *buffer, size_t bufferSize, const char *format,
va_list args)
  ^
In file included from compat.c:3:0:
F:/MinGW64/x86_64-w64-mingw32/include/stdio.h:543:7: note: previous
definition of 'vsnprintf' was here
int vsnprintf (char * __restrict__ __stream, size_t __n, const char
* __restrict__ __format, va_list __local_argv)
^
../../gnuwin32/MkRules:218: recipe for target 'compat.o' failed
make[4]: *** [compat.o] Error 1
Makefile:120: recipe for target 'rlibs' failed
make[3]: *** [rlibs] Error 1
Makefile:179: recipe for target '../../bin/x64/R.dll' failed
make[2]: *** [../../bin/x64/R.dll] Error 2
Makefile:104: recipe for target 'rbuild' failed
make[1]: *** [rbuild] Error 2
Makefile:14: recipe for target 'all' failed
make: *** [all] Error 2

After doing some checking (for example see [4]), I asked Duncan about
the problem, and he suggested moving the #ifndef _W64 in compat.c up
above the offending lines (65-75). That did not work, so, I figured
(it seems mistakenly from the other thread) that if those functions
are included from stdio already, I can just delete them from compat.c.
The specific lines are:

int snprintf(char *buffer, size_t max, const char *format, ...)
{
 int res;
 va_list(ap);
 va_start(ap, format);
 res = trio_vsnprintf(buffer, max, format, ap);
 va_end(ap);
 return res;
}

int vsnprintf(char *buffer, size_t bufferSize, const char *format, va_list args)
{
 return trio_vsnprintf(buffer, bufferSize, format, args);
}

Continuing the build using 4.9.2 crashed again at the following point:

gcc -std=gnu99 -m64 -I../include -I. -I../extra -DHAVE_CONFIG_H
-DR_DLL_BUILD  -O3 -Wall -pedantic -mtune=core2   -c malloc.c -o
malloc.o
windres -F pe-x86-64  -I../include -i dllversion.rc -o dllversion.o
gcc -std=gnu99 -m64 -shared -s -mwindows -o R.dll R.def console.o
dynload.o editor.o embeddedR.o extra.o opt.o pager.o preferences.o
psignal.o rhome.o rt_complete.o rui.o run.o shext.o sys-win32.o
system.o dos_wglob.o malloc.o ../main/libmain.a ../appl/libappl.a
../nmath/libnmath.a getline/gl.a ../extra/xdr/libxdr.a
../extra/pcre/libpcre.a ../extra/bzip2/libbz2.a
../extra/intl/libintl.a ../extra/trio/libtrio.a ../extra/tzone/libtz.a
../extra/tre/libtre.a ../extra/xz/liblzma.a dllversion.o -fopenmp -L.
-lgfortran -lRblas -L../../bin/x64 -lRzlib -lRgraphapp -lRiconv
-lcomctl32 -lversion
collect2.exe: error: ld returned 5 exit status
Makefile:150: recipe for target 'R.dll' failed
make[3]: *** [R.dll] Error 1
Makefile:179: recipe for target '../../bin/x64/R.dll' failed
make[2]: *** [../../bin/x64/R.dll] Error 2
Makefile:104: recipe for target 'rbuild' failed
make[1]: *** [rbuild] Error 2
Makefile:14: recipe for target 'all' failed
make: *** [all] Error 2

As all those files existed in their correct places, the only reason I
could think of that this would fail here is that GCC version 4.9 did
make some changes to enhance 

Re: [Rd] RFC: getifexists() {was [Bug 16065] "exists" ...}

2015-01-08 Thread John Nolan
Adding an optional argument to get (and mget) like

val <- get(name, where, ..., value.if.not.found=NULL )   (*)

would be useful for many.  HOWEVER, it is possible that there could be 
some confusion here: (*) can give a NULL because either x exists and 
has value NULL, or because x doesn't exist.   If that matters, the user 
would need to be careful about specifying a value.if.not.found that cannot 
be confused with a valid value of x.  

To avoid this difficulty, perhaps we want both: have Martin's getifexists( ) 
return a list with two values: 
  - a boolean variable 'found'  # = value returned by exists( )
  - a variable 'value'

Then implement get( ) as:

get <- function(x,...,value.if.not.found ) {

  if( missing(value.if.not.found) ) {
a <- getifexists(x,... )
if (!a$found) error("x not found")
  } else {
a <- getifexists(x,...,value.if.not.found )
  }
  return(a$value)
}

Note that value.if.not.found has no default value in above.
It behaves exactly like current get does if value.if.not.found 
is not specified, and if it is specified, it would be faster 
in the common situation mentioned below:   
 if(exists(x,...)) { get(x,...) }

John

P.S. if you like dromedaries call it valueIfNotFound ...

 ..
 John P. Nolan
 Math/Stat Department
 227 Gray Hall,   American University
 4400 Massachusetts Avenue, NW
 Washington, DC 20016-8050

 jpno...@american.edu   voice: 202.885.3140  
 web: academic2.american.edu/~jpnolan
 ..


-"R-devel"  wrote: - 
To: Martin Maechler , R-devel@r-project.org
From: Duncan Murdoch 
Sent by: "R-devel" 
Date: 01/08/2015 06:39AM
Subject: Re: [Rd] RFC: getifexists() {was [Bug 16065] "exists" ...}

On 08/01/2015 4:16 AM, Martin Maechler wrote:
> In November, we had a "bug repository conversation"
> with Peter Hagerty and myself:
> 
>   https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16065
> 
> where the bug report title started with
> 
>  --->>  "exists" is a bottleneck for dispatch and package loading, ...
> 
> Peter proposed an extra simplified and henc faster version of exists(),
> and I commented
> 
> > --- Comment #2 from Martin Maechler  ---
> > I'm very grateful that you've started exploring the bottlenecks of 
> loading
> > packages with many S4 classes (and methods)...
> > and I hope we can make real progress there rather sooner than later.
> 
> > OTOH, your `summaryRprof()` in your vignette indicates that exists() 
> may use
> > upto 10% of the time spent in library(reportingTools),  and your speedup
> > proposals of exist()  may go up to ca 30%  which is good and well worth
> > considering,  but still we can only expect 2-3% speedup for package 
> loading
> > which unfortunately is not much.
> 
> > Still I agree it is worth looking at exists() as you did  ... and 
> > consider providing a fast simplified version of it in addition to 
> current
> > exists() [I think].
> 
> > BTW, as we talk about enhancements here, maybe consider a further 
> possibility:
> > My subjective guess is that probably more than half of exists() uses 
> are of the
> > form
> 
> > if(exists(name, where, ...)) {
> >get(name, whare, )
> >..
> > } else { 
> > NULL / error() / .. or similar
> > }
> 
> > i.e. many exists() calls when returning TRUE are immediately followed 
> by the
> > corresponding get() call which repeats quite a bit of the lookup that 
> exists()
> > has done.
> 
> > Instead, I'd imagine a function, say  getifexists(name, ...) that does 
> both at
> > once in the "exists is TRUE" case but in a way we can easily keep the 
> if(.) ..
> > else clause above.  One already existing approach would use
> 
> > if(!inherits(tryCatch(xx <- get(name, where, ...), error=function(e)e), 
> "error")) {
> 
> >   ... (( work with xx )) ...
> 
> > } else  { 
> >NULL / error() / .. or similar
> > }
> 
> > but of course our C implementation would be more efficient and use more 
> concise
> > syntax {which should not look like error handling}.   Follow ups to 
> this idea
> > should really go to R-devel (the mailing list).
> 
> and now I do follow up here myself :
> 
> I found that  'getifexists()' is actually very simple to implement,
> I have already tested it a bit, but not yet committed to R-devel
> (the "R trunk" aka "master branch") because I'd like to get
> public comments {RFC := Request For Comments}.
> 

I don't like the name -- I'd prefer getIfExists.  As Baath (2012, R
Journal) pointed out, R names are very inconsistent in naming
conventions, but lowerCamelCase is the most common choice.  Second most
common is period.separated, so an argument could be made for
get.if.exists, but there's still the possibility of confusion with S3
methods, and users of other languages wher

Re: [Rd] On base::rank

2015-01-08 Thread Martin Maechler
> Arunkumar Srinivasan 
> on Thu, 8 Jan 2015 13:46:57 +0100 writes:

> Have a look at the following, taken from base::rank:

> ...
> if (!is.na(na.last) && any(nas)) {
> yy <- integer(length(x)) # <~
> storage.mode(yy) <- storage.mode(y) # <
> yy <- NA
> NAkeep <- (na.last == "keep")
> if (NAkeep || na.last) {
> yy[!nas] <- y
> if (!NAkeep)
> yy[nas] <- (length(y) + 1L):length(yy)
> }
> ...

> Alternatively, look at lines 36 and 37 here:
> https://github.com/wch/r-source/blob/fbf5cdf29d923395b537a9893f46af1aa75e38f3/src/library/base/R/rank.R#L36

> There seems to be no need for those lines, IIUC. Isn't it? 
> 'yy' is replaced with NA in the ver next line.

Indeed.   Interesting that nobody has noticed till now,
even though that part has been world readable since at least 2008-08-25.

Note that the R source code is at 
 http://svn.r-project.org/R/
and the file in question at
 http://svn.r-project.org/R/trunk/src/library/base/R/rank.R

where you can already see the new code
(given that 'x' was no longer needed, there's no need for 'xx').

Martin Maechler, 
ETH Zurich

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] On base::rank

2015-01-08 Thread Arunkumar Srinivasan
> Indeed.   Interesting that nobody has noticed till now,
> even though that part has been world readable since at least 2008-08-25.

That was what made me a bit unsure :-).

> Note that the R source code is at
>  http://svn.r-project.org/R/
> and the file in question at
>  http://svn.r-project.org/R/trunk/src/library/base/R/rank.R

Okay, thanks.

> where you can already see the new code
> (given that 'x' was no longer needed, there's no need for 'xx').

Great! thanks again.

> Martin Maechler,
> ETH Zurich

Best,
Arun.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: getifexists() {was [Bug 16065] "exists" ...}

2015-01-08 Thread Duncan Murdoch

On 08/01/2015 9:03 AM, John Nolan wrote:

Adding an optional argument to get (and mget) like

val <- get(name, where, ..., value.if.not.found=NULL )   (*)


That would be a bad idea, as it would change behaviour of existing uses 
of get().  What I suggested would not
give a default.  If the arg was missing, we'd get the old behaviour, if 
the arg was present, we'd use it.


I'm not sure this is preferable to the separate function 
implementation.  This makes the documentation and implementation of 
get() more complicated, and it would probably be slower for everyone.


Duncan Murdoch



would be useful for many.  HOWEVER, it is possible that there could be
some confusion here: (*) can give a NULL because either x exists and
has value NULL, or because x doesn't exist.   If that matters, the user
would need to be careful about specifying a value.if.not.found that cannot
be confused with a valid value of x.

To avoid this difficulty, perhaps we want both: have Martin's getifexists( )
return a list with two values:
   - a boolean variable 'found'  # = value returned by exists( )
   - a variable 'value'

Then implement get( ) as:

get <- function(x,...,value.if.not.found ) {

   if( missing(value.if.not.found) ) {
 a <- getifexists(x,... )
 if (!a$found) error("x not found")
   } else {
 a <- getifexists(x,...,value.if.not.found )
   }
   return(a$value)
}

Note that value.if.not.found has no default value in above.
It behaves exactly like current get does if value.if.not.found
is not specified, and if it is specified, it would be faster
in the common situation mentioned below:
  if(exists(x,...)) { get(x,...) }

John

P.S. if you like dromedaries call it valueIfNotFound ...

  ..
  John P. Nolan
  Math/Stat Department
  227 Gray Hall,   American University
  4400 Massachusetts Avenue, NW
  Washington, DC 20016-8050

  jpno...@american.edu   voice: 202.885.3140
  web: academic2.american.edu/~jpnolan
  ..


-"R-devel"  wrote: -
To: Martin Maechler , R-devel@r-project.org
From: Duncan Murdoch
Sent by: "R-devel"
Date: 01/08/2015 06:39AM
Subject: Re: [Rd] RFC: getifexists() {was [Bug 16065] "exists" ...}

On 08/01/2015 4:16 AM, Martin Maechler wrote:
> In November, we had a "bug repository conversation"
> with Peter Hagerty and myself:
>
>   https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16065
>
> where the bug report title started with
>
>  --->>  "exists" is a bottleneck for dispatch and package loading, ...
>
> Peter proposed an extra simplified and henc faster version of exists(),
> and I commented
>
> > --- Comment #2 from Martin Maechler  ---
> > I'm very grateful that you've started exploring the bottlenecks of 
loading
> > packages with many S4 classes (and methods)...
> > and I hope we can make real progress there rather sooner than later.
>
> > OTOH, your `summaryRprof()` in your vignette indicates that exists() 
may use
> > upto 10% of the time spent in library(reportingTools),  and your speedup
> > proposals of exist()  may go up to ca 30%  which is good and well worth
> > considering,  but still we can only expect 2-3% speedup for package 
loading
> > which unfortunately is not much.
>
> > Still I agree it is worth looking at exists() as you did  ... and
> > consider providing a fast simplified version of it in addition to 
current
> > exists() [I think].
>
> > BTW, as we talk about enhancements here, maybe consider a further 
possibility:
> > My subjective guess is that probably more than half of exists() uses 
are of the
> > form
>
> > if(exists(name, where, ...)) {
> >get(name, whare, )
> >..
> > } else {
> > NULL / error() / .. or similar
> > }
>
> > i.e. many exists() calls when returning TRUE are immediately followed 
by the
> > corresponding get() call which repeats quite a bit of the lookup that 
exists()
> > has done.
>
> > Instead, I'd imagine a function, say  getifexists(name, ...) that does 
both at
> > once in the "exists is TRUE" case but in a way we can easily keep the 
if(.) ..
> > else clause above.  One already existing approach would use
>
> > if(!inherits(tryCatch(xx <- get(name, where, ...), error=function(e)e), 
"error")) {
>
> >   ... (( work with xx )) ...
>
> > } else  {
> >NULL / error() / .. or similar
> > }
>
> > but of course our C implementation would be more efficient and use more 
concise
> > syntax {which should not look like error handling}.   Follow ups to 
this idea
> > should really go to R-devel (the mailing list).
>
> and now I do follow up here myself :
>
> I found that  'getifexists()' is actually very simple to implement,
> I have already tested it a bit, but not yet committed to R-devel
> (the "R trunk" aka "master branch") because I'd like to get

Re: [Rd] gsub with perl=TRUE results in 'this version of PCRE is not compiled with Unicode property support' in R-devel

2015-01-08 Thread Kasper Daniel Hansen
Dan, for OS X, there is a new pcre library posted at
http://r.research.att.com/libs/ with a date stamp of Dec 28.  This fixes
this problem.  You can test for this by running
  make check
post compilation.  It'll bang out with a failure if this is not in order.

(And I know that all of this is described in R-admin).

It would be helpful (time saving) if a message is posted to r-sig-mac
whenever a new (version of a) library is added to
http://r.research.att.com/libs/
I know it is adding more work to the helpful people who are doing all the
heavy lifting.

Kasper

On Thu, Jan 8, 2015 at 7:06 AM, Prof Brian Ripley 
wrote:

> Why are you reporting that your PCRE library does not have something which
> the R-admin manual says it should preferably have?  To wit, footnote 37 says
>
> 'and not PCRE2, which started at version 10.0. PCRE must be built with
> UTF-8 support (not the default) and support for Unicode properties is
> assumed by some R packages. Neither are tested by configure. JIT support is
> desirable.'
>
> That certainly does not fail on my Linux, Windows and OS X builds of
> R-devel.  (Issues about pre-built binaries, if that is what you used,
> should be reported to their maintainers, not here.)
>
> And the help does say in ?regex
>
>  In UTF-8 mode, some Unicode properties may be supported via
>  ‘\p{xx}’ and ‘\P{xx}’ which match characters with and without
>  property ‘xx’ respectively.
>
> Note the 'may'.
>
>
>
>
>
> On 07/01/2015 23:25, Dan Tenenbaum wrote:
>
>> The following code:
>>
>> res <- gsub("(*UCP)\\b(i)\\b",
>>  "", "nhgrimelanomaclass", perl = TRUE)
>>
>> results in:
>>
>> Error in gsub(sprintf("(*UCP)\\b(%s)\\b", "i"), "",
>> "nhgrimelanomaclass",  :
>>invalid regular expression '(*UCP)\b(i)\b'
>> In addition: Warning message:
>> In gsub(sprintf("(*UCP)\\b(%s)\\b", "i"), "", "nhgrimelanomaclass",  :
>>PCRE pattern compilation error
>> 'this version of PCRE is not compiled with Unicode property
>> support'
>> at '(*UCP)\b(i)\b'
>>
>> on
>>
>> R Under development (unstable) (2015-01-01 r67290)
>> Platform: x86_64-apple-darwin13.4.0 (64-bit)
>> Running under: OS X 10.9.5 (Mavericks)
>>
>> locale:
>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods   base
>>
>> And also on the same version of R-devel on Snow Leopard, Windows, and
>> Linux. But it does not produce an error on
>>
>> R version 3.1.2 (2014-10-31)
>> Platform: x86_64-apple-darwin13.4.0 (64-bit)
>>
>> locale:
>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>
>> attached base packages:
>> [1] stats graphics  grDevices utils datasets  methods   base
>>
>> Dan
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>
> --
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Emeritus Professor of Applied Statistics, University of Oxford
> 1 South Parks Road, Oxford OX1 3TG, UK
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: getifexists() {was [Bug 16065] "exists" ...}

2015-01-08 Thread Michael Lawrence
If we do add an argument to get(), then it should be named consistently
with the ifnotfound argument of mget(). As mentioned, the possibility of a
NULL value is problematic. One solution is a sentinel value that indicates
an unbound value (like R_UnboundValue).

But another idea (and one pretty similar to John's) is to follow the SYMSXP
design at the C level, where there is a structure that points to the name
and a value. We already have SYMSXPs at the R level of course (name
objects) but they do not provide access to the value, which is typically
R_UnboundValue. But this does not even need to be implemented with SYMSXP.
The design would allow something like:

binding <- getBinding("x", env)
if (hasValue(binding)) {
  x <- value(binding) # throws an error if none
  message(name(binding), "has value", x)
}

That I think it is a bit verbose but readable and could be made fast. And I
think binding objects would be useful in other ways, as they are
essentially a "named object". For example, when iterating over an
environment.

Michael




On Thu, Jan 8, 2015 at 6:03 AM, John Nolan  wrote:

> Adding an optional argument to get (and mget) like
>
> val <- get(name, where, ..., value.if.not.found=NULL )   (*)
>
> would be useful for many.  HOWEVER, it is possible that there could be
> some confusion here: (*) can give a NULL because either x exists and
> has value NULL, or because x doesn't exist.   If that matters, the user
> would need to be careful about specifying a value.if.not.found that cannot
> be confused with a valid value of x.
>
> To avoid this difficulty, perhaps we want both: have Martin's getifexists(
> )
> return a list with two values:
>   - a boolean variable 'found'  # = value returned by exists( )
>   - a variable 'value'
>
> Then implement get( ) as:
>
> get <- function(x,...,value.if.not.found ) {
>
>   if( missing(value.if.not.found) ) {
> a <- getifexists(x,... )
> if (!a$found) error("x not found")
>   } else {
> a <- getifexists(x,...,value.if.not.found )
>   }
>   return(a$value)
> }
>
> Note that value.if.not.found has no default value in above.
> It behaves exactly like current get does if value.if.not.found
> is not specified, and if it is specified, it would be faster
> in the common situation mentioned below:
>  if(exists(x,...)) { get(x,...) }
>
> John
>
> P.S. if you like dromedaries call it valueIfNotFound ...
>
>  ..
>  John P. Nolan
>  Math/Stat Department
>  227 Gray Hall,   American University
>  4400 Massachusetts Avenue, NW
>  Washington, DC 20016-8050
>
>  jpno...@american.edu   voice: 202.885.3140
>  web: academic2.american.edu/~jpnolan
>  ..
>
>
> -"R-devel"  wrote: -
> To: Martin Maechler , R-devel@r-project.org
> From: Duncan Murdoch
> Sent by: "R-devel"
> Date: 01/08/2015 06:39AM
> Subject: Re: [Rd] RFC: getifexists() {was [Bug 16065] "exists" ...}
>
> On 08/01/2015 4:16 AM, Martin Maechler wrote:
> > In November, we had a "bug repository conversation"
> > with Peter Hagerty and myself:
> >
> >   https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16065
> >
> > where the bug report title started with
> >
> >  --->>  "exists" is a bottleneck for dispatch and package loading, ...
> >
> > Peter proposed an extra simplified and henc faster version of exists(),
> > and I commented
> >
> > > --- Comment #2 from Martin Maechler 
> ---
> > > I'm very grateful that you've started exploring the bottlenecks of
> loading
> > > packages with many S4 classes (and methods)...
> > > and I hope we can make real progress there rather sooner than
> later.
> >
> > > OTOH, your `summaryRprof()` in your vignette indicates that
> exists() may use
> > > upto 10% of the time spent in library(reportingTools),  and your
> speedup
> > > proposals of exist()  may go up to ca 30%  which is good and well
> worth
> > > considering,  but still we can only expect 2-3% speedup for
> package loading
> > > which unfortunately is not much.
> >
> > > Still I agree it is worth looking at exists() as you did  ... and
> > > consider providing a fast simplified version of it in addition to
> current
> > > exists() [I think].
> >
> > > BTW, as we talk about enhancements here, maybe consider a further
> possibility:
> > > My subjective guess is that probably more than half of exists()
> uses are of the
> > > form
> >
> > > if(exists(name, where, ...)) {
> > >get(name, whare, )
> > >..
> > > } else {
> > > NULL / error() / .. or similar
> > > }
> >
> > > i.e. many exists() calls when returning TRUE are immediately
> followed by the
> > > corresponding get() call which repeats quite a bit of the lookup
> that exists()
> > > has done.
> >
> > > Instead, I'd imagine a function, say  getifexists(name, ...) that
> does both at
> > > once in the "exist

Re: [Rd] RFC: getifexists() {was [Bug 16065] "exists" ...}

2015-01-08 Thread Martin Maechler

> Adding an optional argument to get (and mget) like
> val <- get(name, where, ..., value.if.not.found=NULL )   (*)

> would be useful for many.  HOWEVER, it is possible that there could be 
> some confusion here: (*) can give a NULL because either x exists and 
> has value NULL, or because x doesn't exist.   If that matters, the user 
> would need to be careful about specifying a value.if.not.found that cannot 
> be confused with a valid value of x.  

Exactly -- well, of course: That problem { NULL can be the legit value of what 
you
want to get() } was the only reason to have a 'value.if.not' argument at all. 

Note that this is not about a universal replacement of 
the  if(exists(..)) { .. get(..) } idiom, but rather a
replacement of these in the cases where speed matters very much,
which is e.g. in the low level support code for S4 method dispatch.

'value.if.not.found':
Note that CRAN checks requires all arguments to be written in
full length.  Even though we have auto completion in ESS,
Rstudio or other good R IDE's,  I very much like to keep
function calls somewhat compact.

And yes, as you mention the dromedars aka 2-hump camels:  
getIfExist is already horrible to my taste (and "_" is not S-like; 
yes that's all very much a matter of taste and yes I'm from the
20th century).

> To avoid this difficulty, perhaps we want both: have Martin's getifexists( ) 
> return a list with two values: 
>   - a boolean variable 'found'  # = value returned by exists( )
>   - a variable 'value'

> Then implement get( ) as:

> get <- function(x,...,value.if.not.found ) {

>   if( missing(value.if.not.found) ) {
> a <- getifexists(x,... )
> if (!a$found) error("x not found")
>   } else {
> a <- getifexists(x,...,value.if.not.found )
>   }
>   return(a$value)
> }

Interesting...
Note that the above get() implementation would just be "conceptually", as 
all of this is also quite a bit about speed, and we do the
different cases in C anyway [via 'op' code].

> Note that value.if.not.found has no default value in above.
> It behaves exactly like current get does if value.if.not.found 
> is not specified, and if it is specified, it would be faster 
> in the common situation mentioned below:   
>  if(exists(x,...)) { get(x,...) }

Good... Let's talk about your getifexists() as I argue we'd keep
get() exactly as it is now anyway, if we use a new 3rd function (I keep
calling 'getifexists()' for now):

I think in that case, getifexists() would not even *need* an argument 
'value.if.not' (or 'value.if.not.found'); it rather would return a 
  list(found = *, value = *)
in any case.
Alternatively, it could return
  structure(, value = *)

In the first case, our main use case would be

  if((r <- getifexists(x, *))$found) {
 ## work with  r$value
  }

in the 2nd case {structure} :

  if((r <- getifexists(x, *))) {
 ## work with  attr(r,"value")
  }

I think that (both cases) would still be a bit slower (for the above
most important use case) but probably not much
and it would like slightly more readable than my

   if (!is.null(r <- getifexists(x, *))) {
  ## work with  r
   }

After all of this, I think I'd still somewhat prefer my original proposal,
but not strongly -- I had originally also thought of returning the
two parts explicitly, but then tended to prefer the version that
behaved exactly like get() in the case the object is found.

... Nice interesting ideas! ... 
let the proposals and consideration flow ...

Martin


> John

> P.S. if you like dromedaries call it valueIfNotFound ...

:-) ;-)  
I don't .. as I said above, I already strongly dislike more than one hump. 
[ Each capital is one key stroke ("Shift") more ,
  and each "_" is two key strokes more on most key boards...,
  and I do like identifiers that I can also quickly pronounce on
  the phone or in teaching .. ]

>  ..
>  John P. Nolan
>  Math/Stat Department
>  227 Gray Hall,   American University
>  4400 Massachusetts Avenue, NW
>  Washington, DC 20016-8050
>  ..


> -"R-devel"  wrote: - 
> To: Martin Maechler , R-devel@r-project.org
> From: Duncan Murdoch 
> Sent by: "R-devel" 
> Date: 01/08/2015 06:39AM
> Subject: Re: [Rd] RFC: getifexists() {was [Bug 16065] "exists" ...}

> On 08/01/2015 4:16 AM, Martin Maechler wrote:
> > In November, we had a "bug repository conversation"
> > with Peter Hagerty and myself:
> > 
> >   https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16065
> > 
> > where the bug report title started with
> > 
> >  --->>  "exists" is a bottleneck for dispatch and package loading, ...
> > 
> > Peter proposed an extra simplified and henc faster version of exists(),
> > and I commented
> > 
> > > --- Comment #2 from Martin Maechler  ---
> > > I'm very grateful that you've started exploring the bottlenecks of 
> > loading
> > > packages w

[Rd] unloadNamespace

2015-01-08 Thread Paul Gilbert
In the documentation the closed thing I see to an explanation of this is 
that ?detach says "Unloading some namespaces has undesirable side effects"


Can anyone explain why unloading tseries will load zoo? I don't think 
this behavior is specific to tseries, it's just an example. I realize 
one would not usually unload something that is not loaded, but I would 
expect it to do nothing or give an error. I only discovered this when 
trying to clean up to debug another problem.


R version 3.1.2 (2014-10-31) -- "Pumpkin Helmet"
and
R Under development (unstable) (2015-01-02 r67308) -- "Unsuffered 
Consequences"

...
Type 'q()' to quit R.

> loadedNamespaces()
[1] "base"  "datasets"  "graphics"  "grDevices" "methods"   "stats"
[7] "utils"
> unloadNamespace("tseries") # loads zoo ?
> loadedNamespaces()
 [1] "base"  "datasets"  "graphics"  "grDevices" "grid" 
"lattice"

 [7] "methods"   "quadprog"  "stats" "utils" "zoo"
>

Somewhat related, is there an easy way to get back to a "clean" state 
for loaded and attached things, as if R had just been started? I'm 
trying to do this in a vignette so it is not easy to stop and restart R.


Paul

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] New version of Rtools for Windows

2015-01-08 Thread Hin-Tak Leung

The r.dll crash is easy - you need to be using gcc-ar for ar, and gcc-ranlib 
for ranlib. I also posted a patch to fix the check failure for stack probing, 
as lto optimizes away the stack probing code, as it should.

yes, lto build's speed gain is very impressive.



--
On Thu, Jan 8, 2015 2:01 PM GMT Henric Winell wrote:

>On 2015-01-08 14:18, Avraham Adler wrote:
>
>> Very timely, as this is how I got into the problem I posted about
>> earlier; maybe some of the problems I ran into will mean more to the
>> you and the experts on this thread, Dr. Murdoch.For reference, I run
>> Windows 7 64bit, and I am trying to build a 64 bit version of R-3.1.2.
>>
>> As we discussed offline, Dr. Murdoch, I've been trying to build R
>> using more recent tools than GCC4.6.3 prerelease. Ruben Von Boxen
>> (rubenvb) told me he is no longer developing his own builds of GCC,
>> but is focusing on MSYS2 and the mingw64 personal builds. So, similar
>> to what Jeroen said, I first installed MSYS2, whose initial
>> installation on windows is not so simple[1]. After the initial
>> install, the following packages need to be manually installed: make,
>> tar, zip, unzip, zlib, and rsync. I also installed base-devel, which
>> is way more than necessary, but there may be packages in there which
>> are necessary.
>>
>> I originally installed the most up-to-date version of GCC (4.9.2)[2],
>> and I did pick the -seh version, as since I install (almost) all
>> packages from source (the one exception being nloptr for now), the
>> exception handling should be consistent and it is supposed to up to
>> ~15% faster[3].
>>
>> The initial build crashed with the following error:
>>
>> gcc -std=gnu99 -m64 -I../../include -I. -DHAVE_CONFIG_H  -O3 -Wall
>> -pedantic -mtune=core2   -c xmalloc.c -o xmalloc.o
>> ar crs libtre.a regcomp.o regerror.o regexec.o tre-ast.o tre-compile.o
>> tre-match -approx.o tre-match-backtrack.o tre-match-parallel.o
>> tre-mem.o tre-parse.o tre-stack.o xmalloc.o
>> gcc -std=gnu99 -m64   -O3 -Wall -pedantic -mtune=core2   -c compat.c -o 
>> compat.o
>> compat.c:65:5: error: redefinition of 'snprintf'
>>   int snprintf(char *buffer, size_t max, const char *format, ...)
>>   ^
>> In file included from compat.c:3:0:
>> F:/MinGW64/x86_64-w64-mingw32/include/stdio.h:553:5: note: previous
>> definition of 'snprintf' was here
>>   int snprintf (char * __restrict__ __stream, size_t __n, const char *
>> __restrict__ __format, ...)
>>   ^
>> compat.c:75:5: error: redefinition of 'vsnprintf'
>>   int vsnprintf(char *buffer, size_t bufferSize, const char *format,
>> va_list args)
>>   ^
>> In file included from compat.c:3:0:
>> F:/MinGW64/x86_64-w64-mingw32/include/stdio.h:543:7: note: previous
>> definition of 'vsnprintf' was here
>> int vsnprintf (char * __restrict__ __stream, size_t __n, const char
>> * __restrict__ __format, va_list __local_argv)
>> ^
>> ../../gnuwin32/MkRules:218: recipe for target 'compat.o' failed
>> make[4]: *** [compat.o] Error 1
>> Makefile:120: recipe for target 'rlibs' failed
>> make[3]: *** [rlibs] Error 1
>> Makefile:179: recipe for target '../../bin/x64/R.dll' failed
>> make[2]: *** [../../bin/x64/R.dll] Error 2
>> Makefile:104: recipe for target 'rbuild' failed
>> make[1]: *** [rbuild] Error 2
>> Makefile:14: recipe for target 'all' failed
>> make: *** [all] Error 2
>>
>> After doing some checking (for example see [4]), I asked Duncan about
>> the problem, and he suggested moving the #ifndef _W64 in compat.c up
>> above the offending lines (65-75). That did not work, so, I figured
>> (it seems mistakenly from the other thread) that if those functions
>> are included from stdio already, I can just delete them from compat.c.
>> The specific lines are:
>>
>> int snprintf(char *buffer, size_t max, const char *format, ...)
>> {
>>  int res;
>>  va_list(ap);
>>  va_start(ap, format);
>>  res = trio_vsnprintf(buffer, max, format, ap);
>>  va_end(ap);
>>  return res;
>> }
>>
>> int vsnprintf(char *buffer, size_t bufferSize, const char *format, va_list 
>> args)
>> {
>>  return trio_vsnprintf(buffer, bufferSize, format, args);
>> }
>>
>> Continuing the build using 4.9.2 crashed again at the following point:
>>
>> gcc -std=gnu99 -m64 -I../include -I. -I../extra -DHAVE_CONFIG_H
>> -DR_DLL_BUILD  -O3 -Wall -pedantic -mtune=core2   -c malloc.c -o
>> malloc.o
>> windres -F pe-x86-64  -I../include -i dllversion.rc -o dllversion.o
>> gcc -std=gnu99 -m64 -shared -s -mwindows -o R.dll R.def console.o
>> dynload.o editor.o embeddedR.o extra.o opt.o pager.o preferences.o
>> psignal.o rhome.o rt_complete.o rui.o run.o shext.o sys-win32.o
>> system.o dos_wglob.o malloc.o ../main/libmain.a ../appl/libappl.a
>> ../nmath/libnmath.a getline/gl.a ../extra/xdr/libxdr.a
>> ../extra/pcre/libpcre.a ../extra/bzip2/libbz2.a
>> ../extra/intl/libintl.a ../extra/trio/libtrio.a ../extra/tzone/libtz.a
>> ../extra/tre/libtre.a ../extra/xz/liblzma.a dllversion.

Re: [Rd] unloadNamespace

2015-01-08 Thread Gabriel Becker
Paul,

My switchr package (https://github.com/gmbecker/switchr) has the
flushSession function which does what you want and seems to work (on my
test machine at least).

I havent tested it under a recent Rdevel, or with that specific package,
however I will soon, as the overarching model of switchr relies on this
working.

If you do try it before me with that package, please let me know whether it
works or not.

~G

On Thu, Jan 8, 2015 at 7:45 AM, Paul Gilbert  wrote:

> In the documentation the closed thing I see to an explanation of this is
> that ?detach says "Unloading some namespaces has undesirable side effects"
>
> Can anyone explain why unloading tseries will load zoo? I don't think this
> behavior is specific to tseries, it's just an example. I realize one would
> not usually unload something that is not loaded, but I would expect it to
> do nothing or give an error. I only discovered this when trying to clean up
> to debug another problem.
>
> R version 3.1.2 (2014-10-31) -- "Pumpkin Helmet"
> and
> R Under development (unstable) (2015-01-02 r67308) -- "Unsuffered
> Consequences"
> ...
> Type 'q()' to quit R.
>
> > loadedNamespaces()
> [1] "base"  "datasets"  "graphics"  "grDevices" "methods"   "stats"
> [7] "utils"
> > unloadNamespace("tseries") # loads zoo ?
> > loadedNamespaces()
>  [1] "base"  "datasets"  "graphics"  "grDevices" "grid" "lattice"
>  [7] "methods"   "quadprog"  "stats" "utils" "zoo"
> >
>
> Somewhat related, is there an easy way to get back to a "clean" state for
> loaded and attached things, as if R had just been started? I'm trying to do
> this in a vignette so it is not easy to stop and restart R.
>
> Paul
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Gabriel Becker, PhD
Alumnus
Statistics Department
University of California, Davis

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] New version of Rtools for Windows

2015-01-08 Thread Avraham Adler
On Thu, Jan 8, 2015 at 10:48 AM, Hin-Tak Leung
 wrote:
>
> The r.dll crash is easy - you need to be using gcc-ar for ar, and gcc-ranlib 
> for ranlib. I also posted a patch to fix the check failure for stack probing, 
> as lto optimizes away the stack probing code, as it should.
>
> yes, lto build's speed gain is very impressive.
>
>


I apologize for my ignorance, but how would I do that? I tried by
changing the following in src/gnuwin32/MkRules.local:

# prefix for 64-bit: path or x86_64-w64-mingw32-
BINPREF64 = x86_64-w64-mingw32-gcc-

I added the gcc- as the suffix there, but I guess that is insufficient
as I still get the following error using 4.9.2:

windres -F pe-x86-64  -I../include -i dllversion.rc -o dllversion.o
gcc -std=gnu99 -m64 -shared -s -mwindows -o R.dll R.def console.o
dynload.o editor.o embeddedR.o extra.o opt.o pager.o preferences.o
psignal.o rhome.o rt_complete.o rui.o run.o shext.o sys-win32.o
system.o dos_wglob.o malloc.o ../main/libmain.a ../appl/libappl.a
../nmath/libnmath.a getline/gl.a ../extra/xdr/libxdr.a
../extra/pcre/libpcre.a ../extra/bzip2/libbz2.a
../extra/intl/libintl.a ../extra/trio/libtrio.a ../extra/tzone/libtz.a
../extra/tre/libtre.a ../extra/xz/liblzma.a dllversion.o -fopenmp -L.
-lgfortran -lRblas -L../../bin/x64 -lRzlib -lRgraphapp -lRiconv
-lcomctl32 -lversion
collect2.exe: error: ld returned 5 exit status
Makefile:150: recipe for target 'R.dll' failed
make[3]: *** [R.dll] Error 1
Makefile:179: recipe for target '../../bin/x64/R.dll' failed
make[2]: *** [../../bin/x64/R.dll] Error 2
Makefile:104: recipe for target 'rbuild' failed
make[1]: *** [rbuild] Error 2
Makefile:14: recipe for target 'all' failed
make: *** [all] Error 2

I still had to delete those lines in compat.c, so this build, were it
to have completed, is still subject to the non-conformance of
scientfic notation printing that was discussed earlier.

Hin-tak, any suggestions for this error (and the compat.c for that
matter) that you, or any reader of this list, may have would be
greatly appreciated.

Thank you!

Avi



> --
> On Thu, Jan 8, 2015 2:01 PM GMT Henric Winell wrote:
>
>>On 2015-01-08 14:18, Avraham Adler wrote:
>>
>>> Very timely, as this is how I got into the problem I posted about
>>> earlier; maybe some of the problems I ran into will mean more to the
>>> you and the experts on this thread, Dr. Murdoch.For reference, I run
>>> Windows 7 64bit, and I am trying to build a 64 bit version of R-3.1.2.
>>>
>>> As we discussed offline, Dr. Murdoch, I've been trying to build R
>>> using more recent tools than GCC4.6.3 prerelease. Ruben Von Boxen
>>> (rubenvb) told me he is no longer developing his own builds of GCC,
>>> but is focusing on MSYS2 and the mingw64 personal builds. So, similar
>>> to what Jeroen said, I first installed MSYS2, whose initial
>>> installation on windows is not so simple[1]. After the initial
>>> install, the following packages need to be manually installed: make,
>>> tar, zip, unzip, zlib, and rsync. I also installed base-devel, which
>>> is way more than necessary, but there may be packages in there which
>>> are necessary.
>>>
>>> I originally installed the most up-to-date version of GCC (4.9.2)[2],
>>> and I did pick the -seh version, as since I install (almost) all
>>> packages from source (the one exception being nloptr for now), the
>>> exception handling should be consistent and it is supposed to up to
>>> ~15% faster[3].
>>>
>>> The initial build crashed with the following error:
>>>
>>> gcc -std=gnu99 -m64 -I../../include -I. -DHAVE_CONFIG_H  -O3 -Wall
>>> -pedantic -mtune=core2   -c xmalloc.c -o xmalloc.o
>>> ar crs libtre.a regcomp.o regerror.o regexec.o tre-ast.o tre-compile.o
>>> tre-match -approx.o tre-match-backtrack.o tre-match-parallel.o
>>> tre-mem.o tre-parse.o tre-stack.o xmalloc.o
>>> gcc -std=gnu99 -m64   -O3 -Wall -pedantic -mtune=core2   -c compat.c -o 
>>> compat.o
>>> compat.c:65:5: error: redefinition of 'snprintf'
>>>   int snprintf(char *buffer, size_t max, const char *format, ...)
>>>   ^
>>> In file included from compat.c:3:0:
>>> F:/MinGW64/x86_64-w64-mingw32/include/stdio.h:553:5: note: previous
>>> definition of 'snprintf' was here
>>>   int snprintf (char * __restrict__ __stream, size_t __n, const char *
>>> __restrict__ __format, ...)
>>>   ^
>>> compat.c:75:5: error: redefinition of 'vsnprintf'
>>>   int vsnprintf(char *buffer, size_t bufferSize, const char *format,
>>> va_list args)
>>>   ^
>>> In file included from compat.c:3:0:
>>> F:/MinGW64/x86_64-w64-mingw32/include/stdio.h:543:7: note: previous
>>> definition of 'vsnprintf' was here
>>> int vsnprintf (char * __restrict__ __stream, size_t __n, const char
>>> * __restrict__ __format, va_list __local_argv)
>>> ^
>>> ../../gnuwin32/MkRules:218: recipe for target 'compat.o' failed
>>> make[4]: *** [compat.o] Error 1
>>> Makefile:120: recipe for target 'rlibs' failed
>>> make[3]: *** [rlibs] Error 1
>>> Makefile:179: recipe for ta

Re: [Rd] New version of Rtools for Windows

2015-01-08 Thread Hin-Tak Leung
Oh, I forgot to mention that besides setting AR, RANLIB and the stack probing 
fix, you also need a very up to date binutils. 2.25 was out in december. Even 
with that , if you linker's default is not what you are compiling for (i.e. a 
multiarch toolchain), you need to set GNUTARGET also, i.e. -m32/-m64 is not 
enough. Some fix to autodetect non-default targets went in after christmas 
before the new year, but I am not brave enough to try that on a daily basis yet 
(only tested it and reported it, then reverting the change - how gcc invokes 
the linker is rather complicated and it is not easy to have two binutils 
installed...)- setting GNUTARGET seems safer :-).
Whether you need that depends on whether you are compiling for your toolchain's 
default target architecture.

AR, RANLIB, GNUTARGET are all environment variables - you set them the usual 
way. The stack probing fix is for passing "make check", when you finish make.

--
On Thu, Jan 8, 2015 6:14 PM GMT Avraham Adler wrote:

>On Thu, Jan 8, 2015 at 10:48 AM, Hin-Tak Leung
> wrote:
>>
>> The r.dll crash is easy - you need to be using gcc-ar for ar, and gcc-ranlib 
>> for ranlib. I also posted a patch to fix the check failure for stack 
>> probing, as lto optimizes away the stack probing code, as it should.
>>
>> yes, lto build's speed gain is very impressive.
>>
>
>
>I apologize for my ignorance, but how would I do that? I tried by
>changing the following in src/gnuwin32/MkRules.local:
>
># prefix for 64-bit: path or x86_64-w64-mingw32-
>BINPREF64 = x86_64-w64-mingw32-gcc-
>
>I added the gcc- as the suffix there, but I guess that is insufficient
>as I still get the following error using 4.9.2:
>
>windres -F pe-x86-64  -I../include -i dllversion.rc -o dllversion.o
>gcc -std=gnu99 -m64 -shared -s -mwindows -o R.dll R.def console.o
>dynload.o editor.o embeddedR.o extra.o opt.o pager.o preferences.o
>psignal.o rhome.o rt_complete.o rui.o run.o shext.o sys-win32.o
>system.o dos_wglob.o malloc.o ../main/libmain.a ../appl/libappl.a
>../nmath/libnmath.a getline/gl.a ../extra/xdr/libxdr.a
>../extra/pcre/libpcre.a ../extra/bzip2/libbz2.a
>../extra/intl/libintl.a ../extra/trio/libtrio.a ../extra/tzone/libtz.a
>../extra/tre/libtre.a ../extra/xz/liblzma.a dllversion.o -fopenmp -L.
>-lgfortran -lRblas -L../../bin/x64 -lRzlib -lRgraphapp -lRiconv
>-lcomctl32 -lversion
>collect2.exe: error: ld returned 5 exit status
>Makefile:150: recipe for target 'R.dll' failed
>make[3]: *** [R.dll] Error 1
>Makefile:179: recipe for target '../../bin/x64/R.dll' failed
>make[2]: *** [../../bin/x64/R.dll] Error 2
>Makefile:104: recipe for target 'rbuild' failed
>make[1]: *** [rbuild] Error 2
>Makefile:14: recipe for target 'all' failed
>make: *** [all] Error 2
>
>I still had to delete those lines in compat.c, so this build, were it
>to have completed, is still subject to the non-conformance of
>scientfic notation printing that was discussed earlier.
>
>Hin-tak, any suggestions for this error (and the compat.c for that
>matter) that you, or any reader of this list, may have would be
>greatly appreciated.
>
>Thank you!
>
>Avi
>
>
>> --
>> On Thu, Jan 8, 2015 2:01 PM GMT Henric Winell wrote:
>>
>>On 2015-01-08 14:18, Avraham Adler wrote:
>>
>>> Very timely, as this is how I got into the problem I posted about
>>> earlier; maybe some of the problems I ran into will mean more to the
>>> you and the experts on this thread, Dr. Murdoch.For reference, I run
>>> Windows 7 64bit, and I am trying to build a 64 bit version of R-3.1.2.
>>>
>>> As we discussed offline, Dr. Murdoch, I've been trying to build R
>>> using more recent tools than GCC4.6.3 prerelease. Ruben Von Boxen
>>> (rubenvb) told me he is no longer developing his own builds of GCC,
>>> but is focusing on MSYS2 and the mingw64 personal builds. So, similar
>>> to what Jeroen said, I first installed MSYS2, whose initial
>>> installation on windows is not so simple[1]. After the initial
>>> install, the following packages need to be manually installed: make,
>>> tar, zip, unzip, zlib, and rsync. I also installed base-devel, which
>>> is way more than necessary, but there may be packages in there which
>>> are necessary.
>>>
>>> I originally installed the most up-to-date version of GCC (4.9.2)[2],
>>> and I did pick the -seh version, as since I install (almost) all
>>> packages from source (the one exception being nloptr for now), the
>>> exception handling should be consistent and it is supposed to up to
>>> ~15% faster[3].
>>>
>>> The initial build crashed with the following error:
>>>
>>> gcc -std=gnu99 -m64 -I../../include -I. -DHAVE_CONFIG_H  -O3 -Wall
>>> -pedantic -mtune=core2   -c xmalloc.c -o xmalloc.o
>>> ar crs libtre.a regcomp.o regerror.o regexec.o tre-ast.o tre-compile.o
>>> tre-match -approx.o tre-match-backtrack.o tre-match-parallel.o
>>> tre-mem.o tre-parse.o tre-stack.o xmalloc.o
>>> gcc -std=gnu99 -m64   -O3 -Wall -pedantic -mtune=core2   -c compat.c -o 
>>> c

Re: [Rd] RFC: getifexists() {was [Bug 16065] "exists" ...}

2015-01-08 Thread luke-tierney

On Thu, 8 Jan 2015, Michael Lawrence wrote:


If we do add an argument to get(), then it should be named consistently
with the ifnotfound argument of mget(). As mentioned, the possibility of a
NULL value is problematic. One solution is a sentinel value that indicates
an unbound value (like R_UnboundValue).


A null default is fine -- it's a default; if it isn't right for a
particular case you can provide something else.



But another idea (and one pretty similar to John's) is to follow the SYMSXP
design at the C level, where there is a structure that points to the name
and a value. We already have SYMSXPs at the R level of course (name
objects) but they do not provide access to the value, which is typically
R_UnboundValue. But this does not even need to be implemented with SYMSXP.
The design would allow something like:

binding <- getBinding("x", env)
if (hasValue(binding)) {
 x <- value(binding) # throws an error if none
 message(name(binding), "has value", x)
}

That I think it is a bit verbose but readable and could be made fast. And I
think binding objects would be useful in other ways, as they are
essentially a "named object". For example, when iterating over an
environment.


This would need a lot more thought. Directly exposing the internals is
definitely not something we want to do as we may well want to change
that design. But there are lots of other corner issues that would have
to be thought through before going forward, such as what happens if an
rm occurs between obtaining a binding object and doing something with
it. Serialization would also need thinking through. This doesn't seem
like a worthwhile place to spend our efforts to me.

Adding getIfExists, or .get, or get0, or whatever seems fine. Adding
an argument to get() with missing giving current behavior may be OK
too. Rewriting exists and get as .Primitives may be sufficient though.

Best,

luke



Michael




On Thu, Jan 8, 2015 at 6:03 AM, John Nolan  wrote:


Adding an optional argument to get (and mget) like

val <- get(name, where, ..., value.if.not.found=NULL )   (*)

would be useful for many.  HOWEVER, it is possible that there could be
some confusion here: (*) can give a NULL because either x exists and
has value NULL, or because x doesn't exist.   If that matters, the user
would need to be careful about specifying a value.if.not.found that cannot
be confused with a valid value of x.

To avoid this difficulty, perhaps we want both: have Martin's getifexists(
)
return a list with two values:
  - a boolean variable 'found'  # = value returned by exists( )
  - a variable 'value'

Then implement get( ) as:

get <- function(x,...,value.if.not.found ) {

  if( missing(value.if.not.found) ) {
a <- getifexists(x,... )
if (!a$found) error("x not found")
  } else {
a <- getifexists(x,...,value.if.not.found )
  }
  return(a$value)
}

Note that value.if.not.found has no default value in above.
It behaves exactly like current get does if value.if.not.found
is not specified, and if it is specified, it would be faster
in the common situation mentioned below:
 if(exists(x,...)) { get(x,...) }

John

P.S. if you like dromedaries call it valueIfNotFound ...

 ..
 John P. Nolan
 Math/Stat Department
 227 Gray Hall,   American University
 4400 Massachusetts Avenue, NW
 Washington, DC 20016-8050

 jpno...@american.edu   voice: 202.885.3140
 web: academic2.american.edu/~jpnolan
 ..


-"R-devel"  wrote: -
To: Martin Maechler , R-devel@r-project.org
From: Duncan Murdoch
Sent by: "R-devel"
Date: 01/08/2015 06:39AM
Subject: Re: [Rd] RFC: getifexists() {was [Bug 16065] "exists" ...}

On 08/01/2015 4:16 AM, Martin Maechler wrote:
> In November, we had a "bug repository conversation"
> with Peter Hagerty and myself:
>
>   https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16065
>
> where the bug report title started with
>
>  --->>  "exists" is a bottleneck for dispatch and package loading, ...
>
> Peter proposed an extra simplified and henc faster version of exists(),
> and I commented
>
> > --- Comment #2 from Martin Maechler 
---
> > I'm very grateful that you've started exploring the bottlenecks of
loading
> > packages with many S4 classes (and methods)...
> > and I hope we can make real progress there rather sooner than
later.
>
> > OTOH, your `summaryRprof()` in your vignette indicates that
exists() may use
> > upto 10% of the time spent in library(reportingTools),  and your
speedup
> > proposals of exist()  may go up to ca 30%  which is good and well
worth
> > considering,  but still we can only expect 2-3% speedup for
package loading
> > which unfortunately is not much.
>
> > Still I agree it is worth looking at exists() as you did  ... and
> > consider providing a fast simplified version of it in addition to
current
> > exists() [I think].
>
> > BTW,

[Rd] Testing R packages on Solaris Studio

2015-01-08 Thread Jeroen Ooms
I have setup a Solaris server to test packages before submitting to
CRAN, in order to catch problems that might not reveal themselves on
Fedora, Debian, OSX or Windows. The machine runs a Solaris 11.2 vm
with Solaris Studio 12.3.

I was able to compile current r-devel using the suggested environment
variables from "R Installation and Administration" and:

  ./configure --prefix=/opt/R-devel --with-blas='-library=sunperf' --with-lapack

All works great (fast too), except for some CRAN packages with c++
code won't build. The compiler itself works, most packages (including
e.g. MCMCpack) build OK. However packages like Rcpp and RJSONIO fail
with errors shown here:
https://gist.github.com/jeroenooms/f1b6a172320a32f59c82.

I tried installing with GNU make, but that does not seem to be the problem

  configure.vars = "MAKE=/opt/csw/bin/gmake"

I am aware that I can work around it by compiling with gcc instead of
solaris studio, but I would specifically like to replicate the setup
from CRAN.

Which additional args/vars/dependencies do I need to make Rcpp and
RJSONIO build as they do on the CRAN Solaris server?

> sessionInfo()
R Under development (unstable) (2015-01-07 r67351)
Platform: i386-pc-solaris2.11 (32-bit)
Running under: Solaris 11

locale:
[1] C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] tcltk_3.2.0 tools_3.2.0

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: getifexists() {was [Bug 16065] "exists" ...}

2015-01-08 Thread Jeroen Ooms
On Thu, Jan 8, 2015 at 6:36 AM, Duncan Murdoch  wrote:
>> val <- get(name, where, ..., value.if.not.found=NULL )   (*)
>
> That would be a bad idea, as it would change behaviour of existing uses of
> get().

Another approach would be if the "not found" behavior consists of a
callback, e.g. an expression or function:

  get(name, where, ..., not.found=stop("object ", name, " not found"))

This would cover the case of not.found=NULL, but also allows for
writing code with syntax similar to tryCatch

  obj <- get("foo", not.found = someDefaultValue())

Not sure what this would do to performance though.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] setequal: better readability, reduced memory footprint, and minor speedup

2015-01-08 Thread peter dalgaard
If you look at the definition of %in%, you'll find that it is implemented using 
match, so if we did as you suggest, I give it about three days before someone 
suggests to inline the function call... Readability of source code is not 
usually our prime concern.

The && idea does have some merit, though. 

Apropos, why is there no setcontains()?

-pd

> On 06 Jan 2015, at 22:02 , Hervé Pagès  wrote:
> 
> Hi,
> 
> Current implementation:
> 
> setequal <- function (x, y)
> {
>  x <- as.vector(x)
>  y <- as.vector(y)
>  all(c(match(x, y, 0L) > 0L, match(y, x, 0L) > 0L))
> }
> 
> First what about replacing 'match(x, y, 0L) > 0L' and 'match(y, x, 0L) > 0L'
> with 'x %in% y' and 'y %in% x', respectively. They're strictly
> equivalent but the latter form is a lot more readable than the former
> (isn't this the "raison d'être" of %in%?):
> 
> setequal <- function (x, y)
> {
>  x <- as.vector(x)
>  y <- as.vector(y)
>  all(c(x %in% y, y %in% x))
> }
> 
> Furthermore, replacing 'all(c(x %in% y, y %in x))' with
> 'all(x %in% y) && all(y %in% x)' improves readability even more and,
> more importantly, reduces memory footprint significantly on big vectors
> (e.g. by 15% on integer vectors with 15M elements):
> 
> setequal <- function (x, y)
> {
>  x <- as.vector(x)
>  y <- as.vector(y)
>  all(x %in% y) && all(y %in% x)
> }
> 
> It also seems to speed up things a little bit (not in a significant
> way though).
> 
> Cheers,
> H.
> 
> -- 
> Hervé Pagès
> 
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
> 
> E-mail: hpa...@fredhutch.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] RFC: getifexists() {was [Bug 16065] "exists" ...}

2015-01-08 Thread Peter Haverty
For what it's worth, I think we would need a new function if the default
behavior changes.  Since we already have "get" and "mget", maybe "cget" for
"conditional get"?  "if get", "safe get", ...

I like the idea of keeping the original "not found" behavior if the
"if.not.found" arg is missing. However, it will be important to keep the
number of arguments down.  (I noticed that Martin's example lacks a "frame"
argument.)  I've heard rumors that there are plans to reduce the function
call overhead, so perhaps this matters less now.

I like Luke's idea of making exists/get/etc. .Primitives. I think that will
be necessary in order to go fast.  For my two cents, I also think
get/assign should just be synonyms for the "[[" .Primitive.  That could
actually simplify things a bit. One might add "inherits=FALSE" and
"if.not.found" arguments to the environment "[[" code, for example.

Regards,
Pete


Pete


Peter M. Haverty, Ph.D.
Genentech, Inc.
phave...@gene.com

On Thu, Jan 8, 2015 at 11:57 AM,  wrote:

> On Thu, 8 Jan 2015, Michael Lawrence wrote:
>
>  If we do add an argument to get(), then it should be named consistently
>> with the ifnotfound argument of mget(). As mentioned, the possibility of a
>> NULL value is problematic. One solution is a sentinel value that indicates
>> an unbound value (like R_UnboundValue).
>>
>
> A null default is fine -- it's a default; if it isn't right for a
> particular case you can provide something else.
>
>
>> But another idea (and one pretty similar to John's) is to follow the
>> SYMSXP
>> design at the C level, where there is a structure that points to the name
>> and a value. We already have SYMSXPs at the R level of course (name
>> objects) but they do not provide access to the value, which is typically
>> R_UnboundValue. But this does not even need to be implemented with SYMSXP.
>> The design would allow something like:
>>
>> binding <- getBinding("x", env)
>> if (hasValue(binding)) {
>>  x <- value(binding) # throws an error if none
>>  message(name(binding), "has value", x)
>> }
>>
>> That I think it is a bit verbose but readable and could be made fast. And
>> I
>> think binding objects would be useful in other ways, as they are
>> essentially a "named object". For example, when iterating over an
>> environment.
>>
>
> This would need a lot more thought. Directly exposing the internals is
> definitely not something we want to do as we may well want to change
> that design. But there are lots of other corner issues that would have
> to be thought through before going forward, such as what happens if an
> rm occurs between obtaining a binding object and doing something with
> it. Serialization would also need thinking through. This doesn't seem
> like a worthwhile place to spend our efforts to me.
>
> Adding getIfExists, or .get, or get0, or whatever seems fine. Adding
> an argument to get() with missing giving current behavior may be OK
> too. Rewriting exists and get as .Primitives may be sufficient though.
>
> Best,
>
> luke
>
>
>  Michael
>>
>>
>>
>>
>> On Thu, Jan 8, 2015 at 6:03 AM, John Nolan  wrote:
>>
>>  Adding an optional argument to get (and mget) like
>>>
>>> val <- get(name, where, ..., value.if.not.found=NULL )   (*)
>>>
>>> would be useful for many.  HOWEVER, it is possible that there could be
>>> some confusion here: (*) can give a NULL because either x exists and
>>> has value NULL, or because x doesn't exist.   If that matters, the user
>>> would need to be careful about specifying a value.if.not.found that
>>> cannot
>>> be confused with a valid value of x.
>>>
>>> To avoid this difficulty, perhaps we want both: have Martin's
>>> getifexists(
>>> )
>>> return a list with two values:
>>>   - a boolean variable 'found'  # = value returned by exists( )
>>>   - a variable 'value'
>>>
>>> Then implement get( ) as:
>>>
>>> get <- function(x,...,value.if.not.found ) {
>>>
>>>   if( missing(value.if.not.found) ) {
>>> a <- getifexists(x,... )
>>> if (!a$found) error("x not found")
>>>   } else {
>>> a <- getifexists(x,...,value.if.not.found )
>>>   }
>>>   return(a$value)
>>> }
>>>
>>> Note that value.if.not.found has no default value in above.
>>> It behaves exactly like current get does if value.if.not.found
>>> is not specified, and if it is specified, it would be faster
>>> in the common situation mentioned below:
>>>  if(exists(x,...)) { get(x,...) }
>>>
>>> John
>>>
>>> P.S. if you like dromedaries call it valueIfNotFound ...
>>>
>>>  ..
>>>  John P. Nolan
>>>  Math/Stat Department
>>>  227 Gray Hall,   American University
>>>  4400 Massachusetts Avenue, NW
>>>  Washington, DC 20016-8050
>>>
>>>  jpno...@american.edu   voice: 202.885.3140
>>>  web: academic2.american.edu/~jpnolan
>>>  ..
>>>
>>>
>>> -"R-devel"  wrote: -
>>> To: Martin Maechler , R-devel@r-project.org
>>> From: Duncan Murdoc

Re: [Rd] RFC: getifexists() {was [Bug 16065] "exists" ...}

2015-01-08 Thread Peter Haverty
Michael's idea has an interesting bonus that he and I discussed earlier.
It would be very convenient to have a container of key/value pairs.  I
imagine many people often write this:

x - mapply( names(x), x, FUN=function(k,v) { # work with key and value }

especially ex perl people accustomed to

while ( ($key, $value) = each( some_hash ) { }

Perhaps there is room for additional discussion of using lists of SYMSXPs
in this manner. (If SYMSXPs are not that safe, perhaps a looping construct
for named vectors that gave the illusion iterating over a list of
two-tuples.)



Pete


Peter M. Haverty, Ph.D.
Genentech, Inc.
phave...@gene.com

On Thu, Jan 8, 2015 at 11:57 AM,  wrote:

> On Thu, 8 Jan 2015, Michael Lawrence wrote:
>
>  If we do add an argument to get(), then it should be named consistently
>> with the ifnotfound argument of mget(). As mentioned, the possibility of a
>> NULL value is problematic. One solution is a sentinel value that indicates
>> an unbound value (like R_UnboundValue).
>>
>
> A null default is fine -- it's a default; if it isn't right for a
> particular case you can provide something else.
>
>
>> But another idea (and one pretty similar to John's) is to follow the
>> SYMSXP
>> design at the C level, where there is a structure that points to the name
>> and a value. We already have SYMSXPs at the R level of course (name
>> objects) but they do not provide access to the value, which is typically
>> R_UnboundValue. But this does not even need to be implemented with SYMSXP.
>> The design would allow something like:
>>
>> binding <- getBinding("x", env)
>> if (hasValue(binding)) {
>>  x <- value(binding) # throws an error if none
>>  message(name(binding), "has value", x)
>> }
>>
>> That I think it is a bit verbose but readable and could be made fast. And
>> I
>> think binding objects would be useful in other ways, as they are
>> essentially a "named object". For example, when iterating over an
>> environment.
>>
>
> This would need a lot more thought. Directly exposing the internals is
> definitely not something we want to do as we may well want to change
> that design. But there are lots of other corner issues that would have
> to be thought through before going forward, such as what happens if an
> rm occurs between obtaining a binding object and doing something with
> it. Serialization would also need thinking through. This doesn't seem
> like a worthwhile place to spend our efforts to me.
>
> Adding getIfExists, or .get, or get0, or whatever seems fine. Adding
> an argument to get() with missing giving current behavior may be OK
> too. Rewriting exists and get as .Primitives may be sufficient though.
>
> Best,
>
> luke
>
>
>  Michael
>>
>>
>>
>>
>> On Thu, Jan 8, 2015 at 6:03 AM, John Nolan  wrote:
>>
>>  Adding an optional argument to get (and mget) like
>>>
>>> val <- get(name, where, ..., value.if.not.found=NULL )   (*)
>>>
>>> would be useful for many.  HOWEVER, it is possible that there could be
>>> some confusion here: (*) can give a NULL because either x exists and
>>> has value NULL, or because x doesn't exist.   If that matters, the user
>>> would need to be careful about specifying a value.if.not.found that
>>> cannot
>>> be confused with a valid value of x.
>>>
>>> To avoid this difficulty, perhaps we want both: have Martin's
>>> getifexists(
>>> )
>>> return a list with two values:
>>>   - a boolean variable 'found'  # = value returned by exists( )
>>>   - a variable 'value'
>>>
>>> Then implement get( ) as:
>>>
>>> get <- function(x,...,value.if.not.found ) {
>>>
>>>   if( missing(value.if.not.found) ) {
>>> a <- getifexists(x,... )
>>> if (!a$found) error("x not found")
>>>   } else {
>>> a <- getifexists(x,...,value.if.not.found )
>>>   }
>>>   return(a$value)
>>> }
>>>
>>> Note that value.if.not.found has no default value in above.
>>> It behaves exactly like current get does if value.if.not.found
>>> is not specified, and if it is specified, it would be faster
>>> in the common situation mentioned below:
>>>  if(exists(x,...)) { get(x,...) }
>>>
>>> John
>>>
>>> P.S. if you like dromedaries call it valueIfNotFound ...
>>>
>>>  ..
>>>  John P. Nolan
>>>  Math/Stat Department
>>>  227 Gray Hall,   American University
>>>  4400 Massachusetts Avenue, NW
>>>  Washington, DC 20016-8050
>>>
>>>  jpno...@american.edu   voice: 202.885.3140
>>>  web: academic2.american.edu/~jpnolan
>>>  ..
>>>
>>>
>>> -"R-devel"  wrote: -
>>> To: Martin Maechler , R-devel@r-project.org
>>> From: Duncan Murdoch
>>> Sent by: "R-devel"
>>> Date: 01/08/2015 06:39AM
>>> Subject: Re: [Rd] RFC: getifexists() {was [Bug 16065] "exists" ...}
>>>
>>> On 08/01/2015 4:16 AM, Martin Maechler wrote:
>>> > In November, we had a "bug repository conversation"
>>> > with Peter Hagerty and myself:
>>> >
>>> >   https://bugs.r-project.org/bugzilla/s

Re: [Rd] RFC: getifexists() {was [Bug 16065] "exists" ...}

2015-01-08 Thread Michael Lawrence
On Thu, Jan 8, 2015 at 11:57 AM,  wrote:

> On Thu, 8 Jan 2015, Michael Lawrence wrote:
>
>  If we do add an argument to get(), then it should be named consistently
>> with the ifnotfound argument of mget(). As mentioned, the possibility of a
>> NULL value is problematic. One solution is a sentinel value that indicates
>> an unbound value (like R_UnboundValue).
>>
>
> A null default is fine -- it's a default; if it isn't right for a
> particular case you can provide something else.
>
>
>> But another idea (and one pretty similar to John's) is to follow the
>> SYMSXP
>> design at the C level, where there is a structure that points to the name
>> and a value. We already have SYMSXPs at the R level of course (name
>> objects) but they do not provide access to the value, which is typically
>> R_UnboundValue. But this does not even need to be implemented with SYMSXP.
>> The design would allow something like:
>>
>> binding <- getBinding("x", env)
>> if (hasValue(binding)) {
>>  x <- value(binding) # throws an error if none
>>  message(name(binding), "has value", x)
>> }
>>
>> That I think it is a bit verbose but readable and could be made fast. And
>> I
>> think binding objects would be useful in other ways, as they are
>> essentially a "named object". For example, when iterating over an
>> environment.
>>
>
> This would need a lot more thought. Directly exposing the internals is
> definitely not something we want to do as we may well want to change
> that design. But there are lots of other corner issues that would have
> to be thought through before going forward, such as what happens if an
> rm occurs between obtaining a binding object and doing something with
> it. Serialization would also need thinking through. This doesn't seem
> like a worthwhile place to spend our efforts to me.
>
>

Just wanted to be clear that I was not suggesting to expose any internals.
We could implement the behavior using SYMSXP, or not. Nor would the binding
need to be mutable. The binding would be considered independent of the
environment from which it was retrieved. As Pete has mentioned, it could be
a useful abstraction to have in general.


> Adding getIfExists, or .get, or get0, or whatever seems fine. Adding
> an argument to get() with missing giving current behavior may be OK
> too. Rewriting exists and get as .Primitives may be sufficient though.
>
> Best,
>
> luke
>
>
>  Michael
>>
>>
>>
>>
>> On Thu, Jan 8, 2015 at 6:03 AM, John Nolan  wrote:
>>
>>  Adding an optional argument to get (and mget) like
>>>
>>> val <- get(name, where, ..., value.if.not.found=NULL )   (*)
>>>
>>> would be useful for many.  HOWEVER, it is possible that there could be
>>> some confusion here: (*) can give a NULL because either x exists and
>>> has value NULL, or because x doesn't exist.   If that matters, the user
>>> would need to be careful about specifying a value.if.not.found that
>>> cannot
>>> be confused with a valid value of x.
>>>
>>> To avoid this difficulty, perhaps we want both: have Martin's
>>> getifexists(
>>> )
>>> return a list with two values:
>>>   - a boolean variable 'found'  # = value returned by exists( )
>>>   - a variable 'value'
>>>
>>> Then implement get( ) as:
>>>
>>> get <- function(x,...,value.if.not.found ) {
>>>
>>>   if( missing(value.if.not.found) ) {
>>> a <- getifexists(x,... )
>>> if (!a$found) error("x not found")
>>>   } else {
>>> a <- getifexists(x,...,value.if.not.found )
>>>   }
>>>   return(a$value)
>>> }
>>>
>>> Note that value.if.not.found has no default value in above.
>>> It behaves exactly like current get does if value.if.not.found
>>> is not specified, and if it is specified, it would be faster
>>> in the common situation mentioned below:
>>>  if(exists(x,...)) { get(x,...) }
>>>
>>> John
>>>
>>> P.S. if you like dromedaries call it valueIfNotFound ...
>>>
>>>  ..
>>>  John P. Nolan
>>>  Math/Stat Department
>>>  227 Gray Hall,   American University
>>>  4400 Massachusetts Avenue, NW
>>>  Washington, DC 20016-8050
>>>
>>>  jpno...@american.edu   voice: 202.885.3140
>>>  web: academic2.american.edu/~jpnolan
>>>  ..
>>>
>>>
>>> -"R-devel"  wrote: -
>>> To: Martin Maechler , R-devel@r-project.org
>>> From: Duncan Murdoch
>>> Sent by: "R-devel"
>>> Date: 01/08/2015 06:39AM
>>> Subject: Re: [Rd] RFC: getifexists() {was [Bug 16065] "exists" ...}
>>>
>>> On 08/01/2015 4:16 AM, Martin Maechler wrote:
>>> > In November, we had a "bug repository conversation"
>>> > with Peter Hagerty and myself:
>>> >
>>> >   https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16065
>>> >
>>> > where the bug report title started with
>>> >
>>> >  --->>  "exists" is a bottleneck for dispatch and package loading, ...
>>> >
>>> > Peter proposed an extra simplified and henc faster version of exists(),
>>> > and I commented
>>> >
>>> > > --- Comment #2 from Martin Maechler 
>>

[Rd] latex warning

2015-01-08 Thread Gábor Csárdi
Dear all,

I am getting an R CMD check warning about the PDF manual. I am having a
hard time finding out what is wrong, here is the log of the Rd2pdf call.

The full check (and other) log is at
https://api.travis-ci.org/jobs/46373922/log.txt?deansi=true if anybody is
interested, and the package itself is here:
https://github.com/metacran/r-builder/tree/bintex/rbuildertest

Thanks, Best,
Gabor


+cat ./rbuildertest.Rcheck/Rdlatex.log
Hmm ... looks like a package
This is pdfTeX, Version 3.14159265-2.6-1.40.15 (TeX Live 2014) (preloaded
format=pdflatex)
 restricted \write18 enabled.

kpathsea: Running mktexfmt pdflatex.fmt
fmtutil: running `pdftex -ini   -jobname=pdflatex -progname=pdflatex
-translate-file=cp227.tcx *pdflatex.ini' ...
This is pdfTeX, Version 3.14159265-2.6-1.40.15 (TeX Live 2014) (INITEX)
 restricted \write18 enabled.
 (/home/travis/R-bin/texlive/texmf-dist/web2c/cp227.tcx)
entering extended mode
(/home/travis/R-bin/texlive/texmf-dist/tex/latex/latexconfig/pdflatex.ini
(/home/travis/R-bin/texlive/texmf-config/tex/generic/config/pdftexconfig.tex)
(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/latex.ltx
(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/texsys.cfg)
./texsys.aux found


\@currdir set to: ./.


Assuming \openin and \input
have the same search path.


Defining UNIX/DOS style filename parser.

catcodes, registers, compatibility for TeX 2,  parameters,
LaTeX2e <2014/05/01>
hacks, control, par, spacing, files, font encodings, lengths,


Local config file fonttext.cfg used


(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/fonttext.cfg
(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/fonttext.ltx
=== Don't modify this file, use a .cfg file instead ===

(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/omlenc.def)
(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/t1enc.def)
(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/ot1enc.def)
(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/omsenc.def)
(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/t1cmr.fd)
(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/ot1cmr.fd)
(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/ot1cmss.fd)
(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/ot1cmtt.fd)))


Local config file fontmath.cfg used


(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/fontmath.cfg
(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/fontmath.ltx
=== Don't modify this file, use a .cfg file instead ===

(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/omlcmm.fd)
(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/omscmsy.fd)
(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/omxcmex.fd)
(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/ucmr.fd)))


Local config file preload.cfg used

=
(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/preload.cfg
(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/preload.ltx)) page
nos.,
x-ref, environments, center, verbatim, math definitions, boxes, title,
sectioning, contents, floats, footnotes, index, bibliography, output,
===
Local configuration file hyphen.cfg used
===
(/home/travis/R-bin/texlive/texmf-dist/tex/generic/babel/hyphen.cfg
(/home/travis/R-bin/texlive/texmf-dist/tex/generic/babel/switch.def)
(/home/travis/R-bin/texlive/texmf-dist/tex/generic/hyphen/hyphen.tex)
(/home/travis/R-bin/texlive/texmf-dist/tex/generic/hyphen/dumyhyph.tex)
(/home/travis/R-bin/texlive/texmf-dist/tex/generic/hyphen/zerohyph.tex))
=
Applying patch file ltpatch.ltx
=
(/home/travis/R-bin/texlive/texmf-dist/tex/latex/base/ltpatch.ltx)
 ) )
Beginning to dump on file pdflatex.fmt
 (preloaded format=pdflatex 2015.1.8)
4976 strings of total length 68991
45099 memory locations dumped; current usage is 144&43215
3320 multiletter control sequences
\font\nullfont=nullfont
\font\OMX/cmex/m/n/10=cmex10
\font\tenln=line10
\font\tenlnw=linew10
\font\tencirc=lcircle10
\font\tencircw=lcirclew10
\font\OT1/cmr/m/n/5=cmr5
\font\OT1/cmr/m/n/7=cmr7
\font\OT1/cmr/m/n/10=cmr10
\font\OML/cmm/m/it/5=cmmi5
\font\OML/cmm/m/it/7=cmmi7
\font\OML/cmm/m/it/10=cmmi10
\font\OMS/cmsy/m/n/5=cmsy5
\font\OMS/cmsy/m/n/7=cmsy7
\font\OMS/cmsy/m/n/10=cmsy10
3633 words of font info for 14 preloaded fonts
14 hyphenation exceptions
Hyphenation trie of length 6081 has 183 ops out of 35111
  2 for language 1
  181 for language 0
0 words of pdfTeX memory
0 indirect objects
No pages of output.
Transcript written on pdflatex.log.
fmtutil: /home/travis/.texlive2014/texmf-var/web2c/pdftex/pdflatex.fmt
installed.
fmtutil: No errors, exiting successfully.
entering extended mode
(./Rd2.tex
LaTeX2e <2014/05/01>
Babel <3.9l> and hyphe

Re: [Rd] setequal: better readability, reduced memory footprint, and minor speedup

2015-01-08 Thread William Dunlap
> why is there no setcontains()?

Several packages define is.subset(), which I am assuming is what you are
proposing, but it its arguments reversed.  E.g., package:algstat has
   is.subset <- function(x, y) all(x %in% y)
   containsQ <- function(y, x) all(x %in% y)
and package:rje has essentially the same is.subset.

package:arulesSequences and package:arules have an S4 generic called
is.subset, which is entirely different (it is not a predicate, but returns
a matrix).


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Jan 8, 2015 at 1:30 PM, peter dalgaard  wrote:

> If you look at the definition of %in%, you'll find that it is implemented
> using match, so if we did as you suggest, I give it about three days before
> someone suggests to inline the function call... Readability of source code
> is not usually our prime concern.
>
> The && idea does have some merit, though.
>
> Apropos, why is there no setcontains()?
>
> -pd
>
> > On 06 Jan 2015, at 22:02 , Hervé Pagès  wrote:
> >
> > Hi,
> >
> > Current implementation:
> >
> > setequal <- function (x, y)
> > {
> >  x <- as.vector(x)
> >  y <- as.vector(y)
> >  all(c(match(x, y, 0L) > 0L, match(y, x, 0L) > 0L))
> > }
> >
> > First what about replacing 'match(x, y, 0L) > 0L' and 'match(y, x, 0L) >
> 0L'
> > with 'x %in% y' and 'y %in% x', respectively. They're strictly
> > equivalent but the latter form is a lot more readable than the former
> > (isn't this the "raison d'être" of %in%?):
> >
> > setequal <- function (x, y)
> > {
> >  x <- as.vector(x)
> >  y <- as.vector(y)
> >  all(c(x %in% y, y %in% x))
> > }
> >
> > Furthermore, replacing 'all(c(x %in% y, y %in x))' with
> > 'all(x %in% y) && all(y %in% x)' improves readability even more and,
> > more importantly, reduces memory footprint significantly on big vectors
> > (e.g. by 15% on integer vectors with 15M elements):
> >
> > setequal <- function (x, y)
> > {
> >  x <- as.vector(x)
> >  y <- as.vector(y)
> >  all(x %in% y) && all(y %in% x)
> > }
> >
> > It also seems to speed up things a little bit (not in a significant
> > way though).
> >
> > Cheers,
> > H.
> >
> > --
> > Hervé Pagès
> >
> > Program in Computational Biology
> > Division of Public Health Sciences
> > Fred Hutchinson Cancer Research Center
> > 1100 Fairview Ave. N, M1-B514
> > P.O. Box 19024
> > Seattle, WA 98109-1024
> >
> > E-mail: hpa...@fredhutch.org
> > Phone:  (206) 667-5791
> > Fax:(206) 667-1319
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] setequal: better readability, reduced memory footprint, and minor speedup

2015-01-08 Thread Peter Haverty
How about unique them both and compare the lengths?  It's less work,
especially allocation.



Pete


Peter M. Haverty, Ph.D.
Genentech, Inc.
phave...@gene.com

On Thu, Jan 8, 2015 at 1:30 PM, peter dalgaard  wrote:

> If you look at the definition of %in%, you'll find that it is implemented
> using match, so if we did as you suggest, I give it about three days before
> someone suggests to inline the function call... Readability of source code
> is not usually our prime concern.
>
> The && idea does have some merit, though.
>
> Apropos, why is there no setcontains()?
>
> -pd
>
> > On 06 Jan 2015, at 22:02 , Herv� Pag�s  wrote:
> >
> > Hi,
> >
> > Current implementation:
> >
> > setequal <- function (x, y)
> > {
> >  x <- as.vector(x)
> >  y <- as.vector(y)
> >  all(c(match(x, y, 0L) > 0L, match(y, x, 0L) > 0L))
> > }
> >
> > First what about replacing 'match(x, y, 0L) > 0L' and 'match(y, x, 0L) >
> 0L'
> > with 'x %in% y' and 'y %in% x', respectively. They're strictly
> > equivalent but the latter form is a lot more readable than the former
> > (isn't this the "raison d'�tre" of %in%?):
> >
> > setequal <- function (x, y)
> > {
> >  x <- as.vector(x)
> >  y <- as.vector(y)
> >  all(c(x %in% y, y %in% x))
> > }
> >
> > Furthermore, replacing 'all(c(x %in% y, y %in x))' with
> > 'all(x %in% y) && all(y %in% x)' improves readability even more and,
> > more importantly, reduces memory footprint significantly on big vectors
> > (e.g. by 15% on integer vectors with 15M elements):
> >
> > setequal <- function (x, y)
> > {
> >  x <- as.vector(x)
> >  y <- as.vector(y)
> >  all(x %in% y) && all(y %in% x)
> > }
> >
> > It also seems to speed up things a little bit (not in a significant
> > way though).
> >
> > Cheers,
> > H.
> >
> > --
> > Herv� Pag�s
> >
> > Program in Computational Biology
> > Division of Public Health Sciences
> > Fred Hutchinson Cancer Research Center
> > 1100 Fairview Ave. N, M1-B514
> > P.O. Box 19024
> > Seattle, WA 98109-1024
> >
> > E-mail: hpa...@fredhutch.org
> > Phone:  (206) 667-5791
> > Fax:(206) 667-1319
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd@cbs.dk  Priv: pda...@gmail.com
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] setequal: better readability, reduced memory footprint, and minor speedup

2015-01-08 Thread Michael Lawrence
Currently unique() does duplicated() internally and then extracts. One
could make a countUnique that simply counts, rather than allocate the
logical return value of duplicated(). But so much of the cost is in the
hash operation that it probably won't help much, but that might depend on
the sizes of things. The more unique elements, the better it would perform.


On Thu, Jan 8, 2015 at 2:06 PM, Peter Haverty 
wrote:

> How about unique them both and compare the lengths?  It's less work,
> especially allocation.
>
>
>
> Pete
>
> 
> Peter M. Haverty, Ph.D.
> Genentech, Inc.
> phave...@gene.com
>
> On Thu, Jan 8, 2015 at 1:30 PM, peter dalgaard  wrote:
>
> > If you look at the definition of %in%, you'll find that it is implemented
> > using match, so if we did as you suggest, I give it about three days
> before
> > someone suggests to inline the function call... Readability of source
> code
> > is not usually our prime concern.
> >
> > The && idea does have some merit, though.
> >
> > Apropos, why is there no setcontains()?
> >
> > -pd
> >
> > > On 06 Jan 2015, at 22:02 , Hervé Pagès  wrote:
> > >
> > > Hi,
> > >
> > > Current implementation:
> > >
> > > setequal <- function (x, y)
> > > {
> > >  x <- as.vector(x)
> > >  y <- as.vector(y)
> > >  all(c(match(x, y, 0L) > 0L, match(y, x, 0L) > 0L))
> > > }
> > >
> > > First what about replacing 'match(x, y, 0L) > 0L' and 'match(y, x, 0L)
> >
> > 0L'
> > > with 'x %in% y' and 'y %in% x', respectively. They're strictly
> > > equivalent but the latter form is a lot more readable than the former
> > > (isn't this the "raison d'être" of %in%?):
> > >
> > > setequal <- function (x, y)
> > > {
> > >  x <- as.vector(x)
> > >  y <- as.vector(y)
> > >  all(c(x %in% y, y %in% x))
> > > }
> > >
> > > Furthermore, replacing 'all(c(x %in% y, y %in x))' with
> > > 'all(x %in% y) && all(y %in% x)' improves readability even more and,
> > > more importantly, reduces memory footprint significantly on big vectors
> > > (e.g. by 15% on integer vectors with 15M elements):
> > >
> > > setequal <- function (x, y)
> > > {
> > >  x <- as.vector(x)
> > >  y <- as.vector(y)
> > >  all(x %in% y) && all(y %in% x)
> > > }
> > >
> > > It also seems to speed up things a little bit (not in a significant
> > > way though).
> > >
> > > Cheers,
> > > H.
> > >
> > > --
> > > Hervé Pagès
> > >
> > > Program in Computational Biology
> > > Division of Public Health Sciences
> > > Fred Hutchinson Cancer Research Center
> > > 1100 Fairview Ave. N, M1-B514
> > > P.O. Box 19024
> > > Seattle, WA 98109-1024
> > >
> > > E-mail: hpa...@fredhutch.org
> > > Phone:  (206) 667-5791
> > > Fax:(206) 667-1319
> > >
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> > --
> > Peter Dalgaard, Professor,
> > Center for Statistics, Copenhagen Business School
> > Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> > Phone: (+45)38153501
> > Email: pd@cbs.dk  Priv: pda...@gmail.com
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> [[alternative HTML version deleted]]
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] setequal: better readability, reduced memory footprint, and minor speedup

2015-01-08 Thread Peter Haverty
Try this out. It looks like a 2X speedup for some cases and a wash in
others.  "unique" does two allocations, but skipping the "> 0L" allocation
could make up for it.

library(microbenchmark)
library(RUnit)

x = sample.int(1e4, 1e5, TRUE)
y = sample.int(1e4, 1e5, TRUE)

set_equal <- function(x, y) {
xu = .Internal(unique(x, FALSE, FALSE, NA))
yu = .Internal(unique(y, FALSE, FALSE, NA))
if (length(xu) != length(yu)) {
return(FALSE);
}
return( all(match(xu, yu, 0L) > 0L) )
}

set_equal2 <- function(x, y) {
xu = .Internal(unique(x, FALSE, FALSE, NA))
yu = .Internal(unique(y, FALSE, FALSE, NA))
if (length(xu) != length(yu)) {
return(FALSE);
}
return( !anyNA(match(xu, yu)) )
}

microbenchmark(
a = setequal(x, y),
b = set_equal(x, y),
c = set_equal2(x, y)
)
checkIdentical(setequal(x, y), set_equal(x, y))
checkIdentical(setequal(x, y), set_equal2(x, y))

x = y
microbenchmark(
a = setequal(x, y),
b = set_equal(x, y),
c = set_equal2(x, y)
)
checkIdentical(setequal(x, y), set_equal(x, y))
checkIdentical(setequal(x, y), set_equal2(x, y))


Sorry, I'm probably over-posting today.

Regards,

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] setequal: better readability, reduced memory footprint, and minor speedup

2015-01-08 Thread Peter Haverty
I was thinking something like:

setequal <- function(x,y) {
xu = unique(x)
yu = unique(y)
if (length(xu) != length(yu)) { return FALSE; }
return (all( match( xu, yu, 0L ) > 0L ) )
}

This lets you fail early for cheap (skipping the allocation from the
">0L"s).  Whether or not this goes fast depends a lot on the uniqueness of
x and y and whether or not you want to optimize for the TRUE or FALSE case.
You'd do much better to make some real hashes in C and compare the keys,
but it's probably not worth the complexity.




Pete


Peter M. Haverty, Ph.D.
Genentech, Inc.
phave...@gene.com

On Thu, Jan 8, 2015 at 2:06 PM, Peter Haverty  wrote:

> How about unique them both and compare the lengths?  It's less work,
> especially allocation.
>
>
>
> Pete
>
> 
> Peter M. Haverty, Ph.D.
> Genentech, Inc.
> phave...@gene.com
>
> On Thu, Jan 8, 2015 at 1:30 PM, peter dalgaard  wrote:
>
>> If you look at the definition of %in%, you'll find that it is implemented
>> using match, so if we did as you suggest, I give it about three days before
>> someone suggests to inline the function call... Readability of source code
>> is not usually our prime concern.
>>
>> The && idea does have some merit, though.
>>
>> Apropos, why is there no setcontains()?
>>
>> -pd
>>
>> > On 06 Jan 2015, at 22:02 , Herv� Pag�s  wrote:
>> >
>> > Hi,
>> >
>> > Current implementation:
>> >
>> > setequal <- function (x, y)
>> > {
>> >  x <- as.vector(x)
>> >  y <- as.vector(y)
>> >  all(c(match(x, y, 0L) > 0L, match(y, x, 0L) > 0L))
>> > }
>> >
>> > First what about replacing 'match(x, y, 0L) > 0L' and 'match(y, x, 0L)
>> > 0L'
>> > with 'x %in% y' and 'y %in% x', respectively. They're strictly
>> > equivalent but the latter form is a lot more readable than the former
>> > (isn't this the "raison d'�tre" of %in%?):
>> >
>> > setequal <- function (x, y)
>> > {
>> >  x <- as.vector(x)
>> >  y <- as.vector(y)
>> >  all(c(x %in% y, y %in% x))
>> > }
>> >
>> > Furthermore, replacing 'all(c(x %in% y, y %in x))' with
>> > 'all(x %in% y) && all(y %in% x)' improves readability even more and,
>> > more importantly, reduces memory footprint significantly on big vectors
>> > (e.g. by 15% on integer vectors with 15M elements):
>> >
>> > setequal <- function (x, y)
>> > {
>> >  x <- as.vector(x)
>> >  y <- as.vector(y)
>> >  all(x %in% y) && all(y %in% x)
>> > }
>> >
>> > It also seems to speed up things a little bit (not in a significant
>> > way though).
>> >
>> > Cheers,
>> > H.
>> >
>> > --
>> > Herv� Pag�s
>> >
>> > Program in Computational Biology
>> > Division of Public Health Sciences
>> > Fred Hutchinson Cancer Research Center
>> > 1100 Fairview Ave. N, M1-B514
>> > P.O. Box 19024
>> > Seattle, WA 98109-1024
>> >
>> > E-mail: hpa...@fredhutch.org
>> > Phone:  (206) 667-5791
>> > Fax:(206) 667-1319
>> >
>> > __
>> > R-devel@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>> --
>> Peter Dalgaard, Professor,
>> Center for Statistics, Copenhagen Business School
>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> Phone: (+45)38153501
>> Email: pd@cbs.dk  Priv: pda...@gmail.com
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] setequal: better readability, reduced memory footprint, and minor speedup

2015-01-08 Thread Hervé Pagès

On 01/08/2015 01:30 PM, peter dalgaard wrote:

If you look at the definition of %in%, you'll find that it is implemented using 
match, so if we did as you suggest, I give it about three days before someone 
suggests to inline the function call...


But you wouldn't bet money on that right? Because you know you would
loose.


Readability of source code is not usually our prime concern.


Don't sacrifice readability if you do not have a good reason for it.
What's your reason here? Are you seriously suggesting that inlining
makes a significant difference? As Michael pointed out, the expensive
operation here is the hashing. But sadly some people like inlining and
want to use it everywhere: it's easy and they feel good about it, even
if it hurts readability and maintainability (if you use x %in% y
instead of the inlined version, the day someone changes the
implementation of x %in% y for something faster, or fixes a bug
in it, your code will automatically benefit, right now it won't).

More simply put: good readability generally leads to better code.



The && idea does have some merit, though.

Apropos, why is there no setcontains()?


Wait... shouldn't everybody use all(match(x, y, nomatch = 0L) > 0L) ?

H.



-pd


On 06 Jan 2015, at 22:02 , Hervé Pagès  wrote:

Hi,

Current implementation:

setequal <- function (x, y)
{
  x <- as.vector(x)
  y <- as.vector(y)
  all(c(match(x, y, 0L) > 0L, match(y, x, 0L) > 0L))
}

First what about replacing 'match(x, y, 0L) > 0L' and 'match(y, x, 0L) > 0L'
with 'x %in% y' and 'y %in% x', respectively. They're strictly
equivalent but the latter form is a lot more readable than the former
(isn't this the "raison d'être" of %in%?):

setequal <- function (x, y)
{
  x <- as.vector(x)
  y <- as.vector(y)
  all(c(x %in% y, y %in% x))
}

Furthermore, replacing 'all(c(x %in% y, y %in x))' with
'all(x %in% y) && all(y %in% x)' improves readability even more and,
more importantly, reduces memory footprint significantly on big vectors
(e.g. by 15% on integer vectors with 15M elements):

setequal <- function (x, y)
{
  x <- as.vector(x)
  y <- as.vector(y)
  all(x %in% y) && all(y %in% x)
}

It also seems to speed up things a little bit (not in a significant
way though).

Cheers,
H.

--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] New version of Rtools for Windows

2015-01-08 Thread Avraham Adler
Regarding the redefinition error, I've asked on StackOverflow for
advice [1], but I have noticed the following; perhaps someone here can
understand what changed between the stdio.h of 4.6.3 and the stdio.h
of 4.8.4. In GCC 4.8.4, the section of stdio.h which is referenced in
the errors is the following:

#if !defined (__USE_MINGW_ANSI_STDIO) || __USE_MINGW_ANSI_STDIO == 0
/* this is here to deal with software defining
 * vsnprintf as _vsnprintf, eg. libxml2.  */
#pragma push_macro("snprintf")
#pragma push_macro("vsnprintf")
# undef snprintf
# undef vsnprintf
  int __cdecl __ms_vsnprintf(char * __restrict__ d,size_t n,const char
* __restrict__ format,va_list arg)
__MINGW_ATTRIB_DEPRECATED_MSVC2005 __MINGW_ATTRIB_DEPRECATED_SEC_WARN;

  __mingw_ovr
  __MINGW_ATTRIB_NONNULL(3)
  int vsnprintf (char * __restrict__ __stream, size_t __n, const char
* __restrict__ __format, va_list __local_argv)
  {
return __ms_vsnprintf (__stream, __n, __format, __local_argv);
  }

  int __cdecl __ms_snprintf(char * __restrict__ s, size_t n, const
char * __restrict__  format, ...);

#ifndef __NO_ISOCEXT
__mingw_ovr
__MINGW_ATTRIB_NONNULL(3)
int snprintf (char * __restrict__ __stream, size_t __n, const char *
__restrict__ __format, ...)
{
  register int __retval;
  __builtin_va_list __local_argv; __builtin_va_start( __local_argv, __format );
  __retval = __ms_vsnprintf (__stream, __n, __format, __local_argv);
  __builtin_va_end( __local_argv );
  return __retval;
}
#endif /* !__NO_ISOCEXT */

#pragma pop_macro ("vsnprintf")
#pragma pop_macro ("snprintf")
#endif

The corresponding section in 4.6.3 as found in the Rtools for Windows
installation is:

#if !defined (__USE_MINGW_ANSI_STDIO) || __USE_MINGW_ANSI_STDIO == 0
/* this is here to deal with software defining
 * vsnprintf as _vsnprintf, eg. libxml2.  */
#pragma push_macro("snprintf")
#pragma push_macro("vsnprintf")
# undef snprintf
# undef vsnprintf
  int __cdecl vsnprintf(char * __restrict__ d,size_t n,const char *
__restrict__ format,va_list arg)
__MINGW_ATTRIB_DEPRECATED_MSVC2005 __MINGW_ATTRIB_DEPRECATED_SEC_WARN;

#ifndef __NO_ISOCEXT
  int __cdecl snprintf(char * __restrict__ s, size_t n, const char *
__restrict__  format, ...);
#ifndef __CRT__NO_INLINE
  __CRT_INLINE int __cdecl vsnprintf(char * __restrict__ d,size_t
n,const char * __restrict__ format,va_list arg)
  {
return _vsnprintf (d, n, format, arg);
  }
#endif /* !__CRT__NO_INLINE */
#endif /* !__NO_ISOCEXT */
#pragma pop_macro ("vsnprintf")
#pragma pop_macro ("snprintf")
#endif

The latter does not have a direct redefinition of the two functions. I
still don't know why the #undef calls do not work [1].

Thank you,

Avi

[1] 
https://stackoverflow.com/questions/27853225/is-there-a-way-to-include-stdio-h-but-ignore-some-of-the-functions-therein

On Thu, Jan 8, 2015 at 2:27 PM, Hin-Tak Leung
 wrote:
> Oh, I forgot to mention that besides setting AR, RANLIB and the stack probing 
> fix, you also need a very up to date binutils. 2.25 was out in december. Even 
> with that , if you linker's default is not what you are compiling for (i.e. a 
> multiarch toolchain), you need to set GNUTARGET also, i.e. -m32/-m64 is not 
> enough. Some fix to autodetect non-default targets went in after christmas 
> before the new year, but I am not brave enough to try that on a daily basis 
> yet (only tested it and reported it, then reverting the change - how gcc 
> invokes the linker is rather complicated and it is not easy to have two 
> binutils installed...)- setting GNUTARGET seems safer :-).
> Whether you need that depends on whether you are compiling for your 
> toolchain's default target architecture.
>
> AR, RANLIB, GNUTARGET are all environment variables - you set them the usual 
> way. The stack probing fix is for passing "make check", when you finish make.
>
> --
> On Thu, Jan 8, 2015 6:14 PM GMT Avraham Adler wrote:
>
>>On Thu, Jan 8, 2015 at 10:48 AM, Hin-Tak Leung
>> wrote:
>>>
>>> The r.dll crash is easy - you need to be using gcc-ar for ar, and 
>>> gcc-ranlib for ranlib. I also posted a patch to fix the check failure for 
>>> stack probing, as lto optimizes away the stack probing code, as it should.
>>>
>>> yes, lto build's speed gain is very impressive.
>>>
>>
>>
>>I apologize for my ignorance, but how would I do that? I tried by
>>changing the following in src/gnuwin32/MkRules.local:
>>
>># prefix for 64-bit: path or x86_64-w64-mingw32-
>>BINPREF64 = x86_64-w64-mingw32-gcc-
>>
>>I added the gcc- as the suffix there, but I guess that is insufficient
>>as I still get the following error using 4.9.2:
>>
>>windres -F pe-x86-64  -I../include -i dllversion.rc -o dllversion.o
>>gcc -std=gnu99 -m64 -shared -s -mwindows -o R.dll R.def console.o
>>dynload.o editor.o embeddedR.o extra.o opt.o pager.o preferences.o
>>psignal.o rhome.o rt_complete.o rui.o run.o shext.o sys-win32.o
>>system.o dos_wglob.o malloc.o ../main/libmain.a ../appl/libappl.a
>>../nmath/libnmath.a getline/gl.a ../