[Rd] openNLP Segmentation Fault (PR#13986)

2009-10-06 Thread michael
Full_Name: Michael Olschimke
Version: 2.9
OS: Ubuntu 8.04 TLS
Submission from: (NULL) (188.96.220.32)


When using openNLP (0.0-7) with rJava (0.7-0) with java-6-sun-1.6.0.07 on Ubuntu
8.04 TLS 32 bit I get the following error message (see below) when calling
tagPOS. I have installed the packages as super user, and using it as super user.
Sometimes the message does not appear but the desired behavior occurs (once or
twice). 

tagPOS("dies ist ein test", language="de")
[itchy:08338] *** Process received signal ***
[itchy:08338] Signal: Segmentation fault (11)
[itchy:08338] Signal code: Invalid permissions (2)
[itchy:08338] Failing at address: 0xb7535100
[itchy:08338] [ 0] [0xb7eef440]
[itchy:08338] [ 1] [0xb4b2dc9f]
[itchy:08338] [ 2] [0xb4b36707]
[itchy:08338] [ 3] [0xb4b3658a]
[itchy:08338] [ 4] [0xb4b37496]
[itchy:08338] [ 5] [0xb4a9ed6a]
[itchy:08338] [ 6] [0xb4a9c249]
[itchy:08338] [ 7] 
/usr/lib/jvm/java-6-sun-1.6.0.07/jre/lib/i386/client/libjvm.so
[0x621c63d]
[itchy:08338] [ 8] 
/usr/lib/jvm/java-6-sun-1.6.0.07/jre/lib/i386/client/libjvm.so
[0x63107b8]
[itchy:08338] [ 9] 
/usr/lib/jvm/java-6-sun-1.6.0.07/jre/lib/i386/client/libjvm.so
[0x621c4d0]
[itchy:08338] [10] 
/usr/lib/jvm/java-6-sun-1.6.0.07/jre/lib/i386/client/libjvm.so
[0x6245d67]
[itchy:08338] [11] 
/usr/lib/jvm/java-6-sun-1.6.0.07/jre/lib/i386/client/libjvm.so
[0x622a1ca]
[itchy:08338] [12] 
/usr/local/lib/R/site-library/rJava/libs/rJava.so(RcallMethod+0x6e1)
[0xb758c951]
[itchy:08338] [13] /usr/lib/R/lib/libR.so [0xb7c811a9]
[itchy:08338] [14] /usr/lib/R/lib/libR.so(Rf_eval+0x714) [0xb7ca9d04]
[itchy:08338] [15] /usr/lib/R/lib/libR.so [0xb7cad00e]
[itchy:08338] [16] /usr/lib/R/lib/libR.so(Rf_eval+0x451) [0xb7ca9a41]
[itchy:08338] [17] /usr/lib/R/lib/libR.so [0xb7cac28c]
[itchy:08338] [18] /usr/lib/R/lib/libR.so(Rf_eval+0x451) [0xb7ca9a41]
[itchy:08338] [19] /usr/lib/R/lib/libR.so [0xb7cab280]
[itchy:08338] [20] /usr/lib/R/lib/libR.so(Rf_eval+0x451) [0xb7ca9a41]
[itchy:08338] [21] /usr/lib/R/lib/libR.so(Rf_applyClosure+0x2ac) [0xb7cad97c]
[itchy:08338] [22] /usr/lib/R/lib/libR.so(Rf_eval+0x349) [0xb7ca9939]
[itchy:08338] [23] /usr/lib/R/lib/libR.so [0xb7cab280]
[itchy:08338] [24] /usr/lib/R/lib/libR.so(Rf_eval+0x451) [0xb7ca9a41]
[itchy:08338] [25] /usr/lib/R/lib/libR.so(Rf_applyClosure+0x2ac) [0xb7cad97c]
[itchy:08338] [26] /usr/lib/R/lib/libR.so(Rf_eval+0x349) [0xb7ca9939]
[itchy:08338] [27] /usr/lib/R/lib/libR.so [0xb7caee4f]
[itchy:08338] [28] /usr/lib/R/lib/libR.so(Rf_eval+0x1d8) [0xb7ca97c8]
[itchy:08338] [29] /usr/lib/R/lib/libR.so(Rf_eval+0x622) [0xb7ca9c12]
[itchy:08338] *** End of error message ***
Segmentation fault

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] build packages with vignettes in Windows

2008-04-29 Thread Michael

I've been trying to build a Windows binary of a package of mine without
success.  It seems that the files under inst/doc throw the script off.

I am using the command 'Rcmd INSTALL --build'.

-- Making package genepi 
  adding build stamp to DESCRIPTION
  installing NAMESPACE file and metadata
  installing R files
  installing inst files
FIND: Parameter format not correct
make[2]: *** [C:/Library/R/genepi/inst] Error 2
make[1]: *** [all] Error 2
make: *** [pkg-genepi] Error 2
*** Installation of genepi failed ***

I also tried a couple of packages downloaded from CRAN.  Those without
inst/doc directory worked fine and those who do have it didn't.

I'm using a fresh install of R-2.7.0 and Rtools-2.7. 

Any clue of what was wrong with my setup?

Thanks,

Michael

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] build packages with vignettes in Windows

2008-04-29 Thread Michael
On 29 Apr 2008, Duncan Murdoch wrote:

> On 29/04/2008 12:54 PM, Michael wrote:
> > I've been trying to build a Windows binary of a package of mine without
> > success.  It seems that the files under inst/doc throw the script off.
> >

> > I am using the command 'Rcmd INSTALL --build'.
> >
> > -- Making package genepi 
> >   adding build stamp to DESCRIPTION
> >   installing NAMESPACE file and metadata
> >   installing R files
> >   installing inst files
> > FIND: Parameter format not correct

> That looks as though you don't have the tools installed correctly, you have
> some other "find" earlier on your path.

Thanks, I didn't let the Rtools Installer to update the PATH variable for me
because it says 'restart is required' which I didn't want to do, since I can
update the environment variable without rebooting Windows.

Michael

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] build packages with vignettes in Windows

2008-04-30 Thread Michael
On 29 Apr 2008, Duncan Murdoch wrote:
 
> Right, you don't need to set the system path for everything, but you do
> need to set it in CMD (or other shell) before running Rcmd.

For Win 2K/XP/Vista, the system path can be set (through the GUI interface,
not sure how to do it with scripts) without restarting, for new CMD
processes started afterwards.

Michael

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] variable scope in update(): bug or feature?

2006-12-21 Thread Michael

I stumbled upon this when using update() (specifically update.lm()).  If
in the original call to lm(), say

a <- lm (y ~ x + z, data = mydata)

where y and z are in data frame mydata but x is in the global environment.

Then if later I run,

a0 <- update (a, ~ . - z)

a0$model will contain values of x in the global environment which may well
be different, even different length from mydata$y.  Somehow, update() pads
a0$model to have the same number of rows as the length of x.

I would think that it would desirable to use x as in a$model rather than the
global one.  

Is this a bug or a deliberate feature?

Thanks,

Michael

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] variable scope in update(): bug or feature?

2006-12-22 Thread Michael

On 22 Dec 2006, Martin Maechler wrote:

> - use a simple reproducible example -- 
> just for the convenience of your readers

Sending email directly to r-devel doesn't seem to work for me.  So I'm
resend this via gmane.

Here is an example:

> rm (list = ls())
> x <- 1:10
> mdata <- data.frame (z = rnorm (10), y = x + 3)
> m1 <- lm (y ~ x + z, data = mdata)
> summary (m1)

Call:
lm(formula = y ~ x + z, data = mdata)

Residuals:
  Min 1Q Median 3QMax
-4.950e-16 -8.107e-17  2.085e-17  9.043e-17  3.787e-16

Coefficients:
 Estimate Std. Errort value Pr(>|t|)
(Intercept)  3.000e+00  1.923e-16  1.560e+16   <2e-16 ***
x1.000e+00  2.881e-17  3.472e+16   <2e-16 ***
z   -8.717e-17  1.149e-16 -7.590e-010.473
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 2.6e-16 on 7 degrees of freedom
Multiple R-Squared: 1,  Adjusted R-squared: 1
F-statistic: 6.103e+32 on 2 and 7 DF,  p-value: < 2.2e-16

> x <- rep (1:2, each = 5)
> m2 <- update (m1, ~ . - z)
> summary (m2)

Call:
lm(formula = y ~ x, data = mdata)

Residuals:
  Min 1Q Median 3QMax
-2.000e+00 -1.000e+00  2.086e-16  1.000e+00  2.000e+00

Coefficients:
   Estimate Std. Error t value Pr(>|t|)
(Intercept)1.000  1.581   0.632  0.54474
x  5.000  1.000   5.000  0.00105 **
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.581 on 8 degrees of freedom
Multiple R-Squared: 0.7576, Adjusted R-squared: 0.7273
F-statistic:25 on 1 and 8 DF,  p-value: 0.001053

This is R 2.4.1 on Mac OS X 10.4.8.

> - use R-help.  This is really a question about R.

I think this could be a bug (at least it is not doing what I expected)
so I emailed R-devel.

Michael

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] variable scope in update(): bug or feature?

2006-12-22 Thread Michael
On 22 Dec 2006, Brian Ripley wrote:

> 'update' will update and (by default) re-fit a model. It does this
> by extracting the call stored in the object, updating the call and
> (by default) evaluating that call. Sometimes it is useful to call
> 'update' with only one argument, for example if the data frame has
> been corrected.

Thanks.  I understand now that this is the expected behavior per the
documentation.

It is just that when I call 'update (m1, ~ . - z)' I did not expect x to
change (or everything can be different if mydata has changed).  I only
wanted to evaluate the old model with old data but sans the z variable.

Would it be useful to have an option in update() not to update the data
(i.e., behaves more like drop1())?

Michael

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] density() fails if x contains Inf or -Inf (PR#8033)

2005-07-25 Thread michael . beer
Full_Name: Michael Beer
Version: 2.1.0 and 2.1.1
OS: linux-gnu and mingw32
Submission from: (NULL) (134.21.49.141)


The command "density(c(0.5,0.6,Inf,0.7))" fails saying "Wrong type for argument
2 in call to massdist". This problem can be resolved by replacing "nx = nx" by
"nx = as.integer(nx)" in the call to massdist. When there are infinite values,
the value of sum(x.finite) is attributed to nx before. Thus nx is changed to
class "numeric" instead of "integer".

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] strptime problem for 2004-10-03 02:00:00

2005-10-24 Thread Michael Sumner
Hello, I at first thought this was a system or locale issue, but since 
it occurs on
both Windows and Linux and only for 2004 (AFAIK) I report it.

I have a problem with as.POSIXct for the hour between
"2004-10-03 02:00:00 GMT" and "2004-10-03 02:59:59 GMT".  

In short, the 2 AM (GMT) hour in 2004 (but not in other years) is 
interpreted as 1 AM by strptime:
(I use ISOdatetime as a convenience).

## Windows XP

ISOdatetime(2004, 10, 3, 2, 0, 0,  tz = "GMT")
##[1] "2004-10-03 01:00:00 GMT"
ISOdatetime(2004, 10, 3, 1, 0, 0,  tz = "GMT")
##[1] "2004-10-03 01:00:00 GMT"

 ISOdatetime(2005, 10, 3, 2, 0, 0,  tz = "GMT")
##[1] "2005-10-03 02:00:00 GMT"
 ISOdatetime(2005, 10, 3, 1, 0, 0,  tz = "GMT")
## [1] "2005-10-03 01:00:00 GMT"


I've not explored it for other years, but it is not a problem for the 
same time in previous and next years.
I only found it as I have a continuous sequence of date-times that cover 
that time period, the problem
is not created by traversing that time with seq.POSIXt.

I usually use Windows XP, below I also give results on Linux  (release 
"2.4.21-37.ELsmp").  On that
machine the times are incorrect in the other direction (in 2004, 2 AM is 
interpreted as 3 AM).

My (Windows) system is set to automatically adjust for daylight summer 
time, and if I uncheck this and restart R
the problem is "fixed".  I don't know how I would do that on Linux, but 
it's a server anyway so I couldn't.  

## R 2.2.0
## Windows XP, SP2
## System time is set to (GMT+10:00) Hobart  -  Tasmanian Summer Time (1 
hour forward of GMT+10)
Sys.getlocale()
## [1] 
"LC_COLLATE=English_Australia.1252;LC_CTYPE=English_Australia.1252;LC_MONETARY=English_Australia.1252;LC_NUMERIC=C;LC_TIME=English_Australia.1252"

 Try for 2004

(t1 <- ISOdatetime(2004, 10, 3, 1, 0, 0, tz = "GMT"))
##[1] "2004-10-03 01:00:00 GMT"
(t2 <- ISOdatetime(2004, 10, 3, 2, 0, 0, tz = "GMT"))
##[1] "2004-10-03 01:00:00 GMT"

## no difference - why?  
 t2 - t1
## Time difference of 0 secs

## Try for 2005

(t1 <- ISOdatetime(2005, 10, 3, 1, 0, 0, tz = "GMT"))
##[1] "2005-10-03 01:00:00 GMT"

(t2 <- ISOdatetime(2005, 10, 3, 2, 0, 0, tz = "GMT") )
##[1] "2005-10-03 02:00:00 GMT"

## 1 hour difference - as expected  
t2 - t1
## Time difference of 1 hours



 LINUX


## R 2.2.0
## Linux (release "2.4.21-37.ELsmp"
## System time is set to (EST) - summer time (I don't know how to find 
more about this)
Sys.getlocale()
## [1] 
"LC_CTYPE=en_AU.UTF-8;LC_NUMERIC=C;LC_TIME=en_AU.UTF-8;LC_COLLATE=en_AU.UTF-8;LC_MONETARY=en_AU.UTF-8;LC_MESSAGES=en_AU.UTF-8;LC_PAPER=C;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=C;LC_IDENTIFICATION=C"


 Try for 2004

(t1 <- ISOdatetime(2004, 10, 3, 1, 0, 0, tz = "GMT"))
## [1] "2004-10-03 01:00:00 GMT"

 (t2 <- ISOdatetime(2004, 10, 3, 2, 0, 0, tz = "GMT"))
##[1] "2004-10-03 03:00:00 GMT"

## difference of 2 hours - why?
 t2 - t1
## Time difference of 2 hours


 ## Try for 2005

 (t1 <- ISOdatetime(2005, 10, 3, 1, 0, 0, tz = "GMT"))
##[1] "2005-10-03 01:00:00 GMT"

(t2 <- ISOdatetime(2005, 10, 3, 2, 0, 0, tz = "GMT") )
##[1] "2005-10-03 02:00:00 GMT"


## one hour difference as expected
 t2 - t1
## Time difference of 1 hours

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] strptime problem for 2004-10-03 02:00:00

2005-10-26 Thread Michael Sumner
Great.  Thanks for confirmation.

Cheers, Mike.

 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

----

Message: 16
Date: Tue, 25 Oct 2005 21:33:54 +0100 (BST)
From: Prof Brian Ripley <[EMAIL PROTECTED]>
Subject: Re: [Rd] strptime problem for 2004-10-03 02:00:00
To: Michael Sumner <[EMAIL PROTECTED]>
Cc: r-devel@stat.math.ethz.ch
Message-ID: <[EMAIL PROTECTED]>
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed

It's a bug, a rather rare one.  2004-10-03 02:00:00 does not exist in your 
time zone, and in trying to find out if the time is on DST or not it has 
failed to find out.  It needs to be told that GMT is never on DST so not 
to bother.

On Linux, running R with TZ="GMT" set should fix this.  Windows is harder 
to control since it does not distinguish the UK timezone from GMT.

There is a bugfix now in R-devel.  It may migrate to R-patched in due 
course.

On Tue, 25 Oct 2005, Michael Sumner wrote:


>> Hello, I at first thought this was a system or locale issue, but since
>> it occurs on
>> both Windows and Linux and only for 2004 (AFAIK) I report it.
>>
>> I have a problem with as.POSIXct for the hour between
>> "2004-10-03 02:00:00 GMT" and "2004-10-03 02:59:59 GMT".
>>
>> In short, the 2 AM (GMT) hour in 2004 (but not in other years) is
>> interpreted as 1 AM by strptime:
>> (I use ISOdatetime as a convenience).
>>
>> ## Windows XP
>>
>> ISOdatetime(2004, 10, 3, 2, 0, 0,  tz = "GMT")
>> ##[1] "2004-10-03 01:00:00 GMT"
>> ISOdatetime(2004, 10, 3, 1, 0, 0,  tz = "GMT")
>> ##[1] "2004-10-03 01:00:00 GMT"
>>
>> ISOdatetime(2005, 10, 3, 2, 0, 0,  tz = "GMT")
>> ##[1] "2005-10-03 02:00:00 GMT"
>> ISOdatetime(2005, 10, 3, 1, 0, 0,  tz = "GMT")
>> ## [1] "2005-10-03 01:00:00 GMT"
>>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R (>= 3.4.0): integer-to-double coercion in comparisons no longer done (a good thing)

2018-01-27 Thread Michael Lawrence
Thanks for highlighting this. I just made the change one day. Guess I
should have mentioned it in the NEWS.

Michael

On Sat, Jan 27, 2018 at 3:01 PM, Henrik Bengtsson <
henrik.bengts...@gmail.com> wrote:

> Hi,
>
> there was a memory improvement done in R going from R 3.3.3 to R 3.4.0
> when it comes to comparing an integer 'x' an double 'y' (either may be
> scalar or vector).
>
> For example, in R 3.3.3, I get:
>
> > getRversion()
> [1] '3.3.3'
> > x <- integer(1000)
> > y <- double(1000)
> > profmem::profmem(z <- (x < y))
> Rprofmem memory profiling of:
> z <- (x < y)
>
> Memory allocations:
>   bytes  calls
> 1  8040 
> 2  4040 
> total 12080
> >
>
> and in R 3.4.0, I get:
>
> > getRversion()
> [1] '3.4.0'
> > x <- integer(1000)
> > y <- double(1000)
> > profmem::profmem(z <- (x < y))
> Rprofmem memory profiling of:
> z <- (x < y)
>
> Memory allocations:
>   bytes  calls
> 1  4040 
> total  4040
>
> Note how in R (<= 3.3.3), the (x < y) comparison will cause an
> internal coercion of integer vector 'x' into a double, which then can
> be compared to double 'y'.  In R (>= 3.4.0), it appears that this
> coercion is done per element and will therefore avoid introducing a
> new, temporary copy internally.  The same is observed with when
> comparing with (x == y).
>
> This is a great improvement that I think deserves more credit.  I
> couldn't find any mentioning of it in the R 3.4.0 NEWS
> (https://cran.r-project.org/doc/manuals/r-release/NEWS.html).  Does
> anyone know whether this was a specific improvement for such
> comparison, or a side effect of something else, e.g. an improved byte
> compiler?
>
> Thanks,
>
> Henrik
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] bug [methods]: double execution of `insertSource` within the same session does not work

2018-01-29 Thread Michael Lawrence
Thanks, I will fix this.

On Mon, Jan 29, 2018 at 8:06 AM, Demetrio Rodriguez T. <
demetrio.rodrigue...@gmail.com> wrote:

> Hello everyone,
>
>
> I hope this reaches someone at all. It's my first bug report to the R-core,
> and, apparently, bugzilla is locked from new reports for now.
>
> I was using `methods::insertSource` to debug and successfully fix another
> package, until it suddenly stopped working. I figured out, that it is
> because I am using it on the same function multiple times within one
> session. It also produces warnings even during the first call, but somehow
> still works. Below I provide a reproducible example:
>
> SETUP:
> ```bash
> demetrio@laptop:[folder_structure]/Bugreports/methods_insertSource$ ls -a
> .  ..  gmapsdistance_fix.R  methods_insertSource.R
> ```
>
> file `gmapsdistance_fix.R`
> ```R
> gmapsdistance = function(param) {
> print('I am a bug report, my params are:')
> print(param)
> }
> ```
>
>
> file `methods_insertSource.R`
> ```R
> library(gmapsdistance)  # works with any package
>
> methods::insertSource('gmapsdistance_fix.R',
> package = 'gmapsdistance',
> functions = 'gmapsdistance',
> force = T
> )
> buggy = gmapsdistance('Works?')
> ```
>
>
> TO REPRODUCE:
> in that directory `R --vanilla` then
> ```R
> > source('methods_insertSource.R')
> Modified functions inserted through trace(): gmapsdistance
> [1] "I am a bug report, my params are:"
> [1] "Works?"
> Warning message:
> In methods::insertSource("gmapsdistance_fix.R", package = "gmapsdistance",
> :
>   cannot insert these (not found in source): "gmapsdistance"
> # Works, but gives the warning that it does not
>
> # repeat:
> > source('methods_insertSource.R')
> Error in assign(this, thisObj, envir = envwhere) :
>   cannot change value of locked binding for 'gmapsdistance'
> In addition: Warning message:
> In methods::insertSource("gmapsdistance_fix.R", package = "gmapsdistance",
> :
>   cannot insert these (not found in source): "gmapsdistance"
>
> # does not work, and gets even more confusing: so is it that the object is
> not find, or is it about a locked object?
> ```
>
> I think it's a bug.
>
> - BUG REPORT END 
>
>
> I looked into it a bit myself, in case you are interested:
>
> ```R
> # lines 20-22
> if (is(source, "environment"))
> env <- source
> else env <- evalSource(source, package, FALSE)
> # We're in the second case I guess
>
> # Browse[2]> env
> # Object of class "sourceEnvironment"
> # Source environment created  2017-12-01 05:19:51
> # For package "gmapsdistance"
> # From source file "gmapsdistancefix.R"
>
>
> # later, before line 52:
> x = env
> Browse[2]> package
> [1] "gmapsdistance"
>
> # evaluate 52
> packageSlot(env) <- package
>
> # objects x and env are still identical
> # Browse[2]> class(env)
> # [1] "sourceEnvironment"
> # attr(,"package")
> # [1] "methods"
> # Browse[2]> class(x)
> # [1] "sourceEnvironment"
> # attr(,"package")
> # [1] "methods"
>
> # Browse[2]> env
> # Object of class "sourceEnvironment"
> # Source environment created  2017-12-01 05:19:51
> # For package "gmapsdistance"
> # From source file "gmapsdistancefix.R"
> # Browse[2]> x
> # Object of class "sourceEnvironment"
> # Source environment created  2017-12-01 05:19:51
> # For package "gmapsdistance"
> # From source file "gmapsdistancefix.R"
>
> # so:
> Browse[2]>  names(env)
> NULL
>
> # which is why 53-60 do not work:
> allObjects <- names(env)
> if (!missing(functions)) {
> notThere <- is.na(match(functions, allObjects))
> if (any(notThere)) {
> warning(gettextf("cannot insert these (not found in source): %s",
> paste("\"", functions[notThere], "\"", sep = "",
> collapse = ", ")), domain = NA)
> }
> }
> ```
>
> Looking forward to your feedback!
>
> Cheers,
> Demetrio
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] as.list method for by Objects

2018-01-30 Thread Michael Lawrence
I agree that it would make sense for the object to have c("by", "list") as
its class attribute, since the object is known to behave as a list.
However, it would may be too disruptive to make this change at this point.
Hard to predict.

Michael

On Mon, Jan 29, 2018 at 5:00 PM, Dario Strbenac 
wrote:

> Good day,
>
> I'd like to suggest the addition of an as.list method for a by object that
> actually returns a list of class "list". This would make it safer to do
> type-checking, because is.list also returns TRUE for a data.frame variable
> and using class(result) == "list" is an alternative that only returns TRUE
> for lists. It's also confusing initially that
>
> > class(x)
> [1] "by"
> > is.list(x)
> [1] TRUE
>
> since there's no explicit class definition for "by" and no mention if it
> has any superclasses.
>
> --
> Dario Strbenac
> University of Sydney
> Camperdown NSW 2050
> Australia
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] as.list method for by Objects

2018-01-30 Thread Michael Lawrence
by() does not always return a list. In Gabe's example, it returns an
integer, thus it is coerced to a list. as.list() means that it should be a
VECSXP, not necessarily with "list" in the class attribute.

Michael

On Tue, Jan 30, 2018 at 2:41 PM, Hervé Pagès  wrote:

> Hi Gabe,
>
> Interestingly the behavior of as.list() on by objects seem to
> depend on the object itself:
>
> > b1 <- by(1:2, 1:2, identity)
> > class(as.list(b1))
> [1] "list"
>
> > b2 <- by(warpbreaks[, 1:2], warpbreaks[,"tension"], summary)
> > class(as.list(b2))
> [1] "by"
>
> This is with R 3.4.3 and R devel (2017-12-11 r73889).
>
> H.
>
> On 01/30/2018 02:33 PM, Gabriel Becker wrote:
>
>> Dario,
>>
>> What version of R are you using. In my mildly old 3.4.0 installation and
>> in the version of Revel I have lying around (also mildly old...)  I don't
>> see the behavior I think you are describing
>>
>> > b = by(1:2, 1:2, identity)
>>
>> > class(as.list(b))
>>
>> [1] "list"
>>
>> > sessionInfo()
>>
>> R Under development (unstable) (2017-12-19 r73926)
>>
>> Platform: x86_64-apple-darwin15.6.0 (64-bit)
>>
>> Running under: OS X El Capitan 10.11.6
>>
>>
>> Matrix products: default
>>
>> BLAS:
>> /Users/beckerg4/local/Rdevel/R.framework/Versions/3.5/Resour
>> ces/lib/libRblas.dylib
>>
>> LAPACK:
>> /Users/beckerg4/local/Rdevel/R.framework/Versions/3.5/Resour
>> ces/lib/libRlapack.dylib
>>
>>
>> locale:
>>
>> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>
>>
>> attached base packages:
>>
>> [1] stats graphics  grDevices utils datasets  methods   base
>>
>>
>> loaded via a namespace (and not attached):
>>
>> [1] compiler_3.5.0
>>
>> >
>>
>>
>> As for by not having a class definition, no S3 class has an explicit
>> definition, so this is somewhat par for the course here...
>>
>> did I misunderstand something?
>>
>>
>> ~G
>>
>> On Tue, Jan 30, 2018 at 2:24 PM, Hervé Pagès > <mailto:hpa...@fredhutch.org>> wrote:
>>
>> I agree that it makes sense to expect as.list() to perform
>> a "strict coercion" i.e. to return an object of class "list",
>> *even* on a list derivative. That's what as( , "list") does
>> by default:
>>
>># on a data.frame object
>>as(data.frame(), "list")  # object of class "list"
>>  # (but strangely it drops the names)
>>
>># on a by object
>>x <- by(warpbreaks[, 1:2], warpbreaks[,"tension"], summary)
>>as(x, "list")  # object of class "list"
>>
>> More generally speaking as() is expected to perform "strict
>> coercion" by default, unless called with 'strict=FALSE'.
>>
>> That's also what as.list() does on a data.frame:
>>
>>as.list(data.frame())  # object of class "list"
>>
>> FWIW as.numeric() also performs "strict coercion" on an integer
>> vector:
>>
>>as.numeric(1:3)  # object of class "numeric"
>>
>> So an as.list.env method that does the same as as(x, "list")
>> would bring a small touch of consistency in an otherwise
>> quite inconsistent world of coercion methods(*).
>>
>> H.
>>
>> (*) as(data.frame(), "list", strict=FALSE) doesn't do what you'd
>>  expect (just one of many examples)
>>
>>
>> On 01/29/2018 05:00 PM, Dario Strbenac wrote:
>>
>> Good day,
>>
>> I'd like to suggest the addition of an as.list method for a by
>> object that actually returns a list of class "list". This would
>> make it safer to do type-checking, because is.list also returns
>> TRUE for a data.frame variable and using class(result) == "list"
>> is an alternative that only returns TRUE for lists. It's also
>> confusing initially that
>>
>> class(x)
>>
>> [1] "by"
>>
>> is.list(x)
>>
>> [1] TRUE
>>
>> since there's no explicit class defini

Re: [Rd] as.list method for by Objects

2018-01-30 Thread Michael Lawrence
I just meant that the minimal contract for as.list() appears to be that it
returns a VECSXP. To the user, we might say that is.list() will always
return TRUE. I'm not sure we can expect consistency across methods beyond
that, nor is it feasible at this point to match the semantics of the
methods package. It deals in "class space" while as.list() deals in
"typeof() space".

Michael

On Tue, Jan 30, 2018 at 3:47 PM, Hervé Pagès  wrote:

> On 01/30/2018 02:50 PM, Michael Lawrence wrote:
>
>> by() does not always return a list. In Gabe's example, it returns an
>> integer, thus it is coerced to a list. as.list() means that it should be a
>> VECSXP, not necessarily with "list" in the class attribute.
>>
>
> The documentation is not particularly clear about what as.list()
> means for list derivatives. IMO clarifications should stick to
> simple concepts and formulations like "is.list(x) is TRUE" or
> "x is a list or a list derivative" rather than "x is a VECSXP".
> Coercion is useful beyond the use case of implementing a .C entry
> point and calling as.numeric/as.list/etc... on its arguments.
>
> This is why I was hoping that we could maybe discuss the possibility
> of making the as.list() contract less vague than just "as.list()
> must return a list or a list derivative".
>
> Again, I think that 2 things weight quite a lot in that discussion:
>   1) as.list() returns an object of class "data.frame" on a
>  data.frame (strict coercion). If all what as.list() needed to
>  do was to return a VECSXP, then as.list.default() already does
>  this on a data.frame so why did someone bother adding an
>  as.list.data.frame method that does strict coercion?
>   2) The S4 coercion system based on as() does strict coercion by
>  default.
>
> H.
>
>
>> Michael
>>
>>
>> On Tue, Jan 30, 2018 at 2:41 PM, Hervé Pagès > <mailto:hpa...@fredhutch.org>> wrote:
>>
>> Hi Gabe,
>>
>> Interestingly the behavior of as.list() on by objects seem to
>> depend on the object itself:
>>
>>  > b1 <- by(1:2, 1:2, identity)
>>  > class(as.list(b1))
>> [1] "list"
>>
>>  > b2 <- by(warpbreaks[, 1:2], warpbreaks[,"tension"], summary)
>>  > class(as.list(b2))
>> [1] "by"
>>
>> This is with R 3.4.3 and R devel (2017-12-11 r73889).
>>
>> H.
>>
>> On 01/30/2018 02:33 PM, Gabriel Becker wrote:
>>
>> Dario,
>>
>> What version of R are you using. In my mildly old 3.4.0
>> installation and in the version of Revel I have lying around
>> (also mildly old...)  I don't see the behavior I think you are
>> describing
>>
>>  > b = by(1:2, 1:2, identity)
>>
>>  > class(as.list(b))
>>
>>  [1] "list"
>>
>>  > sessionInfo()
>>
>>  R Under development (unstable) (2017-12-19 r73926)
>>
>>  Platform: x86_64-apple-darwin15.6.0 (64-bit)
>>
>>  Running under: OS X El Capitan 10.11.6
>>
>>
>>  Matrix products: default
>>
>>  BLAS:
>> /Users/beckerg4/local/Rdevel/R
>> .framework/Versions/3.5/Resources/lib/libRblas.dylib
>>
>>  LAPACK:
>> /Users/beckerg4/local/Rdevel/R
>> .framework/Versions/3.5/Resources/lib/libRlapack.dylib
>>
>>
>>  locale:
>>
>>  [1]
>> en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>>
>>
>>  attached base packages:
>>
>>  [1] stats graphics  grDevices utils datasets
>>  methods   base
>>
>>
>>  loaded via a namespace (and not attached):
>>
>>  [1] compiler_3.5.0
>>
>>  >
>>
>>
>> As for by not having a class definition, no S3 class has an
>> explicit definition, so this is somewhat par for the course
>> here...
>>
>> did I misunderstand something?
>>
>>
>> ~G
>>
>> On Tue, Jan 30, 2018 at 2:24 PM, Hervé Pagès
>> mailto:hpa...@fredhutch.org>
>> <mailto:hpa...@fredhutch.org <mailto:hpa...@fredhutch.org>>>
>> wrote:
>>
>>  I agree that it makes sense to expect as.list() to perform
>&

Re: [Rd] Best practices in developing package: From a single file

2018-01-31 Thread Michael Lawrence
I pretty much agree. I tried using roxygen when it was first released but
couldn't stand putting documentation in comments, especially for complex,
S4-based software. Rd is easy to read and write and lets me focus on the
task of writing documentation (focus is the hardest part of any task for
me). Probably the best feature of roxygen is that it automatically
generates \usage{}, which is otherwise completely redundant with the code.

I think the preceeding systems like doxygen, javadoc, gtk-doc, qtdoc, etc,
found a nice compromise through templating, where the bulk of the details
are written into the template, and just the essentials (usage, arguments,
return value) were embedded in the source file. I think this is even more
important for R, since we're often describing complex algorithms, while
most C/C++/Java software is oriented complex classes containing many
relatively simple methods.

Michael


On Tue, Jan 30, 2018 at 11:53 AM, Duncan Murdoch 
wrote:

> On 30/01/2018 11:29 AM, Brian G. Peterson wrote:
>
>> On Tue, 2018-01-30 at 17:00 +0100, Suzen, Mehmet wrote:
>>
>>> Dear R developers,
>>>
>>> I am wondering what are the best practices for developing an R
>>> package. I am aware of Hadley Wickham's best practice
>>> documentation/book (http://r-pkgs.had.co.nz/).  I recall a couple of
>>> years ago there were some tools for generating a package out of a
>>> single file, such as using package.skeleton, but no auto-generated
>>> documentation. Do you know a way to generate documentation and a
>>> package out of single R source file, or from an environment?
>>>
>>
>> Mehmet,
>>
>> This list is for development of the R language itself and closely
>> related tools.  There is a separate list, R-pkg-devel, for development
>> of packages.
>>
>> Since you're here, I'll try to answer your question.
>>
>> package.skeleton can create a package from all the R functions in a
>> specified environment.  So if you load all the functions that you want
>> in your new package into your R environment, then call
>> package.skeleton, you'll have a starting point.
>>
>> At that point, I would probably recommend moving to RStudio, and using
>> RStudio to generate markdown comments for roxygen for all your newly
>> created function files.  Then you could finish off the documentation by
>> writing it in these roxygen skeletons or copying and pasting from
>> comments in your original code files.
>>
>
> I'd agree about moving to RStudio, but I think Roxygen is the wrong
> approach for documentation.  package.skeleton() will have done the boring
> mechanical part of setting up your .Rd files; all you have to do is edit
> some content into them.  (Use prompt() to add a new file if you add a new
> function later, don't run package.skeleton() again.)
>
> This isn't the fashionable point of view, but I think it is easier to get
> good documentation that way than using Roxygen.  (It's easier to get bad
> documentation using Roxygen, but who wants that?)
>
> The reason I think this is that good documentation requires work and
> thought.  You need to think about the markup that will get your point
> across, you need to think about putting together good examples, etc.
> This is *harder* in Roxygen than if you are writing Rd files, because
> Roxygen is a thin front end to produce Rd files from comments in your .R
> files.  To get good stuff in the help page, you need just as much work as
> in writing the .Rd file directly, but then you need to add another layer on
> top to put in in a comment.  Most people don't bother.
>
> I don't know any packages with what I'd consider to be good documentation
> that use Roxygen.  It's just too easy to write minimal documentation that
> passes checks, so Roxygen users don't keep refining it.
>
> (There are plenty of examples of packages that write bad documentation
> directly to .Rd as well.  I just don't know of examples of packages with
> good documentation that use Roxygen.)
>
> Based on my criticism last week of git and Github, I expect to be called a
> grumpy old man for holding this point of view.  I'd actually like to be
> proven wrong.  So to anyone who disagrees with me:  rather than just
> calling me names, how about some examples of Roxygen-using packages that
> have good help pages with good explanations, and good examples in them?
>
> Back to Mehmet's question:  I think Hadley's book is pretty good, and I'd
> recommend most of it, just not the Roxygen part.
>
> Duncan Murdoch
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] bug [methods]: double execution of `insertSource` within the same session does not work

2018-01-31 Thread Michael Lawrence
The issue should be resolved in R-devel. It was actually deeper and more
important than this obscure insertSource() function. names() was not doing
the right thing on S4 objects derived from "environment".

On Mon, Jan 29, 2018 at 11:02 AM, Michael Lawrence 
wrote:

> Thanks, I will fix this.
>
> On Mon, Jan 29, 2018 at 8:06 AM, Demetrio Rodriguez T. <
> demetrio.rodrigue...@gmail.com> wrote:
>
>> Hello everyone,
>>
>>
>> I hope this reaches someone at all. It's my first bug report to the
>> R-core,
>> and, apparently, bugzilla is locked from new reports for now.
>>
>> I was using `methods::insertSource` to debug and successfully fix another
>> package, until it suddenly stopped working. I figured out, that it is
>> because I am using it on the same function multiple times within one
>> session. It also produces warnings even during the first call, but somehow
>> still works. Below I provide a reproducible example:
>>
>> SETUP:
>> ```bash
>> demetrio@laptop:[folder_structure]/Bugreports/methods_insertSource$ ls -a
>> .  ..  gmapsdistance_fix.R  methods_insertSource.R
>> ```
>>
>> file `gmapsdistance_fix.R`
>> ```R
>> gmapsdistance = function(param) {
>> print('I am a bug report, my params are:')
>> print(param)
>> }
>> ```
>>
>>
>> file `methods_insertSource.R`
>> ```R
>> library(gmapsdistance)  # works with any package
>>
>> methods::insertSource('gmapsdistance_fix.R',
>> package = 'gmapsdistance',
>> functions = 'gmapsdistance',
>> force = T
>> )
>> buggy = gmapsdistance('Works?')
>> ```
>>
>>
>> TO REPRODUCE:
>> in that directory `R --vanilla` then
>> ```R
>> > source('methods_insertSource.R')
>> Modified functions inserted through trace(): gmapsdistance
>> [1] "I am a bug report, my params are:"
>> [1] "Works?"
>> Warning message:
>> In methods::insertSource("gmapsdistance_fix.R", package =
>> "gmapsdistance",
>> :
>>   cannot insert these (not found in source): "gmapsdistance"
>> # Works, but gives the warning that it does not
>>
>> # repeat:
>> > source('methods_insertSource.R')
>> Error in assign(this, thisObj, envir = envwhere) :
>>   cannot change value of locked binding for 'gmapsdistance'
>> In addition: Warning message:
>> In methods::insertSource("gmapsdistance_fix.R", package =
>> "gmapsdistance",
>> :
>>   cannot insert these (not found in source): "gmapsdistance"
>>
>> # does not work, and gets even more confusing: so is it that the object is
>> not find, or is it about a locked object?
>> ```
>>
>> I think it's a bug.
>>
>> - BUG REPORT END 
>>
>>
>> I looked into it a bit myself, in case you are interested:
>>
>> ```R
>> # lines 20-22
>> if (is(source, "environment"))
>> env <- source
>> else env <- evalSource(source, package, FALSE)
>> # We're in the second case I guess
>>
>> # Browse[2]> env
>> # Object of class "sourceEnvironment"
>> # Source environment created  2017-12-01 05:19:51
>> # For package "gmapsdistance"
>> # From source file "gmapsdistancefix.R"
>>
>>
>> # later, before line 52:
>> x = env
>> Browse[2]> package
>> [1] "gmapsdistance"
>>
>> # evaluate 52
>> packageSlot(env) <- package
>>
>> # objects x and env are still identical
>> # Browse[2]> class(env)
>> # [1] "sourceEnvironment"
>> # attr(,"package")
>> # [1] "methods"
>> # Browse[2]> class(x)
>> # [1] "sourceEnvironment"
>> # attr(,"package")
>> # [1] "methods"
>>
>> # Browse[2]> env
>> # Object of class "sourceEnvironment"
>> # Source environment created  2017-12-01 05:19:51
>> # For package "gmapsdistance"
>> # From source file "gmapsdistancefix.R"
>> # Browse[2]> x
>> # Object of class "sourceEnvironment"
>> # Source environment created  2017-12-01 05:19:51
>> # For package "gmapsdistance"
>> # From source file "gmapsdistancefix.R"
>>
>> # so:
>> Browse[2]>  names(env)
>> NULL
>>
>> # which is why 53-60 do not work:
>> allObjects <- names(env)
>> if (!missing(functions)) {
>> notThere <- is.na(match(functions, allObjects))
>> if (any(notThere)) {
>> warning(gettextf("cannot insert these (not found in source): %s",
>> paste("\"", functions[notThere], "\"", sep = "",
>> collapse = ", ")), domain = NA)
>> }
>> }
>> ```
>>
>> Looking forward to your feedback!
>>
>> Cheers,
>> Demetrio
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] as.list method for by Objects

2018-02-01 Thread Michael Lawrence
On Thu, Feb 1, 2018 at 1:21 AM, Martin Maechler 
wrote:

> >>>>> Michael Lawrence 
> >>>>> on Tue, 30 Jan 2018 10:37:38 -0800 writes:
>
> > I agree that it would make sense for the object to have c("by",
> "list") as
> > its class attribute, since the object is known to behave as a list.
>
> Well, but that (list behavior) applies to most non-simple S3
> classed objects, say "data.frame", say "lm" to start with real basic ones.
>
> The later part of the discussion, seems more relevant to me.
> Adding "list" to the class attribute seems as wrong to me as
> e.g. adding "double" to "Date" or "POSIXct" (and many more such cases).
>
>
There's a distinction though. Date and POSIXct should not really behave as
double values (an implementation detail), but "by" is expected to behave as
a list (when it is one).

For the present case, we should stay with focusing on  is.list()
> being true after as.list() .. the same we would do with
> as.numeric() and is.numeric().
>
> Martin
>
> > However, it would may be too disruptive to make this change at this
> point.
> > Hard to predict.
>
> > Michael
>
> > On Mon, Jan 29, 2018 at 5:00 PM, Dario Strbenac <
> dstr7...@uni.sydney.edu.au>
> > wrote:
>
> >> Good day,
> >>
> >> I'd like to suggest the addition of an as.list method for a by
> object that
> >> actually returns a list of class "list". This would make it safer
> to do
> >> type-checking, because is.list also returns TRUE for a data.frame
> variable
> >> and using class(result) == "list" is an alternative that only
> returns TRUE
> >> for lists. It's also confusing initially that
> >>
> >> > class(x)
> >> [1] "by"
> >> > is.list(x)
> >> [1] TRUE
> >>
> >> since there's no explicit class definition for "by" and no mention
> if it
> >> has any superclasses.
> >>
> >> --
> >> Dario Strbenac
> >> University of Sydney
> >> Camperdown NSW 2050
> >> Australia
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >>
> >>
>
> > [[alternative HTML version deleted]]
>
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Fwd: Re: Best practices in developing package:

2018-02-01 Thread Michael Lawrence
Folding is a simple solution, but there are intrinsic problems, like the
need to embed the documentation in comments. If the user already has to
expand a fold to edit the docs, the IDE could instead just provide a link
or shortcut that jumps to a separate documentation file, written in
whatever language, Rd, markdown, docbook. For example, I could imagine
RStudio showing the rendered documentation in a side pane when the cursor
is on the function name/signature, and the user could somehow switch modes
to edit it. But there would be no need to mix two different languages in
the same file, and thus no ugly escaping, and no documentation obscuring
the code, or vice versa.

On Thu, Feb 1, 2018 at 7:20 AM, Lionel Henry  wrote:

>
> > On 1 févr. 2018, at 06:51, Therneau, Terry M., Ph.D. 
> wrote:
> >
> > A second is that I care a lot about documentation so my help files are
> > fairly long, so much so that the advantage of having the documentation
> of an argument
> > "close" to the declaration of said argument is lost.
>
> Good point. It suggests editors need folding support for roxygen sections.
>
> Lionel
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Best practices in developing package: From a single file

2018-02-02 Thread Michael Lawrence
On Thu, Feb 1, 2018 at 9:20 AM, Gabriel Becker  wrote:

> On Thu, Feb 1, 2018 at 5:24 AM, Lionel Henry  wrote:
>
> > On 31 janv. 2018, at 09:08, Gabriel Becker  wrote:
> >
> > > it *actively discourages* the bits it doesn't directly support.
> >
> > It may be discouraging to include Rd syntax in roxygen docs but only
> > because the LaTeX-like syntax of Rd is burdensome, not because of
> > roxygen. It is still handy to have inlined Rd as a backup and we do
> > use it for the cases where we need finer grained control.
> >
>
> I only somewhat agree with this. Part of it is the Rd specifically, I
> agree, but part of it is just the fact that it is a different syntax at
> all. People who write roxygen documentation tend to think about and write
> it in roxygen, I think. Any switch out to another syntax, thus introducing
> two syntaxes side-by-side, is discouraged by the very fact that they are
> thinking in roxygen comments.
>
> Again, this is a "discouragement", not a disallowing. I know that people
> who care deeply about writing absolutely top notch documentation, and who
> also use roxygen will do the switch when called for, but the path of least
> resistance, i.e. the pattern of behavior that is *encouraged* by roxygen2
> is to not do that, and simply write documentation using only the supported
> roxygen2 tags. I'm not saying this makes the system bad, per se. As I
> pointed out, I use it in many of my packages (and it was my choice to do
> so, not because I inherited code from someone who already did), but
> pretending it doesn't encourage certain types of behavior doesn't seem like
> the right way to go either.
>
>
> >
> > I agree with your sentiment that roxygen encourages writing of
> > documentation for time-constrained users.
> >
>
> I do think it does that, but that was really only half of what I said, I
> said it encourages time constrained users to write middling (i.e. not
> great) documentation. Another person pointed out that structurally it
> really encourages terseness in the explanations of parameters, which I
> think is very true and have heard independently from others when talking
> about it as well. This is again not a requirement, but it is a real thing.
>
>
> >
> > I'll add that the major problem of documentation is not fancy
> > formatting but the content getting out of sync with the codebase.
> > Having documentation sitting next to the code is the preferred
> > antidote to doc rot, e.g. docstrings in lisp languages, Julia and
> > Python, the Linux kernel-doc system, doxygen, javadoc, ...
> >
>
> I mean, it is *an *antidote to doc rot. And sure, one that is used
> elsewhere. You could easily imagine one that didn't require it though.
> Perhaps doc files associated with objects (including closures) could embed
> a hash of the object they document, then you could see which things have
> changed since the documentation was updated and look at which documentation
> is still ok and which needs updating. That's just off the top of my head,
> I'm sure you could make the detection much more sophisticated.
>
> Or perhaps you could imagine two help systems, akin to --help and man for
> command line tools, one of which is minimalist showing usage, etc,
> generated by roxygen comments, and one of which is much more extensive, and
> not tied to (what could be extremely large) comments in the same .R file as
> the code.
>
>
This is basically what I meant by the template-based approach. Have the
details in the template, and the vitals in the doc comment block. Combine
the two and view the docs in different ways, dynamically.


> Best,
> ~G
>
>
> > It is true that R CMD check extensive checks help a lot as well in
> > this regard though only for things that can be checked automatically.
> >
> > Best,
> > Lionel
> >
> >
>
>
> --
> Gabriel Becker, PhD
> Scientist (Bioinformatics)
> Genentech Research
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] scale.default gives an incorrect error message when is.numeric() fails on a sparse row matrix (dgeMatrix)

2018-02-27 Thread Michael Chirico
I am attempting to use the lars package with a sparse input feature matrix,
but the following fails:

library(Matrix)
library(lars)
data(diabetes)
attach(diabetes)
x = as(as.matrix(as.data.frame(x)), 'dgCMatrix')
lars(x, y, intercept = FALSE)

Error in scale.default(x, FALSE, normx) :
>
>   length of 'scale' must equal the number of columns of 'x'
>
>
More specifically, scale.default fails:

normx = new(
  "dgeMatrix",
  x = c(1.04, 1, 1.09,
1.01, 1.01,
0.992, 1.04,
0.975, 1.06,
1.06), Dim = c(1L, 10L),
  Dimnames =
list(NULL, c("x.age", "x.sex", "x.bmi", "x.map", "x.tc",
 "x.ldl", "x.hdl", "x.tch", "x.ltg", "x.glu")),
  factors = list()
)

scale(x, FALSE, normx)

The problem is that this check fails because is.numeric(normx) is FALSE:

if (is.numeric(scale) && length(scale) == nc)

So, the error message is misleading. In fact length(scale) is the same as
nc.

At a minimum, the error message needs to be repaired; do we also want to
attempt as.numeric(normx) (which I believe would have allowed scale to work
in this case)?

(I'm aware that there's some import issues in lars, as the offending line
to create normx *should* work, as is.numeric(sqrt(drop(rep(1, nrow(x)) %*%
(x^2 is TRUE -- it's simply that lars doesn't import the appropriate S4
methods)

Michael Chirico

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] scale.default gives an incorrect error message when is.numeric() fails on a dgeMatrix

2018-03-01 Thread Michael Chirico
thanks. I know the setup code is a mess, just duct-taped something together
from the examples in lars (which are a mess in turn). in fact when I
messaged Prof. Hastie he recommended using glmnet. I wonder why lars is
kept on CRAN if they've no intention of maintaining it... but I digress...

On Mar 2, 2018 1:52 AM, "Martin Maechler" 
wrote:

> >>>>> Michael Chirico 
> >>>>> on Tue, 27 Feb 2018 20:18:34 +0800 writes:
>
> Slightly amended 'Subject': (unimportant mistake: a dgeMatrix is *not*
> sparse)
>
> MM: modified to commented R code,  slightly changed from your post:
>
>
> ## I am attempting to use the lars package with a sparse input feature
> matrix,
> ## but the following fails:
>
> library(Matrix)
> library(lars)
> data(diabetes) # from 'lars'
> ##UAagghh! not like this -- both attach() *and*   as.data.frame()  are
> horrific!
> ##UA  attach(diabetes)
> ##UA  x = as(as.matrix(as.data.frame(x)), 'dgCMatrix')
> x <- as(unclass(diabetes$x), "dgCMatrix")
> lars(x, y, intercept = FALSE)
> ## Error in scale.default(x, FALSE, normx) :
> ##   length of 'scale' must equal the number of columns of 'x'
>
> ## More specifically, scale.default fails as called from lars():
> normx <- new("dgeMatrix",
>   x = c(4, 0, 9, 1, 1, -1, 4, -2, 6, 6)*1e-14, Dim = c(1L, 10L),
>   Dimnames = list(NULL,
>   c("x.age", "x.sex", "x.bmi", "x.map", "x.tc",
> "x.ldl", "x.hdl", "x.tch", "x.ltg", "x.glu")))
> scale.default(x, center=FALSE, scale = normx)
> ## Error in scale.default(x, center = FALSE, scale = normx) :
> ##   length of 'scale' must equal the number of columns of 'x'
>
> >  The problem is that this check fails because is.numeric(normx) is FALSE:
>
> >  if (is.numeric(scale) && length(scale) == nc)
>
> >  So, the error message is misleading. In fact length(scale) is the same
> as
> >  nc.
>
> Correct, twice.
>
> >  At a minimum, the error message needs to be repaired; do we also want to
> >  attempt as.numeric(normx) (which I believe would have allowed scale to
> work
> >  in this case)?
>
> It seems sensible to allow  both 'center' and 'scale' to only
> have to *obey*  as.numeric(.)  rather than fulfill is.numeric(.).
>
> Though that is not a bug in scale()  as its help page has always
> said that 'center' and 'scale' should either be a logical value
> or a numeric vector.
>
> For that reason I can really claim a bug in 'lars' which should
> really not use
>
>scale(x, FALSE, normx)
>
> but rather
>
>scale(x, FALSE, scale = as.numeric(normx))
>
> and then all would work.
>
> > -
>
> >  (I'm aware that there's some import issues in lars, as the offending
> line
> >  to create normx *should* work, as is.numeric(sqrt(drop(rep(1, nrow(x))
> %*%
> >  (x^2 is TRUE -- it's simply that lars doesn't import the
> appropriate S4
> >  methods)
>
> >  Michael Chirico
>
> Yes, 'lars' has _not_ been updated since  Spring 2013, notably
> because its authors have been saying (for rather more than 5
> years I think) that one should really use
>
>  require("glmnet")
>
> instead.
>
> Your point is still valid that it would be easy to enhance
> base :: scale.default()  so it'd work in more cases.
>
> Thank you for that.  I do plan to consider such a change in
> R-devel (planned to become R 3.5.0 in April).
>
> Martin Maechler,
> ETH Zurich
>
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] model.frame strips class as promised, but fails to strip OBJECT in C

2018-03-05 Thread Michael Chirico
Full thread here:

https://github.com/tidyverse/broom/issues/287

Reproducible example:

is.object(freeny$y)
# [1] TRUE
attr(freeny$y, 'class')
# [1] "ts"
class(freeny$y)
# [1] "ts"

# ts attribute wiped by model.frame
class(model.frame(y ~ ., data = freeny)$y)
# [1] "numeric"
attr(model.frame(y ~ ., data = freeny)$y, 'class')
# NULL

# but still:
is.object(model.frame(y ~ ., data = freeny)$y)
# [1] TRUE

That is, given a numeric vector with class "ts", model.frame strips the
"ts" attribute but keeps the is.object property.

This behavior is alluded to in ?model.frame:

Unless na.action = NULL, time-series attributes will be removed from the
> variables found (since they will be wrong if NAs are removed).
>

And in fact explicitly setting na.action = NULL prevents dropping the class:

class(model.frame(y ~ ., data = freeny, na.action = NULL)$y)
# [1] "ts"

The reason this looks especially like a bug is that it differs from how
na.omit behaves:

DF <- data.frame(x = c(1, 2, 3), y = c(0, 10, NA))
is.object(DF$y)
# [1] FALSE
class(DF$y) = 'foo'
is.object(DF$y)
# [1] TRUE
class(na.omit(DF)$y)
# [1] "numeric"
is.object(na.omit(DF)$y)
# [1] FALSE


That is, similarly presented with a classed object, na.omit strips the
class *and* the OBJECT attribute.

Thanks,
Michael Chirico

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Workflow for translations?

2018-03-25 Thread Michael Lawrence
Hi Detlef,

Sorry, this is something that I have been supposed to be doing. I will send
out a call soon.

Michael

On Sun, Mar 25, 2018 at 10:00 AM, Detlef Steuer  wrote:

> Hi friends,
>
> what happend to the "call for translation" that was a clear
> signal to start working on an update for transations in former times?
>
> What is the supposed workflow for translations at the moment?
> Did I miss some policy change there? Can I read it up somewhere?
>
> All the best
> Detlef
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] odd assignInNamespace / setGeneric interaction

2018-04-18 Thread Michael Lawrence
Hi Bill,

Ideally, your coworker would just make an alias (or shortcut or
whatever) for R that passed --no-save to R. I'll try to look into this
though.

Michael

On Wed, Apr 18, 2018 at 1:38 PM, William Dunlap via R-devel
 wrote:
> A coworker got tired of having to type 'yes' or 'no' after quitting R: he
> never wanted to save the R workspace when quitting.  So he added
> assignInNamespace lines to his .Rprofile file to replace base::q with
> one that, by default, called the original with save="no"..
>
>   utils::assignInNamespace(".qOrig", base::q, "base")
>   utils::assignInNamespace("q", function(save = "no", ...)
> base:::.qOrig(save = save, ...), "base")
>
> This worked fine until he decide to load the distr package:
>
>   > suppressPackageStartupMessages(library(distr))
>   Error: package or namespace load failed for ‘distr’ in
> loadNamespace(name):
>there is no package called ‘.GlobalEnv’
>
> distr calls setGeneric("q"), which indirectly causes the environment
> of base::q, .GlobalEnv, to be loaded as a namespace, causing the error.
> Giving his replacement q function the environment getNamespace("base")
> avoids the problem.
>
> I can reproduce the problem by making a package that just calls
> setGeneric("as.hexmode",...) and a NAMEPACE file with
> exportMethods("as.hexmode").  If my .Rprofile puts a version of as.hexmode
> with environment .GlobalEnv into the base namespace, then I get the same
> error when trying to load the package.
>
> I suppose this is mostly a curiosity and unlikely to happen to most people
> but it did confuse us for a while.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] odd assignInNamespace / setGeneric interaction

2018-04-19 Thread Michael Lawrence
To clarify, I am going to fix the issue in the methods package
(actually I already have but need to test further). There's no intent
to change the behavior of q().

On Thu, Apr 19, 2018 at 8:39 AM, William Dunlap  wrote:
> The problem is not specific to redefining the q function, but to
> the interaction of assignInNamespace and setGeneric.  The
> latter requires, roughtly, that the environment of the function
> being replaced by an S4 generic is (or is the descendent of)
> the environment in which it lives.
>
> E.g., the following demonstrates the problem
>
> % R --quiet --vanilla
>> assignInNamespace("plot", function(x, ...) stop("No plotting allowed!"),
>> getNamespace("graphics"))
>> library(stats4)
> Error: package or namespace load failed for ‘stats4’ in loadNamespace(name):
>  there is no package called ‘.GlobalEnv’
>
> and defining the bogus plot function in the graphics namespace avoids the
> problem
>
> % R --quiet --vanilla
>>  assignInNamespace("plot", with(getNamespace("graphics"), function(x, ...)
>> stop("No plotting allowed!")), getNamespace("graphics"))
>> library(stats4)
>>
>
> I suppose poeple who use assignInNamespace get what they deserve.
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Thu, Apr 19, 2018 at 2:33 AM, Martin Maechler
>  wrote:
>>
>> >>>>> Michael Lawrence 
>> >>>>> on Wed, 18 Apr 2018 14:16:37 -0700 writes:
>>
>> > Hi Bill,
>> > Ideally, your coworker would just make an alias (or shortcut or
>> > whatever) for R that passed --no-save to R. I'll try to look into
>> this
>> > though.
>>
>> > Michael
>>
>> Yes, indeed!
>>
>> As some of you know, I've been using R (for ca 23 years now)
>> almost only from ESS (Emacs Speaks Statistics).
>>
>> There, I've activated '--no-save' for ca 20 years or so,
>> nowadays (since Emacs has adopted "custom") I have had this in
>> my ~/.emacs  custom lines
>>
>>  '(inferior-R-args "--no-restore-history --no-save ")
>>
>> standalone (to paste into your own ~/.emacs ) :
>>
>> (custom-set-variables '(inferior-R-args "--no-restore-history --no-save
>> "))
>>
>> 
>>
>> The current fashionable IDE to R,
>> Rstudio, also allows to set such switches by its GUI:
>>
>> Menu [Tools]
>>   --> (bottom) entry [Global Options]
>> --> the first sidebar entry  [R General]:
>> Look for two lines mentioning "workspace" or ".RData" and
>> change to 'save never' ( == --no-save),
>> and nowadays I also recommend my students to not *read*
>> these, i.e., '--no-restore'
>>
>> ---
>>
>> @Michael: I'm not sure what you're considering.  I feel that in
>>  general, there are already too many R startup tweaking
>>  possibilities, notably via environment variables.
>> [e.g., the current ways to pre-determine the active .libPaths() in R,
>>  and the fact the R calls R again during 'R CMD check' etc,
>>  sometimes drives me crazy when .libPaths() become incompatible
>>  for too many reasons  yes, I'm diverting: that's another story]
>>
>> If we'd want to allow using (yet  another!) environment variable
>> here, I'd at least would  make sure they are not consulted when
>> explicit --no-save or --vanilla, etc are used.
>>
>> Martin
>>
>>
>> > On Wed, Apr 18, 2018 at 1:38 PM, William Dunlap via R-devel
>> >  wrote:
>> >> A coworker got tired of having to type 'yes' or 'no' after quitting
>> R: he
>> >> never wanted to save the R workspace when quitting.  So he added
>> >> assignInNamespace lines to his .Rprofile file to replace base::q
>> with
>> >> one that, by default, called the original with save="no"..
>> >>
>> >> utils::assignInNamespace(".qOrig", base::q, "base")
>> >> utils::assignInNamespace("q", function(save = "no", ...)
>> >> base:::.qOrig(save = save, ...), "base")
>> >>
>> >> This worked fine until he decide to load the distr package:
>> >>
>> >> > suppressPackageStartupMessages(library(distr))
>> >> Error: package or

Re: [Rd] readLines() for non-blocking pipeline behaves differently in R 3.5

2018-04-25 Thread Michael Lawrence
Probably related to the switch to buffered connections. I will look
into this soon.

On Wed, Apr 25, 2018 at 2:34 PM, Randy Lai  wrote:
> It seems that the behavior of readLines() in R 3.5 has changed for 
> non-blocking pipeline.
>
>
> Consider the following R script, which reads from STDIN line by line.
> ```
> con <- file("stdin")
> open(con, blocking = FALSE)
>
> while (TRUE) {
> txt <- readLines(con, 1)
> if (length(txt) > 0) {
> cat(txt, "\n", file = stdout())
> }
> Sys.sleep(0.1)
> }
> close(con)
>
> ```
>
> In R 3.4.4, it works as expected.
>
> ```
> (randymbpro)-Desktop$ echo "abc\nfoo" | R --slave -f test.R
> abc
> foo
> ```
>
> In R 3.5, only the first line is printed
> ```
> (randymbpro)-Desktop$ echo "abc\nfoo" | R --slave -f test.R
> abc
> ```
>
> Is this change expected?  If I change `blocking` to `TRUE` above, the above 
> code would
> work. But I need non-blocking connection in my use case of piping buffer from
> another program.
>
> Best,
>
> R 3.5 @ macOS 10.13
>
>
> Randy
>
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] readLines() for non-blocking pipeline behaves differently in R 3.5

2018-04-26 Thread Michael Lawrence
The issue is that readLines() tries to seek (for reasons I don't
understand) in the non-blocking case, but silently fails for "stdin"
since it's a stream. This confused the buffering logic. The fix is to
mark "stdin" as unable to seek, but I do wonder why readLines() is
seeking in the first place.

Anyway, I'll get this into patched ASAP. Thanks for the report.

Michael


On Wed, Apr 25, 2018 at 5:13 PM, Michael Lawrence  wrote:
> Probably related to the switch to buffered connections. I will look
> into this soon.
>
> On Wed, Apr 25, 2018 at 2:34 PM, Randy Lai  wrote:
>> It seems that the behavior of readLines() in R 3.5 has changed for 
>> non-blocking pipeline.
>>
>>
>> Consider the following R script, which reads from STDIN line by line.
>> ```
>> con <- file("stdin")
>> open(con, blocking = FALSE)
>>
>> while (TRUE) {
>> txt <- readLines(con, 1)
>> if (length(txt) > 0) {
>> cat(txt, "\n", file = stdout())
>> }
>> Sys.sleep(0.1)
>> }
>> close(con)
>>
>> ```
>>
>> In R 3.4.4, it works as expected.
>>
>> ```
>> (randymbpro)-Desktop$ echo "abc\nfoo" | R --slave -f test.R
>> abc
>> foo
>> ```
>>
>> In R 3.5, only the first line is printed
>> ```
>> (randymbpro)-Desktop$ echo "abc\nfoo" | R --slave -f test.R
>> abc
>> ```
>>
>> Is this change expected?  If I change `blocking` to `TRUE` above, the above 
>> code would
>> work. But I need non-blocking connection in my use case of piping buffer from
>> another program.
>>
>> Best,
>>
>> R 3.5 @ macOS 10.13
>>
>>
>> Randy
>>
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] readLines() for non-blocking pipeline behaves differently in R 3.5

2018-04-26 Thread Michael Lawrence
Thanks for the clear explanation. At first glance seeking to the
current position seemed like it would be a no-op, but obviously things
are more complicated under the hood.

On Thu, Apr 26, 2018 at 11:35 AM, Gábor Csárdi  wrote:
> I suspect the reason for the seek is this:
>
> cat("1\n", file = "foobar")
> f  <- file("foobar", blocking = FALSE, open = "r")
> readLines(f)
> #> [1] "1"
>
> cat("2\n", file = "foobar", append = TRUE)
> readLines(f)
> #> [1] "2"
>
> cat("3\n", file = "foobar", append = TRUE)
> readLines(f)
> #> [1] "3"
>
> I.e. R can emulate a file connection with non-blocking reads.
> AFAICT there is no such thing, in Unix at least.
> For this  emulation, it needs to seek to the "current" position.
>
> Gabor
>
> On Thu, Apr 26, 2018 at 7:21 PM, Michael Lawrence
>  wrote:
>> The issue is that readLines() tries to seek (for reasons I don't
>> understand) in the non-blocking case, but silently fails for "stdin"
>> since it's a stream. This confused the buffering logic. The fix is to
>> mark "stdin" as unable to seek, but I do wonder why readLines() is
>> seeking in the first place.
>>
>> Anyway, I'll get this into patched ASAP. Thanks for the report.
>>
>> Michael
>>
>>
>> On Wed, Apr 25, 2018 at 5:13 PM, Michael Lawrence  wrote:
>>> Probably related to the switch to buffered connections. I will look
>>> into this soon.
>>>
>>> On Wed, Apr 25, 2018 at 2:34 PM, Randy Lai  wrote:
>>>> It seems that the behavior of readLines() in R 3.5 has changed for 
>>>> non-blocking pipeline.
>>>>
>>>>
>>>> Consider the following R script, which reads from STDIN line by line.
>>>> ```
>>>> con <- file("stdin")
>>>> open(con, blocking = FALSE)
>>>>
>>>> while (TRUE) {
>>>> txt <- readLines(con, 1)
>>>> if (length(txt) > 0) {
>>>> cat(txt, "\n", file = stdout())
>>>> }
>>>> Sys.sleep(0.1)
>>>> }
>>>> close(con)
>>>>
>>>> ```
>>>>
>>>> In R 3.4.4, it works as expected.
>>>>
>>>> ```
>>>> (randymbpro)-Desktop$ echo "abc\nfoo" | R --slave -f test.R
>>>> abc
>>>> foo
>>>> ```
>>>>
>>>> In R 3.5, only the first line is printed
>>>> ```
>>>> (randymbpro)-Desktop$ echo "abc\nfoo" | R --slave -f test.R
>>>> abc
>>>> ```
>>>>
>>>> Is this change expected?  If I change `blocking` to `TRUE` above, the 
>>>> above code would
>>>> work. But I need non-blocking connection in my use case of piping buffer 
>>>> from
>>>> another program.
>>>>
>>>> Best,
>>>>
>>>> R 3.5 @ macOS 10.13
>>>>
>>>>
>>>> Randy
>>>>
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> __
>>>> R-devel@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>
>>
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] readLines() behaves differently for gzfile connection

2018-05-10 Thread Michael Lawrence
Would it be possible to get that file or a representative subset of it
somewhere so that I can reproduce this?

Thanks,
Michael

On Thu, May 10, 2018 at 3:31 PM, Ben Heavner  wrote:
> When I read a .gz file with readLines() in 3.4.3, it returns text (and a
> warning). In 3.5.0, it gives a warning, but no text. Is this expected
> behavior or a bug?
>
> 3.4.3:
>> source_file = "1k_annotation.gz"
>> readfile_con <- gzfile(source_file, "r")
>> readLines(readfile_con, n = 5)
> [1] "#chr\tpos\tref\talt\t
>
> 
>
> Warning message:
> In readLines(readfile_con, n = 5) :
>   seek on a gzfile connection returned an internal error
>
>> close(readfile_con)
>
>> sessionInfo()
> R version 3.4.3 (2017-11-30)
> Platform: x86_64-apple-darwin15.6.0 (64-bit)
> Running under: macOS Sierra 10.12.6
>
> Matrix products: default
> BLAS:
> /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
> LAPACK:
> /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
>
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_3.4.3
>
> -
>
> 3.5.0:
>> source_file = "1k_annotation.gz"
>> readfile_con <- gzfile(source_file, "r")
>> readLines(readfile_con, n = 5)
> [1] "" "" "" "" ""
> Warning message:
> In readLines(readfile_con, n = 5) :
>   seek on a gzfile connection returned an internal error
>> close(readfile_con)
>> sessionInfo()
> R version 3.5.0 (2018-04-23)
> Platform: x86_64-pc-linux-gnu (64-bit)
> Running under: Debian GNU/Linux 9 (stretch)
>
> Matrix products: default
> BLAS: /usr/lib/openblas-base/libblas.so.3
> LAPACK: /usr/lib/libopenblasp-r0.2.19.so
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
>  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=C
>  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
>  [9] LC_ADDRESS=C   LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> loaded via a namespace (and not attached):
> [1] compiler_3.5.0
>
> 
> (note: I'm running 3.5.0 via the docker rocker/tidyverse:3.5 container, and
> 3.4.3 on my mac desktop machine)
>
> Thanks!
> Ben Heavner
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] readLines() behaves differently for gzfile connection

2018-05-14 Thread Michael Lawrence
I haven't been able to reproduce the empty lines issue on my Mac or
Linux laptop, but I have yet to try that container.

The warning is because of a SEEK_SET to -1, which apparently is
unsupported by zlib. Maybe the zlib version in that container is
getting confused. I'm not sure why readLines() wants to seek to -1
instead of 0, but it only does that on non-blocking connections. The
compressed file connections are effectively blocking but are marked as
non-blocking. Marking them as blocking removes the warning. I will get
that into devel and release soon. Hopefully that fixes the empty lines
issue also.

Michael

On Thu, May 10, 2018 at 4:21 PM, Ben Heavner  wrote:
> You bet - it's available on github at
> https://github.com/UW-GAC/wgsaparsr/blob/master/tests/testthat/1k_annotation.gz
>
> -Ben
>
> On Thu, May 10, 2018 at 4:17 PM, Michael Lawrence
>  wrote:
>>
>> Would it be possible to get that file or a representative subset of it
>> somewhere so that I can reproduce this?
>>
>> Thanks,
>> Michael
>>
>> On Thu, May 10, 2018 at 3:31 PM, Ben Heavner  wrote:
>> > When I read a .gz file with readLines() in 3.4.3, it returns text (and a
>> > warning). In 3.5.0, it gives a warning, but no text. Is this expected
>> > behavior or a bug?
>> >
>> > 3.4.3:
>> >> source_file = "1k_annotation.gz"
>> >> readfile_con <- gzfile(source_file, "r")
>> >> readLines(readfile_con, n = 5)
>> > [1] "#chr\tpos\tref\talt\t
>> >
>> > 
>> >
>> > Warning message:
>> > In readLines(readfile_con, n = 5) :
>> >   seek on a gzfile connection returned an internal error
>> >
>> >> close(readfile_con)
>> >
>> >> sessionInfo()
>> > R version 3.4.3 (2017-11-30)
>> > Platform: x86_64-apple-darwin15.6.0 (64-bit)
>> > Running under: macOS Sierra 10.12.6
>> >
>> > Matrix products: default
>> > BLAS:
>> >
>> > /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRblas.0.dylib
>> > LAPACK:
>> >
>> > /Library/Frameworks/R.framework/Versions/3.4/Resources/lib/libRlapack.dylib
>> >
>> > locale:
>> > [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
>> >
>> > attached base packages:
>> > [1] stats graphics  grDevices utils datasets  methods   base
>> >
>> > loaded via a namespace (and not attached):
>> > [1] compiler_3.4.3
>> >
>> > -
>> >
>> > 3.5.0:
>> >> source_file = "1k_annotation.gz"
>> >> readfile_con <- gzfile(source_file, "r")
>> >> readLines(readfile_con, n = 5)
>> > [1] "" "" "" "" ""
>> > Warning message:
>> > In readLines(readfile_con, n = 5) :
>> >   seek on a gzfile connection returned an internal error
>> >> close(readfile_con)
>> >> sessionInfo()
>> > R version 3.5.0 (2018-04-23)
>> > Platform: x86_64-pc-linux-gnu (64-bit)
>> > Running under: Debian GNU/Linux 9 (stretch)
>> >
>> > Matrix products: default
>> > BLAS: /usr/lib/openblas-base/libblas.so.3
>> > LAPACK: /usr/lib/libopenblasp-r0.2.19.so
>> >
>> > locale:
>> >  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
>> >  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>> >  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=C
>> >  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
>> >  [9] LC_ADDRESS=C   LC_TELEPHONE=C
>> > [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>> >
>> > attached base packages:
>> > [1] stats graphics  grDevices utils datasets  methods   base
>> >
>> > loaded via a namespace (and not attached):
>> > [1] compiler_3.5.0
>> >
>> > 
>> > (note: I'm running 3.5.0 via the docker rocker/tidyverse:3.5 container,
>> > and
>> > 3.4.3 on my mac desktop machine)
>> >
>> > Thanks!
>> > Ben Heavner
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-devel@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>> >
>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Dispatch mechanism seems to alter object before calling method on it

2018-05-15 Thread Michael Lawrence
My understanding is that array (or any other structure) does not
"simply" inherit from vector, because structures are not vectors in
the strictest sense. Basically, once a vector gains attributes, it is
a structure, not a vector. The methods package accommodates this by
defining an "is" relationship between "structure" and "vector" via an
"explicit coerce", such that any "structure" passed to a "vector"
method is first passed to as.vector(), which strips attributes. This
is very much by design.

Michael


On Tue, May 15, 2018 at 5:25 PM, Hervé Pagès  wrote:
> Hi,
>
> This was quite unexpected:
>
>   setGeneric("foo", function(x) standardGeneric("foo"))
>
>   setMethod("foo", "vector", identity)
>
>   foo(matrix(1:12, ncol=3))
>   # [1]  1  2  3  4  5  6  7  8  9 10 11 12
>
>   foo(array(1:24, 4:2))
>   # [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
> 24
>
> If I define a method for array objects, things work as expected though:
>
>   setMethod("foo", "array", identity)
>
>   foo(matrix(1:12, ncol=3))
>   #  [,1] [,2] [,3]
>   # [1,]159
>   # [2,]26   10
>   # [3,]37   11
>   # [4,]48   12
>
> So, luckily, I have a workaround.
>
> But shouldn't the dispatch mechanism stay away from the business of
> altering objects before passed to it?
>
> Thanks,
> H.
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fredhutch.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Dispatch mechanism seems to alter object before calling method on it

2018-05-16 Thread Michael Lawrence
Factors and data.frames are not structures, because they must have a
class attribute. Just call them "objects". They are higher level than
structures, which in practice just shape data without adding a lot of
semantics. Compare getClass("matrix") and getClass("factor").

I agree that inheritance through explicit coercion is confusing. As
far as I know, there are only 2 places where it is used:
1) Objects with attributes but no class, basically "structure" and its
subclasses "array" <- "matrix"
2) Classes that extend a reference type ("environment", "name" and
"externalptr") via hidden delegation (@.xData)

I'm not sure if anyone should be doing #2. For #1, a simple "fix"
would be just to drop inheritance of "structure" from "vector". I
think the intent was to mimic base R behavior, where it will happily
strip (or at least ignore) attributes when passing an array or matrix
to an internal function that expects a vector.

A related problem, which explains why factor and data.frame inherit
from "vector" even though they are objects, is that any S4 object
derived from those needs to be (for pragmatic compatibility reasons)
an integer vector or list, respectively, internally (the virtual
@.Data slot). Separating that from inheritance would probably be
difficult.

Yes, we can consider these to be problems, to some extent stemming
from the behavior and design of R itself, but I'm not sure it's worth
doing anything about them at this point.

Michael

On Wed, May 16, 2018 at 8:33 AM, Hervé Pagès  wrote:
> On 05/15/2018 09:13 PM, Michael Lawrence wrote:
>>
>> My understanding is that array (or any other structure) does not
>> "simply" inherit from vector, because structures are not vectors in
>> the strictest sense. Basically, once a vector gains attributes, it is
>> a structure, not a vector. The methods package accommodates this by
>> defining an "is" relationship between "structure" and "vector" via an
>> "explicit coerce", such that any "structure" passed to a "vector"
>> method is first passed to as.vector(), which strips attributes. This
>> is very much by design.
>
>
> It seems that the problem is really with matrices and arrays, not
> with "structures" in general:
>
>   f <- factor(c("z", "x", "z"), levels=letters)
>   m <- matrix(1:12, ncol=3)
>   df <- data.frame(f=f)
>   x <- structure(1:3, titi="A")
>
> Only the matrix looses its attributes when passed to a "vector"
> method:
>
>   setGeneric("foo", function(x) standardGeneric("foo"))
>   setMethod("foo", "vector", identity)
>
>   foo(f) # attributes are preserved
>   # [1] z x z
>   # Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z
>
>   foo(m) # attributes are stripped
>   # [1]  1  2  3  4  5  6  7  8  9 10 11 12
>
>   foo(df)# attributes are preserved
>   #   f
>   # 1 z
>   # 2 x
>   # 3 z
>
>   foo(x) # attributes are preserved
>   # [1] 1 2 3
>   # attr(,"titi")
>   # [1] "A"
>
> Also if structures are passed to as.vector() before being passed to
> a "vector" method, shouldn't as.vector() and foo() be equivalent on
> them? For 'f' and 'x' they're not:
>
>   as.vector(f)
>   # [1] "z" "x" "z"
>
>   as.vector(x)
>   # [1] 1 2 3
>
> Finally note that for factors and data frames the "vector" method gets
> selected despite the fact that is( , "vector") is FALSE:
>
>   is(f, "vector")
>   # [1] FALSE
>
>   is(m, "vector")
>   # [1] TRUE
>
>   is(df, "vector")
>   # [1] FALSE
>
>   is(x, "vector")
>   # [1] TRUE
>
> Couldn't we recognize these problems as real, even if they are by
> design? Hopefully we can all agree that:
> - the dispatch mechanism should only dispatch, not alter objects;
> - is() and selectMethod() should not contradict each other.
>
> Thanks,
> H.
>
>>
>> Michael
>>
>>
>> On Tue, May 15, 2018 at 5:25 PM, Hervé Pagès  wrote:
>>>
>>> Hi,
>>>
>>> This was quite unexpected:
>>>
>>>setGeneric("foo", function(x) standardGeneric("foo"))
>>>
>>>setMethod("foo", "vector", identity)
>>>
>>>foo(matrix(1:12, ncol=3))
>>># [1]  1  2  3  4  5  6  7  8  9 10 11 12
>>>
>>> 

Re: [Rd] Dispatch mechanism seems to alter object before calling method on it

2018-05-16 Thread Michael Lawrence
On Wed, May 16, 2018 at 12:23 PM, Hervé Pagès  wrote:
> On 05/16/2018 10:22 AM, Michael Lawrence wrote:
>>
>> Factors and data.frames are not structures, because they must have a
>> class attribute. Just call them "objects". They are higher level than
>> structures, which in practice just shape data without adding a lot of
>> semantics. Compare getClass("matrix") and getClass("factor").
>>
>> I agree that inheritance through explicit coercion is confusing. As
>> far as I know, there are only 2 places where it is used:
>> 1) Objects with attributes but no class, basically "structure" and its
>> subclasses "array" <- "matrix"
>> 2) Classes that extend a reference type ("environment", "name" and
>> "externalptr") via hidden delegation (@.xData)
>>
>> I'm not sure if anyone should be doing #2. For #1, a simple "fix"
>> would be just to drop inheritance of "structure" from "vector". I
>> think the intent was to mimic base R behavior, where it will happily
>> strip (or at least ignore) attributes when passing an array or matrix
>> to an internal function that expects a vector.
>>
>> A related problem, which explains why factor and data.frame inherit
>> from "vector" even though they are objects, is that any S4 object
>> derived from those needs to be (for pragmatic compatibility reasons)
>> an integer vector or list, respectively, internally (the virtual
>> @.Data slot). Separating that from inheritance would probably be
>> difficult.
>>
>> Yes, we can consider these to be problems, to some extent stemming
>> from the behavior and design of R itself, but I'm not sure it's worth
>> doing anything about them at this point.
>
>
> Thanks for the informative discussion. It still doesn't explain
> why 'm' gets its attributes stripped and 'x' does not though:
>
>   m <- matrix(1:12, ncol=3)
>   x <- structure(1:3, titi="A")
>
>   setGeneric("foo", function(x) standardGeneric("foo"))
>   setMethod("foo", "vector", identity)
>
>   foo(m)
>   # [1]  1  2  3  4  5  6  7  8  9 10 11 12
>
>   foo(x)
>   # [1] 1 2 3
>   # attr(,"titi")
>   # [1] "A"
>
> If I understand correctly, both are "structures", not "objects".
>

The structure 'x' has no class, so nothing special is going to happen.
As you know, S4 has a well-defined class hierarchy. Just look at
getClass("structure") to see its subclasses. There was at some point
an attempt to create a sort of dynamic inheritance, where a 'test'
function would be called and could figure this out. However, that was
never implemented. For one thing, it would be even more confusing.

> Why aren't these problems worth fixing? More generally speaking
> the erratic behavior of the S4 system with respect to S3 objects
> has been a plague since the beginning of the methods package.
> And many people have complained about this in many occasions in
> one way or another. For the record, here are some of the most
> notorious problems:
>
>   class(as.numeric(1:4))
>   # [1] "numeric"
>   class(as(1:4, "numeric"))
>   # [1] "integer"
>

This is not really a problem with the methods package. is.numeric(1L)
is TRUE, thus integer extends numeric, so coercing an integer to
numeric is a no-op. as.numeric() should really be called as.double()
or something. But that's not going to change, of course.

>   is.vector(matrix())
>   # [1] FALSE
>   is(matrix(), "vector")
>   # [1] TRUE
>

We already discussed this in the context of "structure" inheriting
from "vector" and explicit coercion.

>   is.list(data.frame())
>   # [1] TRUE
>   is(data.frame(), "list")
>   # [1] FALSE
>   extends("data.frame", "list")
>   # [1] TRUE
>

This is a compromise for compatibility with inherits(), since the
result of data.frame() is an S3 object.

>
>   is(data.frame(), "vector")
>   # [1] FALSE
>   is(data.frame(), "factor")
>   # [1] FALSE
>   is(data.frame(), "vector_OR_factor")
>   # [1] TRUE
>

The question is: which inheritance to follow, S3 or S4? Since "vector"
is a basic class, inheritance follows S3 rules. But the class union is
an S4 class, so it follows S4 rules.

>   etc...
>
> Many people stay away from S4 because of these incomprehensible
> behaviors.
>
> Finally note that even pure S3 operations can produce output that
> doesn&

Re: [Rd] Dispatch mechanism seems to alter object before calling method on it

2018-05-16 Thread Michael Lawrence
On Wed, May 16, 2018 at 3:45 PM, Hervé Pagès  wrote:
> On 05/16/2018 01:24 PM, Michael Lawrence wrote:
>>
>> On Wed, May 16, 2018 at 12:23 PM, Hervé Pagès 
>> wrote:
>>>
>>> On 05/16/2018 10:22 AM, Michael Lawrence wrote:
>>>>
>>>>
>>>> Factors and data.frames are not structures, because they must have a
>>>> class attribute. Just call them "objects". They are higher level than
>>>> structures, which in practice just shape data without adding a lot of
>>>> semantics. Compare getClass("matrix") and getClass("factor").
>>>>
>>>> I agree that inheritance through explicit coercion is confusing. As
>>>> far as I know, there are only 2 places where it is used:
>>>> 1) Objects with attributes but no class, basically "structure" and its
>>>> subclasses "array" <- "matrix"
>>>> 2) Classes that extend a reference type ("environment", "name" and
>>>> "externalptr") via hidden delegation (@.xData)
>>>>
>>>> I'm not sure if anyone should be doing #2. For #1, a simple "fix"
>>>> would be just to drop inheritance of "structure" from "vector". I
>>>> think the intent was to mimic base R behavior, where it will happily
>>>> strip (or at least ignore) attributes when passing an array or matrix
>>>> to an internal function that expects a vector.
>>>>
>>>> A related problem, which explains why factor and data.frame inherit
>>>> from "vector" even though they are objects, is that any S4 object
>>>> derived from those needs to be (for pragmatic compatibility reasons)
>>>> an integer vector or list, respectively, internally (the virtual
>>>> @.Data slot). Separating that from inheritance would probably be
>>>> difficult.
>>>>
>>>> Yes, we can consider these to be problems, to some extent stemming
>>>> from the behavior and design of R itself, but I'm not sure it's worth
>>>> doing anything about them at this point.
>>>
>>>
>>>
>>> Thanks for the informative discussion. It still doesn't explain
>>> why 'm' gets its attributes stripped and 'x' does not though:
>>>
>>>m <- matrix(1:12, ncol=3)
>>>x <- structure(1:3, titi="A")
>>>
>>>setGeneric("foo", function(x) standardGeneric("foo"))
>>>setMethod("foo", "vector", identity)
>>>
>>>foo(m)
>>># [1]  1  2  3  4  5  6  7  8  9 10 11 12
>>>
>>>foo(x)
>>># [1] 1 2 3
>>># attr(,"titi")
>>># [1] "A"
>>>
>>> If I understand correctly, both are "structures", not "objects".
>>>
>>
>> The structure 'x' has no class, so nothing special is going to happen.
>> As you know, S4 has a well-defined class hierarchy. Just look at
>> getClass("structure") to see its subclasses. There was at some point
>> an attempt to create a sort of dynamic inheritance, where a 'test'
>> function would be called and could figure this out. However, that was
>> never implemented. For one thing, it would be even more confusing.
>>
>>> Why aren't these problems worth fixing? More generally speaking
>>> the erratic behavior of the S4 system with respect to S3 objects
>>> has been a plague since the beginning of the methods package.
>>> And many people have complained about this in many occasions in
>>> one way or another. For the record, here are some of the most
>>> notorious problems:
>>>
>>>class(as.numeric(1:4))
>>># [1] "numeric"
>>>class(as(1:4, "numeric"))
>>># [1] "integer"
>>>
>>
>> This is not really a problem with the methods package. is.numeric(1L)
>> is TRUE, thus integer extends numeric, so coercing an integer to
>> numeric is a no-op.
>
>
> Only as(1:4, "numeric", strict=FALSE) should be a no-op.
> as(1:4, "numeric") should still coerce because as() is supposed
> to perform strict coercion by default.
>
>> as.numeric() should really be called as.double()
>> or something. But that's not going to change, of course.
>
>
> as.numeric() is doing the right thing (i.e. strict coercion) so there
> is no 

[Rd] Patch for bug 17256 'possible bug in writeForeignSAS in the foreign library when string is NA'

2018-05-17 Thread NELSON, Michael
Attached is a patch to fix 
https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17256

'possible bug in writeForeignSAS in the foreign library when string is NA'



The patch
1. fixing the case where there were NA within a character column,   and the 
case where all values are strings of length 0
2. general replacement of calls to `sapply` with `vapply` (and replacing 
any(is.na()) with anyNA.



Happy to  add in bugzilla (but don't have an account there)

Regards

Michael Nelson

___
Disclaimer: This message is intended for the addressee named and may contain 
confidential information.
If you are not the intended recipient, please delete it and notify the sender.
Views expressed in this message are those of the individual sender, and are not 
necessarily the views of the NSW Ministry of Health.
___
This email has been scanned for the NSW Ministry of Health by the Websense 
Hosted Email Security System.
Emails and attachments are monitored to ensure compliance with the NSW Ministry 
of Health's Electronic Messaging Policy.
___
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Patch for bug 17256 'possible bug in writeForeignSAS in the foreign library when string is NA'

2018-05-17 Thread NELSON, Michael
Hi Martin,

Thanks - I will continue in Bugzilla.

Michael

-Original Message-
From: Martin Maechler [mailto:maech...@stat.math.ethz.ch] 
Sent: Thursday, 17 May 2018 11:34 PM
To: NELSON, Michael
Cc: r-devel@r-project.org
Subject: Re: [Rd] Patch for bug 17256 'possible bug in writeForeignSAS in the 
foreign library when string is NA'

>>>>> NELSON, Michael 
>>>>> on Thu, 17 May 2018 11:53:27 + writes:

> Attached is a patch to fix
> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17256
> 'possible bug in writeForeignSAS in the foreign library
> when string is NA'



> The patch 1. fixing the case where there were NA within a
> character column, and the case where all values are
> strings of length 0 2. general replacement of calls to
> `sapply` with `vapply` (and replacing any(is.na()) with
> anyNA.



> Happy to add in bugzilla (but don't have an account there)

The patch attachment did not make it through the (antispam / antivirus /... ) 
filters:
Such attachments should have MIME type text/plain.

I have created a bugzilla account for you(r e-mail) and you should've gotten an 
auto-email with info.

Thank you in advance for helping with this.

In this special case, we'd also be happy for other users testing the problem 
and the fix, as access to SAS may have become difficult for most R core members.

Martin Maechler
ETH Zurich
__
This email has been scanned for the NSW Ministry of Health by the Websense 
Hosted Email Security System.
Emails and attachments are monitored to ensure compliance with the NSW Ministry 
of health's Electronic Messaging Policy.
__
___
Disclaimer: This message is intended for the addressee named and may contain 
confidential information.
If you are not the intended recipient, please delete it and notify the sender.
Views expressed in this message are those of the individual sender, and are not 
necessarily the views of the NSW Ministry of Health.
___
This email has been scanned for the NSW Ministry of Health by the Websense 
Hosted Email Security System.
Emails and attachments are monitored to ensure compliance with the NSW Ministry 
of Health's Electronic Messaging Policy.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Error message truncation

2018-05-18 Thread Michael Chirico
Help pages for stop/warning reference the option "warning.length", e.g.
from ?stop:

Errors will be truncated to getOption("warning.length") characters, default
> 1000.


Essentially the same is in ?warning.

Neither of these mention the hard-coded limits on the acceptable values of
this option in options.c
<https://github.com/wch/r-source/blob/a7356bf91b511287aacd3a992abfbcb75b60d93c/src/main/options.c#L546-L552>
:

if (streql(CHAR(namei), "warning.length")) {
  int k = asInteger(argi);
  if (k < 100 || k > 8170)
  error(_("invalid value for '%s'"), CHAR(namei));
  R_WarnLength = k;
  SET_VECTOR_ELT(value, i, SetOption(tag, argi));
}

Further, it appears there's a physical limit on the length of the error
message itself which is only slightly larger than 8170:

set.seed(1023)
NN = 1L
str = paste(sample(letters, NN, TRUE), collapse = '')
# should of course be 1
tryCatch(stop(str), error = function(e) nchar(e$message))
# [1] 8190

My questions are:


   - Can we add some information to the help pages indicating valid values
   of options('warning.length')?
   - Is there any way to increase the limit on error message length? I
   understand having such a limit is safer than potentially crashing a system
   that wants to print a massive error string.

This came up in relation to this SO Q&A:

https://stackoverflow.com/a/50387968/3576984

The user is submitting a database query; the error message will first
reproduce the entirety of the query and then give some diagnostic
information. Queries can get quite long, so it stands to reason that this
8190-length limit might be binding.

Thanks,
Michael Chirico

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Creating S3 methods for S4 classes (coming from r-package-devel)

2018-05-24 Thread Michael Lawrence
You only have to make an S4 method if there is already an S4 generic.
If there is just an S3 generic, then just define S3 methods on it. I
think we should stay away from defining S4 generics when there is no
good reason for them. Good reasons include multiple dispatch, or a
non-default signature. Neither of those apply in this case.

Michael

On Thu, May 24, 2018 at 6:39 AM, Joris Meys  wrote:
>  Dear all,
>
> I asked this question on r-package-devel but Martin Maechler pointed out
> this was more suited on R-devel. So here it goes:
>
> per the manual, one should create and register both the S3 and a S4 method
> if one needs a method for an S4 class for a function using S3 dispatching.
> This is cumbersome, and not very optimal.
>
> I was wondering if there's a better way to do this. Currently I recreate a
> generic in my package and create a default method that sends all the other
> classes to the S3 generic, eg:
>
> setGeneric("predict")
> setMethod("predict", "ANY", stats::predict)
>
> I'm not sure if this hasn't any adverse consequences, as it is not the
> recommended approach.
>
> It would be great if these generics could be made available through stats4.
> If this would be the prefered route, I volunteer to create the patch for
> that.
>
> Any thoughts?
> Cheers
> Joris
>
> (Original mail on r-package-devel :
> https://stat.ethz.ch/pipermail/r-package-devel/2018q2/002757.html )
>
>
> --
> Joris Meys
> Statistical consultant
>
> Department of Data Analysis and Mathematical Modelling
> Ghent University
> Coupure Links 653, B-9000 Gent (Belgium)
> <https://maps.google.com/?q=Coupure+links+653,%C2%A0B-9000+Gent,%C2%A0Belgium&entry=gmail&source=g>
>
> ---
> Biowiskundedagen 2017-2018
> http://www.biowiskundedagen.ugent.be/
>
> ---
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Creating S3 methods for S4 classes (coming from r-package-devel)

2018-05-24 Thread Michael Lawrence
On Thu, May 24, 2018 at 10:47 AM, Joris Meys  wrote:
>
>
> On Thu, May 24, 2018 at 6:20 PM, Michael Lawrence
>  wrote:
>>
>> You only have to make an S4 method if there is already an S4 generic.
>> If there is just an S3 generic, then just define S3 methods on it.
>
>
> I was refering to the recommendations in ?Methods_for_S3
> (https://stat.ethz.ch/R-manual/R-devel/library/methods/html/Methods_for_S3.html).
> :
>
> "Two possible mechanisms for implementing a method corresponding to an S4
> class, there are two possibilities are to register it as an S3 method with
> the S4 class name or to define and set an S4 method, which will have the
> side effect of creating an S4 generic version of this function.
>
> For most situations either works, but the recommended approach is to do
> both:"
>
> The reasoning is described there as well, and I have no reason to believe
> that information is not up to date. I can get away with defining an S3
> generic, but this stops being useful when using superclasses for reasons
> mentioned in the documentation.
>

The reason for having an S4 method is that if there is an S4 generic,
an S4 method (potentially on a superclass) will take precedence. But
if there is no S4 generic, then there are no S4 methods.

It is possible for another package to create an S4 generic. In that
case it would be defensive to define another generic locally. If they
use the same implicit generic, then it should be merged with the local
generic, and things should work. But that is assuming a lot. It would
be preferable for no package to define an S4 generic for predict(),
since there is no reason for it, and it only complicates things.

>
>> I
>> think we should stay away from defining S4 generics when there is no
>> good reason for them. Good reasons include multiple dispatch, or a
>> non-default signature. Neither of those apply in this case.
>
>
> I would personally prefer to use dispatching that's tailored to the type of
> class I work with, as that seems more consistent. But I agree we should
> avoid defining generics for the same function in different packages, hence
> my proposal about stats4.
>

Single dispatch should be consistent between S3 and S4. I think we
should keep things simple and just have one generic, the one we
already have.

>
> --
> Joris Meys
> Statistical consultant
>
> Department of Data Analysis and Mathematical Modelling
> Ghent University
> Coupure Links 653, B-9000 Gent (Belgium)
>
> ---
> Biowiskundedagen 2017-2018
> http://www.biowiskundedagen.ugent.be/
>
> ---
> Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] encoding argument of source() in 3.5.0

2018-06-04 Thread NELSON, Michael



On R 3.5.0 (Mac) 

The issue appears when using the default (libcurl) method and specifying the 
encoding

Note that using method='internal' causes a segfault if used in conjunction with 
encoding. (and works when encoding is not set)

urlR <- "http://home.versanet.de/~s-berman/source2.R";
# works 
url_default <- url(urlR)
scan(url_default, "")
# Read 7 items
# [1] "source.test2"   "<-" "function()" "{"
  "print(\"Non-ascii:" "äöüß\")"   
# [7] "}" 

url_default_en <- url(urlR, encoding = "UTF-8")
scan(url_default_en, "")
# Read 0 items
# character(0)
url_internal <- url(urlR, method = 'internal')
scan(url_internal, "")
# Read 7 items
# [1] "source.test2"   "<-" "function()" "{"
  "print(\"Non-ascii:" "äöüß\")"   
# [7] "}" 

url_internal_en <- url(urlR, encoding = "UTF-8", method = 'internal')
#scan(url_internal_en, "")
#*** caught segfault ***
#  address 0x0, cause 'memory not mapped'

url_libcurl <- url(urlR, method = 'libcurl')
scan(url_libcurl, "")
# Read 7 items
# [1] "source.test2"   "<-" "function()" "{"
  "print(\"Non-ascii:" "äöüß\")"   
# [7] "}" 
url_libcurl_en <- url(urlR, encoding = "UTF-8", method = 'libcurl')
scan(url_libcurl_en, "")
# Read 0 items
# character(0)


Michael


From: R-devel [r-devel-boun...@r-project.org] on behalf of Stephen Berman 
[stephen.ber...@gmx.net]
Sent: Monday, 4 June 2018 7:26 PM
To: Martin Maechler
Cc: R-devel
Subject: Re: [Rd] encoding argument of source() in 3.5.0

On Mon, 4 Jun 2018 10:44:11 +0200 Martin Maechler  
wrote:

>>>>>> peter dalgaard
>>>>>> on Sun, 3 Jun 2018 23:51:24 +0200 writes:
>
> > Looks like this actually comes from readLines(), nothing
> > to do with source() as such: In current R-devel (still):
>
> >> f <- file("http://home.versanet.de/~s-berman/source2.R";, 
> encoding="UTF-8")
> >> readLines(f)
> > character(0)
> >> close(f)
> >> f <- file("http://home.versanet.de/~s-berman/source2.R";)
> >> readLines(f)
> > [1] "source.test2 <- function() {"   "print(\"Non-ascii: äöüß\")"
> > [3] "}"
>
> > -pd
>
> and that's not even readLines(), but rather how exactly the
> connection is defined [even in your example above]
>
>   > urlR <- "http://home.versanet.de/~s-berman/source2.R";
>   > readLines(urlR, encoding="UTF-8")
>   [1] "source.test2 <- function() {"   "print(\"Non-ascii: äöüß\")"
>   [3] "}"
>   > f <- file(urlR, encoding = "UTF-8")
>   > readLines(f)
>   character(0)
>
> and the same behavior with scan()  instead of readLines() :
>
>> scan(urlR,"") # works
> Read 7 items
> [1] "source.test2"   "<-" "function()" "{"

> [5] "print(\"Non-ascii:" "äöüß\")""}"
>> scan(f,"") # fails
> Read 0 items
> character(0)
>>
>
> So it seems as if the bug is in the file() [or url()] C code ..

Yes, the problem seems to be restricted to loading files from a
(non-local) URL; i.e. this works fine on my computer:

  > source("file:///home/steve/prog/R/source2.R", encoding="UTF-8")

Also, I noticed this works too:

  > read.table("http://home.versanet.de/~s-berman/table2";, encoding="UTF-8", 
skip=1)

where (if I read the source correctly) using `skip=1' makes read.table()
call readLines().  (The read.table() invocation also works without
`skip'.)

> But then we also have to consider Windows .. where I think most changes have
> happened during the  R-3.4.4 --> R-3.5.0  transition.

Yes, please.  I need (or at least it would be convenient) to be able to
load R code containing non-ascii characters from the web under
MS-Windows.

Steve Berman

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
_

Re: [Rd] Subsetting the "ROW"s of an object

2018-06-08 Thread Michael Lawrence
There probably should be an abstraction for this. In S4Vectors, we
have extractROWS().

Michael

On Fri, Jun 8, 2018 at 8:45 AM, Hadley Wickham  wrote:
> Hi all,
>
> Is there a better to way to subset the ROWs (in the sense of NROW) of
> an vector, matrix, data frame or array than this?
>
> subset_ROW <- function(x, i) {
>   nd <- length(dim(x))
>   if (nd <= 1L) {
> x[i]
>   } else {
> dims <- rep(list(quote(expr = )), nd - 1L)
> do.call(`[`, c(list(quote(x), quote(i)), dims, list(drop = FALSE)))
>   }
> }
>
> subset_ROW(1:10, 4:6)
> #> [1] 4 5 6
>
> str(subset_ROW(array(1:10, c(10)), 2:4))
> #>  int [1:3(1d)] 2 3 4
> str(subset_ROW(array(1:10, c(10, 1)), 2:4))
> #>  int [1:3, 1] 2 3 4
> str(subset_ROW(array(1:10, c(5, 2)), 2:4))
> #>  int [1:3, 1:2] 2 3 4 7 8 9
> str(subset_ROW(array(1:10, c(10, 1, 1)), 2:4))
> #>  int [1:3, 1, 1] 2 3 4
>
> subset_ROW(data.frame(x = 1:10, y = 10:1), 2:4)
> #>   x y
> #> 2 2 9
> #> 3 3 8
> #> 4 4 7
>
> It seems like there should be a way to do this that doesn't require
> generating a call with missing arguments, but I can't think of it.
>
> Thanks!
>
> Hadley
>
> --
> http://hadley.nz
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Subsetting the "ROW"s of an object

2018-06-08 Thread Michael Lawrence
Actually, it's sort of the opposite. Everything becomes a sequence of
integers internally, even when the argument is missing. So the same
amount of work is done, basically. ALTREP will let us improve this
sort of thing.

Michael

On Fri, Jun 8, 2018 at 1:49 PM, Hadley Wickham  wrote:
> Hmmm, yes, there must be some special case in the C code to avoid
> recycling a length-1 logical vector:
>
> dims <- c(4, 4, 4, 1e5)
>
> arr <- array(rnorm(prod(dims)), dims)
> dim(arr)
> #> [1]  4  4  4 10
> i <- c(1, 3)
>
> bench::mark(
>   arr[i, TRUE, TRUE, TRUE],
>   arr[i, , , ]
> )[c("expression", "min", "mean", "max")]
> #> # A tibble: 2 x 4
> #>   expressionmin mean  max
> #> 
> #> 1 arr[i, TRUE, TRUE, TRUE]   41.8ms   43.6ms   46.5ms
> #> 2 arr[i, , , ]   41.7ms   43.1ms   46.3ms
>
>
> On Fri, Jun 8, 2018 at 12:31 PM, Berry, Charles  wrote:
>>
>>
>>> On Jun 8, 2018, at 11:52 AM, Hadley Wickham  wrote:
>>>
>>> On Fri, Jun 8, 2018 at 11:38 AM, Berry, Charles  wrote:
>>>>
>>>>
>>>>> On Jun 8, 2018, at 10:37 AM, Hervé Pagès  wrote:
>>>>>
>>>>> Also the TRUEs cause problems if some dimensions are 0:
>>>>>
>>>>>> matrix(raw(0), nrow=5, ncol=0)[1:3 , TRUE]
>>>>> Error in matrix(raw(0), nrow = 5, ncol = 0)[1:3, TRUE] :
>>>>>   (subscript) logical subscript too long
>>>>
>>>> OK. But this is easy enough to handle.
>>>>
>>>>>
>>>>> H.
>>>>>
>>>>> On 06/08/2018 10:29 AM, Hadley Wickham wrote:
>>>>>> I suspect this will have suboptimal performance since the TRUEs will
>>>>>> get recycled. (Maybe there is, or could be, ALTREP, support for
>>>>>> recycling)
>>>>>> Hadley
>>>>
>>>>
>>>> AFAICS, it is not an issue. Taking
>>>>
>>>> arr <- array(rnorm(2^22),c(2^10,4,4,4))
>>>>
>>>> as a test case
>>>>
>>>> and using a function that will either use the literal code 
>>>> `x[idrop=FALSE]' or `eval(mc)':
>>>>
>>>> subset_ROW4 <-
>>>> function(x, i, useLiteral=FALSE)
>>>> {
>>>>literal <- quote(x[idrop=FALSE])
>>>>mc <- quote(x[i])
>>>>nd <- max(1L, length(dim(x)))
>>>>mc[seq(4,length=nd-1L)] <- rep(TRUE, nd-1L)
>>>>mc[["drop"]] <- FALSE
>>>>if (useLiteral)
>>>>eval(literal)
>>>>else
>>>>eval(mc)
>>>> }
>>>>
>>>> I get identical times with
>>>>
>>>> system.time(for (i in 1:1) 
>>>> subset_ROW4(arr,seq(1,length=10,by=100),TRUE))
>>>>
>>>> and with
>>>>
>>>> system.time(for (i in 1:1) 
>>>> subset_ROW4(arr,seq(1,length=10,by=100),FALSE))
>>>
>>> I think that's because you used a relatively low precision timing
>>> mechnaism, and included the index generation in the timing. I see:
>>>
>>> arr <- array(rnorm(2^22),c(2^10,4,4,4))
>>> i <- seq(1,length = 10, by = 100)
>>>
>>> bench::mark(
>>>  arr[i, TRUE, TRUE, TRUE],
>>>  arr[i, , , ]
>>> )
>>> #> # A tibble: 2 x 1
>>> #>   expressionminmean   median  max  n_gc
>>> #>
>>> #> 1 arr[i, TRUE,…   7.4µs  10.9µs  10.66µs   1.22ms 2
>>> #> 2 arr[i, , , ]   7.06µs   8.8µs   7.85µs 538.09µs 2
>>>
>>> So not a huge difference, but it's there.
>>
>>
>> Funny. I get similar results to yours above albeit with smaller differences. 
>> Usually < 5 percent.
>>
>> But with subset_ROW4 I see no consistent difference.
>>
>> In this example, it runs faster on average using `eval(mc)' to return the 
>> result:
>>
>>> arr <- array(rnorm(2^22),c(2^10,4,4,4))
>>> i <- seq(1,length=10,by=100)
>>> bench::mark(subset_ROW4(arr,i,FALSE), subset_ROW4(arr,i,TRUE))[,1:8]
>> # A tibble: 2 x 8
>>   expression  min mean   median  max `itr/sec` 
>> mem_alloc  n_gc
>>  
>>  
>> 1 subset_ROW4(arr, i, FALSE)   28.9µs   34.9µs   32.1µs   1.36ms28686.   
>>  5.05KB 5
>> 2 subset_ROW4(arr, i, TRUE)28.9µs 35µs   32.4µs 875.11µs28572.   
>>  5.05KB 5
>>>
>>
>> And on subsequent reps the lead switches back and forth.
>>
>>
>> Chuck
>>
>
>
>
> --
> http://hadley.nz
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] readLines function with R >= 3.5.0

2018-06-12 Thread Michael Lawrence
Hi Jen,

This was already resolved for R 3.5.1 by just disabling buffering on
terminal file connections like stdin.

Sounds like you might want to be running a web service or something
instead though.

Michael

On Tue, Jun 12, 2018 at 4:46 PM, Jennifer Lyon
 wrote:
> Hi:
>
> I have also just stumbled into this bug. Unfortunately, I can not
> change the data my program receives from stdin. My code runs in a
> larger system and stdin is sent to a Docker container running my R
> code. The protocol is I read a line, readLines("stdin", n=1), do some
> actions, send output on stdout, and wait for the next set of data.  I
> don't have control over this protocol, so I can't use the ^D
> workaround.
>
> I am open for other workaround suggestions. The single line is
> actually JSON and can be quite large. If there isn't something else
> cleaner, I am going to try readChar() in a while loop looking for \n
> but I'm guessing that would likely be too slow.  I am open to other
> workaround solutions. For the moment I have reverted back to R 3.4.4.
>
> Thanks for any suggestions.
>
> Jen.
>
>
>>> >>>>> Martin Maechler
>>> >>>>> on Mon, 28 May 2018 10:28:01 +0200 writes:
>>>
>>> >>>>> Ralf Stubner
>>> >>>>> on Fri, 25 May 2018 19:18:58 +0200 writes:
>>>
>>> >> Dear all, I would like to draw you attention to this
>>> >> question on SO:
>>> >>
> https://stackoverflow.com/questions/50372043/readlines-function-with-new-version-of-r
>>>
>>>
>>> >> Based on the OP's code I used the script
>>>
>>> >> ###
>>> >> create_matrix <- function() {
>>> >> cat("Write the numbers of vertices: ")
>>> >> user_input <- readLines("stdin", n=1)
>>> >> user_input <- as.numeric(user_input)
>>> >> print(user_input)
>>> >> }
>>> >> create_matrix()
>>> >> ###
>>>
>>> >> and called it with "R -f " from the command line.
>>>
>>> >> With 'R version 3.4.4 (2018-03-15) -- "Someone to Lean On"' the
> script
>>> >> prints the inputed number as expected. With both 'R version 3.5.0
>>> >> (2018-04-23) -- "Joy in Playing"' and 'R Under development
> (unstable)
>>> >> (2018-05-19 r74746) -- "Unsuffered Consequences"' the script does
> not
>>> >> continue after inputing a number.
>>>
>>> > I can confirm.
>>> > It "works" if you additionally (the [Enter], i.e., EOL) you also
>>> > "send" an EOF -- in Unix alikes via  -D
>>>
>>> > The same happens if you use  'Rscript '
>>>
>>> > I'm not the expert here, but am close to sure that we (R core)
>>> > did not intend this change, when fixing other somewhat subtle
>>> > bugs in Rscript / 'R -f'
>>>
>>> > Martin Maechler
>>>
>>> The same behavior in regular R , no need for a script etc.
>>>
>>> > str(readLines("stdin", n=1))
>>>
>>> then in addition to the input  you need to "give" an EOF (Ctrl D) in R
>>= 3.5.0
>>>
>>> Interestingly, everything works fine if you use  stdin() instead
>>> of "stdin" :
>>>
>>> > rr <- readLines(stdin(), n=1)
>>> foo
>>> > rr
>>> [1] "foo"
>>> >
>>> --
>>>
>>> So, for now use  stdin()  which is much clearer than the string
>>> "stdin" anyway
>>>
>>> Martin Maechler
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] readLines function with R >= 3.5.0

2018-06-13 Thread Michael Lawrence
Are you sure it's not available in patched? It's definitely in the
source since 6/1.

Michael


On Wed, Jun 13, 2018 at 2:19 AM, Martin Maechler
 wrote:
>>>>>> Michael Lawrence
>>>>>> on Tue, 12 Jun 2018 19:27:49 -0700 writes:
>
> > Hi Jen, This was already resolved for R 3.5.1 by just
> > disabling buffering on terminal file connections like stdin.
>
> and before R 3.5.1 exists, *and*
> as the change is also not yet available in R patched (!)
> this means using a version of
> "R-devel", e.g. for Windows available from
>https://cloud.r-project.org/bin/windows/base/rdevel.html
>
> Martin
>
> > Sounds like you might want to be running a web service or
> > something instead though.
>
> > Michael
>
> > On Tue, Jun 12, 2018 at 4:46 PM, Jennifer Lyon
> >  wrote:
> >> Hi:
> >>
> >> I have also just stumbled into this bug. Unfortunately, I
> >> can not change the data my program receives from
> >> stdin. My code runs in a larger system and stdin is sent
> >> to a Docker container running my R code. The protocol is
> >> I read a line, readLines("stdin", n=1), do some actions,
> >> send output on stdout, and wait for the next set of data.
> >> I don't have control over this protocol, so I can't use
> >> the ^D workaround.
> >>
> >> I am open for other workaround suggestions. The single
> >> line is actually JSON and can be quite large. If there
> >> isn't something else cleaner, I am going to try
> >> readChar() in a while loop looking for \n but I'm
> >> guessing that would likely be too slow.  I am open to
> >> other workaround solutions. For the moment I have
> >> reverted back to R 3.4.4.
> >>
> >> Thanks for any suggestions.
> >>
> >> Jen.
> >>
> >>
> >>>> >>>>> Martin Maechler >>>>> on Mon, 28 May 2018
> >>>> 10:28:01 +0200 writes:
> >>>>
> >>>> >>>>> Ralf Stubner >>>>> on Fri, 25 May 2018 19:18:58
> >>>> +0200 writes:
> >>>>
> >>>> >> Dear all, I would like to draw you attention to this
> >>>> >> question on SO:
> >>>> >>
> >> 
> https://stackoverflow.com/questions/50372043/readlines-function-with-new-version-of-r
> >>>>
> >>>>
> >>>> >> Based on the OP's code I used the script
> >>>>
> >>>> >> ###
> >>>> >> create_matrix <- function() { >> cat("Write the
> >>>> numbers of vertices: ") >> user_input <-
> >>>> readLines("stdin", n=1) >> user_input <-
> >>>> as.numeric(user_input) >> print(user_input) >> } >>
> >>>> create_matrix()
> >>>> >> ###
> >>>>
> >>>> >> and called it with "R -f " from the
> >>>> command line.
> >>>>
> >>>> >> With 'R version 3.4.4 (2018-03-15) -- "Someone to
> >>>> Lean On"' the
> >> script
> >>>> >> prints the inputed number as expected. With both 'R
> >>>> version 3.5.0 >> (2018-04-23) -- "Joy in Playing"' and
> >>>> 'R Under development
> >> (unstable)
> >>>> >> (2018-05-19 r74746) -- "Unsuffered Consequences"'
> >>>> the script does
> >> not
> >>>> >> continue after inputing a number.
> >>>>
> >>>> > I can confirm.  > It "works" if you additionally (the
> >>>> [Enter], i.e., EOL) you also > "send" an EOF -- in Unix
> >>>> alikes via -D
> >>>>
> >>>> > The same happens if you use 'Rscript '
> >>>>
> >>>> > I'm not the expert here, but am close to sure that we
> >>>> (R core) > did not intend this change, when fixing
> >>>> other somewhat subtle > bugs in Rscript / 'R -f'
> >>>>
> >>>> > Martin Maechler
> >>>>
> >>>> The same behavior in regular R , no need for a script
> >>>> etc.
> >>>>
> >>>> > str(readLines("stdin", n=1))
> >>>>
> >>>> then in addition to the input you need to "give" an EOF
> >>>> (Ctrl D) in R
> >>> = 3.5.0
> >>>>
> >>>> Interestingly, everything works fine if you use stdin()
> >>>> instead of "stdin" :
> >>>>
> >>>> > rr <- readLines(stdin(), n=1) foo > rr [1] "foo"
> >>>> >
> >>>> --
> >>>>
> >>>> So, for now use stdin() which is much clearer than the
> >>>> string "stdin" anyway
> >>>>
> >>>> Martin Maechler
> >>
> >> [[alternative HTML version deleted]]
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] readLines function with R >= 3.5.0

2018-06-19 Thread Michael Lawrence
Hi Jen,

Please provide a reproducible example, since the original stack
overflow example works in both trunk and patched.

Thanks,
Michael

On Tue, Jun 19, 2018 at 3:45 PM, Jennifer Lyon
 wrote:
> Hi Michael:
>
> I can confirm Martin's comment. I tested my software with r-devel (r74914)
> and it works, while with r-patched (r74914) it does not work (it hangs, as
> it did in R 3.5.0). I apologize for it taking so long for me to test this,
> but is there any chance this fix could make into R 3.5.1?
>
> Thanks.
>
> Jen.
>
> On Wed, Jun 13, 2018 at 6:24 AM, Michael Lawrence
>  wrote:
>>
>> Are you sure it's not available in patched? It's definitely in the
>> source since 6/1.
>>
>> Michael
>>
>>
>> On Wed, Jun 13, 2018 at 2:19 AM, Martin Maechler
>>  wrote:
>> >>>>>> Michael Lawrence
>> >>>>>> on Tue, 12 Jun 2018 19:27:49 -0700 writes:
>> >
>> > > Hi Jen, This was already resolved for R 3.5.1 by just
>> > > disabling buffering on terminal file connections like stdin.
>> >
>> > and before R 3.5.1 exists, *and*
>> > as the change is also not yet available in R patched (!)
>> > this means using a version of
>> > "R-devel", e.g. for Windows available from
>> >https://cloud.r-project.org/bin/windows/base/rdevel.html
>> >
>> > Martin
>> >
>> > > Sounds like you might want to be running a web service or
>> > > something instead though.
>> >
>> > > Michael
>> >
>> > > On Tue, Jun 12, 2018 at 4:46 PM, Jennifer Lyon
>> > >  wrote:
>> > >> Hi:
>> > >>
>> > >> I have also just stumbled into this bug. Unfortunately, I
>> > >> can not change the data my program receives from
>> > >> stdin. My code runs in a larger system and stdin is sent
>> > >> to a Docker container running my R code. The protocol is
>> > >> I read a line, readLines("stdin", n=1), do some actions,
>> > >> send output on stdout, and wait for the next set of data.
>> > >> I don't have control over this protocol, so I can't use
>> > >> the ^D workaround.
>> > >>
>> > >> I am open for other workaround suggestions. The single
>> > >> line is actually JSON and can be quite large. If there
>> > >> isn't something else cleaner, I am going to try
>> > >> readChar() in a while loop looking for \n but I'm
>> > >> guessing that would likely be too slow.  I am open to
>> > >> other workaround solutions. For the moment I have
>> > >> reverted back to R 3.4.4.
>> > >>
>> > >> Thanks for any suggestions.
>> > >>
>> > >> Jen.
>> > >>
>> > >>
>> > >>>> >>>>> Martin Maechler >>>>> on Mon, 28 May 2018
>> > >>>> 10:28:01 +0200 writes:
>> > >>>>
>> > >>>> >>>>> Ralf Stubner >>>>> on Fri, 25 May 2018 19:18:58
>> > >>>> +0200 writes:
>> > >>>>
>> > >>>> >> Dear all, I would like to draw you attention to this
>> > >>>> >> question on SO:
>> > >>>> >>
>> > >>
>> > https://stackoverflow.com/questions/50372043/readlines-function-with-new-version-of-r
>> > >>>>
>> > >>>>
>> > >>>> >> Based on the OP's code I used the script
>> > >>>>
>> > >>>> >> ###
>> > >>>> >> create_matrix <- function() { >> cat("Write the
>> > >>>> numbers of vertices: ") >> user_input <-
>> > >>>> readLines("stdin", n=1) >> user_input <-
>> > >>>> as.numeric(user_input) >> print(user_input) >> } >>
>> > >>>> create_matrix()
>> > >>>> >> ###
>> > >>>>
>> > >>>> >> and called it with "R -f " from the
>> &g

Re: [Rd] list of methods

2018-06-26 Thread Michael Lawrence
While it's easy to conceive of a utility that found all generics for
which there is no non-default method for a given class vector, it's
not clear it would be useful, because it depends on the nature of the
object. Surv objects are vector-like, so they need to implement the
"vector API", which is not formally defined. You could look at the
S4Vectors package or the date/time classes for reference. But Surv
gets a lot less for free since length() returns twice their logical
length, an unfortunate inconsistency.

Michael



On Tue, Jun 26, 2018 at 11:24 AM, Therneau, Terry M., Ph.D. via
R-devel  wrote:
> I recently got a request to add head() and tail() methods for Surv objects, 
> which is quite
> reasonable, but not unlike other requests for logLik,  vcov, extractAIC, ...  
>  What they
> all have in common is that are methods added since creation of the survival 
> package, and
> that I didn't know they existed.
>
> To try and get ahead of the curve, is there a way to list names of all of the 
> default
> methods?   There are functions to get all the instances of a method by name, 
> e.g.
> methods("extractAIC") or find all the methods already implemented for a 
> class, but I don't
> see something give me a list of the ones that I haven't created yet.
>
> Terry T.
>
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R History: Why is there no importFrom() function?

2018-07-22 Thread Michael Lawrence
I can't speak to the history per se, but I can give an opinion on the
current situation. R is a programming language, as is Python, but R is
also a system for interactive data analysis. Outside of the
software/package context, library() is almost always sufficient. When
it is not, consider "foo <- package::foo". While not as well
structured as an importFrom() call, it does make the side effect
explicit. When that becomes onerous, it is probably past time to
transition to a package.

Michael

On Sat, Jul 21, 2018 at 7:54 PM, Aaron Jacobs  wrote:
> Excuse me if this is inappropriate content for this list, but I thought it
> might be the best place -- and the best audience -- to ask about a design
> decision for the R language.
>
> Programs or analyses written in R typically use library() to pull in
> functions from non-core packages. This differs markedly from most
> languages*, which usually offer some way to selectively import symbols. For
> example, in Python you'd see "from random import randint", and so on.
>
> Within R packages, the NAMESPACE file provides this exact functionality
> with the importFrom() directive, but the R language itself does not expose
> this as a function for regular users.
>
> I know that R did not have namespaces for some of its early existence, but
> I'm curious as to why the language never acquired an import() or
> importFrom() replacement for library() when it did get them. Was it purely
> for compatibility with S and earlier R versions? Or was there a principled
> difference of opinion on how R code should be written at stake?
>
> Any insight from those of you familiar with R's history would be deeply
> appreciated.
>
> Regards,
> Aaron
>
> ps. I am aware of the very clever "import" package, which provides exactly
> this feature -- I am more wondering why such an approach was never adopted
> by the language itself.
>
> * Most languages with real modules/namespaces, I mean.
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] WishList: Remove Generic Arguments

2018-08-09 Thread Michael Lawrence
A generic function is not simply a way to name two functions (methods)
the same. It has a particular purpose, and the argument names are
aligned with and convey that purpose. The methods only implement
polymorphism; they don't change the purpose. Changing the purpose
would make code unreadable.

Michael

On Thu, Aug 9, 2018 at 2:45 PM, Abs Spurdle  wrote:
>  I apologize if this issue has been raised before.
>
> I really like object oriented S3 programming.
> However, there's one feature of object oriented S3 programming that I don't
> like.
> Generic functions can have arguments other than dots.
>
> Lets say you have an R package with something like:
>
> print.myfunction (f, ...)
> {   dosomething (f, ...)
> }
>
> Noting that I use function objects a lot.
>
> R CMD check will generate a warning because you've named your object f
> rather than x.
>
> I don't want to name my object x.
> I want to name my object f.
> Naming the object x makes the program unreadable.
> Especially if f contains an attribute or an argument named x.
>
> There's a work around.
> You can redefine the print function, using something like:
>
> print = function (...) base::print (...)
>
> However, you have to export and document the function.
>
> I think that it would be better if generic functions didn't have any
> arguments except for dots.
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug 17432 in readLines with R >= 3.5.0 still a problem

2018-09-13 Thread Michael Lawrence
Thanks, I responded to this on bugzilla.
On Wed, Sep 12, 2018 at 9:04 AM Chris Culnane
 wrote:
>
> Bug 17432 (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17432) is 
> still a problem when using pipes for IPC.
>
> The bug is evident when calling R from another process and trying to 
> communicate via StdIn. R will buffer the input and not read lines until the 
> buffer is exceeded or StdIn is closed by the sending process. This prevents 
> interactive communication between a calling process and a child R process.
>
> From a quick look at the source code, it looks like the bug is caused by only 
> disabling buffering when isatty() returns true for a file descriptor 
> (connections.c). This fixes the original bug when the script is run in a 
> terminal, but doesn't help for pipes, which will return false for isatty().
>
> An example R script and python script are provided to demonstrate the problem:
>
> R script (example.r):
> 
> f <- file("stdin")
> open(f)
> while(length(line <- readLines(f,n=1)) > 0) {
>   write(line, stderr())
> }
>
> Python3 script:
> 
> import sys, os, subprocess
> process = subprocess.Popen(['Rscript', 'example.r'], stdin=subprocess.PIPE, 
> stdout=subprocess.PIPE)
> for line in sys.stdin:
> process.stdin.write((line + '\n').encode('utf-8'))
> process.stdin.flush()
>
>
> Expected Behaviour:
> Run python script, each line entered is echoed back immediately by the R 
> script - which is what happens on 3.4.4
>
> Observed Behaviiour on >=3.5.0 (include devel):
> The R script does not process lines as they are sent, it only receives them 
> when StdIn is closed.
>
>
> Best Regards
>
> Chris
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug 17432 in readLines with R >= 3.5.0 still a problem

2018-09-14 Thread Michael Lawrence
The actual bug corresponding to this thread is:
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17470
On Fri, Sep 14, 2018 at 9:22 AM Jennifer Lyon  wrote:
>
> Michael:
>
> I don't see any comments on Bug 17432 
> (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17432) later than June 
> 1, 2018. Would you please supply a link pointing to the followup to this 
> discussion on bugzilla?
>
> Thanks.
>
> Jen.
>
> > On Thu Sep 13 14:14:46 CEST 2018 Michael Lawrence wrote:
> >
> > Thanks, I responded to this on bugzilla.
> > On Wed, Sep 12, 2018 at 9:04 AM Chris Culnane
> >  wrote:
> > >
> > > Bug 17432 (https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=17432) is 
> > > still a problem when using pipes for IPC.
> > >
> > > The bug is evident when calling R from another process and trying to 
> > > communicate via StdIn. R will buffer the input and not read lines until 
> > > the buffer is exceeded or StdIn is closed by the sending process. This 
> > > prevents interactive communication between a calling process and a child 
> > > R process.
> > >
> > > From a quick look at the source code, it looks like the bug is caused by 
> > > only disabling buffering when isatty() returns true for a file descriptor 
> > > (connections.c). This fixes the original bug when the script is run in a 
> > > terminal, but doesn't help for pipes, which will return false for 
> > > isatty().
> > >
> > > An example R script and python script are provided to demonstrate the 
> > > problem:
> > >
> > > R script (example.r):
> > > 
> > > f <- file("stdin")
> > > open(f)
> > > while(length(line <- readLines(f,n=1)) > 0) {
> > >   write(line, stderr())
> > > }
> > >
> > > Python3 script:
> > > 
> > > import sys, os, subprocess
> > > process = subprocess.Popen(['Rscript', 'example.r'], 
> > > stdin=subprocess.PIPE, stdout=subprocess.PIPE)
> > > for line in sys.stdin:
> > > process.stdin.write((line + '\n').encode('utf-8'))
> > > process.stdin.flush()
> > >
> > >
> > > Expected Behaviour:
> > > Run python script, each line entered is echoed back immediately by the R 
> > > script - which is what happens on 3.4.4
> > >
> > > Observed Behaviiour on >=3.5.0 (include devel):
> > > The R script does not process lines as they are sent, it only receives 
> > > them when StdIn is closed.
> > >
> > >
> > > Best Regards
> > >
> > > Chris
>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] named arguments discouraged in `[.data.frame` and `[<-.data.frame`

2018-11-28 Thread Michael Lawrence
Whenever they are calling a primitive, because primitives match
arguments positionally. Of course, you then you need to introduce the
concept of a primitive.

You could also make an argument from the code clarity perspective, as
typically primitives have simple interfaces and/or are used frequently
enough that naming arguments just introduces clutter. That probably
requires experience though.

Michael
On Wed, Nov 28, 2018 at 11:30 AM Henrik Pärn  wrote:
>
> tl;dr:
>
> Why are named arguments discouraged in `[.data.frame`, `[<-.data.frame` and 
> `[[.data.frame`?
>
> (because this question is of the kind 'why is R designed like this?', I 
> though R-devel would be more appropriate than R-help)
>
> #
>
> Background:
>
> Now and then students presents there fancy functions like this:
>
> myfancyfun(d,12,0.3,0.2,500,1000,FALSE,TRUE,FALSE,TRUE,FALSE)
>
> Incomprehensible. Thus, I encourage them to use spaces and name arguments, 
> _at least_ when trying to communicate their code with others. Something like:
>
> myfancyfun(data = d, n = 12, gamma = 0.3, prob = 0.2,
>   size = 500, niter = 1000, model = FALSE,
>  scale = TRUE, drop = FALSE, plot = TRUE, save = FALSE)
>
>
> Then some overzealous students started to use named arguments everywhere. 
> E-v-e-r-y-w-h-e-r-e. Even in the most basic situation when indexing vectors 
> (as a subtle protest?), like:
>
> vec <- 1:9
>
> vec[i = 4]
> `[`(x = vec, i = 4)
>
> vec[[i = 4]]
> `[[`(x = vec, i = 4)
>
> vec[i = 4] <- 10
> `[<-`(x = vec, i = 4, value = 10)
>
> ...or when indexing matrices:
>
> m <- matrix(vec, ncol = 3)
> m[i = 2, j = 2]
> `[`(x = m, i = 2, j = 2)
> # 5
>
> m[i = 2, j = 2] <- 0
> `[<-`(x = m, i = 2, j = 2, value = 0)
>
> ##
>
> This practice indeed feels like overkill, but it didn't seem to hurt either. 
> Until they used it on data frames. Then suddenly warnings appeared that named 
> arguments are discouraged:
>
> d <- data.frame(m)
>
> d[[i = "X2"]]
> # [1] 4 5 6
> # Warning message:
> # In `[[.data.frame`(d, i = "X2") :
> #  named arguments other than 'exact' are discouraged
>
> d[i = 2, j = 2]
> # [1] 0
> # Warning message:
> # In `[.data.frame`(d, i = 2, j = 2) :
> #  named arguments other than 'drop' are discouraged
>
> d[i = 2, j = 2] <- 5
> # Warning message:
> # In `[<-.data.frame`(`*tmp*`, i = 2, j = 2, value = 5) :
> #  named arguments are discouraged
>
>
> ##
>
> Of course I could tell them "don't do it, it's overkill and not common 
> practice" or "it's just a warning, don't worry". However, I assume the 
> warnings are there for a good reason.
>
> So how do I explain to the students that named arguments are actively 
> discouraged in `[.data.frame` and `[<-.data.frame`, but not in `[` and `[<-`? 
> When will they get bitten?
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Unexpected argument-matching when some are missing

2018-11-30 Thread Michael Lawrence
Argument matching is by name first, then the still missing arguments
are filled positionally. Unnamed missing arguments are thus left
missing. Does that help?

Michael
On Fri, Nov 30, 2018 at 8:18 AM Emil Bode  wrote:
>
> But the main point is where arguments are mixed together:
>
> > debugonce(plot.default)
> > plot(x=1:10, y=, 'l')
> ...
> Browse[2]> missing(y)
> [1] FALSE
> Browse[2]> y
> [1] "l"
> Browse[2]> type
> [1] "p"
>
> I think that's what I fall over mostly: that named, empty arguments behave 
> entirely different from omitting them (", ,")
>
> And I definitely agree we need a guru to explain it all to us (
>
> Cheers, Emil Bode
>
>
> On 30/11/2018, 15:35, "S Ellison"  wrote:
>
> > Yes, I think all of that is correct. But y _is_ missing in this sense:
> > > plot(1:10, y=)
> > > ...
> > Browse[2]> missing(y)
>
> Although I said what I meant by 'missing' vs 'not present', it wasn't 
> exactly what missing() means. My bad.
> missing() returns TRUE if an argument is not specified in the call 
> _whether or not_ it has a default, hence the behaviour of missing(y) in 
> debug(plot).
>
> But we can easily find out whether a default has been assigned:
> plot(1:10, y=, type=)
> Browse[2]> y
> NULL
> Browse[2]> type
> "p"
>
> ... which is consistent with silent omission of 'y=' and 'type='
>
>
> Still waiting for a guru...
>
> Steve E
>
>
>
> ***
> This email and any attachments are confidential. Any use, copying or
> disclosure other than by the intended recipient is unauthorised. If
> you have received this message in error, please notify the sender
> immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com
> and delete this message and any copies from your computer and network.
> LGC Limited. Registered in England 2991879.
> Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Dead link in documentation of ?timezones

2018-12-07 Thread Michael Chirico
This link is referenced in ?timezones and appears to have been
moved/removed. Is there a replacement?

http://www.twinsun.com/tz/tz-link.htm

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Dead link in documentation of ?timezones

2018-12-07 Thread Michael Chirico
Indeed! Sorry, I need more sleep, should have known better. Thanks!

On Fri, Dec 7, 2018 at 6:22 PM Martin Maechler 
wrote:

> >>>>> Michael Chirico
> >>>>> on Fri, 7 Dec 2018 10:36:37 +0800 writes:
>
> > This link is referenced in ?timezones and appears to have been
> > moved/removed. Is there a replacement?
>
> > http://www.twinsun.com/tz/tz-link.htm
>
> Yes, already in the sources (*) of R at
>
>https://svn.r-project.org/R/trunk/src/library/base/man/timezones.Rd
>
> We (Kurt \in {R-core}) do regularly (but not daily!) check all our URLs
> --- as they are also checked for all CRAN packages -- and so
> found and fixed the problems there.
>
> So, (in the future) you can look into the development sources to
> see if a URL problem has already been addressed.
>
> Still, of course "thank you!"  for noticing and caring about it!
>
> Best,
> Martin
>
>
> --
> *) the only official source, everything else is a mirror
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] strtoi output of empty string inconsistent across platforms

2019-01-10 Thread Michael Chirico
Identified as root cause of a bug in data.table:

https://github.com/Rdatatable/data.table/issues/3267

On my machine, strtoi("", base = 2L) produces NA_integer_ (which seems
consistent with ?strtoi: "Values which cannot be interpreted as integers or
would overflow are returned as NA_integer_").

But on all the other machines I've seen, 0L is returned. This seems to be
consistent with the output of a simple C program using the underlying
strtol function (see data.table link for this program, and for full
sessionInfo() of some environments with differing output).

So, what is the correct output of strtoi("", base = 2L)? Is the
cross-platform inconsistency to be expected/documentable?

Michael Chirico

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] strtoi output of empty string inconsistent across platforms

2019-01-12 Thread Michael Chirico
Thanks Martin.

For what it's worth, this extremely representative, highly scientific
Twitter poll suggests the Mac/Linux split is pretty stark (NA on Mac, 0 on
Linux)

https://twitter.com/michael_chirico/status/1083649190117306369?s=17

On Sat, Jan 12, 2019, 2:00 AM Martin Maechler  >>>>> Martin Maechler
> >>>>> on Fri, 11 Jan 2019 09:44:14 +0100 writes:
>
> >>>>> Michael Chirico
> >>>>> on Fri, 11 Jan 2019 14:36:17 +0800 writes:
>
> >> Identified as root cause of a bug in data.table:
> >> https://github.com/Rdatatable/data.table/issues/3267
>
> >> On my machine, strtoi("", base = 2L) produces NA_integer_
> >> (which seems consistent with ?strtoi: "Values which
> >> cannot be interpreted as integers or would overflow are
> >> returned as NA_integer_").
>
> > indeed consistent with R's documentation on strtoi().
> > What machine would that be?
>
> >> But on all the other machines I've seen, 0L is
> >> returned. This seems to be consistent with the output of
> >> a simple C program using the underlying strtol function
> >> (see data.table link for this program, and for full
> >> sessionInfo() of some environments with differing
> >> output).
>
> >> So, what is the correct output of strtoi("", base = 2L)?
>
> >> Is the cross-platform inconsistency to be
> >> expected/documentable?
>
> > The inconsistency is certainly undesirable.  The relevant
> > utility function in R's source (/src/main/character.c)
> > is
>
> > static int strtoi(SEXP s, int base) { long int res; char
> > *endp;
>
> > /* strtol might return extreme values on error */
> > errno = 0;
>
> > if(s == NA_STRING) return(NA_INTEGER); res =
> > strtol(CHAR(s), &endp, base); /* ASCII */ if(errno ||
> > *endp != '\0') res = NA_INTEGER; if(res > INT_MAX || res <
> > INT_MIN) res = NA_INTEGER; return (int) res; }
>
> > and so it clearly is a platform-inconsistency in the
> > underlying C library's strtol().
>
> (corrected typos here: )
>
> > I think we should make this cross-platform consistent ...
> > and indeed it makes much sense to ensure the result of
>
> > strtoi("", base=2L)to become   NA_integer_
>
> > but chances are that would break code that has relied on
> > the current behavior {on "all but your computer" ;-)} ?
>
> I still think that such a change should be done.
>
> 'make check all' on the R source (+ Recommended packages) seems
> not to signal any error or warning with such a change, so I plan
> to commit that change to "the trunk" / "R-devel" soon, unless
> concerns are raised highly (and quickly enough).
>
> Martin
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] \dots used improperly in ?Rprof examples

2019-05-25 Thread Michael Chirico
\dots is used in the Usage section of the Rprof manual, but it's not
rendered as ...

I'm not sure if this should be \ldots, or just written manually with ...

Also, I think the Rprof() on the first line is intended to be on the second
line? So that the flow looks like

Rprof() # start profiling
## some code to be profiled
Rprof(NULL) # shut off profiling
## some code NOT to be profiled
Rprof(append = TRUE) # turn profiling back on and append output to current
file
## some code to be profiled
Rprof(NULL)
## ... et cetera
## Now post-process the output as described in Details

As it is the first line looks like it's commented out

Michael Chirico

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] rbind has confusing result for custom sub-class (possible bug?)

2019-05-25 Thread Michael Chirico
Debugging this issue:

https://github.com/Rdatatable/data.table/issues/2008

We have custom class 'IDate' which inherits from 'Date' (it just forces
integer storage for efficiency, hence, I).

The concatenation done by rbind, however, breaks this and returns a double:

library(data.table)
DF = data.frame(date = as.IDate(Sys.Date()))
storage.mode(rbind(DF, DF)$date)
# [1] "double"

This is specific to base::rbind (data.table's rbind returns an integer as
expected); in ?rbind we see:

The method dispatching is not done via UseMethod(), but by C-internal
dispatching. Therefore there is no need for, e.g., rbind.default.
The dispatch algorithm is described in the source file
(‘.../src/main/bind.c’) as
1. For each argument we get the list of possible class memberships from the
class attribute.
2. *We inspect each class in turn to see if there is an applicable method.*
3. If we find an applicable method we make sure that it is identical to any
method determined for prior arguments. If it is identical, we proceed,
otherwise we immediately drop through to the default code.

It's not clear what #2 means -- an applicable method *for what*? Glancing
at the source code would suggest it's looking for rbind.IDate:

https://github.com/wch/r-source/blob/trunk/src/main/bind.c#L1051-L1063

const char *generic = ((PRIMVAL(op) == 1) ? "cbind" : "rbind"); // should
be rbind here
const char *s = translateChar(STRING_ELT(classlist, i)); // iterating over
the classes, should get to IDate first
sprintf(buf, "%s.%s", generic, s); // should be rbind.IDate

but adding this method (or even exporting it) is no help [ simply defining
rbind.IDate = function(...) as.IDate(NextMethod()) ]

Lastly, it appears that as.Date.IDate is called, which is causing the type
conversion:

debug(data.table:::as.Date.IDate)
rbind(DF, DF) # launches debugger
x
# [1] "2019-05-26" <-- singleton, so apparently applied to DF$date, not
c(DF$date, DF$date)
undebug(data.table:::as.Date.IDate)

I can't really wrap my head around why as.Date is being called here, and
even allowing that, why the end result is still the original class [
class(rbind(DF, DF)$date) == c('IDate', 'Date') ]

So, I'm beginning to think this might be a bug. Am I missing something?

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] rbind has confusing result for custom sub-class (possible bug?)

2019-05-26 Thread Michael Chirico
Have finally managed to come up with a fix after checking out sys.calls()
from within the as.Date.IDate debugger, which shows something like:

[[1]] rbind(DF, DF)
[[2]] rbind(deparse.level, ...)
[[3]] `[<-`(`*tmp*`, ri, value = 18042L)
[[4]] `[<-.Date`(`*tmp*`, ri, value = 18042L)
[[5]] as.Date(value)
[[6]] as.Date.IDate(value)

I'm not sure why [<- is called, I guess the implementation is to assign to
the output block by block? Anyway, we didn't have a [<- method. And
[<-.Date looks like:

value <- unclass(as.Date(value)) # <- converts to double
.Date(NextMethod(.Generic), oldClass(x)) # <- restores 'IDate' class

So we can fix our bug by defining a [<- class; the question that I still
don't see answered in documentation or source code is, why/where is [<-
called, exactly?

Mike C

On Sun, May 26, 2019 at 1:16 PM Michael Chirico 
wrote:

> Debugging this issue:
>
> https://github.com/Rdatatable/data.table/issues/2008
>
> We have custom class 'IDate' which inherits from 'Date' (it just forces
> integer storage for efficiency, hence, I).
>
> The concatenation done by rbind, however, breaks this and returns a double:
>
> library(data.table)
> DF = data.frame(date = as.IDate(Sys.Date()))
> storage.mode(rbind(DF, DF)$date)
> # [1] "double"
>
> This is specific to base::rbind (data.table's rbind returns an integer as
> expected); in ?rbind we see:
>
> The method dispatching is not done via UseMethod(), but by C-internal
> dispatching. Therefore there is no need for, e.g., rbind.default.
> The dispatch algorithm is described in the source file
> (‘.../src/main/bind.c’) as
> 1. For each argument we get the list of possible class memberships from
> the class attribute.
> 2. *We inspect each class in turn to see if there is an applicable
> method.*
> 3. If we find an applicable method we make sure that it is identical to
> any method determined for prior arguments. If it is identical, we proceed,
> otherwise we immediately drop through to the default code.
>
> It's not clear what #2 means -- an applicable method *for what*? Glancing
> at the source code would suggest it's looking for rbind.IDate:
>
> https://github.com/wch/r-source/blob/trunk/src/main/bind.c#L1051-L1063
>
> const char *generic = ((PRIMVAL(op) == 1) ? "cbind" : "rbind"); // should
> be rbind here
> const char *s = translateChar(STRING_ELT(classlist, i)); // iterating over
> the classes, should get to IDate first
> sprintf(buf, "%s.%s", generic, s); // should be rbind.IDate
>
> but adding this method (or even exporting it) is no help [ simply defining
> rbind.IDate = function(...) as.IDate(NextMethod()) ]
>
> Lastly, it appears that as.Date.IDate is called, which is causing the type
> conversion:
>
> debug(data.table:::as.Date.IDate)
> rbind(DF, DF) # launches debugger
> x
> # [1] "2019-05-26" <-- singleton, so apparently applied to DF$date, not
> c(DF$date, DF$date)
> undebug(data.table:::as.Date.IDate)
>
> I can't really wrap my head around why as.Date is being called here, and
> even allowing that, why the end result is still the original class [
> class(rbind(DF, DF)$date) == c('IDate', 'Date') ]
>
> So, I'm beginning to think this might be a bug. Am I missing something?
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] rbind has confusing result for custom sub-class (possible bug?)

2019-05-27 Thread Michael Chirico
Yes, thanks for following up on thread here. And thanks again for clearing
things up, your email was a finger snap of clarity on the whole issue.

I'll add that actually it was data.table's code at fault on the storage
conversion -- note that if you use an arbitrary sub-class 'foo' with no
methods defined, it'll stay integer.

That's because [<- calls as.Date and then as.Date.IDate, and that method
(ours) has as.numeric(); earlier I had recognized that if we commented that
line, the issue was "fixed" but I still wasn't understanding the root cause.

My last curiosity on this issue will be in my follow-up thread.

Mike C

On Mon, May 27, 2019, 10:25 PM Joshua Ulrich 
wrote:

> On Sun, May 26, 2019 at 6:47 AM Joshua Ulrich 
> wrote:
> >
> > On Sun, May 26, 2019 at 4:06 AM Michael Chirico
> >  wrote:
> > >
> > > Have finally managed to come up with a fix after checking out
> sys.calls()
> > > from within the as.Date.IDate debugger, which shows something like:
> > >
> > > [[1]] rbind(DF, DF)
> > > [[2]] rbind(deparse.level, ...)
> > > [[3]] `[<-`(`*tmp*`, ri, value = 18042L)
> > > [[4]] `[<-.Date`(`*tmp*`, ri, value = 18042L)
> > > [[5]] as.Date(value)
> > > [[6]] as.Date.IDate(value)
> > >
> > > I'm not sure why [<- is called, I guess the implementation is to
> assign to
> > > the output block by block? Anyway, we didn't have a [<- method. And
> > > [<-.Date looks like:
> > >
> > > value <- unclass(as.Date(value)) # <- converts to double
> > > .Date(NextMethod(.Generic), oldClass(x)) # <- restores 'IDate' class
> > >
> > > So we can fix our bug by defining a [<- class; the question that I
> still
> > > don't see answered in documentation or source code is, why/where is [<-
> > > called, exactly?
> > >
> > Your rbind(DF, DF) call dispatches to base::rbind.data.frame().  The
> > `[<-` call is this line:
> > value[[jj]][ri] <- if (is.factor(xij)) as.vector(xij) else xij
> >
> > That's where the storage.mode changes from integer to double.
> >
> > debug: value[[jj]][ri] <- if (is.factor(xij)) as.vector(xij) else xij
> > Browse[2]>
> > debug: xij
> > Browse[2]> storage.mode(xij)
> > [1] "integer"
> > Browse[2]> value[[jj]][ri]
> > [1] "2019-05-26"
> > Browse[2]> storage.mode(value[[jj]][ri])
> > [1] "integer"
> > Browse[2]>
> > debug: if (!is.null(nm <- names(xij))) names(value[[jj]])[ri] <- nm
> > Browse[2]> storage.mode(value[[jj]][ri])
> > [1] "double"
> >
> To be clear, I don't think this is a bug in rbind() or
> rbind.data.frame().  The confusion is that rbind.data.frame() calls
> `[<-` for each column of the data.frame, and there is no `[<-.IDate`
> method.  So the parent class method is dispatched, which converts the
> storage mode to double.
>
> Someone may argue that this is an issue with `[<-.Date`, and that it
> shouldn't convert the storage.mode from integer to double.
> >
> > > Mike C
> > >
> > > On Sun, May 26, 2019 at 1:16 PM Michael Chirico <
> michaelchiri...@gmail.com>
> > > wrote:
> > >
> > > > Debugging this issue:
> > > >
> > > > https://github.com/Rdatatable/data.table/issues/2008
> > > >
> > > > We have custom class 'IDate' which inherits from 'Date' (it just
> forces
> > > > integer storage for efficiency, hence, I).
> > > >
> > > > The concatenation done by rbind, however, breaks this and returns a
> double:
> > > >
> > > > library(data.table)
> > > > DF = data.frame(date = as.IDate(Sys.Date()))
> > > > storage.mode(rbind(DF, DF)$date)
> > > > # [1] "double"
> > > >
> > > > This is specific to base::rbind (data.table's rbind returns an
> integer as
> > > > expected); in ?rbind we see:
> > > >
> > > > The method dispatching is not done via UseMethod(), but by C-internal
> > > > dispatching. Therefore there is no need for, e.g., rbind.default.
> > > > The dispatch algorithm is described in the source file
> > > > (‘.../src/main/bind.c’) as
> > > > 1. For each argument we get the list of possible class memberships
> from
> > > > the class attribute.
> > > > 2. *We inspect each class in turn to see if there is an applic

[Rd] Why is R in Japanese (only in Mac terminal)?

2019-05-29 Thread Michael Chirico
Since a while ago, R on my Mac terminal is being started in Japanese:

R version 3.5.2 (2018-12-20) -- "Eggshell Igloo"

Copyright (C) 2018 The R Foundation for Statistical Computing

Platform: x86_64-apple-darwin15.6.0 (64-bit)


R は、自由なソフトウェアであり、「完全に無保証」です。

一定の条件に従えば、自由にこれを再配布することができます。

配布条件の詳細に関しては、'license()' あるいは 'licence()' と入力してください。


  Natural language support but running in an English locale


R is a collaborative project with many contributors.

Type 'contributors()' for more information and

'citation()' on how to cite R or R packages in publications.


'demo()' と入力すればデモをみることができます。

'help()' とすればオンラインヘルプが出ます。

'help.start()' で HTML ブラウザによるヘルプがみられます。

'q()' と入力すれば R を終了します。

I never gave it too much mind since I understand Japanese and am mostly
working in RStudio anyway (RStudio is in English). But I found a "bug" in
testthat's is_english (which tests whether the current session is reporting
base messages in English) and reported here:

https://github.com/r-lib/testthat/issues/879

I say "bug" because as near as I can tell is_english is built assuming the
logic laid out in ?gettext, ?locales. So even though my machine appears to
have none of the "symptoms" of a non-English locale, nevertheless I get
Japanese. My session info:

R version 3.5.2 (2018-12-20)

Platform: x86_64-apple-darwin15.6.0 (64-bit)

Running under: macOS High Sierra 10.13.6


Matrix products: default

BLAS:
/Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRblas.0.dylib

LAPACK:
/Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib


locale:

[1] C/UTF-8/C/C/C/C


attached base packages:

[1] stats graphics  grDevices utils datasets  methods   base


loaded via a namespace (and not attached):

[1] compiler_3.5.2

My Sys.getenv() and "Languages & Region" settings are in the issue link.

Where else should I be looking in my R session or terminal to figure out
why it's in Japanese?

Mike C

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] rbind has confusing result for custom sub-class (possible bug?)

2019-06-02 Thread Michael Chirico
Thanks for following up! In fact that's exactly what was done here:

https://github.com/Rdatatable/data.table/pull/3602/files

On Sun, Jun 2, 2019 at 8:00 PM Joshua Ulrich 
wrote:

> I thought it would be good to summarize my thoughts, since I made a
> few hypotheses that turned out to be false.
>
> This isn't a bug in base R, in either rbind() or `[<-.Date`.
>
> To summarize the root cause:
> base::rbind.data.frame() calls `[<-` for each column of the
> data.frame, and there is no `[<-.IDate` method to ensure the
> replacement value is converted to integer.  And, in fact, `[<-.Date`
> calls as.Date() and data.table::as.Date.IDate() calls as.numeric() on
> the IDate object.  So the problem exists, and can be fixed, in
> data.table.
>
> Best,
> Josh
>
> On Mon, May 27, 2019 at 9:34 AM Joshua Ulrich 
> wrote:
> >
> > Follow-up (inline) on my comment about a potential issue in `[<-.Date`.
> >
> > On Mon, May 27, 2019 at 9:31 AM Michael Chirico
> >  wrote:
> > >
> > > Yes, thanks for following up on thread here. And thanks again for
> clearing things up, your email was a finger snap of clarity on the whole
> issue.
> > >
> > > I'll add that actually it was data.table's code at fault on the
> storage conversion -- note that if you use an arbitrary sub-class 'foo'
> with no methods defined, it'll stay integer.
> > >
> > > That's because [<- calls as.Date and then as.Date.IDate, and that
> method (ours) has as.numeric(); earlier I had recognized that if we
> commented that line, the issue was "fixed" but I still wasn't understanding
> the root cause.
> > >
> > > My last curiosity on this issue will be in my follow-up thread.
> > >
> > > Mike C
> > >
> > > On Mon, May 27, 2019, 10:25 PM Joshua Ulrich 
> wrote:
> > >>
> > >> On Sun, May 26, 2019 at 6:47 AM Joshua Ulrich <
> josh.m.ulr...@gmail.com> wrote:
> > >> >
> > >> > On Sun, May 26, 2019 at 4:06 AM Michael Chirico
> > >> >  wrote:
> > >> > >
> > >> > > Have finally managed to come up with a fix after checking out
> sys.calls()
> > >> > > from within the as.Date.IDate debugger, which shows something
> like:
> > >> > >
> > >> > > [[1]] rbind(DF, DF)
> > >> > > [[2]] rbind(deparse.level, ...)
> > >> > > [[3]] `[<-`(`*tmp*`, ri, value = 18042L)
> > >> > > [[4]] `[<-.Date`(`*tmp*`, ri, value = 18042L)
> > >> > > [[5]] as.Date(value)
> > >> > > [[6]] as.Date.IDate(value)
> > >> > >
> > >> > > I'm not sure why [<- is called, I guess the implementation is to
> assign to
> > >> > > the output block by block? Anyway, we didn't have a [<- method.
> And
> > >> > > [<-.Date looks like:
> > >> > >
> > >> > > value <- unclass(as.Date(value)) # <- converts to double
> > >> > > .Date(NextMethod(.Generic), oldClass(x)) # <- restores 'IDate'
> class
> > >> > >
> > >> > > So we can fix our bug by defining a [<- class; the question that
> I still
> > >> > > don't see answered in documentation or source code is, why/where
> is [<-
> > >> > > called, exactly?
> > >> > >
> > >> > Your rbind(DF, DF) call dispatches to base::rbind.data.frame().  The
> > >> > `[<-` call is this line:
> > >> > value[[jj]][ri] <- if (is.factor(xij)) as.vector(xij) else xij
> > >> >
> > >> > That's where the storage.mode changes from integer to double.
> > >> >
> > >> > debug: value[[jj]][ri] <- if (is.factor(xij)) as.vector(xij) else
> xij
> > >> > Browse[2]>
> > >> > debug: xij
> > >> > Browse[2]> storage.mode(xij)
> > >> > [1] "integer"
> > >> > Browse[2]> value[[jj]][ri]
> > >> > [1] "2019-05-26"
> > >> > Browse[2]> storage.mode(value[[jj]][ri])
> > >> > [1] "integer"
> > >> > Browse[2]>
> > >> > debug: if (!is.null(nm <- names(xij))) names(value[[jj]])[ri] <- nm
> > >> > Browse[2]> storage.mode(value[[jj]][ri])
> > >> > [1] "double"
> > >> >
> > >> To be clear, I don't think this is a 

[Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?

2019-07-08 Thread Michael Chirico
I think of head() as a standard helper for "glancing" at objects, so I'm
sometimes surprised that head() produces massive output:

M = matrix(nrow = 10L, ncol = 10L)
print(head(M)) # <- beware, could be a huge print

I assume there are lots of backwards-compatibility issues as well as valid
use cases for this behavior, so I guess defaulting to M[1:6, 1:6] is out of
the question.

Is there any scope for adding a new argument to head.matrix that would
allow this flexibility? IINM it should essentially be as simple to do
head.array as:

do.call(`[`, c(list(x, drop = FALSE), lapply(pmin(dim(x), n), seq_len)))

(with extra decoration to handle -n, etc)

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] bug: write.dcf converts hyphen in field name to period

2019-08-02 Thread Michael Chirico
write.dcf(list('my-field' = 1L), tmp <- tempfile())

cat(readLines(tmp))
# my.field: 1

However there's nothing wrong with hyphenated fields per the Debian
standard:

https://www.debian.org/doc/debian-policy/ch-controlfields.html

And in fact we see them using hyphenated fields there, and indeed read.dcf
handles this just fine:

writeLines(gsub('.', '-', readLines(tmp), fixed = TRUE), tmp)
read.dcf(tmp)
#  my-field
# [1,] "1"

The guilty line is as.data.frame:

if(!is.data.frame(x)) x <- as.data.frame(x, stringsAsFactors = FALSE)

For my case, simply adding check.names=FALSE to this call would solve the
issue in my case, but I think not in general. Here's what I see in the
standard:

> The field name is composed of US-ASCII characters excluding control
characters, space, and colon (i.e., characters in the ranges U+0021 (!)
through U+0039 (9), and U+003B (;) through U+007E (~), inclusive). Field
names must not begin with the comment character (U+0023 #), nor with the
hyphen character (U+002D -).

This could be handled by an adjustment to the next line:

nmx <- names(x)

becomes

nmx <- gsub('^[#-]', '', gsub('[^\U{0021}-\U{0039}\U{003B}-\U{007E}]', '.',
names(x)))

(Or maybe errors for having invalid names)

Michael Chirico

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] iconv: embedded nulls when converting to UTF-16

2019-08-05 Thread Braun, Michael
R-devel community:

I have encountered some unexpected behavior using iconv, which may be the 
source of errors I am getting when connecting to a UTF-16 -encoded SQL Server 
database.  A simple example is below. 

When researching this problem, I found r-devel reports of the same problem in 
threads from June 2010 and February, 2016, and that bug #16738 was posted to 
Bugzilla as a result.  However, I have not been able to determine if the error 
is mine, if there is a known workaround, or it truly is a bug in R’s iconv 
implementation.  Any additional help is appreciated.

Thanks,

Michael

——

sessionInfo()
#> R version 3.6.1 (2019-07-05).   ## and replicated on R 3.4.1 on a cluster 
running CentOS Linux 7.
#> Platform: x86_64-apple-darwin15.6.0 (64-bit)
#> Running under: macOS Mojave 10.14.6
# 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

#> attached base packages:
#> [1] stats graphics  grDevices utils datasets  methods   base 

#> loaded via a namespace (and not attached):
#> [1] compiler_3.6.1 

s <- "test"
iconv(s, to="UTF-8”)
#> [1] “test"

iconv(s, to="UTF-16")
#> Error in iconv(s, to = "UTF-16"): embedded nul in string: 
'\xfe\xff\0t\0e\0s\0t’

iconv(s, to="UTF-16BE")
#> Error in iconv(s, to = "UTF-16BE"): embedded nul in string: '\0t\0e\0s\0t’

iconv(s, to="UTF-16LE")
#> Error in iconv(s, to = "UTF-16LE"): embedded nul in string: 't\0e\0s\0t\0’




--
Michael Braun, Ph.D.
Associate Professor of Marketing, and
  Corrigan Research Professor
Cox School of Business
Southern Methodist University
Dallas, TX 75275





__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Why no tz argument for format.POSIXlt?

2019-08-13 Thread Michael Chirico
Was a bit surprised to see

oldtz = Sys.getenv('TZ')
Sys.setenv(TZ = 'Asia/Jakarta')
format(Sys.time())
# [1] "2019-08-13 16:05:03"
format(Sys.time(), tz = 'UTC') # all is well
# [1] "2019-08-13 09:05:03"
format(trunc(Sys.time(), 'hours')) # correctly truncated in local time
# [1] "2019-08-13 16:00:00"
format(trunc(Sys.time(), 'hours'), tz = 'UTC') # no effect!
[1] "2019-08-13 16:00:00"
Sys.setenv(TZ = oldtz)

The reason for the discrepancy is that trunc.POSIXt returns a POSIXlt
object (not POSIXct), whereas Sys.time() is POSIXct. And while
format.POSIXct has a tz argument, format.POSIXlt does not:

names(formals(format.POSIXct))
# [1] "x"  "format" "tz" "usetz"  "..."
names(formals(format.POSIXlt))
# [1] "x"  "format" "usetz"  "..."

Is there any reason not to accept a tz argument for format.POSIXlt? It's
quite convenient to be able to specify an output timezone format on the fly
with format.POSIXct; in the case at hand, I'm trying to force UTC time on
input. format(as.POSIXct(x), tz = 'UTC') seems to work just fine, is there
a reason why this wouldn't be done internally?

Michael Chirico

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] --disable-long-double or --enable-long-double=no?

2019-08-21 Thread Michael Chirico
There's a bit of confusion about how to disable long double support in an R
build.

I see --disable-long-double scattered about, e.g.

   - R-exts:
   
https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Writing-portable-packages
   - R-admin:
   https://cran.r-project.org/doc/manuals/r-release/R-admin.html#Solaris
   - CRAN noLD check description:
   https://www.stats.ox.ac.uk/pub/bdr/noLD/README.txt
   - ?capabilities:
   https://stat.ethz.ch/R-manual/R-devel/library/base/html/capabilities.html

However, it's *missing* from ./config (cd r-source && grep
"disable-long-double" configure). Instead there appears to be some code
built around enable-long-double:

./configure:1808:  --enable-long-doubleuse long double type [yes]

./configure:24723:# Check whether --enable-long-double was given.

I see the option apparently introduced here in 2012 & the ambiguity is
immediate -- the commit mentions disable-long-double but builds
enable-long-double.

https://github.com/wch/r-source/commit/fb8e36f8be0aaf47a9c54c9effb219dae34f0e41

Could someone please help to clear the confusion?

Thanks
Michael Chirico

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?

2019-09-15 Thread Michael Chirico
Finally read in detail your response Gabe. Looks great, and I agree it's
quite intuitive, as well as agree against non-recycling.

Once the length(n) == length(dim(x)) behavior is enabled, I don't think
there's any need/desire to have head() do x[1:6,1:6] anymore. head(x, c(6,
6)) is quite clear for those familiar with head(x, 6), it would seem to me.

Mike C

On Sat, Jul 13, 2019 at 8:35 AM Gabriel Becker 
wrote:

> Hi Michael and Abby,
>
> So one thing that could happen that would be backwards compatible (with
> the exception of something that was an error no longer being an error) is
> head and tail could take vectors of length (dim(x)) rather than integers of
> length for n, with the default being n=6 being equivalent to n = c(6,
> dim(x)[2], <...>, dim(x)[k]), at least for the deprecation cycle, if not
> permanently. It not recycling would be unexpected based on the behavior of
> many R functions but would preserve the current behavior while granting
> more fine-grained control to users that feel they need it.
>
> A rapidly thrown-together prototype of such a method for the head of a
> matrix case is as follows:
>
> head2 = function(x, n = 6L, ...) {
> indvecs = lapply(seq_along(dim(x)), function(i) {
> if(length(n) >= i) {
> ni = n[i]
> } else {
> ni =  dim(x)[i]
> }
> if(ni < 0L)
> ni = max(nrow(x) + ni, 0L)
> else
> ni = min(ni, dim(x)[i])
> seq_len(ni)
> })
> lstargs = c(list(x),indvecs, drop = FALSE)
> do.call("[", lstargs)
> }
>
>
> > mat = matrix(1:100, 10, 10)
>
> > *head(mat)*
>
>  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>
> [1,]1   11   21   31   41   51   61   71   8191
>
> [2,]2   12   22   32   42   52   62   72   8292
>
> [3,]3   13   23   33   43   53   63   73   8393
>
> [4,]4   14   24   34   44   54   64   74   8494
>
> [5,]5   15   25   35   45   55   65   75   8595
>
> [6,]6   16   26   36   46   56   66   76   8696
>
> > *head2(mat)*
>
>  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
>
> [1,]1   11   21   31   41   51   61   71   8191
>
> [2,]2   12   22   32   42   52   62   72   8292
>
> [3,]3   13   23   33   43   53   63   73   8393
>
> [4,]4   14   24   34   44   54   64   74   8494
>
> [5,]5   15   25   35   45   55   65   75   8595
>
> [6,]6   16   26   36   46   56   66   76   8696
>
> > *head2(mat, c(2, 3))*
>
>  [,1] [,2] [,3]
>
> [1,]1   11   21
>
> [2,]2   12   22
>
> > *head2(mat, c(2, -9))*
>
>  [,1]
>
> [1,]1
>
> [2,]2
>
>
> Now one thing to keep in mind here, is that I think we'd  either a) have
> to make the non-recycling  behavior permanent, or b) have head treat
> data.frames and matrices different with respect to the subsets they grab
> (which strikes me as a  *Bad Plan *(tm)).
>
> So I don't think the default behavior would ever be mat[1:6, 1:6],  not
> because of backwards compatibility, but because at least in my intuition
> that is just not what head on a data.frame should do by default, and I
> think the behaviors for the basic rectangular datatypes should "stick
> together". I mean, also because of backwards compatibility, but that could  
> *in
> theory* change across a long enough deprecation cycle, but  the
> conceptually right thing to do with a data.frame probably won't.
>
> All of that said, is head(mat, c(6, 6)) really that much  easier to
> type/better than just mat[1:6, 1:6, drop=FALSE] (I know this will behave
> differently if any of the dims of mat are less than 6, but if so why are
> you heading it in the first place ;) )? I don't really have a strong
> feeling on the answer to that.
>
> I'm happy to put a patch for head.matrix, head.data.frame, tail.matrix and
> tail.data.frame, plus documentation, if people on R-core are interested in
> this.
>
> Note, as most here probably know, and as alluded to above,  length(n) > 1
> for head or tail currently give an error, so  this would  be an extension
> of the existing functionality in the mathematical extension sense, where
> all existing behavior would remain identical, but the support/valid
> parameter space would grow.
>
> Best,
> ~G
>
>
> On Fri, Jul 12, 2019 at 4:03 PM Abby Spurdle  wrote:
>
>> > I assume there are lots of backwards-compatibility issues as well as
>> valid
>> > use cases for this behavior, so I guess defaulting to M[1:6, 1:6] is out
>> of
>> > the 

Re: [Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?

2019-09-16 Thread Michael Chirico
Awesome. Gabe, since you already have a workshopped version, would you like
to proceed? Feel free to ping me to review the patch once it's posted.

On Mon, Sep 16, 2019 at 3:26 PM Martin Maechler 
wrote:

> >>>>> Michael Chirico
> >>>>> on Sun, 15 Sep 2019 20:52:34 +0800 writes:
>
> > Finally read in detail your response Gabe. Looks great,
> > and I agree it's quite intuitive, as well as agree against
> > non-recycling.
>
> > Once the length(n) == length(dim(x)) behavior is enabled,
> > I don't think there's any need/desire to have head() do
> > x[1:6,1:6] anymore. head(x, c(6, 6)) is quite clear for
> > those familiar with head(x, 6), it would seem to me.
>
> > Mike C
>
> Thank you, Gabe, and Michael.
> I did like Gabe's proposal already back in July but was
> busy and/or vacationing then ...
>
> If you submit this with a patch (that includes changes to both
> *.R and *.Rd , including some example) as "wishlist" item to R's
> bugzilla, I'm willing/happy to check and commit this to R-devel.
>
> Martin
>
>
> > On Sat, Jul 13, 2019 at 8:35 AM Gabriel Becker
> >  wrote:
>
> >> Hi Michael and Abby,
> >>
> >> So one thing that could happen that would be backwards
> >> compatible (with the exception of something that was an
> >> error no longer being an error) is head and tail could
> >> take vectors of length (dim(x)) rather than integers of
> >> length for n, with the default being n=6 being equivalent
> >> to n = c(6, dim(x)[2], <...>, dim(x)[k]), at least for
> >> the deprecation cycle, if not permanently. It not
> >> recycling would be unexpected based on the behavior of
> >> many R functions but would preserve the current behavior
> >> while granting more fine-grained control to users that
> >> feel they need it.
> >>
> >> A rapidly thrown-together prototype of such a method for
> >> the head of a matrix case is as follows:
> >>
> >> head2 = function(x, n = 6L, ...) { indvecs =
> >> lapply(seq_along(dim(x)), function(i) { if(length(n) >=
> >> i) { ni = n[i] } else { ni = dim(x)[i] } if(ni < 0L) ni =
> >> max(nrow(x) + ni, 0L) else ni = min(ni, dim(x)[i])
> >> seq_len(ni) }) lstargs = c(list(x),indvecs, drop = FALSE)
> >> do.call("[", lstargs) }
> >>
> >>
> >> > mat = matrix(1:100, 10, 10)
> >>
> >> > *head(mat)*
> >>
> >> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> >>
> >> [1,] 1 11 21 31 41 51 61 71 81 91
> >>
> >> [2,] 2 12 22 32 42 52 62 72 82 92
> >>
> >> [3,] 3 13 23 33 43 53 63 73 83 93
> >>
> >> [4,] 4 14 24 34 44 54 64 74 84 94
> >>
> >> [5,] 5 15 25 35 45 55 65 75 85 95
> >>
> >> [6,] 6 16 26 36 46 56 66 76 86 96
> >>
> >> > *head2(mat)*
> >>
> >> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> >>
> >> [1,] 1 11 21 31 41 51 61 71 81 91
> >>
> >> [2,] 2 12 22 32 42 52 62 72 82 92
> >>
> >> [3,] 3 13 23 33 43 53 63 73 83 93
> >>
> >> [4,] 4 14 24 34 44 54 64 74 84 94
> >>
> >> [5,] 5 15 25 35 45 55 65 75 85 95
> >>
> >> [6,] 6 16 26 36 46 56 66 76 86 96
> >>
> >> > *head2(mat, c(2, 3))*
> >>
> >> [,1] [,2] [,3]
> >>
> >> [1,] 1 11 21
> >>
> >> [2,] 2 12 22
> >>
> >> > *head2(mat, c(2, -9))*
> >>
> >> [,1]
> >>
> >> [1,] 1
> >>
> >> [2,] 2
> >>
> >>
> >> Now one thing to keep in mind here, is that I think we'd
> >> either a) have to make the non-recycling behavior
> >> permanent, or b) have head treat data.frames and matrices
> >> different with respect to the subsets they grab (which
> >> strikes me as a *Bad Plan *(tm)).
> >>
> >> So I don't think the default behavior would ever be
> >> mat[1:6, 1:6], not because of backwards compatibility,
> >> but because at least in my intuition that is just not
> >> what he

[Rd] passing extra arguments to devtools::build

2019-09-27 Thread Michael Friendly
This question was posed on SO : 
https://stackoverflow.com/questions/58118495/passing-extra-argumenets-to-devtoolsbuild
 
but there has been no useful reply.

Something seems to have changed in the |devtools|package, so that the 
following commands, that used to run now give an error I can't decipher:

|>Sys.setenv(R_GSCMD="C:/Program 
Files/gs/gs9.21/bin/gswin64c.exe")>devtools::build(args 
=c('--resave-data','--compact-vignettes="gs+qpdf"'))The 
filename,directory name,or volume label syntax is incorrect. Error 
in(function(command =NULL,args =character(),error_on_status 
=TRUE,:System command error|

I've tried other alternatives with other |devtools| commands, like just 
passing a single argument, but still get the same error

|args ='--compact-vignettes="gs+qpdf"'devtools::check_win_devel(args=args)|

I'm using devtools 2.2.0, under R 3.5.2

-- 
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, ASA Statistical Graphics Section
York University  Voice: 416 736-2100 x66249
4700 Keele StreetWeb: http://www.datavis.ca | @datavisFriendly
Toronto, ONT  M3J 1P3 CANADA


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] re-submission of package after CRAN-pretest notes

2020-01-08 Thread Michael Friendly
It used to be the case that when I submitted a package and it gave notes 
or warnings in the CRAN checks, I was required to bump the package 
version before re-submission.


I hope this is no longer the case.  I recently submitted a package that 
gave one fairly trivial NOTE, fixed that, and would like to re-submit.


-Michael


--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept. & Chair, ASA Statistical Graphics Section
York University  Voice: 416 736-2100 x66249
4700 Keele StreetWeb: http://www.datavis.ca | @datavisFriendly
Toronto, ONT  M3J 1P3 CANADA

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Default for checkBuilt in update.packages() should be TRUE

2020-02-24 Thread Michael Dewey
I suppose it is too late to change the name but checkBuilt does not 
immediately clarify to me what it does. It does not check whether I have 
built a package for instance. Having read Duncan's post at least I now 
know that I should set it as TRUE until the default is changed.


Michael

On 24/02/2020 10:48, Duncan Murdoch wrote:
The checkBuilt argument to update.packages() currently defaults to 
FALSE.  This means that packages built and installed under R 3.6.2 will 
not be updated by R 4.0.0, and leads to confusion (e.g. 
https://stackoverflow.com/q/60356442/2554330, where tidyverse can't be 
updated because some of its many dependencies haven't been updated).


The default should be TRUE, even though this will lead to some packages 
being updated unnecessarily, because the cost of an unnecessary update 
is so much less than the cost of missing a necessary update.


Duncan Murdoch

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Michael
http://www.dewey.myzen.co.uk/home.html

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [SPAM] Hard memory limit of 16GB under Windows?

2020-04-07 Thread Michael Dewey

Dear Samuel

Does the FAQ for Windows section 2.9 help you here?

Michael

On 07/04/2020 12:35, Samuel Granjeaud IR/Inserm wrote:

Hi,
I am not not sure whether this topic belongs to this mail list, but I
feel the subscribers here should be the right audience.
I noticed that the memory limit reported under Windows is 16 GB. I am
wondering how to increase it. I didn't found anything in Rprofile.site
nor .Rprofile. Is this limit hard coded at compilation?
Best,
Samuel
[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Michael
http://www.dewey.myzen.co.uk/home.html

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] suggestion: "." in [lsv]apply()

2020-04-16 Thread Michael Mahoney
This syntax is already implemented in the {purrr} package, more or
less -- you need to add a tilde before your function call for it to
work exactly as written:

purrr::map_dbl(split(mtcars, mtcars$cyl), ~ summary(lm(wt ~ mpg, .))$r.squared)

is equivalent to

sapply(split(mtcars, mtcars$cyl), function(d) summary(lm(mpg ~ wt,
d))$r.squared)

Seems like using this package is probably an easier solution for this
wish than adding a reserved variable and adding additional syntax to
the apply family as a whole.

Thanks,

-Mike

> From: Sokol Serguei 
> Date: Thu, Apr 16, 2020 at 12:03 PM
> Subject: Re: [Rd] suggestion: "." in [lsv]apply()
> To: William Dunlap 
> Cc: r-devel 
>
>
> Thanks Bill,
>
> Clearly, my first proposition for wsapply() is quick and dirty one.
> However, if "." becomes a reserved variable with this new syntax,
> wsapply() can be fixed (at least for your example and alike) as:
>
> wsapply=function(l, fun, ...) {
>  .=substitute(fun)
>  if (is.name(.) || is.call(.) && .[[1]]==as.name("function")) {
>  sapply(l, fun, ...)
>  } else {
>  sapply(l, function(d) eval(., list(.=d)), ...)
>  }
> }
>
> Will it do the job?
>
> Best,
> Serguei.
>
> Le 16/04/2020 à 17:07, William Dunlap a écrit :
> > Passing in a function passes not only an argument list but also an
> > environment from which to get free variables. Since your function
> > doesn't pay attention to the environment you get things like the
> > following.
> >
> > > wsapply(list(1,2:3), paste(., ":", deparse(s)))
> > [[1]]
> > [1] "1 : paste(., \":\", deparse(s))"
> >
> > [[2]]
> > [1] "2 : paste(., \":\", deparse(s))" "3 : paste(., \":\", deparse(s))"
> >
> > Bill Dunlap
> > TIBCO Software
> > wdunlap tibco.com 
> >
> >
> > On Thu, Apr 16, 2020 at 7:25 AM Sokol Serguei  > > wrote:
> >
> > Hi,
> >
> > I would like to make a suggestion for a small syntactic
> > modification of
> > FUN argument in the family of functions [lsv]apply(). The idea is to
> > allow one-liner expressions without typing "function(item) {...}" to
> > surround them. The argument to the anonymous function is simply
> > referred
> > as ".". Let take an example. With this new feature, the following call
> >
> > sapply(split(mtcars, mtcars$cyl), function(d) summary(lm(mpg ~ wt,
> > d))$r.squared)
> > #4 6 8
> > #0.5086326 0.4645102 0.4229655
> >
> >
> > could be rewritten as
> >
> > sapply(split(mtcars, mtcars$cyl), summary(lm(mpg ~ wt, .))$r.squared)
> >
> > "Not a big saving in typing" you can say but multiplied by the
> > number of
> > [lsv]apply usage and a neater look, I think, the idea merits to be
> > considered.
> > To illustrate a possible implementation, I propose a wrapper
> > example for
> > sapply():
> >
> > wsapply=function(l, fun, ...) {
> >  s=substitute(fun)
> >  if (is.name (s) || is.call(s) &&
> > s[[1]]==as.name ("function")) {
> >  sapply(l, fun, ...) # legacy call
> >  } else {
> >  sapply(l, function(d) eval(s, list(.=d)), ...)
> >  }
> > }
> >
> > Now, we can do:
> >
> > wsapply(split(mtcars, mtcars$cyl), summary(lm(mpg ~ wt, .))$r.squared)
> >
> > or, traditional way:
> >
> > wsapply(split(mtcars, mtcars$cyl), function(d) summary(lm(mpg ~ wt,
> > d))$r.squared)
> >
> > the both work.
> >
> > How do you feel about that?
> >
> > Best,
> > Serguei.
> >
> > __
> > R-devel@r-project.org  mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Translations and snprintf on Windows

2020-04-30 Thread Michael Chirico
[a bit unsure on if this is maybe better for r-package-devel]

We recently added translations to messages at the R and C level to
data.table.

At the C level, we did _() wrapping for char arrays supplied to the
following functions: error, warning, Rprintf, Error, and snprintf.

This seemed OK but the use of snprintf specifically appears to have caused
a crash on Windows:

https://github.com/Rdatatable/data.table/issues/4402

Is there any guidance against using gettext with snprintf, or perhaps
guidance on which "outputters" *are* OK for translation?

Michael Chirico

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Should 0L * NA_integer_ be 0L?

2020-05-23 Thread Michael Chirico
I don't see this specific case documented anywhere (I also tried to search
the r-devel archives, as well as I could); the only close reference
mentions NA & FALSE = FALSE, NA | TRUE = TRUE. And there's also this
snippet from R-lang:

In cases where the result of the operation would be the same for all
> possible values the NA could take, the operation may return this value.
>

This begs the question -- shouldn't 0L * NA_integer_ be 0L?

Because this is an integer operation, and according to this definition of
NA:

Missing values in the statistical sense, that is, variables whose value
> is not known, have the value @code{NA}
>

NA_integer_ should be an unknown integer value between -2^31+1 and 2^31-1.
Multiplying any of these values by 0 results in 0 -- that is, the result of
the operation would be 0 for all possible values the NA could take.

This came up from what seems like an inconsistency to me:

all(NA, FALSE)
# [1] FALSE
NA * FALSE
# [1] NA

I agree with all(NA, FALSE) being FALSE because we know for sure that all
cannot be true. The same can be said of the multiplication -- whether NA
represents TRUE or FALSE, the resulting value is 0 (FALSE).

I also agree with the numeric case, FWIW: NA_real_ * 0 has to be NA_real_,
because NA_real_ could be Inf or NaN, for both of which multiplication by 0
gives NaN, hence 0 * NA_real_ is either 0 or NaN, hence it must be NA_real_.

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Should 0L * NA_integer_ be 0L?

2020-05-23 Thread Michael Chirico
OK, so maybe one way to paraphrase:

For R, the boundedness of integer vectors is an implementation detail,
rather than a deeper mathematical fact that can be exploited for this
case.

One might also expect then that overflow wouldn't result in NA, but
rather automatically cast up to numeric? But that this doesn't happen
for efficiency reasons?

Would it make any sense to have a different carveout for the logical
case? For logical, storage as integer might be seen as a similar type
of implementation detail (though if we're being this strict, the
question arises of what multiplication of logical values even means).

FALSE * NA = 0L


On Sat, May 23, 2020 at 6:49 PM Martin Maechler
 wrote:
>
> >>>>> Michael Chirico
> >>>>> on Sat, 23 May 2020 18:08:22 +0800 writes:
>
> > I don't see this specific case documented anywhere (I also tried to 
> search
> > the r-devel archives, as well as I could); the only close reference
> > mentions NA & FALSE = FALSE, NA | TRUE = TRUE. And there's also this
> > snippet from R-lang:
>
> > In cases where the result of the operation would be the same for all
> >> possible values the NA could take, the operation may return this value.
> >>
>
> > This begs the question -- shouldn't 0L * NA_integer_ be 0L?
>
> > Because this is an integer operation, and according to this definition 
> of
> > NA:
>
> > Missing values in the statistical sense, that is, variables whose value
> >> is not known, have the value @code{NA}
> >>
>
> > NA_integer_ should be an unknown integer value between -2^31+1 and 
> 2^31-1.
> > Multiplying any of these values by 0 results in 0 -- that is, the 
> result of
> > the operation would be 0 for all possible values the NA could take.
>
>
> > This came up from what seems like an inconsistency to me:
>
> > all(NA, FALSE)
> > # [1] FALSE
> > NA * FALSE
> > # [1] NA
>
> > I agree with all(NA, FALSE) being FALSE because we know for sure that 
> all
> > cannot be true. The same can be said of the multiplication -- whether NA
> > represents TRUE or FALSE, the resulting value is 0 (FALSE).
>
> > I also agree with the numeric case, FWIW: NA_real_ * 0 has to be 
> NA_real_,
> > because NA_real_ could be Inf or NaN, for both of which multiplication 
> by 0
> > gives NaN, hence 0 * NA_real_ is either 0 or NaN, hence it must be 
> NA_real_.
>
> I agree about almost everything you say above. ...
> but possibly the main conclusion.
>
> The problem with your proposed change would be that  integer
> arithmetic gives a different result than the corresponding
> "numeric" computation.
> (I don't remember other such cases in R, at least as long as the
>  integer arithmetic does not overflow.)
>
> One principle to decided such problems in S and R has been that
> the user should typically *not* have to know if their data is
> stored in float/double or in integer, and the results should be the same
> (possibly apart from staying integer for some operations).
>
>
> {{Note that there are also situations were it's really
>   undesirable that0 * NA   does *not* give 0 (but NA);
>   notably in sparse matrix operations where you'd very often can
>   now that NA was not Inf (or NaN) and you really would like to
>   preserve sparseness ...}}
>
>
> > [[alternative HTML version deleted]]
>
> (as you did not use plain text ..)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] C Interface

2010-06-18 Thread michael meyer
Greetings,

I am trying to call simple C-code from R.
I am on Windows XP with RTools installed.

The C-function is

#include 
#include 
#include 
#include 

// prevent name mangling
extern "C" {

SEXP __cdecl test(SEXP s){

  SEXP result;
  PROTECT(result = NEW_NUMERIC(1));
  double* ptr=NUMERIC_POINTER(result);
  double t = *REAL(s);
  double u = t-floor(t)-0.5;
  if(u>0) *ptr=-1+4*u; else *ptr=-1-4*u;
  Rprintf("The value is %f", *ptr);
  UNPROTECT(1);
  return result;
}

};

It is compiled with

R CMD SHLIB OrthoFunctions.c

with flag

MAKEFLAGS="CC=g++"


However when I call this code from R with

test <- function(t){
  .Call("test",t)
}
dyn.load("./OrthoFunctions.dll")
test(0)
dyn.unload("./OrthoFunctions.dll")

then R crashes.

If I compile with the default flags (no extern "C", no __cdecl) I get an
error message about an undefined reference to "__gxx_personality_v0":

C:\...>R CMD SHLIB OrthoFunctions.c
C:/Programme/R/R-2.10.1/etc/Makeconf:151: warning: overriding commands for
target `.c.o'
C:/Programme/R/R-2.10.1/etc/Makeconf:142: warning: ignoring old commands for
target `.c.o'
C:/Programme/R/R-2.10.1/etc/Makeconf:159: warning: overriding commands for
target `.c.d'
C:/Programme/R/R-2.10.1/etc/Makeconf:144: warning: ignoring old commands for
target `.c.d'
C:/Programme/R/R-2.10.1/etc/Makeconf:169: warning: overriding commands for
target `.m.o'
C:/Programme/R/R-2.10.1/etc/Makeconf:162: warning: ignoring old commands for
target `.m.o'
g++ -I"C:/Programme/R/R-2.10.1/include"-O2 -Wall  -c
OrthoFunctions.c -o OrthoFunctions.o
gcc -shared -s -o OrthoFunctions.dll tmp.def OrthoFunctions.o
-LC:/Programme/R/R-2.10.1/bin -lR
OrthoFunctions.o:OrthoFunctions.c:(.eh_frame+0x11): undefined reference to
`__gxx_personality_v0'
collect2: ld returned 1 exit status



I have a vague idea of the issue of calling conventions and was hoping that
the __cdecl
specifier would force the appropriate convention.
I also have Cygwin installed as part of the Python(x,y) distribution but I
am assuming that
R CMD SHLIB source.c
calls the right compiler.

What could the problem be?

Many thanks,


Michael

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] C Interface

2010-06-20 Thread michael meyer
Thanks for all replies.
I'll use inlining until I have figured out how to build a proper package.

Michael

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Call for suggestions

2010-07-04 Thread michael meyer
Greetings,

If this is not the appropriate place to post this question please let me
know where
to post it.

I have a package under development which fits models of the form
$$
f(t)=\sum_i B_iG_i(t,\omega)
$$
depending on a parameter vector $\omega$ of arbitrary dimension to
data (one dimensional time series) in the general framework of the

data = deterministic signal + Gaussian noise

in the spirit of
Bretthorst, G. Larry, 1988, "Bayesian Spectrum Analysis and Parameter
Estimation,"
Lecture Notes in Statistics, vol. 48, Springer-Verlag, New York.
The basic parametric model
$$
G_i(t,\omega)=cos(\omega_i t), sin(\omega_i t)
$$
corresponds to classical spectral analysis, however the model can (at least
in principle)
be completely general. The problem is that the models cannot be defined by
the user but
have to be hard coded (in C++ since the computations are substantial).

I plan to include the ability to modify each model by the action of further
parameters as:

time changes: t -> t+omega, t -> omega*t, t -> t^omega
model function change: G(t) -> sign(G(t))*|G(t)|^omega

I plan to include models that can be generated by these actions from trig
functions,
some piecewise linear functions, monomials, and exponential function.
My question is: what further parametric models are of sufficiently general
interest to be
included?


Many thanks,

Michael Meyer

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Using 'dimname names' in aperm() and apply()

2010-07-29 Thread Michael Lachmann
I think that the "dimname names" of tables and arrays could make
aperm() and apply() (and probably some other functions) easier to use.
(dimname names are, for example, created by table() )

The use would be something like:
--
x <-table( from=sample(3,100,rep=T), to=sample(5,100,rep=T))
trans <- x / apply(x,"from",sum)

y <- aperm( trans, c("from","to") )
z <- aperm(y, c("to","from") )

res <-apply( y, "to", sum)
--

This makes the array much easier to handle than having to keep track
which dimension currently means what.

For aperm and apply, the change seems very simple - one new function,
and an additional line in each.
--
dimnum.from.dimnamename <- function(A, dimensions)
{

  if( is.character(dimensions) ) {
n <- names(dimnames(A))
if( !is.null(n) ) {
dimnum <- seq( along=n)
names(dimnum) <-  n
dimensions <- dimnum[dimensions]
  }
  }
  dimensions
}



aperm <- function (a, perm, resize = TRUE)
{
if (missing(perm))
perm <- integer(0L)
perm <- dimnum.from.dimnamename( a, perm) # this line was added to aperm
.Internal(aperm(a, perm, resize))
}

apply <-  function (X, MARGIN, FUN, ...)
{
FUN <- match.fun(FUN)
d <- dim(X)
dl <- length(d)
if (dl == 0L)
stop("dim(X) must have a positive length")
ds <- 1L:dl
if (length(oldClass(X)))
X <- if (dl == 2)
as.matrix(X)
else as.array(X)
d <- dim(X)
dn <- dimnames(X)


MARGIN <- dimnum.from.dimnamename( X,MARGIN ) # this line was added to apply

s.call <- ds[-MARGIN]
s.ans <- ds[MARGIN]
d.call <- d[-MARGIN]
d.ans <- d[MARGIN]
dn.call <- dn[-MARGIN]
dn.ans <- dn[MARGIN]
d2 <- prod(d.ans)
if (d2 == 0L) {
newX <- array(vector(typeof(X), 1L), dim = c(prod(d.call),
1L))
ans <- FUN(if (length(d.call) < 2L)
newX[, 1]
else array(newX[, 1L], d.call, dn.call), ...)
return(if (is.null(ans)) ans else if (length(d.ans) <
2L) ans[1L][-1L] else array(ans, d.ans, dn.ans))
}
newX <- aperm(X, c(s.call, s.ans))
dim(newX) <- c(prod(d.call), d2)
ans <- vector("list", d2)
if (length(d.call) < 2L) {
if (length(dn.call))
dimnames(newX) <- c(dn.call, list(NULL))
for (i in 1L:d2) {
tmp <- FUN(newX[, i], ...)
if (!is.null(tmp))
ans[[i]] <- tmp
}
}
else for (i in 1L:d2) {
tmp <- FUN(array(newX[, i], d.call, dn.call), ...)
if (!is.null(tmp))
ans[[i]] <- tmp
}
ans.list <- is.recursive(ans[[1L]])
l.ans <- length(ans[[1L]])
ans.names <- names(ans[[1L]])
if (!ans.list)
ans.list <- any(unlist(lapply(ans, length)) != l.ans)
if (!ans.list && length(ans.names)) {
all.same <- sapply(ans, function(x) identical(names(x),
ans.names))
if (!all(all.same))
ans.names <- NULL
}
len.a <- if (ans.list)
d2
else length(ans <- unlist(ans, recursive = FALSE))
if (length(MARGIN) == 1L && len.a == d2) {
names(ans) <- if (length(dn.ans[[1L]]))
dn.ans[[1L]]
return(ans)
}
if (len.a == d2)
return(array(ans, d.ans, dn.ans))
if (len.a && len.a%%d2 == 0L) {
if (is.null(dn.ans))
    dn.ans <- vector(mode = "list", length(d.ans))
dn.ans <- c(list(ans.names), dn.ans)
return(array(ans, c(len.a%/%d2, d.ans), if (!all(sapply(dn.ans,
is.null))) dn.ans))
}
return(ans)
}
--

Thanks,

Michael


--
Michael Lachmann, Max Planck institute of evolutionary anthropology
Deutscher Platz. 6, 04103 Leipzig, Germany
Tel: +49-341-3550521, Fax: +49-341-3550555

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] problem with dl tag in tools::Rd2HTML

2010-08-05 Thread Michael Lachmann

I think tools::Rd2HTML has a problem with the dl tag.
Under some conditions,  and , and  and  are not nested
correctly.
Here is an example from the "options" doc file:
--


save.defaults, save.image.defaults:
see save.

--
You can see that the  starts, then  starts, then  end the
paragraph, but the  has not ended yet.
I don't really understand in html, but I think the correct way would be
--


save.defaults, save.image.defaults:
see save.

--

Michael
-- 
View this message in context: 
http://r.789695.n4.nabble.com/problem-with-dl-tag-in-tools-Rd2HTML-tp2315499p2315499.html
Sent from the R devel mailing list archive at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] problem with dl tag in tools::Rd2HTML

2010-08-05 Thread Michael Lachmann


Duncan Murdoch-2 wrote:
> 
> 
> What version are you using?  I don't see that in current R patched.
> 
> 

I see this in version 2.11.1.
This is the code I ran to generate it:
page <- utils:::.getHelpFile(?options)
tools::Rd2HTML(page,out="t.html")

in the generated file, t.html, the first  tag is such a case.

Michael
-- 
View this message in context: 
http://r.789695.n4.nabble.com/problem-with-dl-tag-in-tools-Rd2HTML-tp2315499p2315620.html
Sent from the R devel mailing list archive at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Assignment of individual values to data frame columns: intentional or unintentional behavior?

2010-08-05 Thread Michael Lachmann


Ulrike Grömping wrote:
> 
> 
> However, given the documentation that partial matching is not used on 
> the left-hand side, I would have expected even more that the assignment
> 
> sw$Fert[1] <- 10
> 
> works differently, because I am using it on the left-hand side. 
> Probably, extraction ([1]) is done first here, so that the right-hand 
> side won. At least, this is very confusing.
> 
> 

I totally agree! I think that 
sw <- data.frame(Fertility=1:5)
sw$Fert[1] <- 10

should work either like

sw$Fert2[1] <- 10

i.e. create new column, containing just 10.
or like

sw$Fertility[1] <- 10

i.e. replace the 1st item in sw$Fertility by 10.

-- 
View this message in context: 
http://r.789695.n4.nabble.com/Assignment-of-individual-values-to-data-frame-columns-intentional-or-unintentional-behavior-tp2315105p2315641.html
Sent from the R devel mailing list archive at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] rbind on data.frame that contains a column that is also a data.frame

2010-08-05 Thread Michael Lachmann

Hi,

The following was already a topic on r-help, but after understanding what is
going on, I think it fits better in r-devel.

The problem is this:
When a data.frame has another data.frame in it, rbind doesn't work well.
Here is an example:
--
> a=data.frame(x=1:10,y=1:10)
> b=data.frame(z=1:10)
> b$a=a
> b
z a.x a.y
1   1   1   1
2   2   2   2
3   3   3   3
4   4   4   4
5   5   5   5
6   6   6   6
7   7   7   7
8   8   8   8
9   9   9   9
10 10  10  10
> rbind(b,b)
Error in `row.names<-.data.frame`(`*tmp*`, value = c("1", "2", "3", "4",  : 
  duplicate 'row.names' are not allowed
In addition: Warning message:
non-unique values when setting 'row.names': ‘1’, ‘10’, ‘2’, ‘3’, ‘4’, ‘5’,
‘6’, ‘7’, ‘8’, ‘9’
--


Looking at the code of rbind.data.frame, the error comes from the   
lines: 
-- 
xij <- xi[[j]] 
if (has.dim[jj]) { 
  value[[jj]][ri, ] <- xij 
  rownames(value[[jj]])[ri] <- rownames(xij)   # <--  problem is here 
} 
-- 
if the rownames() line is dropped, all works well. What this line   
tries to do is to join the rownames of internal elements of the   
data.frames I try to rbind. So the result, in my case should have a   
column 'a', whose rownames are the rownames of the original column 'a'. It   
isn't totally clear to me why this is needed. When would a data.frame   
have different rownames on the inside vs. the outside? 

Notice also that rbind takes into account whether the rownames of the   
data.frames to be joined are simply 1:n, or they are something else.   
If they are 1:n, then the result will have rownames 1:(n+m). If not,   
then the rownames might be kept. 

I think, more consistent would be to replace the lines above with   
something like: 
 if (has.dim[jj]) { 
 value[[jj]][ri, ] <- xij 
 rnj = rownames(value[[jj]]) 
 rnj[ri] = rownames(xij) 
 rnj = make.unique(as.character(unlist(rnj)), sep = "") 
 rownames(value[[jj]]) <- rnj 
 } 

In this case, the rownames of inside elements will also be joined, but   
in case they overlap, they will be made unique - just as they are for   
the overall result of rbind. A side effect here would be that the   
rownames of matrices will also be made unique, which till now didn't   
happen, and which also doesn't happen when one rbinds matrices that   
have rownames. So it would be better to test above if we are dealing   
with a matrix or a data.frame. 

But most people don't have different rownames inside and outside.   
Maybe it would be best to add a flag as to whether you care or don't   
care about the rownames of internal data.frames... 

But maybe data.frames aren't meant to contain other data.frames?

If instead I do
b=data.frame( z=1:10, a=a) 
then rbind(b,b) works well. In this case the data.frame was converted to its
columns. Maybe
b$a = a 
should do the same?

Michael 
-- 
View this message in context: 
http://r.789695.n4.nabble.com/rbind-on-data-frame-that-contains-a-column-that-is-also-a-data-frame-tp2315682p2315682.html
Sent from the R devel mailing list archive at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] problem with dl tag in tools::Rd2HTML

2010-08-05 Thread Michael Lachmann


Michael Lachmann wrote:
> 
> 
> Duncan Murdoch-2 wrote:
>> 
>> 
>> What version are you using?  I don't see that in current R patched.
>> 
>> 
> 
> I see this in version 2.11.1.
> This is the code I ran to generate it:
> page <- utils:::.getHelpFile(?options)
> tools::Rd2HTML(page,out="t.html")
> 
> in the generated file, t.html, the first  tag is such a case.
> 
> 

I also see the same issue in 2.12.0 for OSX that I just downloaded.

Michael
-- 
View this message in context: 
http://r.789695.n4.nabble.com/problem-with-dl-tag-in-tools-Rd2HTML-tp2315499p2315709.html
Sent from the R devel mailing list archive at Nabble.com.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] No RTFM?

2010-08-24 Thread Michael Dewey

At 01:08 20/08/2010, Spencer Graves wrote:
 What do you think about adding a "No RTFM" 
policy to the R mailing lists? Per, "http://en.wikipedia.org/wiki/RTFM":


Spencer,
You raise an interesting point but the responses 
to your post remind us that people (and indeed 
whole cultures) are not all situated at the same 
point on the continuum of directness between 
"It's a cow, stupid" and "From this side it looks 
not unlike a cow". The issue of what is offensive 
is even more complex, I remember being taken to 
task on another list for referring to a "rule of thumb".


The thing I find most rude on the list is not the 
occasional abrupt postings by people who are 
obviously having a bad day but the number of 
fairly long exchanges which end unresolved as the 
OP never bothers to post a conclusion and we 
never know whether we solved his/her problem.
I am not asking for thanks but we would all 
benefit from knowing how it all turned out.



The Ubuntu Forums and LinuxQuestions.org, for 
instance, have instituted "no RTFM" policies to 
promote a welcoming atmosphere.[8][9].


RTFM [and] "Go look on google" are two 
inappropriate responses to a question. If you 
don't know the answer or don't wish to help, 
please say nothing instead of brushing off 
someone's question. Politely showing someone how 
you searched or obtained the answer to a 
question is acceptable, even encouraged.

...

If you wish to remind a user to use search tools 
or other resources when they have asked a 
question you feel is basic or common, please be 
very polite. Any replies for help that contain 
language disrespectful towards the user asking 
the question, i.e. "STFU" or "RTFM" are 
unacceptable and will not be tolerated. —Ubuntu Forums



Gavin Simpson and I recently provided examples 
answering a question from "r.ookie" that had 
previously elicited responses, ""You want us to 
read the help page to you?" and "It yet again 
appears that you are asking us to read the help pages for you."



I can appreciate the sentiment in 
fortunes('rtfm'). In this case, however, 
"r.ookie" had RTFM (and said so), but evidently 
the manual was not sufficiently clear.



Best Wishes,
Spencer Graves




Michael Dewey
http://www.aghmed.fsnet.co.uk

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Feature request: put NewEnvironment and R_NewhashedEnv into API

2010-09-03 Thread Michael Lawrence
What about allocSExp(ENVSXP)? Then SET_ENCLOS() to set the parent? Seems to
work for me.

Michael

On Sun, Aug 29, 2010 at 11:02 AM, Oliver Flasch
wrote:

> Hi,
>
> as Seth Falcon in 2006, I also need to create new environments from package
> C code. Unfortunately, both NewEnvironment and R_NewHashedEnv are not yet
> part of the public API, if I understand correctly.
>
> Is it planned to add at least one of these functions to the public API in
> the near future? May I submit a patch? Otherwise I would need to
> re-implement much of the functionality of R environments in my own code.
>
>
> Many thanks and best regards
>
> Oliver
>
> --
> Dipl. Inform. Oliver Flasch,
> Institute of Computer Science,
> Faculty of Computer Science and Engineering Science,
> Cologne University of Applied Sciences,
> Steinmüllerallee 1, 51643 Gummersbach
> phone: +49 171 6447868
> eMail: oliver.fla...@fh-koeln.de
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] what is the best way for an external interface to interact with graphics, libraries

2010-09-08 Thread Michael Lawrence
On Tue, Sep 7, 2010 at 11:34 AM, Simon Urbanek
wrote:

>
> On Sep 7, 2010, at 2:21 PM, ghostwheel wrote:
>
> >
> > Another message about the R to TeXmacs interface.
> >
> > 1. Graphics
> > The TeXmacs interface allows the user to directly insert graphics into
> the
> > session.
> >
> > Since I am not very familiar with programming for R, I implemented the
> > interaction with graphics in a very primitive way. It was two modes of
> > working: with X11, and without (for example when working remotely through
> > ssh without forwarding X11).
> >
> > In both cases the user has to invoke a command, v(), in order to insert
> the
> > current graph into the buffer at the current place.
> >
> > With X11, the way it works is that when v() is invoked I call
> recordPlot(),
> > then open a postscript file, then replayPlot(), and then close the
> > postscript file and insert it into the session.
> >
> > Without X11, I open a postscript file ahead of time, then when v() is
> > called, I close it, and insert it into the session, and then open a new
> > postscript file.
> >
> > Obviously quite primitive.I think ideally would be if everything was
> > transparent to the user - the user does a plot, and the plot is inserted
> > into the buffer right away, and later, updates to the same plot update
> the
> > original plot where it is. But to be able to do that I need to be able to
> > generate the postscript file of the current plot, and be notified somehow
> > whenever the plot changes.
> >
> > Is all that possible? Is there a better way to implement this all?
> >
>
> I don't know the mechanics of the actual "inserting" in TeXmac but it would
> be trivial to simply create a copy of the plot as EPS (or whatever is
> needed) at the time of insertion. See dev.copy2eps() for a function that
> does exactly that.
>
>
>
> > 2. Libraries
> >
> > A remotely related question is this: the interface with TeXmacs generates
> > menus that depend on the currently loaded libraries.
>
> Libraries are not "loaded" (see .libPath() for handling libraries) - but
> chances are that you meant packages...
>
>
> > I'd like to be able to
> > update the menus whenever a new library is loaded. Is there a possibility
> to
> > have a function called whenever this happens? Or would it be advisable to
> > change the global 'library' function?
> >
>
> I would strongly advise against the latter. A reasonably simple way would
> be to check the search path - if it changed a package has been loaded. A
> natural place to do such check would be in a top-level task handler for
> example.
>
>
Another possibility would be the setHook() function.

Michael

Cheers,
> Simon
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] lapply version with [ subseting - a suggestion

2010-09-21 Thread Michael Lawrence
On Tue, Sep 21, 2010 at 3:55 AM, Vitaly S.  wrote:

>
> Dear R developers,
>
> Reviewing my code, I have realized that about 80% of the time in the lapply
> I
> need to access the names of the objects inside the loop.
>
> In such cases I iterate over indexes or names:
> lapply(names(x), ... [i]),
> lapply(seq_along(x),  ... x[[i]] ... names(x)[i] ), or
> for(i in seq_along(x)) ...
>
> which is rather inconvenient.
>
> How about an argument to lapply which would specify the [ or [[ subseting
> to use
> in the splitting of the vector?
> Or may be a different set of functions lapply1,
> sapply1?
>
>
I'm not sure what you want exactly, but  what about just using mapply over
the names and vector elements?


> I believe this pattern is rather common for other users as well.
>
> Thanks.
> VS.
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [R] R-2.12.0 hangs while loading RGtk2 on FreeBSD

2010-10-22 Thread Michael Lawrence
On Thu, Oct 21, 2010 at 9:42 AM, Rainer Hurling  wrote:

> [moved from R-help]
>
> On 21.10.2010 18:09 (UTC+1), Prof Brian Ripley wrote:
>
>> If you do R CMD INSTALL --no-test-load this will skip the part that is
>> hanging and you can try loading in stages (e.g. dyn.load on the RGtk2.so).
>>
>
> With '--no-test-load' it installs and ends normal. Loading per
> dyn.load("RGtk2.so") works, just as dyn.load("RGtk2.so",F) and
> dyn.load("RGtk2.so",,F). Unloading works, too.
>
> Normal loading over library(RGtk2) within R does not work. R than is
> hanging.
>
> It seems the problem is not with the library itself?
>
>
It looks like something is happening when initializing GTK+ and the event
loop. This happens in the function R_gtkInit in Rgtk.c. If you could run R
-d gdb and break on that function, perhaps you could step through until it
hangs.

Thanks,
Michael


>  I think this is rather technical for R-help, so maybe move to R-devel?
>>
>
> I moved to R-devel.
>
>  And can you check the RGtk2 version? A recent but not current version
>> (2.12.17?) did hang initializing Gtk+ on some platforms and Michael
>> Lawrence had to be involved.
>>
>
> I am using RGtk2_2.12.18.tar.gz for month now.
>
>
>  On Thu, 21 Oct 2010, Rainer Hurling wrote:
>>
>>  Am 21.10.2010 16:12 (UTC+1) schrieb Prof Brian Ripley:
>>>
>>>> On Thu, 21 Oct 2010, Rainer Hurling wrote:
>>>>
>>>>  I am working with R-2.12.0 on FreeBSD 9.0-CURRENT for a while now. I
>>>>> successfully installed more than 300 packages (most as dependencies of
>>>>> others).
>>>>>
>>>>> There are two packages I am not able to install: RGtk2 and rggobi.
>>>>>
>>>>> For example rggobi builds fine and after that it wants to load:
>>>>>
>>>>> --
>>>>> # R CMD INSTALL rggobi_2.1.16.tar.gz
>>>>> [..SNIP..]
>>>>> gcc -std=gnu99 -shared -L/usr/local/lib -o rggobi.so RSEval.o brush.o
>>>>> colorSchemes.o conversion.o data.o dataset.o display.o displays.o
>>>>> edges.o ggobi.o identify.o init.o io.o keyHandlers.o longitudinal.o
>>>>> modes.o plot.o plots.o plugins.o print.o session.o smooth.o ui.o
>>>>> utils.o -pthread -L/usr/local/lib -lggobi -lgtk-x11-2.0 -lxml2
>>>>> -lgdk-x11-2.0 -latk-1.0 -lgdk_pixbuf-2.0 -lpangocairo-1.0 -lXext
>>>>> -lXrender -lXinerama -lXi -lXrandr -lXcursor -lXcomposite -lXdamage
>>>>> -lpangoft2-1.0 -lgio-2.0 -lXfixes -lcairo -lX11 -lpango-1.0 -lm
>>>>> -lfreetype -lfontconfig -lgobject-2.0 -lgmodule-2.0 -lgthread-2.0
>>>>> -lglib-2.0
>>>>> installiert nach /usr/local/lib/R/library/rggobi/libs
>>>>> ** R
>>>>> ** data
>>>>> ** moving datasets to lazyload DB
>>>>> ** demo
>>>>> ** preparing package for lazy loading
>>>>> --
>>>>>
>>>>> At this point the install process is hanging, R utilises no more CPU
>>>>> time. Same with package RGtk2.
>>>>>
>>>>> Is this a known error? Please let me know if I can give more
>>>>> information or try something different.
>>>>>
>>>>
>>>> Well, those are exactly the two packages using Gtk+.
>>>>
>>>> There is no known general problem, and as you could have checked from
>>>> the CRAN check pages, those packages install without problems on several
>>>> platforms. (Not Solaris, where ggobi does not install and RGtk2 requires
>>>> gcc, and not x64 Windows where both need to be patched.)
>>>>
>>>> So it does look very like there is a problem with loading against the
>>>> Gtk+ system libraries on your system.
>>>>
>>>
>>> I think you are right. With previous versions of R (until R-2.10.x) I
>>> did not have this hanging when loading RGtk2 ... And I am pretty sure
>>> that I have no problems with gtk2 outside of R on my FreeBSD system.
>>>
>>> In the meantime I found out that the reported loading error of rggobi
>>> is a loading error of RGtk2, which fails (hangs). So there remains
>>> only a loading error with RGtk2. (Because of that I changed the subject.)
>>>
>>> After building/installing RGtk2, there are the following messages:
>>>
>>> ---

[Rd] windows 64-bit package build on 32-bit machine

2010-10-26 Thread Michael Spiegel
Hello,

Is it possible to build a 64-bit package on a 32-bit machine on
windows? I can cross-compile for x86, x86_64, and ppc on a 32-bit OS X
machine.  And it looks like I could build a 32-bit library on a 64-bit
windows machine.  But it doesn't look possible to build a 64-bit
library on a 32-bit windows machine?

Thanks,
--Michael

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] windows 64-bit package build on 32-bit machine

2010-10-26 Thread Michael Spiegel
Hmm.  So our package does not have no src/Makefile.win and only an
empty configure.win.  We usually build a binary version with R CMD
INSTALL --build.

R --arch x64 CMD INSTALL --build yields the message "The system cannot
find the path specified."

On Tue, Oct 26, 2010 at 10:41 PM, Simon Urbanek
 wrote:
>
> On Oct 26, 2010, at 9:04 PM, Michael Spiegel wrote:
>
>> Hello,
>>
>> Is it possible to build a 64-bit package on a 32-bit machine on
>> windows? I can cross-compile for x86, x86_64, and ppc on a 32-bit OS X
>> machine.  And it looks like I could build a 32-bit library on a 64-bit
>> windows machine.  But it doesn't look possible to build a 64-bit
>> library on a 32-bit windows machine?
>>
>
> Why not? It works for me without problems ...
>
> Cheers,
> Simon
>
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R-2.12.0 hangs while loading RGtk2 on FreeBSD

2010-10-27 Thread Michael Lawrence
On Sat, Oct 23, 2010 at 2:49 AM, Rainer Hurling  wrote:

> On 22.10.2010 22:10 (UTC+1), Rainer Hurling wrote:
>
>> On 22.10.2010 16:18 (UTC+2), Rainer Hurling wrote:
>>
>>> On 22.10.2010 14:57 (UTC+1), Michael Lawrence wrote:
>>>
>>>>
>>>>
>>>> On Thu, Oct 21, 2010 at 9:42 AM, Rainer Hurling >>> <mailto:rhur...@gwdg.de>> wrote:
>>>>
>>>> [moved from R-help]
>>>>
>>>> On 21.10.2010 18:09 (UTC+1), Prof Brian Ripley wrote:
>>>>
>>>> If you do R CMD INSTALL --no-test-load this will skip the part
>>>> that is
>>>> hanging and you can try loading in stages (e.g. dyn.load on the
>>>> RGtk2.so).
>>>>
>>>>
>>>> With '--no-test-load' it installs and ends normal. Loading per
>>>> dyn.load("RGtk2.so") works, just as dyn.load("RGtk2.so",F) and
>>>> dyn.load("RGtk2.so",,F). Unloading works, too.
>>>>
>>>> Normal loading over library(RGtk2) within R does not work. R than is
>>>> hanging.
>>>>
>>>> It seems the problem is not with the library itself?
>>>>
>>>>
>>>> It looks like something is happening when initializing GTK+ and the
>>>> event loop. This happens in the function R_gtkInit in Rgtk.c. If you
>>>> could run R -d gdb and break on that function, perhaps you could step
>>>> through until it hangs.
>>>>
>>>
>>> Michael, thank you for answering. As I wrote earlier (on R-help@),
>>> unfortunately I have no experience with debugging (I am not a
>>> programmer). So I would need some more assistence.
>>>
>>> Is there a difference between 'library(RGtk2)' and 'dyn.load(RGtk2)' in
>>> initializing GTK+? I am able to dyn.load, but library does not work.
>>>
>>> After starting with 'R -d gdb' is the following right?
>>>
>>> (gdb) break R_gtkInit
>>> Function "R_gtkInit" not defined.
>>> Make breakpoint pending on future shared library load? (y or [n]) y
>>> Breakpoint 1 (R_gtkInit) pending.
>>>
>>> When I try to proceed, it gives me the following message
>>>
>>> (gdb) run
>>> Starting program: /usr/local/lib/R/bin/exec/R
>>> /libexec/ld-elf.so.1: Shared object "libRblas.so" not found,
>>> required by "R"
>>> Program exited with code 01.
>>>
>>
>> Ok, I am one step further now:
>>
>>
>> (gdb) run
>> Starting program: /usr/local/lib/R/bin/exec/R
>> [..SNIP..]
>>  > library(RGtk2)
>> [New LWP 100174]
>> Breakpoint 2 at 0x318bd490: file Rgtk.c, line 104.
>> Pending breakpoint "R_gtkInit" resolved
>> [New Thread 2f408b00 (LWP 100174)]
>> [Switching to Thread 2f408b00 (LWP 100174)]
>>
>> Breakpoint 2, R_gtkInit (rargc=0x30b11d10, rargv=0x30a98458,
>> success=0x30afbad0) at Rgtk.c:104
>> 104 Rgtk.c: No such file or directory.
>> in Rgtk.c
>> (gdb)
>>
>>
>> What do you suggest I should do next?
>>
>
> Rgtk.c was not found from gdb because RGtk2 was not build with DEBUG=T and
> R_KEEP_PKG_SOURCE=yes.
>
> So this is the next try. I can trace the code until it hangs:
>
>
> library(RGtk2)
> [New LWP 100250]
> Breakpoint 2 at 0x458bd490: file Rgtk.c, line 104.
>
> Pending breakpoint "R_gtkInit" resolved
> [New Thread 4322ef00 (LWP 100250)]
> [Switching to Thread 4322ef00 (LWP 100250)]
>
> Breakpoint 2, R_gtkInit (rargc=0x446d0980, rargv=0x44698618,
> success=0x446d09e0) at Rgtk.c:104
> 104 {
> (gdb) n
> 107   argc = (int) *rargc;
> (gdb) n
> 104 {
> (gdb) n
> 107   argc = (int) *rargc;
> (gdb) n
> 109   if (!gdk_display_get_default()) {
> (gdb) n
> 110 gtk_disable_setlocale();
> (gdb) n
> 111 if (!gtk_init_check(&argc, &rargv)) {
> (gdb) n
> 121 if (!GDK_DISPLAY()) {
> (gdb) n
> 127 addInputHandler(R_InputHandlers,
> ConnectionNumber(GDK_DISPLAY()),
> (gdb) n
> 132 if (!pipe(fds)) {
> (gdb) n
> 133   ifd = fds[0];
> (gdb) n
> 134   ofd = fds[1];
> (gdb) n
> 135   addInputHandler(R_InputHandlers, ifd,
> R_gtk_timerInputHandler, 32);
> (gdb) n
> 133   ifd = fds[0];
> (gdb) n
> 134   ofd = fds[1];
> (gdb) n
> 135   addInputHandler(R_InputHandlers, ifd,
> R_gtk_timerInputHandler, 32);
> (gdb) n
> 136   

Re: [Rd] R-2.12.0 hangs while loading RGtk2 on FreeBSD

2010-10-28 Thread Michael Lawrence
On Wed, Oct 27, 2010 at 12:09 PM, Rainer Hurling  wrote:

> On 27.10.2010 15:07 (UTC+1), Michael Lawrence wrote:
>
>> On Sat, Oct 23, 2010 at 2:49 AM, Rainer Hurling > <mailto:rhur...@gwdg.de>> wrote:
>>On 22.10.2010 22:10 (UTC+1), Rainer Hurling wrote:
>>On 22.10.2010 16:18 (UTC+2), Rainer Hurling wrote:
>>    On 22.10.2010 14:57 (UTC+1), Michael Lawrence wrote:
>>
>>On Thu, Oct 21, 2010 at 9:42 AM, Rainer Hurling
>>mailto:rhur...@gwdg.de>
>><mailto:rhur...@gwdg.de <mailto:rhur...@gwdg.de>>> wrote:
>>
>>[moved from R-help]
>>
>>On 21.10.2010 18:09 (UTC+1), Prof Brian Ripley wrote:
>>
>>If you do R CMD INSTALL --no-test-load this will skip
>>the part
>>that is
>>hanging and you can try loading in stages (e.g. dyn.load
>>on the
>>RGtk2.so).
>>
>>With '--no-test-load' it installs and ends normal.
>>Loading per
>>dyn.load("RGtk2.so") works, just as
>>dyn.load("RGtk2.so",F) and
>>dyn.load("RGtk2.so",,F). Unloading works, too.
>>
>>Normal loading over library(RGtk2) within R does not
>>work. R than is
>>hanging.
>>
>>It seems the problem is not with the library itself?
>>
>>It looks like something is happening when initializing
>>GTK+ and the
>>event loop. This happens in the function R_gtkInit in
>>Rgtk.c. If you
>>could run R -d gdb and break on that function, perhaps
>>you could step
>>through until it hangs.
>>
>>Michael, thank you for answering. As I wrote earlier (on
>>R-help@),
>>unfortunately I have no experience with debugging (I am not a
>>programmer). So I would need some more assistence.
>>
>>Is there a difference between 'library(RGtk2)' and
>>'dyn.load(RGtk2)' in
>>initializing GTK+? I am able to dyn.load, but library does
>>not work.
>>
>>After starting with 'R -d gdb' is the following right?
>>
>>(gdb) break R_gtkInit
>>Function "R_gtkInit" not defined.
>>Make breakpoint pending on future shared library load? (y or
>>[n]) y
>>Breakpoint 1 (R_gtkInit) pending.
>>
>>When I try to proceed, it gives me the following message
>>
>>(gdb) run
>>Starting program: /usr/local/lib/R/bin/exec/R
>>/libexec/ld-elf.so.1: Shared object "libRblas.so" not found,
>>required by "R"
>>Program exited with code 01.
>>
>>Ok, I am one step further now:
>>
>>(gdb) run
>>Starting program: /usr/local/lib/R/bin/exec/R
>>[..SNIP..]
>> > library(RGtk2)
>>[New LWP 100174]
>>Breakpoint 2 at 0x318bd490: file Rgtk.c, line 104.
>>Pending breakpoint "R_gtkInit" resolved
>>[New Thread 2f408b00 (LWP 100174)]
>>[Switching to Thread 2f408b00 (LWP 100174)]
>>
>>Breakpoint 2, R_gtkInit (rargc=0x30b11d10, rargv=0x30a98458,
>>success=0x30afbad0) at Rgtk.c:104
>>104 Rgtk.c: No such file or directory.
>>in Rgtk.c
>>(gdb)
>>
>>What do you suggest I should do next?
>>
>>Rgtk.c was not found from gdb because RGtk2 was not build with
>>DEBUG=T and R_KEEP_PKG_SOURCE=yes.
>>
>>So this is the next try. I can trace the code until it hangs:
>>
>>library(RGtk2)
>>[New LWP 100250]
>>Breakpoint 2 at 0x458bd490: file Rgtk.c, line 104.
>>
>>Pending breakpoint "R_gtkInit" resolved
>>[New Thread 4322ef00 (LWP 100250)]
>>[Switching to Thread 4322ef00 (LWP 100250)]
>>
>>Breakpoint 2, R_gtkInit (rargc=0x446d0980, rargv=0x44698618,
>>success=0x446d09e0) at Rgtk.c:104
>>104 {
>>(gdb) n
>>107   argc = (int) *rargc;
>>(gdb) n
>>104 {
>>(gdb) n
>>107   argc = (int) *rargc;
>>  

Re: [Rd] Scripting SVG with R

2010-10-29 Thread Michael Lawrence
Lots of interesting responses to this, but I would add that the qtbase
package allows for interesting hybrid applications between the
web/javascript and R. Qt includes a WebKit port, which is integrated with
the QtScript module,  a javascript implementation.  With qtbase, one could
hypothetically embed WebKit within R and expose R objects (extending
QObject) to the Javascript context. Thus, one can call R through Javascript,
in a running R session. You can also modify the Javascript code in the page,
making it possible to integrate R with any page off the web.

See:
http://qt.nokia.com/qt-in-use/files/pdf/qt-features-for-hybrid-web-native-application-development

I haven't actually tested all of that with qtbase, but it should work in
theory. Embedding widgets (with R callbacks) into web pages definitely
works.

There is also the QtSvg module, which parses and outputs SVG.

Michael

On Wed, Oct 13, 2010 at 8:30 AM, Wolfgang Huber  wrote:

>
> Since now many browsers support (ECMA/Java-)scripted SVG, I am wondering
> whether there are already any examples of inserting R code into SVG
> documents (or a Javascript canvas?) either directly, or perhaps more likely
> through a JavaScript layer, to dynamically generate graphics or make them
> interactive?
>
> I am aware of the excellent packages gridSVG and SVGAnnotation, which
> facilitate making R-generated SVG plots more interesting either at
> construction time or by postprocessing; the above question is about
> employing R at viewing time.
>
> Best wishes
>
> Wolfgang Huber
> EMBL
> http://www.embl.de/research/units/genome_biology/huber
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] BLAS benchmarks on R 2.12.0

2010-10-31 Thread Michael Spiegel
Hi,

I saw on the mailing list and in the NEWS file that some unsafe math
transformations were disabled for the reference BLAS implementation
that is used in R.  We have a set of performance tests for the OpenMx
library, and some of the tests have a x3-10 slowdown in R 2.12.0
versus 2.11.1.  When I copy the shared library libRblas.0.dylib from
the 2.11.1 installation into the 2.12.0 installation, the slowdown
goes away.  It seems reasonable that BLAS should conform to IEEE
requirements.  For the purposes of our library, we are considering two
options but I need some advice on both choices:

1) Compile the reference BLAS implementation with unsafe optimizations
and include it as a part of the OpenMx library.  I can't seem to
reproduce the speed of the 2.11.1 reference BLAS library.  What
compiler, which version of the compiler, and what flags are used when
an R binary is distributed? My test machine is a Mac Pro, that may
change the answer.

2) Is there any support for adding a libRblas.unsafe.dylib shared
library in the R installation, much like libRblas.veclib.dylib is
currently included in OS X binaries? Then we could just change the
OpenMx shared library to use the unsafe library when we give it to
users.  We currently change the OpenMx shared library to use the
reference blas implementation, because it is faster than the veclib
implementation for small matrices.

Thank you!
--Michael
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] BLAS benchmarks on R 2.12.0

2010-11-01 Thread Michael Spiegel
Hi Andrew,

In the majority of use cases of our package, we end up doing lots and
lots of matrix operations on small matrices, as opposed to matrix
operations on large matrices.  The optimized BLAS libraries are
usually optimized for large matrices.  The reference implementation is
faster than either veclib, Atlas, or the Goto BLAS implementations,
I've tested all of them on our performance test suite.

--Michael

On Mon, Nov 1, 2010 at 6:07 PM, Andrew Piskorski  wrote:
> On Sun, Oct 31, 2010 at 12:41:24PM -0400, Michael Spiegel wrote:
>
>> 1) Compile the reference BLAS implementation with unsafe optimizations
>> and include it as a part of the OpenMx library.
>
> If BLAS speed is important to you, why are you even trying to use the
> slow reference BLAS library at all, rather than one of the faster
> optimized BLAS libraries (Atlas, Goto, AMD, Intel, etc.)?
>
> --
> Andrew Piskorski 
> http://www.piskorski.com/
>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R package BibTex entries: looking for a more general solution

2010-11-03 Thread Michael Friendly

== Summary ==
* Problem: BibTeX entries extracted from R packages via citation() 
require too much manual editing to be

of general use.
* Proposal: Date: fields should be made mandatory in package DESCRIPTION 
files, perhaps

beginning with warnings from R CMD check
* Proposal: Package authors should be encouraged to use a (new) 
Contributors: field in the DESCRIPTION file
rather than packing all information into the Author: field, which at 
present cannot often be parsed by BibTeX.

* Files: All test files referred to here can be found at

http://euclid.psych.yorku.ca/SCS/Private/Rbibs/

== Details ==
Around 16 Dec. 2009, I queried R-help about automating the extraction of 
citation()s from R packages. The stimulus
was that some journals, notably JSS, now require a reference and 
citation of every R package mentioned,
and it is a pain to create these manually, no less maintain them for 
current versions.


The result of that query was a function, Rpackage.bibs() by Achim 
Zeileis that I have been using ever since.

Code in: http://euclid.psych.yorku.ca/SCS/Private/Rbibs/Rpackages.bib.R
On one current system I get the following:

> Rpackage.bibs(file="Rpackages-R.2.11.1.bib")
Converted 230 of 230 package citations to BibTex
Results written to file Rpackages-R.2.11.1.bib
Warning messages:
1: In citation(x) :
no date field in DESCRIPTION file of package 'codetools'
2: In citation(x) :
no date field in DESCRIPTION file of package 'gridBase'
3: In citation(x) : no date field in DESCRIPTION file of package 'iplots'
>
See:
http://euclid.psych.yorku.ca/SCS/Private/Rbibs/Rpkg-test.pdf
for the result of processing this .bib file with latex/bibtex using the 
jss.bst bibliography style


I'm writing to R-Devel because the DESCRIPTION and inst/CITATION files 
in R packages provide the
basic data used in citation() and any methods based on this, and yet the 
information in these files is
often insufficient to generate well-formed BibTeX entries for use in 
vignettes and publications.


One easy case is illustrated above, where 3 packages have no Date: field 
so the BibTeX gets no

year = {},
and references get printed as Murrell, P () for gridBase. (In my 
original test under R 2.9.1, there where
~ 20 such warnings.) Thus, I propose that Date: be a required field in 
DESCRIPTION files, and

R CMD check complain if this is not found.

The more difficult case has to do with the Author: field in the 
DESCRIPTION file (when no CITATION file is present)
People can write whatever they want here, and the result looks sort of 
OK when printed by citation(), but confuses

BibTeX mightly. One example:

> citation("akima")
To cite package ‘akima’ in publications use:

Fortran code by H. Akima R port by Albrecht Gebhardt aspline function
by Thomas Petzoldt  enhancements and
corrections by Martin Maechler (2009). akima: Interpolation of
irregularly spaced data. R package version 0.5-4.
http://CRAN.R-project.org/package=akima

A BibTeX entry for LaTeX users is

@Manual{,
title = {akima: Interpolation of irregularly spaced data},
author = {Fortran code by H. Akima R port by Albrecht Gebhardt aspline 
function by Thomas Petzoldt  
enhancements and corrections by Martin Maechler},

year = {2009},
note = {R package version 0.5-4},
url = {http://CRAN.R-project.org/package=akima},
}

ATTENTION: This citation information has been auto-generated from the
package DESCRIPTION file and may need manual editing, see
‘help("citation")’ .
>

Yes, the ATTENTION note does say that manual editing may be necessary, 
but I think a worthy goal would be

to try to reduce the need for this.

One simple way to do that would be to support an extra Contributions: 
field in the DESCRIPTION file,
so that Authors: can be more cleanly separated for the purpose of 
creating well-structured BibTeX.

Perhaps others have better ideas.

-Michael

--
Michael Friendly Email: friendly AT yorku DOT ca
Professor, Psychology Dept.
York University  Voice: 416 736-5115 x66249 Fax: 416 736-5814
4700 Keele StreetWeb:http://www.datavis.ca
Toronto, ONT  M3J 1P3 CANADA

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


  1   2   3   4   5   6   7   >