Re: [Rd] Building R under Linux - library dependencies

2016-09-08 Thread Paweł Piątkowski
> You are not enumerating your trade-offs very well. There are natural
> conflicts. What is you really want?
> 
> - Being able to pre-build and distribute?  We have done that since the last
> 5C1990s with .deb packages.
> 
> - Being able to install with minimal size?  Have you queried your users?  I
> note that among the Docker containers for R (in the "Rocker" project Carl and
> I run) the _larger_ ones containing RStudio plus optionally "lots from
> hadley" plus optionally lots of rOpenSci tend to me _more_ popular (for ease
> of installation of the aggregate).
> 
> And while share the overall sentiment a little bit, you have to realize that
> it is 2016 with the corresponding bandwith and storage:
> 
>   edd@max:~$ du -csh /usr/local/lib/R/site-library/
>   1.5G/usr/local/lib/R/site-library/
>   1.5Gtotal
>   edd@max:~$
> 
> And that it _outside_ of R itself, or the (numerous) other shared libraries.

OK, to be honest, it was rather a proof-of-concept than a specific idea. Other 
interpreted and VM-based languages have robust app deployment systems with 
smaller footprint, so I thought that it would be nice to have something similar 
in R.
Maybe you are right and neither R developers, nor users actually need it.

Thanks for the discussion,
-p-

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Fwd: Re: RSiteSearch, sos, rdocumentation.org, ...?

2016-09-08 Thread Jonathan Baron

OK.  It is sort of fixed and sort of works.

We'll keep it for now, but this is not going to work forever. When
namazu fails completely I will not have the time to install a new
search engine.

One option is to use google. For a site like this, I think they will
want some money, but I'm not sure, and I do not have the time to deal
with it.

We have over 10,000 packages now. I wonder if searching all help files
is really helpful anymore.

Jon

On 09/07/16 22:06, Jonathan Baron wrote:

Don't do anything yet. I may have found the problem by accident.

I tried to use the computer from something else, and it was being
drastically slowed down by some leftover processes, which turned out
to be xlhtml. That is something that converts Excel files. Apparently,
some excel files got into the libraries, and they were causing the
indexing to hang completely.

I am now running everything again, starting from scratch, and it might
work. (I'm doing it wrong, but it is 3/4 done. I will do it right
tomorrow, if it works overnight.)

Jon

On 09/07/16 16:53, Jonathan Baron wrote:

Spencer,

Thanks for the quick reply.

I am open to someone who knows Perl getting an account on my site and
trying to get it working. It will probably involve fixing more than
one thing, as mknmz depends on some perl modules that also generate
errors.

My main contribution is figuring out how to extract the html help
files and vignettes only, with some help from R developers and Fedora
maintainers. Here is the trick, for someone who wants to do it:

m0 <- rownames(installed.packages())
m1 <- m0[which(m0 %in% needed.packages)]
source("http://bioconductor.org/biocLite.R";)
update.packages(oldPkgs=m1,repos=biocinstallRepos())
update.packages(dependencies=FALSE,INSTALL_opts=c("--no-configure","--no-test-load",

"

--no-R","--no-clean-on-error","--no-libs","--no-data","--no-demo","--no-exec","--htm

l

"),repos=biocinstallRepos(),ask=F)
m3 <- new.packages()
install.packages(m3,dependencies=FALSE,INSTALL_opts=c("--no-configure","--no-test-lo

a

d","--no-R","--no-clean-on-error","--no-libs","--no-data","--no-demo","--no-exec","-

-

html"),repos=biocinstallRepos())

Note 1: The first 4 lines are designed to deal with a list of the
packages that you actually use. These can be eliminated if you don't
use R on the same machine. The last 3 lines are all you need.

Note 2: This works on Fedora, but I think that the Fedora maintainers
of R have set some defaults that are helpful.

Jon

On 09/07/16 15:41, Spencer Graves wrote:

Hello, All:


  Jonathan Baron is "giving up" maintaining the RSiteSearch database.


  This breaks three things:  (1) The R Site Search web service that 
Baron has maintained.  (2) The RSiteSearch function in the utils 
package.  (3) The sos package, for which I'm the maintainer and lead 
author.



  Might someone else be willing to take these over?


  For me, the "findFn" capability with "writeFindFn2xls" is the 
fastest literature search for anything statistical.  However, I don't 
have the resources to take over the management of Baron's R Site Search 
database.



  He's provided a great service for the R community for many 
years.  I hope we can find a way to keep the system maintained. Failing 
that, I could use help in adapting the sos package to another database.



  Thanks,
  Spencer Graves


 Forwarded Message 
Subject:Re: RSiteSearch, sos, rdocumentation.org, ...?
Date:   Wed, 7 Sep 2016 16:15:22 -0400
From:   Jonathan Baron 
To: Spencer Graves 
CC: 	Jonathan Baron , chris.is@gmail.com, 
i...@datacamp.com , Sundar Dorai-Raj 
, webmaster@www.r-project-org




R site search has stopped working. The indexing scrip, mknmz, failed
to complete. It has been producing more and more errors and warnings,
since it has not been updated for 5 yeaers.

I am giving up on this site. I have too many other things to do aside
from find bugs in programs written in languages I don't know (Perl),
or set up an alternative search engine.

Please inform anyone else who needs to be informed.

I cannot find the email of the www.r-project.org webmaster, so I'm
taking a stab. There are several links to this site in those pages.

Jon
--
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron
Editor: Judgment and Decision Making (http://journal.sjdm.org)



--
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron
Editor: Judgment and Decision Making (http://journal.sjdm.org)


--
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron
Editor: Judgment and Decision Making (http://journal.sjdm.org)


--
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron
Editor: Judgment and Decision Making (http://journal.sjdm.org)

__
R-devel@r-p

Re: [Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

2016-09-08 Thread Martin Maechler
> robin hankin 
> on Thu, 8 Sep 2016 10:05:21 +1200 writes:

> Martin I'd like to make a comment; I think that R's
> behaviour on 'edge' cases like this is an important thing
> and it's great that you are working on it.

> I make heavy use of zero-extent arrays, chiefly because
> the dimnames are an efficient and logical way to keep
> track of certain types of information.

> If I have, for example,

> a <- array(0,c(2,0,2))
> dimnames(a) <- list(name=c('Mike','Kevin'),NULL,item=c("hat","scarf"))


> Then in R-3.3.1, 70800 I get

a> 0
> logical(0)
>> 

> But in 71219 I get

a> 0
> , , item = hat


> name
> Mike
> Kevin

> , , item = scarf


> name
> Mike
> Kevin

> (which is an empty logical array that holds the names of the people and
> their clothes). I find the behaviour of 71219 very much preferable because
> there is no reason to discard the information in the dimnames.

Thanks a lot, Robin, (and Oliver) !

Yes, the above is such a case where the new behavior makes much sense.
And this behavior remains identical after the 71222 amendment.

Martin

> Best wishes
> Robin




> On Wed, Sep 7, 2016 at 9:49 PM, Martin Maechler 

> wrote:

>> > Martin Maechler 
>> > on Tue, 6 Sep 2016 22:26:31 +0200 writes:
>> 
>> > Yesterday, changes to R's development version were committed,
>> relating
>> > to arithmetic, logic ('&' and '|') and
>> > comparison/relational ('<', '==') binary operators
>> > which in NEWS are described as
>> 
>> > SIGNIFICANT USER-VISIBLE CHANGES:
>> 
>> > [.]
>> 
>> > • Arithmetic, logic (‘&’, ‘|’) and comparison (aka
>> > ‘relational’, e.g., ‘<’, ‘==’) operations with arrays now
>> > behave consistently, notably for arrays of length zero.
>> 
>> > Arithmetic between length-1 arrays and longer non-arrays had
>> > silently dropped the array attributes and recycled.  This
>> > now gives a warning and will signal an error in the future,
>> > as it has always for logic and comparison operations in
>> > these cases (e.g., compare ‘matrix(1,1) + 2:3’ and
>> > ‘matrix(1,1) < 2:3’).
>> 
>> > As the above "visually suggests" one could think of the changes
>> > falling mainly two groups,
>> > 1) <0-extent array>  (op) 
>> > 2) <1-extent array>  (arith)  
>> 
>> > These changes are partly non-back compatible and may break
>> > existing code.  We believe that the internal consistency gained
>> > from the changes is worth the few places with problems.
>> 
>> > We expect some package maintainers (10-20, or even more?) need
>> > to adapt their code.
>> 
>> > Case '2)' above mainly results in a new warning, e.g.,
>> 
>> >> matrix(1,1) + 1:2
>> > [1] 2 3
>> > Warning message:
>> > In matrix(1, 1) + 1:2 :
>> > dropping dim() of array of length one.  Will become ERROR
>> >>
>> 
>> > whereas '1)' gives errors in cases the result silently was a
>> > vector of length zero, or also keeps array (dim & dimnames) in
>> > cases these were silently dropped.
>> 
>> > The following is a "heavily" commented  R script showing (all ?)
>> > the important cases with changes :
>> 
>> > 
>> 
>> 
>> > (m <- cbind(a=1[0], b=2[0]))
>> > Lm <- m; storage.mode(Lm) <- "logical"
>> > Im <- m; storage.mode(Im) <- "integer"
>> 
>> > ## 1. -
>> > try( m & NULL ) # in R <= 3.3.x :
>> > ## Error in m & NULL :
>> > ##  operations are possible only for numeric, logical or complex
>> types
>> > ##
>> > ## gives 'Lm' in R >= 3.4.0
>> 
>> > ## 2. -
>> > m + 2:3 ## gave numeric(0), now remains matrix identical to  m
>> > Im + 2:3 ## gave integer(0), now remains matrix identical to Im
>> (integer)
>> 
>> > m > 1  ## gave logical(0), now remains matrix identical to Lm
>> (logical)
>> > m > 0.1[0] ##  ditto
>> > m > NULL   ##  ditto
>> 
>> > ## 3. -
>> > mm <- m[,c(1:2,2:1,2)]
>> > try( m == mm ) ## now gives error   "non-conformable arrays",
>> > ## but gave logical(0) in R <= 3.3.x
>> 
>> > ## 4. -
>> > str( Im + NULL)  ## gave "num", now gives "int"
>> 
>> > ## 5. -
>> > ## special case for arithmetic w/ length-1 array
>> > (m1 <- matrix(1,1,1, dimnames=list("Ro","col")))
>> > (m2 <- matrix(1,2,1, dimnames=list(c("A","B"),"col")))
>> 
>> > m1 + 1:2  # ->  2:3  but now with warning to  "become ERROR"
>> > tools::assertError(m1 & 1:2)# ERR: dims [product 1] do not match the
>> length of object [2]
>> > tools::assertError(m1

Re: [Rd] Fwd: Re: RSiteSearch, sos, rdocumentation.org, ...?

2016-09-08 Thread Michael Dewey
I have mixed feelings about this. I used to find the sos package very 
useful when I first started using it but as the number of packages has 
grown I now find it gives me a huge list which takes a lot of time to 
digest. This may of course reflect my rudimentary search term selection 
skills.


Michael

On 08/09/2016 11:01, Jonathan Baron wrote:

OK.  It is sort of fixed and sort of works.

We'll keep it for now, but this is not going to work forever. When
namazu fails completely I will not have the time to install a new
search engine.

One option is to use google. For a site like this, I think they will
want some money, but I'm not sure, and I do not have the time to deal
with it.

We have over 10,000 packages now. I wonder if searching all help files
is really helpful anymore.

Jon

On 09/07/16 22:06, Jonathan Baron wrote:

Don't do anything yet. I may have found the problem by accident.

I tried to use the computer from something else, and it was being
drastically slowed down by some leftover processes, which turned out
to be xlhtml. That is something that converts Excel files. Apparently,
some excel files got into the libraries, and they were causing the
indexing to hang completely.

I am now running everything again, starting from scratch, and it might
work. (I'm doing it wrong, but it is 3/4 done. I will do it right
tomorrow, if it works overnight.)

Jon

On 09/07/16 16:53, Jonathan Baron wrote:

Spencer,

Thanks for the quick reply.

I am open to someone who knows Perl getting an account on my site and
trying to get it working. It will probably involve fixing more than
one thing, as mknmz depends on some perl modules that also generate
errors.

My main contribution is figuring out how to extract the html help
files and vignettes only, with some help from R developers and Fedora
maintainers. Here is the trick, for someone who wants to do it:

m0 <- rownames(installed.packages())
m1 <- m0[which(m0 %in% needed.packages)]
source("http://bioconductor.org/biocLite.R";)
update.packages(oldPkgs=m1,repos=biocinstallRepos())
update.packages(dependencies=FALSE,INSTALL_opts=c("--no-configure","--no-test-load",


"

--no-R","--no-clean-on-error","--no-libs","--no-data","--no-demo","--no-exec","--htm


l

"),repos=biocinstallRepos(),ask=F)
m3 <- new.packages()
install.packages(m3,dependencies=FALSE,INSTALL_opts=c("--no-configure","--no-test-lo


a

d","--no-R","--no-clean-on-error","--no-libs","--no-data","--no-demo","--no-exec","-


-

html"),repos=biocinstallRepos())

Note 1: The first 4 lines are designed to deal with a list of the
packages that you actually use. These can be eliminated if you don't
use R on the same machine. The last 3 lines are all you need.

Note 2: This works on Fedora, but I think that the Fedora maintainers
of R have set some defaults that are helpful.

Jon

On 09/07/16 15:41, Spencer Graves wrote:

Hello, All:


  Jonathan Baron is "giving up" maintaining the RSiteSearch
database.


  This breaks three things:  (1) The R Site Search web service
that Baron has maintained.  (2) The RSiteSearch function in the
utils package.  (3) The sos package, for which I'm the maintainer
and lead author.


  Might someone else be willing to take these over?


  For me, the "findFn" capability with "writeFindFn2xls" is the
fastest literature search for anything statistical.  However, I
don't have the resources to take over the management of Baron's R
Site Search database.


  He's provided a great service for the R community for many
years.  I hope we can find a way to keep the system maintained.
Failing that, I could use help in adapting the sos package to
another database.


  Thanks,
  Spencer Graves


 Forwarded Message 
Subject: Re: RSiteSearch, sos, rdocumentation.org, ...?
Date: Wed, 7 Sep 2016 16:15:22 -0400
From: Jonathan Baron 
To: Spencer Graves 
CC: Jonathan Baron ,
chris.is@gmail.com, i...@datacamp.com ,
Sundar Dorai-Raj , webmaster@www.r-project-org



R site search has stopped working. The indexing scrip, mknmz, failed
to complete. It has been producing more and more errors and warnings,
since it has not been updated for 5 yeaers.

I am giving up on this site. I have too many other things to do aside
from find bugs in programs written in languages I don't know (Perl),
or set up an alternative search engine.

Please inform anyone else who needs to be informed.

I cannot find the email of the www.r-project.org webmaster, so I'm
taking a stab. There are several links to this site in those pages.

Jon
--
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron
Editor: Judgment and Decision Making (http://journal.sjdm.org)



--
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron
Editor: Judgment and Decision Making (http://journal.sjdm.org)


--
Jonathan Baron, Professor of Psychology, University of Pennsylvania
H

Re: [Rd] Fwd: Re: RSiteSearch, sos, rdocumentation.org, ...?

2016-09-08 Thread Dirk Eddelbuettel

On 8 September 2016 at 06:01, Jonathan Baron wrote:
| We have over 10,000 packages now. I wonder if searching all help files
| is really helpful anymore.

Yes it is. I go to http://rdocumentation.org a lot for quick look-ups.

So thanks to Datacamp for running that.  

Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Fwd: Re: RSiteSearch, sos, rdocumentation.org, ...?

2016-09-08 Thread Spencer Graves


On 9/8/2016 3:30 AM, Joris Meys wrote:
>
> Hi Jonathan,
>
> I have neither the resources nor the skills to take over, but whatever 
> happens I want to thank you for all the work. Too often people forget 
> that all these nice tools keep working due to the devotion of people 
> like you.
>
> So thank you!
>


   I concur.  People all over the world live better today, because R 
made it easier for others to solve problems -- and Jon made a 
substantive contribution to that.  Spencer


> Cheers
> Joris
>
>
> On 8 Sep 2016 04:08, "Jonathan Baron"  > wrote:
>
> Don't do anything yet. I may have found the problem by accident.
>
> I tried to use the computer from something else, and it was being
> drastically slowed down by some leftover processes, which turned out
> to be xlhtml. That is something that converts Excel files. Apparently,
> some excel files got into the libraries, and they were causing the
> indexing to hang completely.
>
> I am now running everything again, starting from scratch, and it might
> work. (I'm doing it wrong, but it is 3/4 done. I will do it right
> tomorrow, if it works overnight.)
>
> Jon
>
> On 09/07/16 16:53, Jonathan Baron wrote:
>
> Spencer,
>
> Thanks for the quick reply.
>
> I am open to someone who knows Perl getting an account on my
> site and
> trying to get it working. It will probably involve fixing more
> than
> one thing, as mknmz depends on some perl modules that also
> generate
> errors.
>
> My main contribution is figuring out how to extract the html help
> files and vignettes only, with some help from R developers and
> Fedora
> maintainers. Here is the trick, for someone who wants to do it:
>
> m0 <- rownames(installed.packages())
> m1 <- m0[which(m0 %in% needed.packages)]
> source("http://bioconductor.org/biocLite.R
> ")
> update.packages(oldPkgs=m1,repos=biocinstallRepos())
> 
> update.packages(dependencies=FALSE,INSTALL_opts=c("--no-configure","--no-test-load","
> 
> --no-R","--no-clean-on-error","--no-libs","--no-data","--no-demo","--no-exec","--html
> "),repos=biocinstallRepos(),ask=F)
> m3 <- new.packages()
> 
> install.packages(m3,dependencies=FALSE,INSTALL_opts=c("--no-configure","--no-test-loa
> 
> d","--no-R","--no-clean-on-error","--no-libs","--no-data","--no-demo","--no-exec","--
> html"),repos=biocinstallRepos())
>
> Note 1: The first 4 lines are designed to deal with a list of the
> packages that you actually use. These can be eliminated if you
> don't
> use R on the same machine. The last 3 lines are all you need.
>
> Note 2: This works on Fedora, but I think that the Fedora
> maintainers
> of R have set some defaults that are helpful.
>
> Jon
>
> On 09/07/16 15:41, Spencer Graves wrote:
>
> Hello, All:
>
>
>   Jonathan Baron is "giving up" maintaining the
> RSiteSearch database.
>
>
>   This breaks three things:  (1) The R Site Search web
> service that Baron has maintained.  (2) The RSiteSearch
> function in the utils package.  (3) The sos package, for
> which I'm the maintainer and lead author.
>
>
>   Might someone else be willing to take these over?
>
>
>   For me, the "findFn" capability with
> "writeFindFn2xls" is the fastest literature search for
> anything statistical.  However, I don't have the resources
> to take over the management of Baron's R Site Search database.
>
>
>   He's provided a great service for the R community
> for many years.  I hope we can find a way to keep the
> system maintained. Failing that, I could use help in
> adapting the sos package to another database.
>
>
>   Thanks,
>   Spencer Graves
>
>
>  Forwarded Message 
> Subject:Re: RSiteSearch, sos, rdocumentation.org
> , ...?
> Date:   Wed, 7 Sep 2016 16:15:22 -0400
> From:   Jonathan Baron  >
> To: Spencer Graves  >
> CC: Jonathan Baron  >, chris.is@gmail.com
> , i...@datacamp.com
>   >, Sundar Dorai-Raj
> mailto:sdorai...@gmail.com>>,
> webmaster@www.r-project-org
>
>
>
> R site search has stopped working. The indexing 

Re: [Rd] Fwd: Re: RSiteSearch, sos, rdocumentation.org, ...?

2016-09-08 Thread Spencer Graves



On 9/8/2016 5:01 AM, Jonathan Baron wrote:

OK.  It is sort of fixed and sort of works.

We'll keep it for now, but this is not going to work forever. When
namazu fails completely I will not have the time to install a new
search engine.

One option is to use google. For a site like this, I think they will
want some money, but I'm not sure, and I do not have the time to deal
with it.

We have over 10,000 packages now. I wonder if searching all help files
is really helpful anymore.



  The fastest way I know to do a literature search for anything 
statistical uses the sos package as follows:



1.  docPages <- findFn('search string') or findFn('{search 
string}')



2.  installPackages(docPages) # this installs packages to 
enable a more complete package summary



3.  writeFindFn2xls(docPages) # this creates an Excel file 
with 3 sheets:  a package summary, the findFn table, and the call.



4.  Then I open the Excel file, and review the package 
summary sheet.  I prioritize my search from there based on the number 
and strength of matches, how close it sounds to what I want, the date of 
the last update, whether it has a vignette, and the authors and 
maintainers.



  There may be a better way to do this using Google or something 
else.  I'd be pleased if someone else could enlighten me.  I admit to 
being biased:  I'm the lead author and maintainer of "sos". However, I 
don't want to perpetuate a tool that has outlived its usefulness, and 
I'm too blind to see that!



  Spencer



Jon

On 09/07/16 22:06, Jonathan Baron wrote:

Don't do anything yet. I may have found the problem by accident.

I tried to use the computer from something else, and it was being
drastically slowed down by some leftover processes, which turned out
to be xlhtml. That is something that converts Excel files. Apparently,
some excel files got into the libraries, and they were causing the
indexing to hang completely.

I am now running everything again, starting from scratch, and it might
work. (I'm doing it wrong, but it is 3/4 done. I will do it right
tomorrow, if it works overnight.)

Jon

On 09/07/16 16:53, Jonathan Baron wrote:

Spencer,

Thanks for the quick reply.

I am open to someone who knows Perl getting an account on my site and
trying to get it working. It will probably involve fixing more than
one thing, as mknmz depends on some perl modules that also generate
errors.

My main contribution is figuring out how to extract the html help
files and vignettes only, with some help from R developers and Fedora
maintainers. Here is the trick, for someone who wants to do it:

m0 <- rownames(installed.packages())
m1 <- m0[which(m0 %in% needed.packages)]
source("http://bioconductor.org/biocLite.R";)
update.packages(oldPkgs=m1,repos=biocinstallRepos())
update.packages(dependencies=FALSE,INSTALL_opts=c("--no-configure","--no-test-load", 


"
--no-R","--no-clean-on-error","--no-libs","--no-data","--no-demo","--no-exec","--htm 


l

"),repos=biocinstallRepos(),ask=F)
m3 <- new.packages()
install.packages(m3,dependencies=FALSE,INSTALL_opts=c("--no-configure","--no-test-lo 


a
d","--no-R","--no-clean-on-error","--no-libs","--no-data","--no-demo","--no-exec","- 


-

html"),repos=biocinstallRepos())

Note 1: The first 4 lines are designed to deal with a list of the
packages that you actually use. These can be eliminated if you don't
use R on the same machine. The last 3 lines are all you need.

Note 2: This works on Fedora, but I think that the Fedora maintainers
of R have set some defaults that are helpful.

Jon

On 09/07/16 15:41, Spencer Graves wrote:

Hello, All:


  Jonathan Baron is "giving up" maintaining the RSiteSearch 
database.



  This breaks three things:  (1) The R Site Search web service 
that Baron has maintained.  (2) The RSiteSearch function in the 
utils package.  (3) The sos package, for which I'm the maintainer 
and lead author.



  Might someone else be willing to take these over?


  For me, the "findFn" capability with "writeFindFn2xls" is the 
fastest literature search for anything statistical. However, I 
don't have the resources to take over the management of Baron's R 
Site Search database.



  He's provided a great service for the R community for many 
years.  I hope we can find a way to keep the system maintained. 
Failing that, I could use help in adapting the sos package to 
another database.



  Thanks,
  Spencer Graves


 Forwarded Message 
Subject: Re: RSiteSearch, sos, rdocumentation.org, ...?
Date: Wed, 7 Sep 2016 16:15:22 -0400
From: Jonathan Baron 
To: Spencer Graves 
CC: Jonathan Baron , 
chris.is@gmail.com, i...@datacamp.com , 
Sundar Dorai-Raj , webmaster@www.r-project-org




R site search has stopped working. The indexing scrip, mknmz, failed
to complete. It has been producing more and more errors and warnings,
since it has not been updated for 5 yeaers.

I am 

Re: [Rd] Fwd: Re: RSiteSearch, sos, rdocumentation.org, ...?

2016-09-08 Thread Jonathan Baron

I looked at rdocumentation.org. At first I thought it was a superior
replacement for namazu, but after I tried a few things I decided that
it wasn't. I could not find any documentation about how to search, and
the various things I tried seemed to yield very strange responses,
e.g., a search for "Hayes mediation bootstrap" gave me mostly
functions that had nothing to do with the search except for the word
"bootstrap".

So I managed to fix the major Perl module errors (one of which was
quite bothersome although not fatal ... yet). And I figured out a new
way to create the indices that namazu uses; the new way is more
selective. And things seem to work now. Aside from the problems I just
fixed, this is not hard to maintain, so I will continue.

It also seems that someone IS sort of maintaining namazu,
sporadically. There is a Fedora rpm for it. That was how I found out
how to fix the Perl module.

But I did end up spending a few hours on this on a day when I am
behind writing action letters, etc. etc. And ultimately I cannot do
this forever and would love it if someone else took it over, or at
least helped, with an account on my server.

Jon

On 09/08/16 06:36, Dirk Eddelbuettel wrote:


On 8 September 2016 at 06:01, Jonathan Baron wrote:
| We have over 10,000 packages now. I wonder if searching all help files
| is really helpful anymore.

Yes it is. I go to http://rdocumentation.org a lot for quick look-ups.

So thanks to Datacamp for running that.  


Dirk

--
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org


--
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron
Editor: Judgment and Decision Making (http://journal.sjdm.org)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Fwd: Re: RSiteSearch, sos, rdocumentation.org, ...?

2016-09-08 Thread Jonathan Baron

On 09/08/16 07:09, John Merrill wrote:

Given Google's commitment to R, I don't think that they'd be at all averse
to supporting a custom search box on the package page. It might well be a
good thing for "someone" to examine the API for setting up such a page and
to investigate how to mark the main CRAN page as searchable.


The main CRAN page is not ideal. We need to be able to search the help
files. My site has only the html help files for each package (except
the ones I use, which are fully installed), so someone should
re-create that. The CRAN page has a "Reference manual" in pdf for
every package, but the individual functions are not separated.

But, yes, Google would work, even for my page. And the sos package
would have to be modified for that. As I said, I'm not going to do
this. But I would welcome it.

Jon
--
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron
Editor: Judgment and Decision Making (http://journal.sjdm.org)

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Fwd: Re: RSiteSearch, sos, rdocumentation.org, ...?

2016-09-08 Thread Dirk Eddelbuettel

Jonathan,

FWIW I mentored a Google Summer of Code student (who was more than highly
self-sufficient and needed next to no help, apart from some small R packaging
tricks) as part of the Xapian project in order to write RXapian:

   https://github.com/amandaJayanetti/RXapian

which is an R interface to the Xapian index engine.

I don't know much about these indice generators, but Xapian [1] appears to be
free, open-source, current, maintained, powerful, and used.  From what I
gather you are still betting on an older (and as I seem to recall,
deprecated) technology. There may be more teers ahead.

The other tip would be to get in touch with Gabor who as part of r-hub has
indices for just about anything, and 9as he his a generation younger than
Spencer, you or me) also provides current (ie JSON over REST) interfaces.

Dirk

[1] https://xapian.org/

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Fwd: Re: RSiteSearch, sos, rdocumentation.org, ...?

2016-09-08 Thread Kevin Coombes
Would it make sense to recreate the "searchable R help pages" by feeding 
them all into elasticsearch, which will automatically index them and 
also provides an extensive (HTTP+JSON-based) API to perform complex 
searches?


On 9/8/2016 10:31 AM, Jonathan Baron wrote:

On 09/08/16 07:09, John Merrill wrote:
Given Google's commitment to R, I don't think that they'd be at all 
averse
to supporting a custom search box on the package page. It might well 
be a
good thing for "someone" to examine the API for setting up such a 
page and

to investigate how to mark the main CRAN page as searchable.


The main CRAN page is not ideal. We need to be able to search the help
files. My site has only the html help files for each package (except
the ones I use, which are fully installed), so someone should
re-create that. The CRAN page has a "Reference manual" in pdf for
every package, but the individual functions are not separated.

But, yes, Google would work, even for my page. And the sos package
would have to be modified for that. As I said, I'm not going to do
this. But I would welcome it.

Jon



---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Fwd: Re: RSiteSearch, sos, rdocumentation.org, ...?

2016-09-08 Thread John Merrill
That would work, although it would entail standing up a server to front the
elasticsearch module.  The strikes me a huge investment of time which
would, in addition, recreate the current key man risk.

On Thu, Sep 8, 2016 at 7:51 AM, Kevin Coombes 
wrote:

> Would it make sense to recreate the "searchable R help pages" by feeding
> them all into elasticsearch, which will automatically index them and also
> provides an extensive (HTTP+JSON-based) API to perform complex searches?
>
> On 9/8/2016 10:31 AM, Jonathan Baron wrote:
>
>> On 09/08/16 07:09, John Merrill wrote:
>>
>>> Given Google's commitment to R, I don't think that they'd be at all
>>> averse
>>> to supporting a custom search box on the package page. It might well be a
>>> good thing for "someone" to examine the API for setting up such a page
>>> and
>>> to investigate how to mark the main CRAN page as searchable.
>>>
>>
>> The main CRAN page is not ideal. We need to be able to search the help
>> files. My site has only the html help files for each package (except
>> the ones I use, which are fully installed), so someone should
>> re-create that. The CRAN page has a "Reference manual" in pdf for
>> every package, but the individual functions are not separated.
>>
>> But, yes, Google would work, even for my page. And the sos package
>> would have to be modified for that. As I said, I'm not going to do
>> this. But I would welcome it.
>>
>> Jon
>>
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

2016-09-08 Thread Gabriel Becker
Martin,

Like Robin and Oliver I think this type of edge-case consistency is
important and that it's fantastic that R-core - and you personally - are
willing to tackle some of these "gotcha" behaviors. "Little" stuff like
this really does combine to go a long way to making R better and better.

I do wonder a  bit about the

x = 1:2

y = NULL

x < y

case.

Returning a logical of length 0 is more backwards compatible, but is it
ever what the author actually intended? I have trouble thinking of a case
where that less-than didn't carry an implicit assumption that y was
non-NULL.  I can say that in my own code, I've never hit that behavior in a
case that wasn't an error.

My vote (unless someone else points out a compelling use for the behavior)
is for the to throw an error. As a developer, I'd rather things like this
break so the bug in my logic is visible, rather than  propagating as the
0-length logical is &'ed or |'ed with other logical vectors, or used to
subset, or (in the case it should be length 1) passed to if() (if throws an
error now, but the rest would silently "work").

Best,
~G

On Thu, Sep 8, 2016 at 3:49 AM, Martin Maechler 
wrote:

> > robin hankin 
> > on Thu, 8 Sep 2016 10:05:21 +1200 writes:
>
> > Martin I'd like to make a comment; I think that R's
> > behaviour on 'edge' cases like this is an important thing
> > and it's great that you are working on it.
>
> > I make heavy use of zero-extent arrays, chiefly because
> > the dimnames are an efficient and logical way to keep
> > track of certain types of information.
>
> > If I have, for example,
>
> > a <- array(0,c(2,0,2))
> > dimnames(a) <- list(name=c('Mike','Kevin'),
> NULL,item=c("hat","scarf"))
>
>
> > Then in R-3.3.1, 70800 I get
>
> a> 0
> > logical(0)
> >>
>
> > But in 71219 I get
>
> a> 0
> > , , item = hat
>
>
> > name
> > Mike
> > Kevin
>
> > , , item = scarf
>
>
> > name
> > Mike
> > Kevin
>
> > (which is an empty logical array that holds the names of the people
> and
> > their clothes). I find the behaviour of 71219 very much preferable
> because
> > there is no reason to discard the information in the dimnames.
>
> Thanks a lot, Robin, (and Oliver) !
>
> Yes, the above is such a case where the new behavior makes much sense.
> And this behavior remains identical after the 71222 amendment.
>
> Martin
>
> > Best wishes
> > Robin
>
>
>
>
> > On Wed, Sep 7, 2016 at 9:49 PM, Martin Maechler <
> maech...@stat.math.ethz.ch>
> > wrote:
>
> >> > Martin Maechler 
> >> > on Tue, 6 Sep 2016 22:26:31 +0200 writes:
> >>
> >> > Yesterday, changes to R's development version were committed,
> >> relating
> >> > to arithmetic, logic ('&' and '|') and
> >> > comparison/relational ('<', '==') binary operators
> >> > which in NEWS are described as
> >>
> >> > SIGNIFICANT USER-VISIBLE CHANGES:
> >>
> >> > [.]
> >>
> >> > • Arithmetic, logic (‘&’, ‘|’) and comparison (aka
> >> > ‘relational’, e.g., ‘<’, ‘==’) operations with arrays now
> >> > behave consistently, notably for arrays of length zero.
> >>
> >> > Arithmetic between length-1 arrays and longer non-arrays had
> >> > silently dropped the array attributes and recycled.  This
> >> > now gives a warning and will signal an error in the future,
> >> > as it has always for logic and comparison operations in
> >> > these cases (e.g., compare ‘matrix(1,1) + 2:3’ and
> >> > ‘matrix(1,1) < 2:3’).
> >>
> >> > As the above "visually suggests" one could think of the changes
> >> > falling mainly two groups,
> >> > 1) <0-extent array>  (op) 
> >> > 2) <1-extent array>  (arith)  
> >>
> >> > These changes are partly non-back compatible and may break
> >> > existing code.  We believe that the internal consistency gained
> >> > from the changes is worth the few places with problems.
> >>
> >> > We expect some package maintainers (10-20, or even more?) need
> >> > to adapt their code.
> >>
> >> > Case '2)' above mainly results in a new warning, e.g.,
> >>
> >> >> matrix(1,1) + 1:2
> >> > [1] 2 3
> >> > Warning message:
> >> > In matrix(1, 1) + 1:2 :
> >> > dropping dim() of array of length one.  Will become ERROR
> >> >>
> >>
> >> > whereas '1)' gives errors in cases the result silently was a
> >> > vector of length zero, or also keeps array (dim & dimnames) in
> >> > cases these were silently dropped.
> >>
> >> > The following is a "heavily" commented  R script showing (all ?)
> >> > the important cases with changes :
> >>
> >> > 
> >> 
> >>
> >> > (m <- cbind(a=1[0], b=2[0]))
> >> > Lm <- m; storage.mode(Lm) <- "logical"
> >> > Im <- m; storage.mod

Re: [Rd] mget call can trigger C stack usage error

2016-09-08 Thread Gabriel Becker
Alexandre,

AFAICS, this code actually causes infinite recursion, and here's why:


   1. formals grabs returns the formals of the function identified by
   sys.function(sys.parent()) this ends up being print.new, whose first
   argument is x
   2. mget looks for the symbol x in envir = as.environment(-1L) which ends
   up being the evaluation frame for print.new [1]
   3. x in that environment resolves the the object you are trying to print
   4. print() is called on that object, and dispatches to print.new() ...



[1]

> debug(mget)

> foo

**

[1] "envir"  "ifnotfound" "inherits"   "mode"   "x"

Browse[2]> *envir*

**

Browse[2]> sys.frames()

[[1]]




*[[2]]*

**


**


Browse[2]> sys.calls()

[[1]]

function (x, ...)

UseMethod("print")(x)


*[[2]]*

*print.new(x)*


**


Browse[2]> class(envir$x)

[1] *"new"*

Hope that helps.
~G

On Mon, Sep 5, 2016 at 4:48 PM, Alexandre Courtiol <
alexandre.court...@gmail.com> wrote:

> Hi all, not sure if you will call this a bug or something else but the
> following silly call trigger a low level error:
>
> foo <- list(x=1)
> class(foo) <- "new"
> print.new <- function(x, ...) print(mget(names(formals(
> foo
>
> > Error: C stack usage  7969412 is too close to the limit
>
>
>
> --
> Alexandre Courtiol
>
> http://sites.google.com/site/alexandrecourtiol/home
>
> *"Science is the belief in the ignorance of experts"*, R. Feynman
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>



-- 
Gabriel Becker, PhD
Associate Scientist (Bioinformatics)
Genentech Research

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

2016-09-08 Thread William Dunlap via R-devel
Shouldn't binary operators (arithmetic and logical) should throw an error
when one operand is NULL (or other type that doesn't make sense)?  This is
a different case than a zero-length operand of a legitimate type.  E.g.,
 any(x < 0)
should return FALSE if x is number-like and length(x)==0 but give an error
if x is NULL.

I.e., I think the type check should be done before the length check.


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Sep 8, 2016 at 8:43 AM, Gabriel Becker  wrote:

> Martin,
>
> Like Robin and Oliver I think this type of edge-case consistency is
> important and that it's fantastic that R-core - and you personally - are
> willing to tackle some of these "gotcha" behaviors. "Little" stuff like
> this really does combine to go a long way to making R better and better.
>
> I do wonder a  bit about the
>
> x = 1:2
>
> y = NULL
>
> x < y
>
> case.
>
> Returning a logical of length 0 is more backwards compatible, but is it
> ever what the author actually intended? I have trouble thinking of a case
> where that less-than didn't carry an implicit assumption that y was
> non-NULL.  I can say that in my own code, I've never hit that behavior in a
> case that wasn't an error.
>
> My vote (unless someone else points out a compelling use for the behavior)
> is for the to throw an error. As a developer, I'd rather things like this
> break so the bug in my logic is visible, rather than  propagating as the
> 0-length logical is &'ed or |'ed with other logical vectors, or used to
> subset, or (in the case it should be length 1) passed to if() (if throws an
> error now, but the rest would silently "work").
>
> Best,
> ~G
>
> On Thu, Sep 8, 2016 at 3:49 AM, Martin Maechler <
> maech...@stat.math.ethz.ch>
> wrote:
>
> > > robin hankin 
> > > on Thu, 8 Sep 2016 10:05:21 +1200 writes:
> >
> > > Martin I'd like to make a comment; I think that R's
> > > behaviour on 'edge' cases like this is an important thing
> > > and it's great that you are working on it.
> >
> > > I make heavy use of zero-extent arrays, chiefly because
> > > the dimnames are an efficient and logical way to keep
> > > track of certain types of information.
> >
> > > If I have, for example,
> >
> > > a <- array(0,c(2,0,2))
> > > dimnames(a) <- list(name=c('Mike','Kevin'),
> > NULL,item=c("hat","scarf"))
> >
> >
> > > Then in R-3.3.1, 70800 I get
> >
> > a> 0
> > > logical(0)
> > >>
> >
> > > But in 71219 I get
> >
> > a> 0
> > > , , item = hat
> >
> >
> > > name
> > > Mike
> > > Kevin
> >
> > > , , item = scarf
> >
> >
> > > name
> > > Mike
> > > Kevin
> >
> > > (which is an empty logical array that holds the names of the people
> > and
> > > their clothes). I find the behaviour of 71219 very much preferable
> > because
> > > there is no reason to discard the information in the dimnames.
> >
> > Thanks a lot, Robin, (and Oliver) !
> >
> > Yes, the above is such a case where the new behavior makes much sense.
> > And this behavior remains identical after the 71222 amendment.
> >
> > Martin
> >
> > > Best wishes
> > > Robin
> >
> >
> >
> >
> > > On Wed, Sep 7, 2016 at 9:49 PM, Martin Maechler <
> > maech...@stat.math.ethz.ch>
> > > wrote:
> >
> > >> > Martin Maechler 
> > >> > on Tue, 6 Sep 2016 22:26:31 +0200 writes:
> > >>
> > >> > Yesterday, changes to R's development version were committed,
> > >> relating
> > >> > to arithmetic, logic ('&' and '|') and
> > >> > comparison/relational ('<', '==') binary operators
> > >> > which in NEWS are described as
> > >>
> > >> > SIGNIFICANT USER-VISIBLE CHANGES:
> > >>
> > >> > [.]
> > >>
> > >> > • Arithmetic, logic (‘&’, ‘|’) and comparison (aka
> > >> > ‘relational’, e.g., ‘<’, ‘==’) operations with arrays now
> > >> > behave consistently, notably for arrays of length zero.
> > >>
> > >> > Arithmetic between length-1 arrays and longer non-arrays had
> > >> > silently dropped the array attributes and recycled.  This
> > >> > now gives a warning and will signal an error in the future,
> > >> > as it has always for logic and comparison operations in
> > >> > these cases (e.g., compare ‘matrix(1,1) + 2:3’ and
> > >> > ‘matrix(1,1) < 2:3’).
> > >>
> > >> > As the above "visually suggests" one could think of the changes
> > >> > falling mainly two groups,
> > >> > 1) <0-extent array>  (op) 
> > >> > 2) <1-extent array>  (arith)  
> > >>
> > >> > These changes are partly non-back compatible and may break
> > >> > existing code.  We believe that the internal consistency gained
> > >> > from the changes is worth the few places with problems.
> > >>
> > >> > We expect some package maintainers (10-20, or even more?) need
> > >> > to adapt their code.
> > >>
> > >> > Case '2)' above mainly results 

Re: [Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

2016-09-08 Thread Gabriel Becker
On Thu, Sep 8, 2016 at 10:05 AM, William Dunlap  wrote:

> Shouldn't binary operators (arithmetic and logical) should throw an error
> when one operand is NULL (or other type that doesn't make sense)?  This is
> a different case than a zero-length operand of a legitimate type.  E.g.,
>  any(x < 0)
> should return FALSE if x is number-like and length(x)==0 but give an error
> if x is NULL.
>
Bill,

That is a good point. I can see the argument for this in the case that the
non-zero length is 1. I'm not sure which is better though. If we switch
any() to all(), things get murky.

Mathematically, all(x<0) is TRUE if x is length 0 (as are all(x==0), and
all(x>0)), but the likelihood of this being a thought-bug on the author's
part is exceedingly high, imho. So the desirable behavior seems to depend
on the angle we look at it from.

My personal opinion is that x < y with length(x)==0 should fail if length(y)
> 1, at least, and I'd be for it being an error even if y is length 1,
though I do acknowledge this is more likely (though still quite unlikely
imho) to be the intended behavior.

~G

>
> I.e., I think the type check should be done before the length check.
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Thu, Sep 8, 2016 at 8:43 AM, Gabriel Becker 
> wrote:
>
>> Martin,
>>
>> Like Robin and Oliver I think this type of edge-case consistency is
>> important and that it's fantastic that R-core - and you personally - are
>> willing to tackle some of these "gotcha" behaviors. "Little" stuff like
>> this really does combine to go a long way to making R better and better.
>>
>> I do wonder a  bit about the
>>
>> x = 1:2
>>
>> y = NULL
>>
>> x < y
>>
>> case.
>>
>> Returning a logical of length 0 is more backwards compatible, but is it
>> ever what the author actually intended? I have trouble thinking of a case
>> where that less-than didn't carry an implicit assumption that y was
>> non-NULL.  I can say that in my own code, I've never hit that behavior in
>> a
>> case that wasn't an error.
>>
>> My vote (unless someone else points out a compelling use for the behavior)
>> is for the to throw an error. As a developer, I'd rather things like this
>> break so the bug in my logic is visible, rather than  propagating as the
>> 0-length logical is &'ed or |'ed with other logical vectors, or used to
>> subset, or (in the case it should be length 1) passed to if() (if throws
>> an
>> error now, but the rest would silently "work").
>>
>> Best,
>> ~G
>>
>> On Thu, Sep 8, 2016 at 3:49 AM, Martin Maechler <
>> maech...@stat.math.ethz.ch>
>> wrote:
>>
>> > > robin hankin 
>> > > on Thu, 8 Sep 2016 10:05:21 +1200 writes:
>> >
>> > > Martin I'd like to make a comment; I think that R's
>> > > behaviour on 'edge' cases like this is an important thing
>> > > and it's great that you are working on it.
>> >
>> > > I make heavy use of zero-extent arrays, chiefly because
>> > > the dimnames are an efficient and logical way to keep
>> > > track of certain types of information.
>> >
>> > > If I have, for example,
>> >
>> > > a <- array(0,c(2,0,2))
>> > > dimnames(a) <- list(name=c('Mike','Kevin'),
>> > NULL,item=c("hat","scarf"))
>> >
>> >
>> > > Then in R-3.3.1, 70800 I get
>> >
>> > a> 0
>> > > logical(0)
>> > >>
>> >
>> > > But in 71219 I get
>> >
>> > a> 0
>> > > , , item = hat
>> >
>> >
>> > > name
>> > > Mike
>> > > Kevin
>> >
>> > > , , item = scarf
>> >
>> >
>> > > name
>> > > Mike
>> > > Kevin
>> >
>> > > (which is an empty logical array that holds the names of the
>> people
>> > and
>> > > their clothes). I find the behaviour of 71219 very much preferable
>> > because
>> > > there is no reason to discard the information in the dimnames.
>> >
>> > Thanks a lot, Robin, (and Oliver) !
>> >
>> > Yes, the above is such a case where the new behavior makes much sense.
>> > And this behavior remains identical after the 71222 amendment.
>> >
>> > Martin
>> >
>> > > Best wishes
>> > > Robin
>> >
>> >
>> >
>> >
>> > > On Wed, Sep 7, 2016 at 9:49 PM, Martin Maechler <
>> > maech...@stat.math.ethz.ch>
>> > > wrote:
>> >
>> > >> > Martin Maechler 
>> > >> > on Tue, 6 Sep 2016 22:26:31 +0200 writes:
>> > >>
>> > >> > Yesterday, changes to R's development version were committed,
>> > >> relating
>> > >> > to arithmetic, logic ('&' and '|') and
>> > >> > comparison/relational ('<', '==') binary operators
>> > >> > which in NEWS are described as
>> > >>
>> > >> > SIGNIFICANT USER-VISIBLE CHANGES:
>> > >>
>> > >> > [.]
>> > >>
>> > >> > • Arithmetic, logic (‘&’, ‘|’) and comparison (aka
>> > >> > ‘relational’, e.g., ‘<’, ‘==’) operations with arrays now
>> > >> > behave consistently, notably for arrays of length zero.
>> > >>
>> > >> > Arithmetic between length-1 arrays and longer non-arrays had
>> > >> 

Re: [Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

2016-09-08 Thread William Dunlap via R-devel
Prior to the mid-1990s, S did "length-0 OP length-n -> rep(NA, n)" and it
was changed
to "length-0 OP length-n -> length-0" to avoid lots of problems like
any(x<0) being NA
when length(x)==0.  Yes, people could code defensively by putting lots of
if(length(x)==0)...
in their code, but that is tedious and error-prone and creates really ugly
code.

Is your suggestion to leave the length-0 OP length-1 case as it is but make
length-0 OP length-two-or-higher an error or warning (akin to the length-2
OP length-3 case)?

By the way, the all(numeric(0)<0) is TRUE, as is all(numeric()>0), by de
Morgan's rule, but that is not really relevant here.


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Sep 8, 2016 at 10:22 AM, Gabriel Becker 
wrote:

>
>
> On Thu, Sep 8, 2016 at 10:05 AM, William Dunlap  wrote:
>
>> Shouldn't binary operators (arithmetic and logical) should throw an error
>> when one operand is NULL (or other type that doesn't make sense)?  This is
>> a different case than a zero-length operand of a legitimate type.  E.g.,
>>  any(x < 0)
>> should return FALSE if x is number-like and length(x)==0 but give an
>> error if x is NULL.
>>
> Bill,
>
> That is a good point. I can see the argument for this in the case that the
> non-zero length is 1. I'm not sure which is better though. If we switch
> any() to all(), things get murky.
>
> Mathematically, all(x<0) is TRUE if x is length 0 (as are all(x==0), and
> all(x>0)), but the likelihood of this being a thought-bug on the author's
> part is exceedingly high, imho. So the desirable behavior seems to depend
> on the angle we look at it from.
>
> My personal opinion is that x < y with length(x)==0 should fail if length(y)
> > 1, at least, and I'd be for it being an error even if y is length 1,
> though I do acknowledge this is more likely (though still quite unlikely
> imho) to be the intended behavior.
>
> ~G
>
>>
>> I.e., I think the type check should be done before the length check.
>>
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>> On Thu, Sep 8, 2016 at 8:43 AM, Gabriel Becker 
>> wrote:
>>
>>> Martin,
>>>
>>> Like Robin and Oliver I think this type of edge-case consistency is
>>> important and that it's fantastic that R-core - and you personally - are
>>> willing to tackle some of these "gotcha" behaviors. "Little" stuff like
>>> this really does combine to go a long way to making R better and better.
>>>
>>> I do wonder a  bit about the
>>>
>>> x = 1:2
>>>
>>> y = NULL
>>>
>>> x < y
>>>
>>> case.
>>>
>>> Returning a logical of length 0 is more backwards compatible, but is it
>>> ever what the author actually intended? I have trouble thinking of a case
>>> where that less-than didn't carry an implicit assumption that y was
>>> non-NULL.  I can say that in my own code, I've never hit that behavior
>>> in a
>>> case that wasn't an error.
>>>
>>> My vote (unless someone else points out a compelling use for the
>>> behavior)
>>> is for the to throw an error. As a developer, I'd rather things like this
>>> break so the bug in my logic is visible, rather than  propagating as the
>>> 0-length logical is &'ed or |'ed with other logical vectors, or used to
>>> subset, or (in the case it should be length 1) passed to if() (if throws
>>> an
>>> error now, but the rest would silently "work").
>>>
>>> Best,
>>> ~G
>>>
>>> On Thu, Sep 8, 2016 at 3:49 AM, Martin Maechler <
>>> maech...@stat.math.ethz.ch>
>>> wrote:
>>>
>>> > > robin hankin 
>>> > > on Thu, 8 Sep 2016 10:05:21 +1200 writes:
>>> >
>>> > > Martin I'd like to make a comment; I think that R's
>>> > > behaviour on 'edge' cases like this is an important thing
>>> > > and it's great that you are working on it.
>>> >
>>> > > I make heavy use of zero-extent arrays, chiefly because
>>> > > the dimnames are an efficient and logical way to keep
>>> > > track of certain types of information.
>>> >
>>> > > If I have, for example,
>>> >
>>> > > a <- array(0,c(2,0,2))
>>> > > dimnames(a) <- list(name=c('Mike','Kevin'),
>>> > NULL,item=c("hat","scarf"))
>>> >
>>> >
>>> > > Then in R-3.3.1, 70800 I get
>>> >
>>> > a> 0
>>> > > logical(0)
>>> > >>
>>> >
>>> > > But in 71219 I get
>>> >
>>> > a> 0
>>> > > , , item = hat
>>> >
>>> >
>>> > > name
>>> > > Mike
>>> > > Kevin
>>> >
>>> > > , , item = scarf
>>> >
>>> >
>>> > > name
>>> > > Mike
>>> > > Kevin
>>> >
>>> > > (which is an empty logical array that holds the names of the
>>> people
>>> > and
>>> > > their clothes). I find the behaviour of 71219 very much
>>> preferable
>>> > because
>>> > > there is no reason to discard the information in the dimnames.
>>> >
>>> > Thanks a lot, Robin, (and Oliver) !
>>> >
>>> > Yes, the above is such a case where the new behavior makes much sense.
>>> > And this behavior remains identical after the 71222 amendment.
>>> >
>>> > Martin
>>> >
>>> > > Best wishes
>>> > > Robin
>>> >
>>> >
>>> >

Re: [Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

2016-09-08 Thread Paul Gilbert



On 09/08/2016 01:22 PM, Gabriel Becker wrote:

On Thu, Sep 8, 2016 at 10:05 AM, William Dunlap  wrote:


Shouldn't binary operators (arithmetic and logical) should throw an error
when one operand is NULL (or other type that doesn't make sense)?  This is
a different case than a zero-length operand of a legitimate type.  E.g.,
 any(x < 0)
should return FALSE if x is number-like and length(x)==0 but give an error
if x is NULL.


Bill,

That is a good point. I can see the argument for this in the case that the
non-zero length is 1. I'm not sure which is better though. If we switch
any() to all(), things get murky.

Mathematically, all(x<0) is TRUE if x is length 0 (as are all(x==0), and
all(x>0)), but the likelihood of this being a thought-bug on the author's
part is exceedingly high, imho.


I suspect there may be more R users than you think that understand and 
use vacuously true in code. I don't really like the idea of turning a 
perfectly good and properly documented mathematical test into an error 
in order to protect against a possible "thought-bug".


Paul

So the desirable behavior seems to depend

on the angle we look at it from.

My personal opinion is that x < y with length(x)==0 should fail if length(y)

1, at least, and I'd be for it being an error even if y is length 1,

though I do acknowledge this is more likely (though still quite unlikely
imho) to be the intended behavior.

~G



I.e., I think the type check should be done before the length check.


Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Sep 8, 2016 at 8:43 AM, Gabriel Becker 
wrote:


Martin,

Like Robin and Oliver I think this type of edge-case consistency is
important and that it's fantastic that R-core - and you personally - are
willing to tackle some of these "gotcha" behaviors. "Little" stuff like
this really does combine to go a long way to making R better and better.

I do wonder a  bit about the

x = 1:2

y = NULL

x < y

case.

Returning a logical of length 0 is more backwards compatible, but is it
ever what the author actually intended? I have trouble thinking of a case
where that less-than didn't carry an implicit assumption that y was
non-NULL.  I can say that in my own code, I've never hit that behavior in
a
case that wasn't an error.

My vote (unless someone else points out a compelling use for the behavior)
is for the to throw an error. As a developer, I'd rather things like this
break so the bug in my logic is visible, rather than  propagating as the
0-length logical is &'ed or |'ed with other logical vectors, or used to
subset, or (in the case it should be length 1) passed to if() (if throws
an
error now, but the rest would silently "work").

Best,
~G

On Thu, Sep 8, 2016 at 3:49 AM, Martin Maechler <
maech...@stat.math.ethz.ch>
wrote:


robin hankin 
on Thu, 8 Sep 2016 10:05:21 +1200 writes:


> Martin I'd like to make a comment; I think that R's
> behaviour on 'edge' cases like this is an important thing
> and it's great that you are working on it.

> I make heavy use of zero-extent arrays, chiefly because
> the dimnames are an efficient and logical way to keep
> track of certain types of information.

> If I have, for example,

> a <- array(0,c(2,0,2))
> dimnames(a) <- list(name=c('Mike','Kevin'),
NULL,item=c("hat","scarf"))


> Then in R-3.3.1, 70800 I get

a> 0
> logical(0)
>>

> But in 71219 I get

a> 0
> , , item = hat


> name
> Mike
> Kevin

> , , item = scarf


> name
> Mike
> Kevin

> (which is an empty logical array that holds the names of the

people

and
> their clothes). I find the behaviour of 71219 very much preferable
because
> there is no reason to discard the information in the dimnames.

Thanks a lot, Robin, (and Oliver) !

Yes, the above is such a case where the new behavior makes much sense.
And this behavior remains identical after the 71222 amendment.

Martin

> Best wishes
> Robin




> On Wed, Sep 7, 2016 at 9:49 PM, Martin Maechler <
maech...@stat.math.ethz.ch>
> wrote:

>> > Martin Maechler 
>> > on Tue, 6 Sep 2016 22:26:31 +0200 writes:
>>
>> > Yesterday, changes to R's development version were committed,
>> relating
>> > to arithmetic, logic ('&' and '|') and
>> > comparison/relational ('<', '==') binary operators
>> > which in NEWS are described as
>>
>> > SIGNIFICANT USER-VISIBLE CHANGES:
>>
>> > [.]
>>
>> > • Arithmetic, logic (‘&’, ‘|’) and comparison (aka
>> > ‘relational’, e.g., ‘<’, ‘==’) operations with arrays now
>> > behave consistently, notably for arrays of length zero.
>>
>> > Arithmetic between length-1 arrays and longer non-arrays had
>> > silently dropped the array attributes and recycled.  This
>> > now gives a warning and will signal an error in the future,
>> > as it has always for logic and comparison operations in
>> > t

Re: [Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

2016-09-08 Thread robin hankin
Could we take a cue from min() and max()?

> x <- 1:10
> min(x[x>7])
[1] 8
> min(x[x>11])
[1] Inf
Warning message:
In min(x[x > 11]) : no non-missing arguments to min; returning Inf
>

As ?min says, this is implemented to preserve transitivity, and this
makes a lot of sense.
I think the issuing of a warning here is a good compromise; I can
always turn off warnings if I want.

I find this behaviour of min() and max() to be annoying in the *right*
way: it annoys me precisely when I need to be
annoyed, that is, when I haven't thought through the consequences of
sending zero-length arguments.


On Fri, Sep 9, 2016 at 6:00 AM, Paul Gilbert  wrote:
>
>
> On 09/08/2016 01:22 PM, Gabriel Becker wrote:
>>
>> On Thu, Sep 8, 2016 at 10:05 AM, William Dunlap  wrote:
>>
>>> Shouldn't binary operators (arithmetic and logical) should throw an error
>>> when one operand is NULL (or other type that doesn't make sense)?  This
>>> is
>>> a different case than a zero-length operand of a legitimate type.  E.g.,
>>>  any(x < 0)
>>> should return FALSE if x is number-like and length(x)==0 but give an
>>> error
>>> if x is NULL.
>>>
>> Bill,
>>
>> That is a good point. I can see the argument for this in the case that the
>> non-zero length is 1. I'm not sure which is better though. If we switch
>> any() to all(), things get murky.
>>
>> Mathematically, all(x<0) is TRUE if x is length 0 (as are all(x==0), and
>> all(x>0)), but the likelihood of this being a thought-bug on the author's
>> part is exceedingly high, imho.
>
>
> I suspect there may be more R users than you think that understand and use
> vacuously true in code. I don't really like the idea of turning a perfectly
> good and properly documented mathematical test into an error in order to
> protect against a possible "thought-bug".
>
> Paul
>
>
> So the desirable behavior seems to depend
>>
>> on the angle we look at it from.
>>
>> My personal opinion is that x < y with length(x)==0 should fail if
>> length(y)
>>>
>>> 1, at least, and I'd be for it being an error even if y is length 1,
>>
>> though I do acknowledge this is more likely (though still quite unlikely
>> imho) to be the intended behavior.
>>
>> ~G
>>
>>>
>>> I.e., I think the type check should be done before the length check.
>>>
>>>
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>>
>>> On Thu, Sep 8, 2016 at 8:43 AM, Gabriel Becker 
>>> wrote:
>>>
 Martin,

 Like Robin and Oliver I think this type of edge-case consistency is
 important and that it's fantastic that R-core - and you personally - are
 willing to tackle some of these "gotcha" behaviors. "Little" stuff like
 this really does combine to go a long way to making R better and better.

 I do wonder a  bit about the

 x = 1:2

 y = NULL

 x < y

 case.

 Returning a logical of length 0 is more backwards compatible, but is it
 ever what the author actually intended? I have trouble thinking of a
 case
 where that less-than didn't carry an implicit assumption that y was
 non-NULL.  I can say that in my own code, I've never hit that behavior
 in
 a
 case that wasn't an error.

 My vote (unless someone else points out a compelling use for the
 behavior)
 is for the to throw an error. As a developer, I'd rather things like
 this
 break so the bug in my logic is visible, rather than  propagating as the
 0-length logical is &'ed or |'ed with other logical vectors, or used to
 subset, or (in the case it should be length 1) passed to if() (if throws
 an
 error now, but the rest would silently "work").

 Best,
 ~G

 On Thu, Sep 8, 2016 at 3:49 AM, Martin Maechler <
 maech...@stat.math.ethz.ch>
 wrote:

>> robin hankin 
>> on Thu, 8 Sep 2016 10:05:21 +1200 writes:
>
>
> > Martin I'd like to make a comment; I think that R's
> > behaviour on 'edge' cases like this is an important thing
> > and it's great that you are working on it.
>
> > I make heavy use of zero-extent arrays, chiefly because
> > the dimnames are an efficient and logical way to keep
> > track of certain types of information.
>
> > If I have, for example,
>
> > a <- array(0,c(2,0,2))
> > dimnames(a) <- list(name=c('Mike','Kevin'),
> NULL,item=c("hat","scarf"))
>
>
> > Then in R-3.3.1, 70800 I get
>
> a> 0
> > logical(0)
> >>
>
> > But in 71219 I get
>
> a> 0
> > , , item = hat
>
>
> > name
> > Mike
> > Kevin
>
> > , , item = scarf
>
>
> > name
> > Mike
> > Kevin
>
> > (which is an empty logical array that holds the names of the

 people
>
> and
> > their clothes). I find the behaviour of 71219 very much
> preferabl

Re: [Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

2016-09-08 Thread Radford Neal
Regarding Martin Maechler's proposal:

   Arithmetic between length-1 arrays and longer non-arrays had
   silently dropped the array attributes and recycled.  This now gives
   a warning and will signal an error in the future, as it has always
   for logic and comparison operations

For example, matrix(1,1,1) + (1:2) would give a warning/error.

I think this might be a mistake.

The potential benefits of this would be detection of some programming
errors, and increased consistency.  The downsides are breaking
existing working programs, and decreased consistency.

Regarding consistency, the overall R philosophy is that attaching an
attribute to a vector doesn't change what you can do with it, or what
the result is, except that the result (often) gets the attributes
carried forward.  By this logic, adding a 'dim' attribute shouldn't
stop you from doing arithmetic (or comparisons) that you otherwise
could.

But maybe 'dim' attributes are special?  Well, they are in some
circumstances, and have to be when they are intended to change the
behaviour, such as when a matrix is used as an index with [.

But in many cases at present, 'dim' attributes DON'T stop you from
treating the object as a plain vector - for example, one is allowed 
to do matrix(1:4,2,2)[3], and a<-numeric(10); a[2:5]<-matrix(1,2,2).

So it may make more sense to move towards consistency in the
permissive direction, rather than the restrictive direction.  That
would mean allowing matrix(1,1,1)<(1:2), and maybe also things
like matrix(1,2,2)+(1:8).

Obviously, a change that removes error conditions is much less likely
to produce backwards-compatibility problems than a change that gives
errors for previously-allowed operations.

And I think there would be some significant problems. In addition to
the 10-20+ packages that Martin expects to break, there could be quite
a bit of user code that would no longer work - scripts for analysing
data sets that used to work, but now don't with the latest version.

There are reasons to expect such problems.  Some operations such as
vector dot products using %*% produce results that are always scalar,
but are formed as 1x1 matrices.  One can anticipate that many people
have not been getting rid of the 'dim' attribute in such cases, when
doing so hasn't been necessary in the past.

Regarding the 0-length vector issue, I agree with other posters that
after a<-numeric(0), is has to be allowable to write a<1.  To not
allow this would be highly destructive of code reliability.  And for
similar reason, after a<-c(), a<1 needs to be allowed, which means
NULL<1 should be allowed (giving logical(0)), since c() is NULL.

   Radford Neal

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R (development) changes in arith, logic, relop with (0-extent) arrays

2016-09-08 Thread Martin Maechler
Thank you, Gabe and Bill,

for taking up the discussion.

> William Dunlap 
> on Thu, 8 Sep 2016 10:45:07 -0700 writes:

> Prior to the mid-1990s, S did "length-0 OP length-n -> rep(NA, n)" and it
> was changed
> to "length-0 OP length-n -> length-0" to avoid lots of problems like
> any(x<0) being NA
> when length(x)==0.  Yes, people could code defensively by putting lots of
> if(length(x)==0)...
> in their code, but that is tedious and error-prone and creates really ugly
> code.

Yes, so actually, basically

 length-0 OP   -> length-0

Now the case of NULL that Bill mentioned.
I agree that NULL  is not at all the same thing as  double(0) or logical(0),
*but* there have been quite a few cases, where NULL is the
result of operations where "for consistency"  double(0) / logical(0) should have
been and there are the users who use NULL as the equivalent
of those, e.g., by initializing a (to be grown, yes, very inefficient!)
vector with NULL instead of with say double(0).

For these reasons, many operations that expect a "number-like"
(includes logical) atomic vector have treated NULL as such...
*and* parts of the {arith/logic/relop} OPs have done so already
in R "forever".
I still would argue that for these OPs, treating NULL as  logical(0) {which
then may be promoted by the usual rules} is good thing.


> Is your suggestion to leave the length-0 OP length-1 case as it is but 
make
> length-0 OP length-two-or-higher an error or warning (akin to the length-2
> OP length-3 case)?

That's exactly what one thing the current changes eliminated:
arithmetic (only; not logic, or relop) did treat the length-1
case (for arrays!) different from the length-GE-2 case.  And I strongly
believe that this is very wrong and counter to the predominant
recycling rules in (S and) R.


> By the way, the all(numeric(0)<0) is TRUE, as is all(numeric()>0), by de
> Morgan's rule, but that is not really relevant here.



> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com

> On Thu, Sep 8, 2016 at 10:22 AM, Gabriel Becker 
> wrote:

>> 
>> 
>> On Thu, Sep 8, 2016 at 10:05 AM, William Dunlap  
wrote:
>> 
>>> Shouldn't binary operators (arithmetic and logical) should throw an 
error
>>> when one operand is NULL (or other type that doesn't make sense)?  This 
is
>>> a different case than a zero-length operand of a legitimate type.  E.g.,
>>> any(x < 0)
>>> should return FALSE if x is number-like and length(x)==0 but give an
>>> error if x is NULL.
>>> 
>> Bill,
>> 
>> That is a good point. I can see the argument for this in the case that 
the
>> non-zero length is 1. I'm not sure which is better though. If we switch
>> any() to all(), things get murky.
>> 
>> Mathematically, all(x<0) is TRUE if x is length 0 (as are all(x==0), and
>> all(x>0)), but the likelihood of this being a thought-bug on the author's
>> part is exceedingly high, imho. So the desirable behavior seems to depend
>> on the angle we look at it from.
>> 
>> My personal opinion is that x < y with length(x)==0 should fail if 
length(y)
>> > 1, at least, and I'd be for it being an error even if y is length 1,
>> though I do acknowledge this is more likely (though still quite unlikely
>> imho) to be the intended behavior.
>> 
>> ~G
>> 
>>> 
>>> I.e., I think the type check should be done before the length check.
>>> 
>>> 
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>> 
>>> On Thu, Sep 8, 2016 at 8:43 AM, Gabriel Becker 
>>> wrote:
>>> 
 Martin,
 
 Like Robin and Oliver I think this type of edge-case consistency is
 important and that it's fantastic that R-core - and you personally - 
are
 willing to tackle some of these "gotcha" behaviors. "Little" stuff like
 this really does combine to go a long way to making R better and 
better.
 
 I do wonder a  bit about the
 
 x = 1:2
 
 y = NULL
 
 x < y
 
 case.
 
 Returning a logical of length 0 is more backwards compatible, but is it
 ever what the author actually intended? I have trouble thinking of a 
case
 where that less-than didn't carry an implicit assumption that y was
 non-NULL.  I can say that in my own code, I've never hit that behavior
 in a
 case that wasn't an error.
 
 My vote (unless someone else points out a compelling use for the
 behavior)
 is for the to throw an error. As a developer, I'd rather things like 
this
 break so the bug in my logic is visible, rather than  propagating as 
the
 0-length logical is &'ed or |'ed with other logical vectors, or used to
 subset, or (in the case it should be length 1) passed to if(