Re: [Rd] agrep bug

2018-06-19 Thread Kurt Hornik
> Kolter, Andreas writes:

> Sorry, I don't understand how to file a bug properly. Nontheless I
> want to report this one because it is still in the code after so many
> years.

Thanks.  This is now fixed in the trunk with c74916.

Best
-k

> This bug still exists:

> https://stackoverflow.com/questions/15871702/difficulties-with-agrep-fixed-f

>   [[alternative HTML version deleted]]

> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] aic() component in GLM-family objects

2018-06-19 Thread Suharto Anggono Suharto Anggono via R-devel
In R, family has aic component since version 0.62. There is no aic component in 
family in R 0.61.3.

Looking at blame, 
https://github.com/wch/r-source/blame/tags/R-0-62/src/library/base/R/family.R , 
aic component in family is introduced in svn revision 640 
(https://github.com/wch/r-source/commit/ac666741679b50bb1dfb5ce631717b375119f6ab):
using aic(.) [Jim Lindsey]; use switch() rather than many if else else.. (MM)

Components of family is documented since R 2.3.0.


> Ben Bolker 
> on Sun, 17 Jun 2018 11:40:38 -0400 writes:

> FWIW p. 206 of the White Book gives the following for
> names(binomial()): family, names, link, inverse, deriv,
> initialize, variance, deviance, weight.

>   So $aic wasn't there In The Beginning.  I haven't done
> any more archaeology to try to figure out when/by whom it
> was first introduced ...

Thank you Ben.

I think I was already suggesting that it was by Simon and Ross
and we cannot know who of the two.

>  Section 6.3.3, on extending families, doesn't give any
> other relevant info.

> A patch for src/library/stats/man/family.Rd below: please
> check what I've said about $aic and $mu.eta to make sure
> they're correct!  I can submit this to the r bug list if
> preferred.

I've spent quite some time checking this - to some extent.

Thank you for the patch. I will use an even slightly extended
version ((and using the correct '\eqn{\eta}{eta}' )).

Thank you indeed.
Martin

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] readLines function with R >= 3.5.0

2018-06-19 Thread Jennifer Lyon
Hi Michael:

I can confirm Martin's comment. I tested my software with r-devel (r74914)
and it works, while with r-patched (r74914) it does not work (it hangs, as
it did in R 3.5.0). I apologize for it taking so long for me to test this,
but is there any chance this fix could make into R 3.5.1?

Thanks.

Jen.

On Wed, Jun 13, 2018 at 6:24 AM, Michael Lawrence  wrote:

> Are you sure it's not available in patched? It's definitely in the
> source since 6/1.
>
> Michael
>
>
> On Wed, Jun 13, 2018 at 2:19 AM, Martin Maechler
>  wrote:
> >> Michael Lawrence
> >> on Tue, 12 Jun 2018 19:27:49 -0700 writes:
> >
> > > Hi Jen, This was already resolved for R 3.5.1 by just
> > > disabling buffering on terminal file connections like stdin.
> >
> > and before R 3.5.1 exists, *and*
> > as the change is also not yet available in R patched (!)
> > this means using a version of
> > "R-devel", e.g. for Windows available from
> >https://cloud.r-project.org/bin/windows/base/rdevel.html
> >
> > Martin
> >
> > > Sounds like you might want to be running a web service or
> > > something instead though.
> >
> > > Michael
> >
> > > On Tue, Jun 12, 2018 at 4:46 PM, Jennifer Lyon
> > >  wrote:
> > >> Hi:
> > >>
> > >> I have also just stumbled into this bug. Unfortunately, I
> > >> can not change the data my program receives from
> > >> stdin. My code runs in a larger system and stdin is sent
> > >> to a Docker container running my R code. The protocol is
> > >> I read a line, readLines("stdin", n=1), do some actions,
> > >> send output on stdout, and wait for the next set of data.
> > >> I don't have control over this protocol, so I can't use
> > >> the ^D workaround.
> > >>
> > >> I am open for other workaround suggestions. The single
> > >> line is actually JSON and can be quite large. If there
> > >> isn't something else cleaner, I am going to try
> > >> readChar() in a while loop looking for \n but I'm
> > >> guessing that would likely be too slow.  I am open to
> > >> other workaround solutions. For the moment I have
> > >> reverted back to R 3.4.4.
> > >>
> > >> Thanks for any suggestions.
> > >>
> > >> Jen.
> > >>
> > >>
> >  > Martin Maechler > on Mon, 28 May 2018
> >  10:28:01 +0200 writes:
> > 
> >  > Ralf Stubner > on Fri, 25 May 2018 19:18:58
> >  +0200 writes:
> > 
> >  >> Dear all, I would like to draw you attention to this
> >  >> question on SO:
> >  >>
> > >> https://stackoverflow.com/questions/50372043/readlines-
> function-with-new-version-of-r
> > 
> > 
> >  >> Based on the OP's code I used the script
> > 
> >  >> ###
> >  >> create_matrix <- function() { >> cat("Write the
> >  numbers of vertices: ") >> user_input <-
> >  readLines("stdin", n=1) >> user_input <-
> >  as.numeric(user_input) >> print(user_input) >> } >>
> >  create_matrix()
> >  >> ###
> > 
> >  >> and called it with "R -f " from the
> >  command line.
> > 
> >  >> With 'R version 3.4.4 (2018-03-15) -- "Someone to
> >  Lean On"' the
> > >> script
> >  >> prints the inputed number as expected. With both 'R
> >  version 3.5.0 >> (2018-04-23) -- "Joy in Playing"' and
> >  'R Under development
> > >> (unstable)
> >  >> (2018-05-19 r74746) -- "Unsuffered Consequences"'
> >  the script does
> > >> not
> >  >> continue after inputing a number.
> > 
> >  > I can confirm.  > It "works" if you additionally (the
> >  [Enter], i.e., EOL) you also > "send" an EOF -- in Unix
> >  alikes via -D
> > 
> >  > The same happens if you use 'Rscript '
> > 
> >  > I'm not the expert here, but am close to sure that we
> >  (R core) > did not intend this change, when fixing
> >  other somewhat subtle > bugs in Rscript / 'R -f'
> > 
> >  > Martin Maechler
> > 
> >  The same behavior in regular R , no need for a script
> >  etc.
> > 
> >  > str(readLines("stdin", n=1))
> > 
> >  then in addition to the input you need to "give" an EOF
> >  (Ctrl D) in R
> > >>> = 3.5.0
> > 
> >  Interestingly, everything works fine if you use stdin()
> >  instead of "stdin" :
> > 
> >  > rr <- readLines(stdin(), n=1) foo > rr [1] "foo"
> >  >
> >  --
> > 
> >  So, for now use stdin() which is much clearer than the
> >  string "stdin" anyway
> > 
> >  Martin Maechler
> > >>
> > >> [[alternative HTML version deleted]]
> > >>
> >

Re: [Rd] Bug 16719: kruskal.test documentation for formula

2018-06-19 Thread Thomas Levine
Thomas Levine writes:
> I submit a couple options for addressing bug 16719: kruskal.test
> documentation for formula.
> https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=16719
>
> disallow-character.diff changes the documentation and error message
> to indicate that factors are accepted.
>
> allow-character.diff changes the kruskal.test functions to convert
> character vectors to factors; documentation is updated accordingly.
>
> I tested the updated functions with the examples in example.R. It is
> based on the examples in the bug report.
>
> If there is interest in applying either patch, especially the latter,
> I want first to test the change on lots of existing programs that call
> kruskal.test, to see if it causes any regressions. Is there a good place
> to look for programs that use particular R functions?
>
> I am having trouble building R, so I have so far tested these changes
> only by patching revision 74631 (SVN head) and sourcing the resulting
> kruskal.test.R in R 3.4.1 on OpenBSD 6.2. I thus have not tested the
> R documentation files.

I thought it was important to test the changes on lots of existing
programs that call kruskal.test, to see if it causes any regressions.

CRAN testing

I downloaded all CRAN packages and checked whether they contained the
fixed expression "kruskal.test". (See "Makefile" and "all-kruskal.r".)

I subsequently tested on all packages in CRAN that mentioned
"kruskal.test".

I patched the development version of R and built like this.

  ./configure --without-recommended-packages
  gmake
  cd src/library
  gmake all docs Rdfiles

This command was helpful for cleaning the repository tree.

  svn status | sed -n 's/^\?  *//p' | xargs rm -r

I tested three versions of kruskal.test

* SVN checkout 74844 with no modifications
* SVN checkout 74844 with disallow-character patch
* SVN checkout 74844 with allow-character patch

The test is to run all of the examples from all of the packages that
mention kruskal.test; with each example I ran, I recorded whether an
error was raised.  I ran all examples, regardless of whether the example
mentioned kruskal.test.  I compared the raising of an error among the
three builds of R/kruskal.test.

I ran these commands for each R version to build R, install the packages
referencing kruskal.test, and run the tests in parallel. The procedure
is available here; see the Makefile for more detail.
https://thomaslevine.com/scm/r-patches/dir?ci=6ea0db4fde&name=kruskal.test-numeric/testing

Run it with like this if you are so inclined.

  make -j 3 install
  make -j 3 test

I found 100 packages that referenced kruskal.test. (This was based on a
very crude string matching; some of these packages mentioned
kruskal.test only in the documentation.) Of these 100 packages, I was
able to install 39. I ran all of the examples in all of these packages,
a total of 2361 examples.

The successes and failures matched exactly among the three builds.
341 examples succeeded, and 2020 failed.
https://thomaslevine.com/scm/r-patches/artifact/5df57add4414970a

This is of course a lot of failures and a small proportion of the
packages. I only installed the packages whose dependencies were easy
for me to install (on OpenBSD 6.2), and some of those implicitly
depended on other things that were not available; this explains
all of the examples that raised errors.

Review of r-help

I also began to collect all kruskal.test calls that I could find in the
r-help archives. Formatting them to be appropriate for evaluation is
quite tedious, so I doubt I will follow through with this, but all of
the calls appear to use ordinary character, numeric, or factor types,
and none performed error catching, so no obvious problems with my
proposed changes stand out.

Furthermore, in looking through the r-help archives, I noted these
messages on r-help where people were having trouble using kruskal.test
and where I think either of my proposed changes would have helped them
perform their desired Kruskal-Wallis rank sum tests.

  <1280836078385-2311712.p...@n4.nabble.com>
  <1280849183252-2312063.p...@n4.nabble.com>

Conclusions
---
I have yet to find any example of my proposed changes causing a
regression. I believe that the most reasonable thing that it might
break is something that depends on either kruskal.test raising an
error or that depends on the specific text in the error message.

If the limited testing is a concern, I could find a way to install
all of the packages and run all of their examples.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Bug 16719: kruskal.test documentation for formula

2018-06-19 Thread Thomas Levine
Thomas Levine writes:
> I have yet to find any example of my proposed changes causing a
> regression. I believe that the most reasonable thing that it might
> break is something that depends on either kruskal.test raising an
> error or that depends on the specific text in the error message.
>
> If the limited testing is a concern, I could find a way to install
> all of the packages and run all of their examples.

In case my April message is hard to find, I have attached the packages
redundantly to this email.
Index: src/library/stats/R/kruskal.test.R
===
--- src/library/stats/R/kruskal.test.R  (revision 74631)
+++ src/library/stats/R/kruskal.test.R  (working copy)
@@ -46,7 +46,10 @@
 x <- x[OK]
 g <- g[OK]
 if (!all(is.finite(g)))
-stop("all group levels must be finite")
+if (is.character(g))
+stop("all group levels must be finite; convert group to a 
factor")
+else
+stop("all group levels must be finite")
 g <- factor(g)
 k <- nlevels(g)
 if (k < 2L)
Index: src/library/stats/man/kruskal.test.Rd
===
--- src/library/stats/man/kruskal.test.Rd   (revision 74631)
+++ src/library/stats/man/kruskal.test.Rd   (working copy)
@@ -22,11 +22,12 @@
   \item{x}{a numeric vector of data values, or a list of numeric data
 vectors.  Non-numeric elements of a list will be coerced, with a
 warning.}
-  \item{g}{a vector or factor object giving the group for the
+  \item{g}{a numeric vector or factor object giving the group for the
 corresponding elements of \code{x}.  Ignored with a warning if
 \code{x} is a list.}
   \item{formula}{a formula of the form \code{response ~ group} where
-\code{response} gives the data values and \code{group} a vector or
+\code{response} gives the data values and \code{group}
+a numeric vector or
 factor of the corresponding groups.} 
   \item{data}{an optional matrix or data frame (or similar: see
 \code{\link{model.frame}}) containing the variables in the
@@ -52,7 +53,8 @@
   list, use \code{kruskal.test(list(x, ...))}.
 
   Otherwise, \code{x} must be a numeric data vector, and \code{g} must
-  be a vector or factor object of the same length as \code{x} giving
+  be a numeric vector or factor object of the same length as \code{x}
+  giving
   the group for the corresponding elements of \code{x}.
 }
 \value{
Index: src/library/stats/R/kruskal.test.R
===
--- src/library/stats/R/kruskal.test.R  (revision 74631)
+++ src/library/stats/R/kruskal.test.R  (working copy)
@@ -45,7 +45,7 @@
 OK <- complete.cases(x, g)
 x <- x[OK]
 g <- g[OK]
-if (!all(is.finite(g)))
+if (!is.character(g) & !all(is.finite(g)))
 stop("all group levels must be finite")
 g <- factor(g)
 k <- nlevels(g)
Index: src/library/stats/man/kruskal.test.Rd
===
--- src/library/stats/man/kruskal.test.Rd   (revision 74631)
+++ src/library/stats/man/kruskal.test.Rd   (working copy)
@@ -22,11 +22,13 @@
   \item{x}{a numeric vector of data values, or a list of numeric data
 vectors.  Non-numeric elements of a list will be coerced, with a
 warning.}
-  \item{g}{a vector or factor object giving the group for the
+  \item{g}{a character vector, numeric vector, or factor
+giving the group for the
 corresponding elements of \code{x}.  Ignored with a warning if
 \code{x} is a list.}
   \item{formula}{a formula of the form \code{response ~ group} where
-\code{response} gives the data values and \code{group} a vector or
+\code{response} gives the data values and \code{group} a
+character vector, numeric vector, or
 factor of the corresponding groups.} 
   \item{data}{an optional matrix or data frame (or similar: see
 \code{\link{model.frame}}) containing the variables in the
@@ -52,7 +54,8 @@
   list, use \code{kruskal.test(list(x, ...))}.
 
   Otherwise, \code{x} must be a numeric data vector, and \code{g} must
-  be a vector or factor object of the same length as \code{x} giving
+  be a numeric vector, character vector, or factor of the same length
+  as \code{x} giving
   the group for the corresponding elements of \code{x}.
 }
 \value{
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] readLines function with R >= 3.5.0

2018-06-19 Thread Michael Lawrence
Hi Jen,

Please provide a reproducible example, since the original stack
overflow example works in both trunk and patched.

Thanks,
Michael

On Tue, Jun 19, 2018 at 3:45 PM, Jennifer Lyon
 wrote:
> Hi Michael:
>
> I can confirm Martin's comment. I tested my software with r-devel (r74914)
> and it works, while with r-patched (r74914) it does not work (it hangs, as
> it did in R 3.5.0). I apologize for it taking so long for me to test this,
> but is there any chance this fix could make into R 3.5.1?
>
> Thanks.
>
> Jen.
>
> On Wed, Jun 13, 2018 at 6:24 AM, Michael Lawrence
>  wrote:
>>
>> Are you sure it's not available in patched? It's definitely in the
>> source since 6/1.
>>
>> Michael
>>
>>
>> On Wed, Jun 13, 2018 at 2:19 AM, Martin Maechler
>>  wrote:
>> >> Michael Lawrence
>> >> on Tue, 12 Jun 2018 19:27:49 -0700 writes:
>> >
>> > > Hi Jen, This was already resolved for R 3.5.1 by just
>> > > disabling buffering on terminal file connections like stdin.
>> >
>> > and before R 3.5.1 exists, *and*
>> > as the change is also not yet available in R patched (!)
>> > this means using a version of
>> > "R-devel", e.g. for Windows available from
>> >https://cloud.r-project.org/bin/windows/base/rdevel.html
>> >
>> > Martin
>> >
>> > > Sounds like you might want to be running a web service or
>> > > something instead though.
>> >
>> > > Michael
>> >
>> > > On Tue, Jun 12, 2018 at 4:46 PM, Jennifer Lyon
>> > >  wrote:
>> > >> Hi:
>> > >>
>> > >> I have also just stumbled into this bug. Unfortunately, I
>> > >> can not change the data my program receives from
>> > >> stdin. My code runs in a larger system and stdin is sent
>> > >> to a Docker container running my R code. The protocol is
>> > >> I read a line, readLines("stdin", n=1), do some actions,
>> > >> send output on stdout, and wait for the next set of data.
>> > >> I don't have control over this protocol, so I can't use
>> > >> the ^D workaround.
>> > >>
>> > >> I am open for other workaround suggestions. The single
>> > >> line is actually JSON and can be quite large. If there
>> > >> isn't something else cleaner, I am going to try
>> > >> readChar() in a while loop looking for \n but I'm
>> > >> guessing that would likely be too slow.  I am open to
>> > >> other workaround solutions. For the moment I have
>> > >> reverted back to R 3.4.4.
>> > >>
>> > >> Thanks for any suggestions.
>> > >>
>> > >> Jen.
>> > >>
>> > >>
>> >  > Martin Maechler > on Mon, 28 May 2018
>> >  10:28:01 +0200 writes:
>> > 
>> >  > Ralf Stubner > on Fri, 25 May 2018 19:18:58
>> >  +0200 writes:
>> > 
>> >  >> Dear all, I would like to draw you attention to this
>> >  >> question on SO:
>> >  >>
>> > >>
>> > https://stackoverflow.com/questions/50372043/readlines-function-with-new-version-of-r
>> > 
>> > 
>> >  >> Based on the OP's code I used the script
>> > 
>> >  >> ###
>> >  >> create_matrix <- function() { >> cat("Write the
>> >  numbers of vertices: ") >> user_input <-
>> >  readLines("stdin", n=1) >> user_input <-
>> >  as.numeric(user_input) >> print(user_input) >> } >>
>> >  create_matrix()
>> >  >> ###
>> > 
>> >  >> and called it with "R -f " from the
>> >  command line.
>> > 
>> >  >> With 'R version 3.4.4 (2018-03-15) -- "Someone to
>> >  Lean On"' the
>> > >> script
>> >  >> prints the inputed number as expected. With both 'R
>> >  version 3.5.0 >> (2018-04-23) -- "Joy in Playing"' and
>> >  'R Under development
>> > >> (unstable)
>> >  >> (2018-05-19 r74746) -- "Unsuffered Consequences"'
>> >  the script does
>> > >> not
>> >  >> continue after inputing a number.
>> > 
>> >  > I can confirm.  > It "works" if you additionally (the
>> >  [Enter], i.e., EOL) you also > "send" an EOF -- in Unix
>> >  alikes via -D
>> > 
>> >  > The same happens if you use 'Rscript '
>> > 
>> >  > I'm not the expert here, but am close to sure that we
>> >  (R core) > did not intend this change, when fixing
>> >  other somewhat subtle > bugs in Rscript / 'R -f'
>> > 
>> >  > Martin Maechler
>> > 
>> >  The same behavior in regular R , no need for a script
>> >  etc.
>> > 
>> >  > str(readLines("stdin", n=1))
>> > 
>> >  then in addition to the input you need to "give" an EOF
>> >  (Ctrl D) in R
>> > >>> = 3.5.0
>> > 
>> >  Interestingly, everything works fine if you use stdin()
>> >  instead of "stdin" :
>> > >>>