from:"Scott Kostyshak"

[Rd] Mention the case of logical(0) in ?stopifnot

2018-03-31 Thread Scott Kostyshak

I wonder if it would be helpful to mention in ?stopifnot that
stopifnot(logical(0)) does not give an error (for background on why this
is the case, see [1]). For example, ?all explicitly mentions the
following:

  That all(logical(0)) is true is a useful convention

and includes an example:

  all(logical(0))  # true, as all zero of the elements are true.

I think it would be nice to give examples in ?stopifnot of calls that
are not ideal uses of the function, such as the poorly written
stopifnot() call that I recently wrote:

  x <- 1:5
  # does not give an error
  stopifnot(ncol(x) == 2)
  # gives an error
  stopifnot(identical(ncol(x), 2L))

Or this code from [2]:

  li <- list()
  li$item <- 1
  # Does not give an error, because
  # "item" is misspelled and "NULL == 0" returns logical(0)
  stopifnot(li$tem == 0)

I think that a useful way to teach users how to use a function is to
teach them how not to use it.

Would a patch for the documentation along these lines be considered?

By the way, there are some regression tests in base R that rely on the
behavior of stopifnot(logical(0)), where the logical(0) results from
`==`. I can make a list of these tests if someone thinks it would be a
good idea to double-check them and possibly improve them (e.g., convert
them to use identical() instead of `==`). I'm guessing it's not worth
the time.

Scott


[1]
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_pipermail_r-2Dhelp_2015-2DDecember_434610.html&d=DwIBAg&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk&m=G8tEZpMWPL4vxGGinNsRHdfXpDqiFEownNAdY_AOiUk&s=wxOygcK0MIUDAQhkzjgfT-4edxWNCWluOEgAyR-xCC0&e=
[2]
https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_questions_33670060_how-2Dto-2Dhave-2Dstopifnot-2Dreturn-2Dan-2Derror-2Dwhen-2Dcalled-2Don-2Da-2Dmissing-2Dnull-2Delement&d=DwIBAg&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk&m=G8tEZpMWPL4vxGGinNsRHdfXpDqiFEownNAdY_AOiUk&s=ZCSz07Z1Gz4pAWgw75UUn9wIMI-wCv2Srfkn2MGYYlI&e=


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] source(echo = TRUE) with a iso-8859-1 encoded file gives an error

2018-05-01 Thread Scott Kostyshak

I have very little knowledge about file encodings and would like to
learn more.

I've read the following pages to learn more:

  
https://urldefense.proofpoint.com/v2/url?u=http-3A__stat.ethz.ch_R-2Dmanual_R-2Ddevel_library_base_html_Encoding.html&d=DwIDAw&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk&m=Hx2R8haOcpOy7nHCyZ63_tEVrmVn5txQk-yjGkgjKjw&s=HegPJMcZ_5R6vYtdQLgIsh-M6ElOlewHPBZxe8IPSlI&e=
  
https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_questions_4806823_how-2Dto-2Ddetect-2Dthe-2Dright-2Dencoding-2Dfor-2Dread-2Dcsv&d=DwIDAw&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk&m=Hx2R8haOcpOy7nHCyZ63_tEVrmVn5txQk-yjGkgjKjw&s=KGDvHJrfkvqbwyKnIiY0V45HtN-W4Rpq4ZBXfIFaFMk&e=
  
https://urldefense.proofpoint.com/v2/url?u=https-3A__developer.r-2Dproject.org_Encodings-5Fand-5FR.html&d=DwIDAw&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk&m=Hx2R8haOcpOy7nHCyZ63_tEVrmVn5txQk-yjGkgjKjw&s=Ka1kGiCw3w22tOLfA50AyrKsMT-La14TQdutJJkdE04&e=

The last one, in particular, has been very helpful. I would be
interested in any further references that you suggest.

I attach a file that reproduces the issue I would like to learn more
about. I do not know if the file encoding will be correctly preserved
through email, so I also provide the file (temporarily) on Dropbox here:

  
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.dropbox.com_s_3lbgebk7b5uaia7_encoding-5Fexport-5Fissue.R-3Fdl-3D0&d=DwIDAw&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk&m=Hx2R8haOcpOy7nHCyZ63_tEVrmVn5txQk-yjGkgjKjw&s=58a7qB9IHt3s2ZLDglGEHwWARuo8xvSlH_z8G5jDaUY&e=

The file gives an error when using "source()" with the
argument echo = TRUE:

  > source("encoding_export_issue.R", echo = TRUE)
  Error in nchar(dep, "c") : invalid multibyte string, element 1
  In addition: Warning message:
  In grepl("^[[:blank:]]*$", dep[1L]) :
input string 1 is invalid in this locale

The problem comes from the "á" character in the .R file. The file
appears to be encoded as "iso-8859-1":

  $ file --mime-encoding encoding_export_issue.R 
  encoding_export_issue.R: iso-8859-1

Note that for me:

  > getOption("encoding")
  [1] "native.enc"

so "native.enc" is used for the "encoding" argument of source().

The following two calls succeed:

  > source("encoding_export_issue.R", echo = TRUE, encoding = "unknown")
  > source("encoding_export_issue.R", echo = TRUE, encoding = "iso-8859-1")

Is this file a valid "iso-8859-1" encoded file?  Why does source() fail
in the case of encoding set to "native.enc"? Is it because of the
settings to UTF-8 in my locale (see info on my system at the bottom of
this email).

I'm guessing it would be a bad idea to put

  options(encoding = "unknown")

in my .Rprofile, because it is difficult to always correctly guess the
encoding of files? Is there a reason why setting it to "unknown" would
lead to more problems than leaving it set to "native.enc"?

I've reproduced the above behavior on R-devel (r74677) and 3.4.3. Below
is my session info and locale info for my system with the 3.4.3 version:

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.3 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C  
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C   

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

loaded via a namespace (and not attached):
[1] compiler_3.4.3

> Sys.getlocale()
[1] 
"LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLATE=en_US.UTF-8;LC_MONETARY=en_US.UTF-8;LC_MESSAGES=en_US.UTF-8;LC_PAPER=en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_US.UTF-8;LC_IDENTIFICATION=C"

Thanks for your time,

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

# Ch?vez
quantile_type <- 4

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] source(echo = TRUE) with a iso-8859-1 encoded file gives an error

2018-05-04 Thread Scott Kostyshak

Thanks for your reply, Ista, and your advice. I will re-post to r-help.

Best,

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

On Tue, May 01, 2018 at 07:15:30PM +, Ista Zahn wrote:
> Hi Scott,
> 
> This question is appropriate for the r-help mailing list, but probably
> off-topic here on r-devel.
> 
> Best,
> Ista
> 
> On Tue, May 1, 2018 at 2:57 PM, Scott Kostyshak  wrote:
> > I have very little knowledge about file encodings and would like to
> > learn more.
> >
> > I've read the following pages to learn more:
> >
> >   
> > https://urldefense.proofpoint.com/v2/url?u=http-3A__stat.ethz.ch_R-2Dmanual_R-2Ddevel_library_base_html_Encoding.html&d=DwIDAw&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk&m=Hx2R8haOcpOy7nHCyZ63_tEVrmVn5txQk-yjGkgjKjw&s=HegPJMcZ_5R6vYtdQLgIsh-M6ElOlewHPBZxe8IPSlI&e=
> >   
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_questions_4806823_how-2Dto-2Ddetect-2Dthe-2Dright-2Dencoding-2Dfor-2Dread-2Dcsv&d=DwIDAw&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk&m=Hx2R8haOcpOy7nHCyZ63_tEVrmVn5txQk-yjGkgjKjw&s=KGDvHJrfkvqbwyKnIiY0V45HtN-W4Rpq4ZBXfIFaFMk&e=
> >   
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__developer.r-2Dproject.org_Encodings-5Fand-5FR.html&d=DwIDAw&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk&m=Hx2R8haOcpOy7nHCyZ63_tEVrmVn5txQk-yjGkgjKjw&s=Ka1kGiCw3w22tOLfA50AyrKsMT-La14TQdutJJkdE04&e=
> >
> > The last one, in particular, has been very helpful. I would be
> > interested in any further references that you suggest.
> >
> > I attach a file that reproduces the issue I would like to learn more
> > about. I do not know if the file encoding will be correctly preserved
> > through email, so I also provide the file (temporarily) on Dropbox here:
> >
> >   
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.dropbox.com_s_3lbgebk7b5uaia7_encoding-5Fexport-5Fissue.R-3Fdl-3D0&d=DwIDAw&c=pZJPUDQ3SB9JplYbifm4nt2lEVG5pWx2KikqINpWlZM&r=1fpq0SJ48L-zRWX2t0llEVIDZAHfU8S-4oINHlOA0rk&m=Hx2R8haOcpOy7nHCyZ63_tEVrmVn5txQk-yjGkgjKjw&s=58a7qB9IHt3s2ZLDglGEHwWARuo8xvSlH_z8G5jDaUY&e=
> >
> > The file gives an error when using "source()" with the
> > argument echo = TRUE:
> >
> >   > source("encoding_export_issue.R", echo = TRUE)
> >   Error in nchar(dep, "c") : invalid multibyte string, element 1
> >   In addition: Warning message:
> >   In grepl("^[[:blank:]]*$", dep[1L]) :
> > input string 1 is invalid in this locale
> >
> > The problem comes from the "á" character in the .R file. The file
> > appears to be encoded as "iso-8859-1":
> >
> >   $ file --mime-encoding encoding_export_issue.R
> >   encoding_export_issue.R: iso-8859-1
> >
> > Note that for me:
> >
> >   > getOption("encoding")
> >   [1] "native.enc"
> >
> > so "native.enc" is used for the "encoding" argument of source().
> >
> > The following two calls succeed:
> >
> >   > source("encoding_export_issue.R", echo = TRUE, encoding = "unknown")
> >   > source("encoding_export_issue.R", echo = TRUE, encoding = "iso-8859-1")
> >
> > Is this file a valid "iso-8859-1" encoded file?  Why does source() fail
> > in the case of encoding set to "native.enc"? Is it because of the
> > settings to UTF-8 in my locale (see info on my system at the bottom of
> > this email).
> >
> > I'm guessing it would be a bad idea to put
> >
> >   options(encoding = "unknown")
> >
> > in my .Rprofile, because it is difficult to always correctly guess the
> > encoding of files? Is there a reason why setting it to "unknown" would
> > lead to more problems than leaving it set to "native.enc"?
> >
> > I've reproduced the above behavior on R-devel (r74677) and 3.4.3. Below
> > is my session info and locale info for my system with the 3.4.3 version:
> >
> >> sessionInfo()
> > R version 3.4.3 (2017-11-30)
> > Platform: x86_64-pc-linux-gnu (64-bit)
> > Running under: Ubuntu 16.04.3 LTS
> >
> > Matrix products: default
> > BLAS: /usr/lib/libblas/libblas.so.3.6.0
> > LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
> >
> > locale:
> &

[Rd] [patch] add sanity checks to quantile()

2019-05-30 Thread Scott Kostyshak

The attached patch adds some sanity checks to the "type" argument of
quantile(). Output from the following commands show the change of
behavior with the current patch:

  vec <- 1:10
  quantile(vec, type = c(1, 2))
  quantile(vec, type = 10)
  quantile(vec, type = "aaa")
  quantile(vec, type = NA_real_)
  quantile(vec, type = 4.3)
  quantile(vec, type = -1)

Current behavior (i.e., without the patch):

  > vec <- 1:10
  > quantile(vec, type = c(1, 2))
  Error in switch(type, (nppm > j), ((nppm > j) + 1)/2, (nppm != j) | ((j%%2L) 
==  : 
EXPR must be a length 1 vector
  In addition: Warning messages:
  1: In if (type == 7) { :
the condition has length > 1 and only the first element will be used
  2: In if (type <= 3) { :
the condition has length > 1 and only the first element will be used
  3: In if (type == 3) n * probs - 0.5 else n * probs :
the condition has length > 1 and only the first element will be used
  > quantile(vec, type = 10)
  Error in quantile.default(vec, type = 10) : object 'a' not found
  > quantile(vec, type = "aaa")
  Error in type - 3 : non-numeric argument to binary operator
  > quantile(vec, type = NA_real_)
  Error in if (type == 7) { : missing value where TRUE/FALSE needed
  > quantile(vec, type = 4.3)
0%  25%  50%  75% 100% 
   1.0  2.5  5.0  7.5 10.0 
  > quantile(vec, type = -1)
0%  25%  50%  75% 100% 
 1257   10 


Behavior with the patch:

  > vec <- 1:10
  > quantile(vec, type = c(1, 2))
  Error in quantile.default(vec, type = c(1, 2)) : 
'type' must be of length 1
  > quantile(vec, type = 10)
  Error in quantile.default(vec, type = 10) : 
'type' must be an integer between 1 and 9
  > quantile(vec, type = "aaa")
  Error in quantile.default(vec, type = "aaa") : 
'type' must be an integer between 1 and 9
  > quantile(vec, type = NA_real_)
  Error in quantile.default(vec, type = NA_real_) : 
'type' must be an integer between 1 and 9
  > quantile(vec, type = 4.3)
  Error in quantile.default(vec, type = 4.3) : 
'type' must be an integer between 1 and 9
  > quantile(vec, type = -1)
  Error in quantile.default(vec, type = -1) : 
'type' must be an integer between 1 and 9


Note that with the patch, quantile() gives an error in some cases where
the current code does not. Specifically, the following two calls to
quantile() do not give an error without the patch:

  quantile(vec, type = 4.3)
  quantile(vec, type = -1)

Thus, this patch could cause current code to give an error. If it is
desired, I could change the patch such that it only gives an error when
current R gives an error (i.e., the only benefit of the patch would be
better error messages), or I can change the patch to give a warning in
these cases.

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

Index: src/library/stats/R/quantile.R
===
--- src/library/stats/R/quantile.R	(revision 76528)
+++ src/library/stats/R/quantile.R	(working copy)
@@ -25,6 +25,12 @@
 function(x, probs = seq(0, 1, 0.25), na.rm = FALSE, names = TRUE,
  type = 7, ...)
 {
+if (length(type) != 1L) {
+stop("'type' must be of length 1")
+}
+if (is.na(type) || !is.numeric(type) || !any(type == 1:9)) {
+stop("'type' must be an integer between 1 and 9")
+}
 if(is.factor(x)) {
 	if(is.ordered(x)) {
 	   if(!any(type == c(1L, 3L)))
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [patch] add sanity checks to quantile()

2020-01-04 Thread Scott Kostyshak

On Fri, May 31, 2019 at 01:28:55AM -0400, Scott Kostyshak wrote:
> The attached patch adds some sanity checks to the "type" argument of
> quantile(). Output from the following commands show the change of
> behavior with the current patch:
> 
>   vec <- 1:10
>   quantile(vec, type = c(1, 2))
>   quantile(vec, type = 10)
>   quantile(vec, type = "aaa")
>   quantile(vec, type = NA_real_)
>   quantile(vec, type = 4.3)
>   quantile(vec, type = -1)
> 
> Current behavior (i.e., without the patch):
> 
>   > vec <- 1:10
>   > quantile(vec, type = c(1, 2))
>   Error in switch(type, (nppm > j), ((nppm > j) + 1)/2, (nppm != j) | 
> ((j%%2L) ==  : 
> EXPR must be a length 1 vector
>   In addition: Warning messages:
>   1: In if (type == 7) { :
> the condition has length > 1 and only the first element will be used
>   2: In if (type <= 3) { :
> the condition has length > 1 and only the first element will be used
>   3: In if (type == 3) n * probs - 0.5 else n * probs :
> the condition has length > 1 and only the first element will be used
>   > quantile(vec, type = 10)
>   Error in quantile.default(vec, type = 10) : object 'a' not found
>   > quantile(vec, type = "aaa")
>   Error in type - 3 : non-numeric argument to binary operator
>   > quantile(vec, type = NA_real_)
>   Error in if (type == 7) { : missing value where TRUE/FALSE needed
>   > quantile(vec, type = 4.3)
> 0%  25%  50%  75% 100% 
>1.0  2.5  5.0  7.5 10.0 
>   > quantile(vec, type = -1)
> 0%  25%  50%  75% 100% 
>  1257   10 
> 
> 
> Behavior with the patch:
> 
>   > vec <- 1:10
>   > quantile(vec, type = c(1, 2))
>   Error in quantile.default(vec, type = c(1, 2)) : 
> 'type' must be of length 1
>   > quantile(vec, type = 10)
>   Error in quantile.default(vec, type = 10) : 
> 'type' must be an integer between 1 and 9
>   > quantile(vec, type = "aaa")
>   Error in quantile.default(vec, type = "aaa") : 
> 'type' must be an integer between 1 and 9
>   > quantile(vec, type = NA_real_)
>   Error in quantile.default(vec, type = NA_real_) : 
> 'type' must be an integer between 1 and 9
>   > quantile(vec, type = 4.3)
>   Error in quantile.default(vec, type = 4.3) : 
> 'type' must be an integer between 1 and 9
>   > quantile(vec, type = -1)
>   Error in quantile.default(vec, type = -1) : 
> 'type' must be an integer between 1 and 9
> 
> 
> Note that with the patch, quantile() gives an error in some cases where
> the current code does not. Specifically, the following two calls to
> quantile() do not give an error without the patch:
> 
>   quantile(vec, type = 4.3)
>   quantile(vec, type = -1)
> 
> Thus, this patch could cause current code to give an error. If it is
> desired, I could change the patch such that it only gives an error when
> current R gives an error (i.e., the only benefit of the patch would be
> better error messages), or I can change the patch to give a warning in
> these cases.
> 
> Scott
> 
> 
> -- 
> Scott Kostyshak
> Assistant Professor of Economics
> University of Florida
> https://people.clas.ufl.edu/skostyshak/
> 

Bump. For this type of patch proposal, is it better to use the
bug tracker?

Thanks,

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [patch] add sanity checks to quantile()

2020-01-04 Thread Scott Kostyshak

On Sat, Jan 04, 2020 at 06:32:15PM -0500, Duncan Murdoch wrote:
> 
> On 04/01/2020 4:35 p.m., Scott Kostyshak wrote:
> > On Fri, May 31, 2019 at 01:28:55AM -0400, Scott Kostyshak wrote:
> > > The attached patch adds some sanity checks to the "type" argument of
> ...
> > Bump. For this type of patch proposal, is it better to use the
> > bug tracker?
> 
> For almost any patch proposal it is.  Certainly if you don't get action
> (or at least discussion) within a few days, any other proposal will be
> forgotten.
> 
> Duncan Murdoch

That makes sense. Thanks for the quick reply and advice. Here is the
ticket:

  https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17683

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Patch for R-exts.texi

2017-07-03 Thread Scott Kostyshak

Attached is a patch for R-exts.texi against r72880.

Here are some of the changes I made:

- Fix a broken link:

https://developer.apple.com/library/mac/#documentation/DeveloperTools/Conceptual/InstrumentsUserGuide/Introduction/Introduction.html
->

https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/InstrumentsUserGuide/index.html

- Changed a few http to https (and checked that the connections are
  indeed secure, as judged by Chromium and Firefox).

- A couple of grammar fixes and "sounds more natural to me" changes.

- "x84_64" -> x86_64

- One change of "which" -> "that"

- The link to Luke's uiowa.edu page involves two changes, removing the
  duplicate URL and changing the protocol to https.

Thanks for your time,

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

Index: doc/manual/R-exts.texi
===
--- doc/manual/R-exts.texi  (revision 72880)
+++ doc/manual/R-exts.texi  (working copy)
@@ -1457,7 +1457,7 @@
 
 @noindent
 then download the sources from
-@uref{http://sourceforge.net/@/projects/@/tcllib/@/files/@/BWidget/} and
+@uref{https://sourceforge.net/@/projects/@/tcllib/@/files/@/BWidget/} and
 at the command line run something like
 
 @example
@@ -1494,7 +1494,7 @@
 
 @noindent
 and not a version starting
-@samp{http://cran.r-project.org/web/packages/@var{pkgname}}.
+@samp{https://cran.r-project.org/web/packages/@var{pkgname}}.
 
 @node Configure and cleanup, Checking and building packages, Package 
structure, Creating R packages
 @section Configure and cleanup
@@ -2117,7 +2117,7 @@
 word, so computations done on OpenMP threads will not make use of
 extended-precision arithmetic which is the default for the main process.
 @c mingw64-public, 2015-02-02.
-@c 
http://stackoverflow.com/questions/2553725/is-the-fpu-control-word-setting-per-thread-or-per-process
+@c 
https://stackoverflow.com/questions/2553725/is-the-fpu-control-word-setting-per-thread-or-per-process
 
 Calling any of the @R{} API from threaded code is `for experts only':
 they will need to read the source code to determine if it is
@@ -7645,7 +7645,7 @@
 which is a GUI version), @command{Shark} (in version of @code{Xcode}
 up to those for Snow Leopard), and @command{Instruments} (part of
 @code{Xcode}, see
-@uref{https://developer.apple.com/library/mac/#documentation/DeveloperTools/Conceptual/InstrumentsUserGuide/Introduction/Introduction.html}).
+@uref{https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/InstrumentsUserGuide/index.html}).
 
 
 @node Debugging, System and foreign language interfaces, Tidying and profiling 
R code, Top
@@ -8295,8 +8295,8 @@
 to be installed separately, and for checking C++ you may also need
 @pkg{libubsan}.} of @command{gcc} and @command{clang} on common Linux
 and macOS platforms.  See
-@uref{http://clang.llvm.org/@/docs/@/UsersManual.html#controlling-code-generation},
-@uref{http://clang.llvm.org/@/docs/@/AddressSanitizer.html} and
+@uref{https://clang.llvm.org/@/docs/@/UsersManual.html#controlling-code-generation},
+@uref{https://clang.llvm.org/@/docs/@/AddressSanitizer.html} and
 @uref{https://code.google.com/@/p/@/address-sanitizer/}.
 
 More thorough checks of C++ code are done if the C++ library has been
@@ -8455,7 +8455,7 @@
 
 Finer control of what is checked can be achieved by other options: for
 @command{clang} see
-@uref{http://clang.llvm.org/@/docs/@/UsersManual.html#controlling-code-generation}.@footnote{or
+@uref{https://clang.llvm.org/@/docs/@/UsersManual.html#controlling-code-generation}.@footnote{or
 the user manual for your version of @command{clang}, e.g.@: (the paths
 have differed for some versions)
 
@uref{http://llvm.org/@/releases/@/4.0.0/@/tools/@/clang/@/docs/@/UsersManual.html}.}
@@ -8560,13 +8560,13 @@
 Recent versions of @command{clang} on @cputype{x86_64} Linux have
 `ThreadSanitizer' (@uref{https://code.google.com/@/p/@/thread-sanitizer/}),
 a `data race detector for C/C++ programs', and `MemorySanitizer'
-(@uref{http://clang.llvm.org/@/docs/@/MemorySanitizer.html},
+(@uref{https://clang.llvm.org/@/docs/@/MemorySanitizer.html},
 @uref{https://code.google.com/@/p/@/memory-sanitizer/@/wiki/@/MemorySanitizer})
 for the detection of uninitialized memory.  Both are based on and
 provide similar functionality to tools in @command{valgrind}.
 
 @command{clang} has a `Static Analyser' which can be run on the source
-files during compilation: see @uref{http://clang-analyzer.llvm.org/}.
+files during compilation: see @uref{https://clang-analyzer.llvm.org/}.
 
 @node Using `Dr. Memory', Fortran array bounds checking, Other analyses with 
`clang', Checking memory access
 @subsection Using `Dr. Memory'
@@ -9429,7 +9429,7 @@
 @uref{https://www.r-project.org/@/doc/@/Rnews/R

Re: [Rd] Patch for R-exts.texi

2017-07-08 Thread Scott Kostyshak

On Sat, Jul 08, 2017 at 06:18:25PM +0200, Martin Maechler wrote:
> >>>>> Scott Kostyshak 
> >>>>> on Mon, 3 Jul 2017 02:09:47 -0400 writes:
> 
> > Attached is a patch for R-exts.texi against r72880.  Here
> > are some of the changes I made:
> 
> > - Fix a broken link:
> > 
> https://developer.apple.com/library/mac/#documentation/DeveloperTools/Conceptual/InstrumentsUserGuide/Introduction/Introduction.html
> -> 
> > 
> https://developer.apple.com/library/content/documentation/DeveloperTools/Conceptual/InstrumentsUserGuide/index.html
> 
> > - Changed a few http to https (and checked that the
> > connections are indeed secure, as judged by Chromium and
> > Firefox).
> 
> > - A couple of grammar fixes and "sounds more natural to
> > me" changes.
> 
> > - "x84_64" -> x86_64
> 
> > - One change of "which" -> "that"
> 
> > - The link to Luke's uiowa.edu page involves two changes,
> > removing the duplicate URL and changing the protocol to
> > https.
> 
> > Thanks for your time,
> > Scott
> 
> > -- 
> > Scott Kostyshak Assistant Professor of Economics
> > University of Florida
> > https://people.clas.ufl.edu/skostyshak/
> 
> > [DELETED ATTACHMENT external: R-exts.texi.diff, plain
> > text]
> 
> Thank you very much, Scott!
> 
> This is a clear improvement
>  ((even though some of the style changes may be debatable - but only by native
>English/American (;-) speakers, not me. ...))
> 
> Hence I've committed it (R-devel, svn rev 72900).

Thanks for putting it in, Martin! I do my best to not impose American
English, but sometimes I just don't realize. I actually have adopted
several non-American rules because I find them more logical. For
example, I like to put punctuation outside of quotes, such as "this is a
quote", where for some reason in American English it is preferred to put
it as "this is a quote."

Thanks for taking the time to review the patch and commit it.

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] [patch] ?confint: "assumes asymptotic normality"

2017-07-20 Thread Scott Kostyshak

>From ?confint:

"Computes confidence intervals" and "The default method assumes
asymptotic normality"

For me, a "confidence interval" implies an exact confidence interval in
formal statistics (I concede that when speaking, the term is often used
more loosely). And of course, even if a test statistic is asymptotically
normal (so the assumption is satisfied), the finite distribution might
not be normal and thus an exact confidence interval would not be
computed.

Attached is a patch that simply changes "asymptotic normality" to
"normality" in confint.Rd. This encourages the user of the function to
think about whether their asymptotically normal statistic is "normal
enough" in a finite sample to get something reliable from confint().

Alternatively, we could instead change "Computes confidence intervals"
to "Computes asymptotic confidence intervals".

I hope I'm not being too pedantic here.

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

Index: src/library/stats/man/confint.Rd
===
--- src/library/stats/man/confint.Rd(revision 72930)
+++ src/library/stats/man/confint.Rd(working copy)
@@ -31,7 +31,7 @@
 }
 \details{
   \code{confint} is a generic function.  The default method assumes
-  asymptotic normality, and needs suitable \code{\link{coef}} and
+  normality, and needs suitable \code{\link{coef}} and
   \code{\link{vcov}} methods to be available.  The default method can be
   called directly for comparison with other methods.
 
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [patch] ?confint: "assumes asymptotic normality"

2017-07-20 Thread Scott Kostyshak

On Thu, Jul 20, 2017 at 04:21:04PM +0200, Martin Maechler wrote:
> >>>>> Scott Kostyshak 
> >>>>> on Thu, 20 Jul 2017 03:28:37 -0400 writes:
> 
> >> From ?confint:
> > "Computes confidence intervals" and "The default method assumes
> > asymptotic normality"
> 
> > For me, a "confidence interval" implies an exact confidence interval in
> > formal statistics (I concede that when speaking, the term is often used
> > more loosely). And of course, even if a test statistic is asymptotically
> > normal (so the assumption is satisfied), the finite distribution might
> > not be normal and thus an exact confidence interval would not be
> > computed.
> 
> > Attached is a patch that simply changes "asymptotic normality" to
> > "normality" in confint.Rd. This encourages the user of the function to
> > think about whether their asymptotically normal statistic is "normal
> > enough" in a finite sample to get something reliable from confint().
> 
> > Alternatively, we could instead change "Computes confidence intervals"
> > to "Computes asymptotic confidence intervals".
> 
> > I hope I'm not being too pedantic here.
> 
> well, it's just at the 97.5% border line of "too pedantic"  ...

:)

> ;-)
> 
> I think you are right with your first proposal to drop
> "asymptotic" here.  After all, there's the explict 'fac <- qnorm(a)'.

Note that I received a private email that my message was indeed too
pedantic and expressed disagreement with the proposal. I'm not sure if
they intended it to be private so I will respond in private and see if
they feel like bringing the discussion on the list. Or perhaps this
minor (and perhaps controversial?) issue is not worth any additional
time.

> One could consider to make  'qnorm' an argument of the
> default method to allow more general distributional assumptions,
> but it may be wiser to have useRs write their own
> confint.() method, notably for cases where
> diag(vcov(object)) is an efficiency waste...

Thanks for your comments,

Scott

> Martin
> 
> 
> > Scott
> 
> 
> > -- 
> > Scott Kostyshak
> > Assistant Professor of Economics
> > University of Florida
> > https://people.clas.ufl.edu/skostyshak/
> 
> 
> > --
> > Index: src/library/stats/man/confint.Rd
> > ===
> > --- src/library/stats/man/confint.Rd(revision 72930)
> > +++ src/library/stats/man/confint.Rd(working copy)
> > @@ -31,7 +31,7 @@
> > }
> > \details{
> > \code{confint} is a generic function.  The default method assumes
> > -  asymptotic normality, and needs suitable \code{\link{coef}} and
> > +  normality, and needs suitable \code{\link{coef}} and
> > \code{\link{vcov}} methods to be available.  The default method can be
> > called directly for comparison with other methods.
>  
> 
> > --
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] specifying name in the error message "promise already under evaluation"

2017-09-17 Thread Scott Kostyshak

Consider the following R code:

abc <- function(x, y = y) {
  x + y
}

abc(x = 3)

which gives the following error:

promise already under evaluation: recursive default argument
reference or earlier problems?

If you google that error, you will find that it usually refers to the
situation given in the example above, although I'm sure the error is
more general and could be triggered in other situations.

I'm trying to think about how to improve the error for the most common
situation that triggers it. One simple way would be to give the name of
the promise. For example, I think that the following would already be an
improvement:

promise "y" already under evaluation: recursive default argument
reference or earlier problems?

Any thoughts?

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Using response variable in interaction as explanatory variable in glm crashes R

2017-10-10 Thread Scott Kostyshak

On Mon, Oct 09, 2017 at 03:52:43PM +, Martin Maechler wrote:
> >>>>> Jan van der Laan 
> >>>>> on Fri, 6 Oct 2017 12:13:39 +0200 writes:
> 
> > It is actually model.matrix that crashes, not glm. Same
> > crash occurs with e.g. lm.
> 
> > model.matrix(dob_mon ~ dob_day*dob_mon, data = tab)
> 
> > also crashes R.
> 
> Yes, segmentation fault.
> 
> It only happens when these are *logical* variables, not, e.g., when
> transformed to integer.
> 
> The C code in src/library/stats/src/model.c  tries to eliminate
> occurances of the LHS of the formula from the RHS when building
> the model matrix and it does work fine in the integer case.
> 
> Part of the culprit code may be this (from line 717),
> with the  isLogical(.) which in our case, shifts the pointer by
> 1  in the call to firstfactor() :
> 
>   int adj = isLogical(var_i)?1:0;
>   // avoid overflow of jstart * nn PR#15578
>   firstfactor(&rx[jstart * nn], n, jnext - jstart,
>   REAL(contrast), nrows(contrast),
>   ncols(contrast), INTEGER(var_i)+adj);
> 
> then in firstfactor(), we see the segfault (when running R with
> '-d gdb') :
> 
> > model.matrix(dob_mon ~ dob_day*dob_mon, data = tab)
> 
>   Program received signal SIGSEGV, Segmentation fault.
>   0x7fffeafa76b5 in firstfactor (ncx=0, v=0x5c3b37c, ncc=1, nrc=2, 
> c=0x5c90008, 
>nrx=8, x=0x5cbf150) at ../../../../../R/src/library/stats/src/model.c:252
> 252   else xj[i] = cj[v[i]-1];
> Missing separate debuginfos, .
> (gdb) list
> 247   for (int j = 0; j < ncc; j++) {
> 248   xj = &x[j * (R_xlen_t)nrx];
> 249   cj = &c[j * (R_xlen_t)nrc];
> 250   for (int i = 0; i < nrx; i++)
> 251   if(v[i] == NA_INTEGER) xj[i] = NA_REAL;
> 252   else xj[i] = cj[v[i]-1];
> 253   }
> 254   }
> 255   
> 
> and indeed in the debugger,  i=7  and  v[i] is "outside", v[]
> being of length 7, hence indexed 0:6.

Dear Martin,

I just wanted to thank you for providing details on your approach to
debugging. Often I see bug fixes and I wonder "how the heck did they
figure that out?" so I am very excited when I see details like these on
the process (and not just the end result), so that I can learn.

Best,

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Marking a ticket as a (potential) regression in bug tracker?

2020-06-12 Thread Scott Kostyshak

Is there a way to mark a ticket as a potential regression in the bug
tracker? I think the following issue is a regression:

  https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17684

I've just tested (2020-06-12 r78687) and what I believe to be a
regression is still there. I don't think the bug has bitten many people,
so I don't think it is critical, but often it is helpful to mark bugs as
regressions in trackers.

Thanks,

Scott


-- 
Scott Kostyshak
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Marking a ticket as a (potential) regression in bug tracker?

2020-11-26 Thread Scott Kostyshak

On Fri, Jun 12, 2020 at 10:17:11AM -0400, Scott Kostyshak wrote:
> 
> Is there a way to mark a ticket as a potential regression in the bug
> tracker? I think the following issue is a regression:
> 
>   https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17684
> 
> I've just tested (2020-06-12 r78687) and what I believe to be a
> regression is still there. I don't think the bug has bitten many people,
> so I don't think it is critical, but often it is helpful to mark bugs as
> regressions in trackers.

If there's no current way to mark something as a regression, would there
be support for adding a way?

Best,

Scott


-- 
Scott Kostyshak (he/him)
Assistant Professor of Economics
University of Florida
https://people.clas.ufl.edu/skostyshak/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R CMD check --outdir=path gives unknown option '--outdir'

2013-07-18 Thread Scott Kostyshak

On Thu, Apr 4, 2013 at 2:06 PM, Henrik Bengtsson  wrote:
> For 'R CMD check', it appears that option '--outdir' is not recognized
> and generates warning "unknown option '--outdir'". R CMD check --help
> says:
>
> Usage: R CMD check [options] pkgs
> [...]
> Options:
> [...]
>   -o, --outdir=DIR  directory used for logfiles, R output, etc.
> (default is 'pkg.Rcheck' in current directory,
> where 'pkg' is the name of the package checked)
>
> Example:
>
> mkdir foo
>
> # Check output is written to foo/
> R CMD check -o foo pkg_0.1.tar.gz
>
> # Option is ignored and check output is written to bar.Rcheck/
> R CMD check --outdir=foo pkg_0.1.tar.gz
> Warning: unknown option '--outdir=foo'
>
> # Also tried with:
> R CMD check --outdir foo pkg_0.1.tar.gz
> Warning: unknown option '--outdir'
>
> R CMD check -outdir=foo pkg_0.1.tar.gz
> Warning: unknown option '-outdir=foo'
>
> R CMD check -outdir foo pkg_0.1.tar.gz
> Warning: unknown option '-outdir'
>
> I get this with:
>
>> sessionInfo()
> R version 3.0.0 (2013-04-03)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
>> sessionInfo()
> R Under development (unstable) (2013-04-02 r62479)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
>
>> sessionInfo()
> R version 2.15.3 (2013-03-01)
> Platform: x86_64-unknown-linux-gnu (64-bit)

I see the same behavior on 3.0.1 (pre-compiled binaries on Ubuntu
12.04 and 13.04).

> sessionInfo()
R version 3.0.1 (2013-05-16)
Platform: x86_64-pc-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=C LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

> Should I report this to http://bugs.r-project.org/?

Did you? If not, please do (or tell me to if you don't have time). I
see nothing in News.Rd on trunk.

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] R CMD check --outdir=path gives unknown option '--outdir'

2013-07-19 Thread Scott Kostyshak

On Fri, Jul 19, 2013 at 3:04 AM, Prof Brian Ripley
 wrote:
> So please follow the posting guide at
> http://www.r-project.org/posting-guide.html, to wit
>
> 'f you are using an old version of R and think it does not work properly,
> upgrade to the latest version and try that, before posting. If possible, try
> the current R-patched or R-devel version of R (see the FAQ for details), to
> see if the problem has already been addressed.'

OK.

> It has been.

Thanks,

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Posting Guide: changed link and other comment

2013-07-19 Thread Scott Kostyshak

I have two comments regarding the Posting Guide:

(1) The link in the following sentence did not work for me:

Take care when you quote other people's comments to respect their
rights, e.g., as summarized here[a].

[a] http://www.jiscmail.ac.uk/help/policy/copyright.htm

Has it been changed to the following?
  http://www.jiscmail.ac.uk/policyandsecurity/copyrightissues.html

(2) Regarding the following extract

  `If you feel insulted by some response to a post of yours, don't
make any hasty response in return - you're as likely as not to regret
it.'

wouldn't someone who is `as likely as not to regret it' be indifferent
between sending a hasty response and not sending a hasty response? The
intent is perfectly clear but perhaps `you're _more_ likely than not'
is a more probabilistically correct expression?

Thanks for the helpful document -- it is useful reading for this list
as well as more generally.

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] tk + browser() can leave R unresponsive

2013-08-03 Thread Scott Kostyshak

I don't know if this is a bug. I can reproduce the following on Ubuntu
12.04.2 and 13.04 64-bit with R version 3.0.1 and with r63479. There
is no difference if R is patched with the fix for PR#15407 or not,
although without the fix there are more ways to trigger this.

I can reproduce with the following:

1. Open R in gnome-terminal or xterm

2. Run 'library(tcltk)'

3. Run 'trace(tk_select.list, edit = TRUE)'
and put "browser()" at the beginning of the onOK body (e.g. in Vim run
<<:g/onOK/put ='browser()'>>). That is, transform

onOK <- function() {
res <- 1L + as.integer(tkcurselection(box))
cat("res is: ", res)
ans.select_list <<- choices[res]
tkgrab.release(dlg)
tkdestroy(dlg)
}

to:

onOK <- function() {
browser()
res <- 1L + as.integer(tkcurselection(box))
cat("res is: ", res)
ans.select_list <<- choices[res]
tkgrab.release(dlg)
tkdestroy(dlg)
}

4. Run 'install.packages()'

5. Double-click on a package

R becomes unresponsive and I have to kill it.

> sessionInfo()
R Under development (unstable) (2013-08-02 r63479)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] tk + browser() can leave R unresponsive

2013-08-03 Thread Scott Kostyshak

On Sat, Aug 3, 2013 at 5:56 AM, Scott Kostyshak  wrote:
> I don't know if this is a bug. I can reproduce the following on Ubuntu
> 12.04.2 and 13.04 64-bit with R version 3.0.1 and with r63479. There
> is no difference if R is patched with the fix for PR#15407 or not,
> although without the fix there are more ways to trigger this.
>
> I can reproduce with the following:
>
> 1. Open R in gnome-terminal or xterm
>
> 2. Run 'library(tcltk)'
>
> 3. Run 'trace(tk_select.list, edit = TRUE)'
> and put "browser()" at the beginning of the onOK body (e.g. in Vim run
> <<:g/onOK/put ='browser()'>>). That is, transform
>
> onOK <- function() {
> res <- 1L + as.integer(tkcurselection(box))
> cat("res is: ", res)
> ans.select_list <<- choices[res]
> tkgrab.release(dlg)
> tkdestroy(dlg)
> }
>
> to:
>
> onOK <- function() {
> browser()
> res <- 1L + as.integer(tkcurselection(box))
> cat("res is: ", res)
> ans.select_list <<- choices[res]
> tkgrab.release(dlg)
> tkdestroy(dlg)
> }
>
> 4. Run 'install.packages()'
>
> 5. Double-click on a package
>
> R becomes unresponsive and I have to kill it.
>
>> sessionInfo()
> R Under development (unstable) (2013-08-02 r63479)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
>  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
>  [9] LC_ADDRESS=C   LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> Scott
>
>
> --
> Scott Kostyshak
> Economics PhD Candidate
> Princeton University

This might be related to PR#14730. I will add this info there unless
someone suggests otherwise.

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] [PATCH] remove a duplicate tk function definition (and alphabetize)

2013-08-31 Thread Scott Kostyshak

'tkcoords' is defined twice (in the same way) in src/library/tcltk/R/Tk.R.

Attached is a patch against r63780 that removes the duplicate
definition and alphabetizes the functions.

I've read that minor patches such as this should be sent to r-devel [1].

Scott

[1] http://permalink.gmane.org/gmane.comp.lang.r.devel/33987


--
Scott Kostyshak
Economics PhD Candidate
Princeton University
Index: src/library/tcltk/R/Tk.R
===
--- src/library/tcltk/R/Tk.R(revision 63780)
+++ src/library/tcltk/R/Tk.R(working copy)
@@ -493,12 +493,11 @@
 tkbbox  <- function(widget, ...) tcl(widget, "bbox", ...)
 tkcanvasx   <- function(widget, ...) tcl(widget, "canvasx", ...)
 tkcanvasy   <- function(widget, ...) tcl(widget, "canvasy", ...)
+tkcget  <- function(widget, ...) tcl(widget, "cget", ...)
 tkcompare   <- function(widget, ...) tcl(widget, "compare", ...)
 tkconfigure <- function(widget, ...) tcl(widget, "configure", ...)
 tkcoords<- function(widget, ...) tcl(widget, "coords", ...)
 tkcreate<- function(widget, ...) tcl(widget, "create", ...)
-tkcget  <- function(widget, ...) tcl(widget, "cget", ...)
-tkcoords<- function(widget, ...) tcl(widget, "coords", ...)
 tkcurselection  <- function(widget, ...) tcl(widget, "curselection", ...)
 tkdchars<- function(widget, ...) tcl(widget, "dchars", ...)
 tkdebug <- function(widget, ...) tcl(widget, "debug", ...)
@@ -508,8 +507,8 @@
 tkdlineinfo <- function(widget, ...) tcl(widget, "dlineinfo", ...)
 tkdtag  <- function(widget, ...) tcl(widget, "dtag", ...)
 tkdump  <- function(widget, ...) tcl(widget, "dump", ...)
+tkentrycget <- function(widget, ...) tcl(widget, "entrycget", ...)
 tkentryconfigure <- function(widget, ...) tcl(widget, "entryconfigure", ...)
-tkentrycget <- function(widget, ...) tcl(widget, "entrycget", ...)
 tkfind  <- function(widget, ...) tcl(widget, "find", ...)
 tkflash <- function(widget, ...) tcl(widget, "flash", ...)
 tkfraction  <- function(widget, ...) tcl(widget, "fraction", ...)
@@ -535,11 +534,11 @@
 tkmark.unset<- function(widget, ...) tcl(widget, "mark", "unset", ...)
 tkmove  <- function(widget, ...) tcl(widget, "move", ...)
 tknearest   <- function(widget, ...) tcl(widget, "nearest", ...)
+tkpostcascade   <- function(widget, ...) tcl(widget, "postcascade", ...)
 tkpost  <- function(widget, ...) tcl(widget, "post", ...)
-tkpostcascade   <- function(widget, ...) tcl(widget, "postcascade", ...)
 tkpostscript<- function(widget, ...) tcl(widget, "postscript", ...)
+tkscan.dragto   <- function(widget, ...) tcl(widget, "scan", "dragto", ...)
 tkscan.mark <- function(widget, ...) tcl(widget, "scan", "mark", ...)
-tkscan.dragto   <- function(widget, ...) tcl(widget, "scan", "dragto", ...)
 tksearch<- function(widget, ...) tcl(widget, "search", ...)
 tksee   <- function(widget, ...) tcl(widget, "see", ...)
 tkselect<- function(widget, ...) tcl(widget, "select", ...)
@@ -563,7 +562,6 @@
 tcl(widget, "selection", "to", ...)
 tkset   <- function(widget, ...) tcl(widget, "set", ...)
 tksize  <- function(widget, ...) tcl(widget, "size", ...)
-tktoggle<- function(widget, ...) tcl(widget, "toggle", ...)
 tktag.add   <- function(widget, ...) tcl(widget, "tag", "add", ...)
 tktag.bind  <- function(widget, ...) tcl(widget, "tag", "bind", ...)
 tktag.cget  <- function(widget, ...) tcl(widget, "tag", "cget", ...)
@@ -576,6 +574,7 @@
 tktag.raise <- function(widget, ...) tcl(widget, "tag", "raise", ...)
 tktag.ranges<- function(widget, ...) tcl(widget, "tag", "ranges", ...)
 tktag.remove<- function(widget, ...) tcl(widget, "tag", "remove", ...)
+tktoggle<- function(widget, ...) tcl(widget, "toggle", ...)
 tktype  <- function(widget, ...) tcl(widget, "type", ...)
 tkunpost<- function(widget, ...) tcl(widget, "unpost", ...)
 tkwindow.cget   <- function(widget, ...) tcl(widget, "window", "cget", ...)
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [PATCH] remove a duplicate tk function definition (and alphabetize)

2013-09-02 Thread Scott Kostyshak

On Mon, Sep 2, 2013 at 1:37 AM, peter dalgaard  wrote:
>
> On Sep 2, 2013, at 00:42 , Duncan Murdoch wrote:
>
>> On 13-09-01 3:53 PM, peter dalgaard wrote:
>>>
>>> On Sep 1, 2013, at 20:08 , Duncan Murdoch wrote:
>>>
>>>> On 13-09-01 2:45 AM, Scott Kostyshak wrote:
>>>>> 'tkcoords' is defined twice (in the same way) in src/library/tcltk/R/Tk.R.
>>>>>
>>>>> Attached is a patch against r63780 that removes the duplicate
>>>>> definition and alphabetizes the functions.
>>>>>
>>>>> I've read that minor patches such as this should be sent to r-devel [1].
>>>>>
>>>>> Scott
>>>>>
>>>>> [1] http://permalink.gmane.org/gmane.comp.lang.r.devel/33987
>>>>
>>>> Thanks.  I would not do the alphabetization; that makes it much harder to 
>>>> track changes.  But if tkcoords is unnecessarily duplicated, that seems 
>>>> like a reasonable change.
>>>>
>>>
>>> That's what I thought at first sight, but actually, the current breach of 
>>> alphabetization is quite slight, so might be worth fixing after all, since 
>>> it helps against putting the same function in twice... (the patch has 
>>> tkpostcascade before tkpost, which is wrong, though).
>>
>> In that case, go ahead.  I didn't look at the patch due to the description.  
>> (I have committed the removal of the dupe.)

Thanks.

>>
>
> OK, done. R-devel only, like your change.

Thanks,

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Comments requested on "changedFiles" function

2013-09-04 Thread Scott Kostyshak

drop=FALSE]
> changes <- changes[, colSums(changes, na.rm = TRUE) > 0, drop=FALSE]
> if (nrow(changes)) {
> cat("Files changed:\n")
> print(changes)
> }
> x
> }
> --
>
> --- changedFiles.Rd:
> \name{changedFiles}
> \alias{changedFiles}
> \alias{print.changedFiles}
> \alias{print.changedFilesSnapshot}
> \title{
> Detect which files have changed
> }
> \description{
> On the first call, \code{changedFiles} takes a snapshot of a selection of
> files.  In subsequent
> calls, it takes another snapshot, and returns an object containing data on
> the
> differences between the two snapshots.  The snapshots need not be the same
> directory;
> this could be used to compare two directories.
> }
> \usage{
> changedFiles(snapshot, timestamp = tempfile("timestamp"), file.info = NULL,
>  md5sum = FALSE, full.names = FALSE, ...)
> }
> \arguments{
>   \item{snapshot}{
> The path to record, or a previous snapshot.  See the Details.
> }
>   \item{timestamp}{
> The name of a file to write at the time the initial snapshot
> is taken.  In subsequent calls, modification times of files will be compared
> to
> this file, and newer files will be reported as changed.  Set to \code{NULL}
> to skip this test.
> }
>   \item{file.info}{
> A vector of columns from the result of the \code{file.info} function, or a
> logical value.  If
> \code{TRUE}, columns \code{c("size", "isdir", "mode", "mtime")} will be
> used.  Set to
> \code{FALSE} or \code{NULL} to skip this test.  See the Details.
> }
>   \item{md5sum}{
> A logical value indicating whether MD5 summaries should be taken as part of
> the snapshot.
> }
>   \item{full.names}{
> A logical value indicating whether full names (as in
> \code{\link{list.files}}) should be
> recorded.
> }
>   \item{\dots}{
> Additional parameters to pass to \code{\link{list.files}} to control the set
> of files
> in the snapshots.
> }
> }
> \details{
> This function works in two modes.  If the \code{snapshot} argument is
> missing or is
> not of S3 class \code{"changedFilesSnapshot"}, it is used as the \code{path}
> argument
> to \code{\link{list.files}} to obtain a list of files.  If it is of class
> \code{"changedFilesSnapshot"}, then it is taken to be the baseline file
> and a new snapshot is taken and compared with it.  In the latter case,
> missing
> arguments default to match those from the initial snapshot.
>
> If the \code{timestamp} argument is length 1, a file with that name is
> created
> in the current directory during the initial snapshot, and
> \code{\link{file_test}}
> is used to compare the age of all files to it during subsequent calls.
>
> If the \code{file.info} argument is \code{TRUE} or it contains a non-empty
> character vector, the indicated columns from the result of a call to
> \code{\link{file.info}} will be recorded and compared.
>
> If \code{md5sum} is \code{TRUE}, the \code{tools::\link{md5sum}} function
> will be called to record the 32 byte MD5 checksum for each file, and these
> values
> will be compared.
> }
> \value{
> In the initial snapshot phase, an object of class
> \code{"changedFilesSnapshot"} is returned.  This
> is a list containing the fields
> \item{pre}{a dataframe whose rownames are the filenames, and whose columns
> contain the
> requested snapshot data}
> \item{timestamp, file.info, md5sum, full.names}{a record of the arguments in
> the initial call}
> \item{args}{other arguments passed via \code{...} to
> \code{\link{list.files}}.}
>
> In the comparison phase, an object of class \code{"changedFiles"}. This is a
> list containing
> \item{added, deleted, changed, unchanged}{character vectors of filenames
> from the before
> and after snapshots, with obvious meanings}
> \item{changes}{a logical matrix with a row for each common file, and a
> column for each
> comparison test.  \code{TRUE} indicates a change in that test.}
>
> \code{\link{print}} methods are defined for each of these types. The
> \code{\link{print}} method for \code{"changedFilesSnapshot"} objects
> displays the arguments used to produce it, while the one for
> \code{"changedFiles"} displays the \code{added}, \code{deleted}
> and \code{changed} fields if non-empty, and a submatrix of the
> \code{changes}
> matrix containing all of the \code{TRUE} values.
> }
> \author{
> Duncan Murdoch
> }
> \seealso{
> \code{\link{file.info}}, \code{\link{file_test}}, \code{\link{md5sum}}.
> }
> \examples{
> # Create some files in a temporary directory
> dir <- tempfile()
> dir.create(dir)

Should a different name than 'dir' be used since 'dir' is a base function?

Further, if someone is not very familiar with R (or just not in "R
mode" at the time of reading), they might think that 'dir.create' is
calling the create member of the object named 'dir' that you just
made.

Scott

> writeBin(1, file.path(dir, "file1"))
> writeBin(2, file.path(dir, "file2"))
> dir.create(file.path(dir, "dir"))
>
> # Take a snapshot
> snapshot <- changedFiles(dir, file.info=TRUE, md5sum=TRUE)
>
> # Change one of the files
> writeBin(3, file.path(dir, "file2"))
>
> # Display the detected changes
> changedFiles(snapshot)
> changedFiles(snapshot)$changes
> }
> \keyword{utilities}
> \keyword{file}
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Comments requested on "changedFiles" function

2013-09-05 Thread Scott Kostyshak

On Thu, Sep 5, 2013 at 6:48 AM, Duncan Murdoch  wrote:
> On 13-09-04 11:36 PM, Scott Kostyshak wrote:
>>
>> On Wed, Sep 4, 2013 at 1:53 PM, Duncan Murdoch 
>> wrote:
>>>
>>> In a number of places internal to R, we need to know which files have
>>> changed (e.g. after building a vignette).  I've just written a general
>>> purpose function "changedFiles" that I'll probably commit to R-devel.
>>> Comments on the design (or bug reports) would be appreciated.
>>>
>>> The source for the function and the Rd page for it are inline below.
>>
>>
>> This looks like a useful function. Thanks for writing it. I have only
>> one (picky) comment below.
>>
>>> - changedFiles.R:
>>> changedFiles <- function(snapshot, timestamp = tempfile("timestamp"),
>>> file.info = NULL,
>>>   md5sum = FALSE, full.names = FALSE, ...) {
>>>  dosnapshot <- function(args) {
>>>  fullnames <- do.call(list.files, c(full.names = TRUE, args))
>>>  names <- do.call(list.files, c(full.names = full.names, args))
>>>  if (isTRUE(file.info) || (is.character(file.info) &&
>>> length(file.info))) {
>>>  info <- file.info(fullnames)
>>>  rownames(info) <- names
>>>  if (isTRUE(file.info))
>>>  file.info <- c("size", "isdir", "mode", "mtime")
>>>  } else
>>>  info <- data.frame(row.names=names)
>>>  if (md5sum)
>>>  info <- data.frame(info, md5sum = tools::md5sum(fullnames))
>>>  list(info = info, timestamp = timestamp, file.info = file.info,
>>>   md5sum = md5sum, full.names = full.names, args = args)
>>>  }
>>>  if (missing(snapshot) || !inherits(snapshot,
>>> "changedFilesSnapshot")) {
>>>  if (length(timestamp) == 1)
>>>  file.create(timestamp)
>>>  if (missing(snapshot)) snapshot <- "."
>>>  pre <- dosnapshot(list(path = snapshot, ...))
>>>  pre$pre <- pre$info
>>>  pre$info <- NULL
>>>  pre$wd <- getwd()
>>>  class(pre) <- "changedFilesSnapshot"
>>>  return(pre)
>>>  }
>>>
>>>  if (missing(timestamp)) timestamp <- snapshot$timestamp
>>>  if (missing(file.info) || isTRUE(file.info)) file.info <-
>>> snapshot$file.info
>>>  if (identical(file.info, FALSE)) file.info <- NULL
>>>  if (missing(md5sum))md5sum <- snapshot$md5sum
>>>  if (missing(full.names)) full.names <- snapshot$full.names
>>>
>>>  pre <- snapshot$pre
>>>  savewd <- getwd()
>>>  on.exit(setwd(savewd))
>>>  setwd(snapshot$wd)
>>>
>>>  args <- snapshot$args
>>>  newargs <- list(...)
>>>  args[names(newargs)] <- newargs
>>>  post <- dosnapshot(args)$info
>>>  prenames <- rownames(pre)
>>>  postnames <- rownames(post)
>>>
>>>  added <- setdiff(postnames, prenames)
>>>  deleted <- setdiff(prenames, postnames)
>>>  common <- intersect(prenames, postnames)
>>>
>>>  if (length(file.info)) {
>>>  preinfo <- pre[common, file.info]
>>>  postinfo <- post[common, file.info]
>>>  changes <- preinfo != postinfo
>>>  }
>>>  else changes <- matrix(logical(0), nrow = length(common), ncol = 0,
>>> dimnames = list(common, character(0)))
>>>  if (length(timestamp))
>>>  changes <- cbind(changes, Newer = file_test("-nt", common,
>>> timestamp))
>>>  if (md5sum) {
>>>  premd5 <- pre[common, "md5sum"]
>>>  postmd5 <- post[common, "md5sum"]
>>>  changes <- cbind(changes, md5sum = premd5 != postmd5)
>>>  }
>>>  changes1 <- changes[rowSums(changes, na.rm = TRUE) > 0, , drop =
>>> FALSE]
>>>  changed <- rownames(changes1)
>>>  structure(list(added = added, deleted = deleted, changed = changed,
>>>  unchanged = setdiff(common, changed), changes = changes), class
>>> =
>>> "changedFiles")
>>> }
>

Re: [Rd] Comments requested on "changedFiles" function

2013-09-06 Thread Scott Kostyshak

On Fri, Sep 6, 2013 at 3:46 PM, Duncan Murdoch  wrote:
> On 06/09/2013 2:20 PM, Duncan Murdoch wrote:
>>
>> I have now put the code into a temporary package for testing; if anyone
>> is interested, for a few days it will be downloadable from
>>
>> fisher.stats.uwo.ca/faculty/murdoch/temp/testpkg_1.0.tar.gz
>
>
> Sorry, error in the URL.  It should be
>
> http://www.stats.uwo.ca/faculty/murdoch/temp/testpkg_1.0.tar.gz

Works well. A couple of things I noticed:

(1)
md5sum is being called on directories, which causes warnings. (If this
is not viewed as undesirable, please ignore the rest of this comment.)
Should this be the responsibility of the user (by passing arguments to
list.files)? In the example, changing
fileSnapshot(dir, file.info=TRUE, md5sum=TRUE)
to
fileSnapshot(dir, file.info=TRUE, md5sum=TRUE, include.dirs=FALSE,
recursive=TRUE")

gets rid of the warnings. But perhaps the user just wants to exclude
directories for the md5sum calculations. This can't be controlled from
fileSnapshot.

Or, should the "if (md5sum)" chunk subset "fullnames" using file_test
or file.info to exclude directories (and then fill in the directories
with NA)?

(2)
If I run example(changedFiles) several times, sometimes I get:

chngdF> changedFiles(snapshot)
File changes:
  mtime md5sum
file2  TRUE   TRUE

and other times I get:

chngdF> changedFiles(snapshot)
File changes:
  md5sum
file2   TRUE

I wonder why.

Scott

> sessionInfo()
R Under development (unstable) (2013-08-31 r63780)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] testpkg_1.0

loaded via a namespace (and not attached):
[1] tools_3.1.0
>

--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Comments requested on "changedFiles" function

2013-09-06 Thread Scott Kostyshak

On Fri, Sep 6, 2013 at 7:40 PM, Scott Kostyshak  wrote:
> On Fri, Sep 6, 2013 at 3:46 PM, Duncan Murdoch  
> wrote:
>> On 06/09/2013 2:20 PM, Duncan Murdoch wrote:
>>>
>>> I have now put the code into a temporary package for testing; if anyone
>>> is interested, for a few days it will be downloadable from
>>>
>>> fisher.stats.uwo.ca/faculty/murdoch/temp/testpkg_1.0.tar.gz
>>
>>
>> Sorry, error in the URL.  It should be
>>
>> http://www.stats.uwo.ca/faculty/murdoch/temp/testpkg_1.0.tar.gz
>
> Works well. A couple of things I noticed:
>
> (1)
> md5sum is being called on directories, which causes warnings. (If this
> is not viewed as undesirable, please ignore the rest of this comment.)
> Should this be the responsibility of the user (by passing arguments to
> list.files)? In the example, changing
> fileSnapshot(dir, file.info=TRUE, md5sum=TRUE)
> to
> fileSnapshot(dir, file.info=TRUE, md5sum=TRUE, include.dirs=FALSE,
> recursive=TRUE")
>
> gets rid of the warnings. But perhaps the user just wants to exclude
> directories for the md5sum calculations. This can't be controlled from
> fileSnapshot.
>
> Or, should the "if (md5sum)" chunk subset "fullnames" using file_test
> or file.info to exclude directories (and then fill in the directories
> with NA)?
>
> (2)
> If I run example(changedFiles) several times, sometimes I get:
>
> chngdF> changedFiles(snapshot)
> File changes:
>   mtime md5sum
> file2  TRUE   TRUE
>
> and other times I get:
>
> chngdF> changedFiles(snapshot)
> File changes:
>   md5sum
> file2   TRUE
>
> I wonder why.

Putting the following in-between snapshot and writeBin in the example
leads to consistent output:

# allow for mtime to change
Sys.sleep(.1)

Scott

>
> Scott
>
>> sessionInfo()
> R Under development (unstable) (2013-08-31 r63780)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
>  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
>  [9] LC_ADDRESS=C   LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> other attached packages:
> [1] testpkg_1.0
>
> loaded via a namespace (and not attached):
> [1] tools_3.1.0
>>
>
>
> --
> Scott Kostyshak
> Economics PhD Candidate
> Princeton University


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Comments requested on "changedFiles" function

2013-09-08 Thread Scott Kostyshak

On Sun, Sep 8, 2013 at 10:55 AM, Duncan Murdoch
 wrote:
> On 13-09-06 11:07 PM, Karl Millar wrote:
>>
>> On Fri, Sep 6, 2013 at 7:03 PM, Duncan Murdoch 
>> wrote:
>>>
>>> On 13-09-06 9:21 PM, Karl Millar wrote:
>>>>
>>>>
>>>> Hi Duncan,
>>>>
>>>> I like the interface of this version a lot better, but there's still a
>>>> bunch of implementation details that need fixing:
>>>>
>>>> * As previously mentioned, there are important cases where the mtime
>>>> values change in ways that this code doesn't detect.
>>>> * If the timestamp file (which is usually in the temp directory) gets
>>>> deleted (which can happen after a moderate amount of time of
>>>> inactivity on some systems), then the file_test('-nt', ...) will
>>>> always return false, even if the file has changed.
>>>
>>>
>>>
>>> If that happened without user intervention, I think it would break other
>>> things in R -- the temp directory is supposed to last for the whole
>>> session.
>>> But I should be checking anyway.
>>
>>
>> Yes, it does break other things in R -- my experience has been that
>> the help system seems to be the one that is impacted the most by this.
>>   FWIW, I've never seen the entire R temp directory deleted, just
>> individual files and subdirectories in it, but even that probably
>> depends on how the machine is configured.  I suspect only a few users
>> ever notice this, but my R use is probably somewhat anomalous and I
>> think it only happens to R sessions that I haven't used for a few
>> days.
>
>
> I use Windows and never see this; deleting temp files is up to me, not to
> the system.  But my understanding was the *nix systems should only clean up
> /tmp on restart, and I don't think an R session will survive a restart.
>
> However, you have convinced me that the use of the timestamp file is not
> beneficial enough to be the default.  I'll leave it as an option, but add
> warnings that it might be unreliable.
>
>
>>
>>>> * If files get added or deleted between the two calls to list.files in
>>>> fileSnapshot, it will fail with an error.
>>>
>>>
>>>
>>> Yours won't work if path contains more than one directory.  This is
>>> probably
>>> a reasonable restriction, but it's inconsistent with list.files, so I'd
>>> like
>>> to avoid it if I can find a way.
>>
>>
>> I'm currently unsure what the behaviour when comparing snapshots with
>> multiple directories should be.
>>
>> Presumably we should have the property that (horribly abusing notation
>> for succinctness):
>>compareSnapshots(c(a1, a2),  c(a1, a2))
>> is the same as concatenating (in some form)
>>compareSnapshots(a1, a1) and compareSnapshots(a2, a2)
>> and there's a bunch of ways we could concatenate -- we could return a
>> list of results, or a single result where each of the 'added, deleted,
>> modified' fields are a list, or where we concatenate the 'added,
>> deleted, modified' fields together into three simple vectors.
>> Concatenating the vectors together like this is appealing, but unless
>> you're using the full names, it doesn't include the information of
>> which directory the changes are in, and using the full names doesn't
>> work in the case where you're comparing different sets of directories,
>> e.g. compareSnapshots(c(a1, a2), c(b1, b2)), where there is no
>> sensible choice for a full name.  The list options don't have this
>> problem, but are harder to work with, particularly for the common case
>> where there's only a single directory.  You'd also have to be somewhat
>> careful with filenames that occur in both directories.
>>
>> Maybe I'm just being dense, but I don't see a way to do this thats
>> clear, easy to use and wouldn't confuse users at the moment.
>
>
> The way I've done this is to require full.names when multiple dirs are on
> the path.  I've reduced it to one list.files() call per dir, by iterating
> over the path variable and using your approach of calling it with full.names
> = FALSE, then adding the dir if necessary.
>
> I haven't adopted your change that forces comparison of only size and mtime
> from file.info.  I don't see a big cost in storing whatever file.info
> returns (which is system dependent; on Windows I don't see the user an

[Rd] tools::md5sum(directory) behavior different on Windows vs. Unix

2013-09-09 Thread Scott Kostyshak

tools::md5sum gives a warning if it receives a directory as an
argument on Unix but not on Windows.

>From what I understand, this happens because in Windows a directory is
not treated as a file so fopen returns NULL. Then, NA is returned
without a warning. On Unix, a directory is treated as a file so fopen
does not return NULL so md5 is run and fails, leading to a warning.

This is a good opportunity for me to understand further (in addition
to [1] and the many places where OS special cases are mentioned) in
which cases R tries to behave the same on Windows as on Unix and in
which cases it allows for differences (in this case, a warning vs. no
warning). For example, it would be straightforward to create a patch
that would lead to the same behavior in this case. tools::md5sum could
either issue a warning for each argument that is a directory or it
could issue no warning (consistent with file.info). Would either patch
be considered?

Or is this difference encouraged because the concept of a file is
different on Unix than on Windows?

Scott

[1] 
http://cran.r-project.org/bin/windows/base/rw-FAQ.html#What-should-I-expect-to-behave-differently-from-the-Unix-version


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] tools::md5sum(directory) behavior different on Windows vs. Unix

2013-09-29 Thread Scott Kostyshak

On Mon, Sep 9, 2013 at 3:00 AM, Scott Kostyshak  wrote:
> tools::md5sum gives a warning if it receives a directory as an
> argument on Unix but not on Windows.
>
> From what I understand, this happens because in Windows a directory is
> not treated as a file so fopen returns NULL. Then, NA is returned
> without a warning. On Unix, a directory is treated as a file so fopen
> does not return NULL so md5 is run and fails, leading to a warning.
>
> This is a good opportunity for me to understand further (in addition
> to [1] and the many places where OS special cases are mentioned) in
> which cases R tries to behave the same on Windows as on Unix and in
> which cases it allows for differences (in this case, a warning vs. no
> warning). For example, it would be straightforward to create a patch
> that would lead to the same behavior in this case. tools::md5sum could
> either issue a warning for each argument that is a directory or it
> could issue no warning (consistent with file.info). Would either patch
> be considered?

Attached is a patch that gives a warning if an element in the file
argument is not a regular file (e.g. is a directory or does not
exist). In my opinion the advantages of this patch are:

(1) the same warnings are generated on all platforms in the case where
one of the elements is a folder.
(2) a warning is also given if a file does not exist.

Comments?

Scott

>
> Or is this difference encouraged because the concept of a file is
> different on Unix than on Windows?
>
> Scott
>
> [1] 
> http://cran.r-project.org/bin/windows/base/rw-FAQ.html#What-should-I-expect-to-behave-differently-from-the-Unix-version
>
>
> --
> Scott Kostyshak
> Economics PhD Candidate
> Princeton University
Index: trunk/src/library/tools/R/md5.R
===
--- trunk/src/library/tools/R/md5.R (revision 64011)
+++ trunk/src/library/tools/R/md5.R (working copy)
@@ -17,7 +17,18 @@
 #  http://www.r-project.org/Licenses/
 
 md5sum <- function(files)
-structure(.Call(Rmd5, files), names=files)
+{
+reg_ <- file_test("-f", files)
+regFiles <- files[reg_]
+notReg <- files[!reg_]
+if(!all(reg_))
+warning("The following are not regular files: ",
+paste(shQuote(notReg), collapse = " "))
+names(files) <- files
+files[!reg_] <- NA
+files[reg_] <- .Call(Rmd5, regFiles)
+files
+}
 
 .installMD5sums <- function(pkgDir, outDir = pkgDir)
 {
Index: trunk/src/library/tools/man/md5sum.Rd
===
--- trunk/src/library/tools/man/md5sum.Rd   (revision 64011)
+++ trunk/src/library/tools/man/md5sum.Rd   (working copy)
@@ -18,7 +18,8 @@
 \value{
   A character vector of the same length as \code{files}, with names
   equal to \code{files}. The elements
-  will be \code{NA} for non-existent or unreadable files, otherwise
+  will be \code{NA} for non-existent or unreadable files (in which case
+  a warning will be generated), otherwise
   a 32-character string of hexadecimal digits.
 
   On Windows all files are read in binary mode (as the \code{md5sum}
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] [PATCH] file.access returns success for NA

2013-10-03 Thread Scott Kostyshak

Currently on R I get the following:

> file.access(c("doesNotExist", NA))
doesNotExist 
  -10

where 0 means success. Is the 0 correct? I was expecting either NA or -1.

?file.access does not mention how NA values should be handled. The
subsection "3.3.4 NA handling" from the R Language Definition manual
suggest to me that file.access should return NA if given NA. I
interpret it in this way because if an element in the input vector is
NA, that means that there is a filename that exists but is not known.
Thus, I thought that file.access should return NA because it is not
known whether the file corresponding to the missing filename exists.

Perhaps file.access acts in this way to maintain compatibility with
the S-PLUS function ‘access’ (which I currently do not have a way of
testing to see how it handles NAs) ? If this is the case, would a
patch for ?file.access be considered?

Below is a patch that changes the return of an NA to NA.

Index: trunk/src/main/platform.c
===
--- trunk/src/main/platform.c (revision 64011)
+++ trunk/src/main/platform.c (working copy)
@@ -1299,7 +1299,7 @@
  access(R_ExpandFileName(translateChar(STRING_ELT(fn, i))),
modemask);
 #endif
- } else INTEGER(ans)[i] = FALSE;
+ } else INTEGER(ans)[i] = NA_INTEGER;
 UNPROTECT(1);
 return ans;
 }

Comments?

Scott

> sessionInfo()
R Under development (unstable) (2013-09-27 r64011)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base
>


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Possible POSIXlt / wday glitch & bugs.r-project.org status

2013-10-04 Thread Scott Kostyshak

On Fri, Oct 4, 2013 at 6:11 AM, Imanuel Costigan  wrote:
> Wanted to raise two questions:
>
> 1. Is bugs.r-project.org down? I haven't been able to reach it for two or 
> three days:

Yes. Quote from Duncan:

... the server is currently down. The volunteer who runs the server is
currently away from his office, so I expect it won't get fixed until he
gets back in a few days.

https://stat.ethz.ch/pipermail/r-help/2013-October/360958.html

Scott

>
> ```
> ping bugs.r-project.org
> PING rbugs.research.att.com (207.140.168.137): 56 data bytes
> Request timeout for icmp_seq 0
> Request timeout for icmp_seq 1
> Request timeout for icmp_seq 2
> Request timeout for icmp_seq 3
> Request timeout for icmp_seq 4
> Request timeout for icmp_seq 5
> Request timeout for icmp_seq 6
> ```
>
> 2. Is wday element of POSIXlt meant to be timezone invariant? You would 
> expect the wday element to be invariant to the timezone of a date. That is, 
> the same date/time instant of 5th October 2013 in both Australia/Sydney and 
> UTC should be a Saturday (i.e. wday = 6). And indeed that is the case with 1 
> min past midnight on 5 October 2013:
>
> ```
> library(lubridate)
> d_utc <- ymd_hms(2013100501, tz='UTC')
> d_local <- ymd_hms(2013100501, tz='Australia/Sydney')
> as.POSIXlt(x=d_utc, tz=tz(d_utc))$wday # 6
> as.POSIXlt(x=d_local, tz=tz(d_local))$wday # 6
> ```
>
> But this isn't always the case. For example,
>
> ```
> d_utc <- ymd_hms(2038100201, tz='UTC')
> d_local <- ymd_hms(2038100201, tz='Australia/Sydney')
> as.POSIXlt(x=d_utc, tz=tz(d_utc))$wday # 6
> as.POSIXlt(x=d_local, tz=tz(d_local))$wday # 5
> ```
>
> Is this expected behaviour? I would have expected a properly encoded 
> date/time of 2 Oct 2038 to be a Saturday irrespective of its time zone.
>
> Obligatory system dump:
>
> ```
>> sessionInfo()
> R version 3.0.1 (2013-05-16)
> Platform: x86_64-apple-darwin12.4.0 (64-bit)
>
> locale:
> [1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
> other attached packages:
> [1] lubridate_1.3.0 testthat_0.7.1  devtools_1.3
>
> loaded via a namespace (and not attached):
>  [1] colorspace_1.2-4   dichromat_2.0-0digest_0.6.3   evaluate_0.5.1
>  [5] ggplot2_0.9.3.1grid_3.0.1 gtable_0.1.2   httr_0.2
>  [9] labeling_0.2   MASS_7.3-29memoise_0.1munsell_0.4.2
> [13] parallel_3.0.1 plyr_1.8   proto_0.3-10   
> RColorBrewer_1.0-5
> [17] RCurl_1.95-4.1 reshape2_1.2.2 scales_0.2.3   stringr_0.6.2
> [21] tools_3.0.1whisker_0.3-2
>
> ```
>
> Using R compiled by homebrew [1]. But also experiencing the same bug using R 
> installed on Windows 7 from the CRAN binaries.
>
> For those interested, I've also noted this on the `lubridate` Github issues 
> page [2], even though this doesn't appear to be a lubridate issue.
>
> Thanks for any help.
>
> [1] http://brew.sh
> [2] https://github.com/hadley/lubridate/issues/209
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] [PATCH] minor suggestions for R-ints manual

2013-10-12 Thread Scott Kostyshak

Attached is a patch with minor suggestions for the R-ints manual at
r64048. The most substantial change is the following:

 The top layer comprises the graphics subsystems. Although there is
-provision for 24 subsystems, after 6 years only two exist, `base' and
+provision for 24 subsystems, since 2001 only two exist, `base' and
 `grid'.

Is the year 2001 correct? I base it on the date of the commit that
introduced the "6 years" string and on the date of grid 0.1.

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University
Index: trunk/doc/manual/R-ints.texi
===
--- trunk/doc/manual/R-ints.texi(revision 64048)
+++ trunk/doc/manual/R-ints.texi(working copy)
@@ -462,7 +462,7 @@
 (which are 32 bits on all @R{} platforms).
 
 @item REALSXP
-@code{length}, @code{truelength} followed by a block of C @code{double}s
+@code{length}, @code{truelength} followed by a block of C @code{double}s.
 
 @item CPLXSXP
 @code{length}, @code{truelength} followed by a block of C99 @code{double
@@ -1330,7 +1330,7 @@
 The relationship between the pairs is similar: @code{warning} tries to
 fathom out a suitable call, and then calls @code{warningcall} with that
 call as the first argument if it succeeds, and with @code{call =
-R_NilValue} it is does not.  When @code{warningcall} is called, it
+R_NilValue} if it does not.  When @code{warningcall} is called, it
 includes the deparsed call in its printout unless @code{call =
 R_NilValue}.
 
@@ -2289,12 +2289,12 @@
 @file{src/main/names.c}: primitives have @samp{Y = 0} in the @samp{eval}
 field.
 
-There needs to an a @samp{\alias} entry in a help file in the @pkg{base}
+There needs to be a @samp{\alias} entry in a help file in the @pkg{base}
 package, and the primitive needs to be added to one of the lists at the
 start of this section.
 
 Some primitives are regarded as language elements (the current ones are
-listed above).  These need to be in added to two lists of exceptions,
+listed above).  These need to be added to two lists of exceptions,
 @code{langElts} in @code{undoc()} (in file
 @file{src/library/tools/R/QC.R}) and @code{lang_elements} in
 @file{tests/primitives.R}.
@@ -2778,7 +2778,7 @@
 
 
 The top layer comprises the graphics subsystems. Although there is
-provision for 24 subsystems, after 6 years only two exist, `base' and
+provision for 24 subsystems, since 2001 only two exist, `base' and
 `grid'.  The base subsystem is registered with the engine when @R{} is
 initialized, and unregistered (via @code{KillAllDevices}) when an @R{}
 session is shut down.  The grid subsystem is registered in its
@@ -3797,7 +3797,7 @@
 interactively.
 Default: true.
 @item _R_CHECK_VIGNETTES_NLINES_
-Maximum number of lines to show of the bottom of the output when reporting
+Maximum number of lines to show at the bottom of the output when reporting
 errors in running vignettes.
 Default: 10.
 @item _R_CHECK_CODOC_S4_METHODS_
@@ -4258,7 +4258,7 @@
 @file{Renviron} file.  This used to record @samp{false} if no command
 was found, but it nowadays records the name for looking up on the path
 at run time.  The latter can be important for binary distributions: one
-does not want to be tied to, for example, TeXLive 2007.
+does not want to be tied to, for example, TeX Live 2007.
 
 
 @node Current and future directions, Function and variable index, Use of TeX 
dialects, Top
@@ -4408,7 +4408,7 @@
 are supported provided that each of the dimensions is no more than
 2^31-1.  However, not all applications can be supported.
 
-The main problem is linear algebra, on done by FORTRAN code compiled
+The main problem is linear algebra done by FORTRAN code compiled
 with 32-bit @code{INTEGER}.  Although not guaranteed, it seems that all
 the compilers currently used with @R{} on a 64-bit platform allow
 matrices each of whose dimensions is less than 2^31 but with more than
@@ -4416,7 +4416,7 @@
 support software (such as @acronym{BLAS} and @acronym{LAPACK}) also
 work.
 
-There are exceptions: for example some complex @acronym{LAPACK})
+There are exceptions: for example some complex @acronym{LAPACK}
 auxiliary routines do use a single @code{INTEGER} index and hence
 overflow silently and segfault or give incorrect results.  One example
 is @code{svd()} on a complex matrix.
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [PATCH] minor suggestions for R-ints manual

2013-11-06 Thread Scott Kostyshak

On Tue, Nov 5, 2013 at 11:43 AM, Martin Maechler
 wrote:
>>>>>> Scott Kostyshak 
>>>>>> on Sat, 12 Oct 2013 17:50:52 -0400 writes:
>
> > Attached is a patch with minor suggestions for the R-ints
> > manual at r64048. The most substantial change is the
> > following:
>
> >  The top layer comprises the graphics subsystems. Although
> > there is -provision for 24 subsystems, after 6 years only
> > two exist, `base' and +provision for 24 subsystems, since
> > 2001 only two exist, `base' and `grid'.
>
> > Is the year 2001 correct? I base it on the date of the
> > commit that introduced the "6 years" string and on the
> > date of grid 0.1.
>
> > Scott
>
> I've used  "about 2001" and otherwise basically applied your patch
> (after checking it).
>
> Thank you very much for your contribution!

Thank *you* Martin!

Scott

>
> Martin Maechler
>
>
> > --
> > Scott Kostyshak Economics PhD Candidate Princeton
> > University x[DELETED ATTACHMENT external: R-ints.diff.txt, plain text]
> >  __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] [PATCH] suggestions for R-lang manual

2013-11-20 Thread Scott Kostyshak

Attached is a patch with suggestions for the R-lang manual at r64277.

Below are a few comments (some are implemented in the patch):

In the section "Objects", there is a table introduced by "The
following table describes the possible values returned by typeof". One
of the results is "any". Can "any" be returned by "typeof()" ?

Regarding the "Recycling rules" section,

-One exception is that when adding vectors to matrices, a warning is not
-given if the lengths are incompatible.
-@c Is that a bug?
-

was this a bug that was fixed? I see the following behavior:

> myvec <- 1:3
> mymat <- matrix(1:12, ncol=2)
> myvec <- 1:5
> myvec + mymat
 [,1] [,2]
[1,]29
[2,]4   11
[3,]6   13
[4,]8   15
[5,]   10   12
[6,]7   14
Warning message:
In myvec + mymat :
  longer object length is not a multiple of shorter object length
>

Regarding

-The arguments in the call to the generic are rematched with the
-arguments for the method using the standard argument matching mechanism.
-The first argument, i.e.@: the object, will have been evaluated.
-

this information is duplicated. See a few paragraphs up "When the
method is invoked it is called..."

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University
Index: trunk/doc/manual/R-lang.texi
===
--- trunk/doc/manual/R-lang.texi(revision 64277)
+++ trunk/doc/manual/R-lang.texi(working copy)
@@ -1064,7 +1064,7 @@
 @cindex function
 @cindex function arguments
 Function calls can have @emph{tagged} (or @emph{named}) arguments, as in
-@code{plot(x, y, pch = 3)} arguments without tags are known as
+@code{plot(x, y, pch = 3)}.  Arguments without tags are known as
 @emph{positional} since the function must distinguish their meaning from
 their sequential positions among the arguments of the call, e.g., that
 @code{x} denotes the abscissa variable and @code{y} the ordinate.  The
@@ -1308,10 +1308,10 @@
 ignored.  If @var{value1} has any type other than a logical or a numeric
 vector an error is signalled.
 
-If/else statements can be used to avoid numeric problems such as taking
-the logarithm of a negative number.  Because if/else statements are the
-same as other statements you can assign the value of them.  The two
-examples below are equivalent.
+@code{if}/@code{else} statements can be used to avoid numeric problems
+such as taking the logarithm of a negative number.  Because
+@code{if}/@code{else} statements are the same as other statements you
+can assign the value of them.  The two examples below are equivalent.
 
 @example
 > if( any(x <= 0) ) y <- log(1+x) else y <- log(x)
@@ -1327,7 +1327,7 @@
 compound statement wrapped in braces, putting the @code{else} on the
 same line as the closing brace that marks the end of the statement.
 
-If/else statements can be nested.
+@code{if}/@code{else} statements can be nested.
 
 @example
 if ( @var{statement1} ) @{
@@ -1342,7 +1342,7 @@
 
 One of the even numbered statements will be evaluated and the resulting
 value returned.  If the optional @code{else} clause is omitted and all
-the odd numbered @var{statement}'s evaluate to @code{FALSE} no statement
+the odd numbered @var{statement}s evaluate to @code{FALSE} no statement
 will be evaluated and @code{NULL} is returned.
 
 The odd numbered @var{statement}s are evaluated, in order, until one
@@ -1378,7 +1378,7 @@
 of the loop (if there is one) is then executed.  No statement below
 @code{next} in the current loop is evaluated.
 
-The value returned by a loop statement statement is always @code{NULL}
+The value returned by a loop statement is always @code{NULL}
 and is returned invisibly.
 
 @node repeat, while, Looping, Control structures
@@ -1451,7 +1451,7 @@
 where the elements of @var{list} may be named.  First, @var{statement}
 is evaluated and the result, @var{value}, obtained.  If @var{value} is a
 number between 1 and the length of @var{list} then the corresponding
-element @var{list} is evaluated and the result returned.  If @var{value}
+element of @var{list} is evaluated and the result returned.  If @var{value}
 is too large or too small @code{NULL} is returned.
 
 @example
@@ -1530,10 +1530,6 @@
 As from @R{} 1.4.0, any arithmetic operation involving a zero-length
 vector has a zero-length result.
 
-One exception is that when adding vectors to matrices, a warning is not
-given if the lengths are incompatible.
-@c Is that a bug?
-
 @node Propagation of names, Dimensional attributes, Recycling rules, 
Elementary arithmetic operations
 @subsection Propagation of names
 @cindex name
@@ -1842,7 +1838,7 @@
 matching.
 
 The most important example of a class method for @code{[} is that used
-for data frames.  It is not be described in detail here (see the help
+for data frames.  It is not described in detail here (see the help
 page for @cod

Re: [Rd] help page of warnings()

2013-12-28 Thread Scott Kostyshak

On Sat, Dec 28, 2013 at 6:06 PM, Elad Zippory  wrote:
> Hi,
>
> I raised this issue at stackoverflow and it was suggested to raise it here:
>
> >From the current help page, it is unclear that "warnings()" does not clear
> after rm(list=ls()). Currently the page states that:
>
> "Warning: It is undocumented where last.warning is stored nor that it is
> visible, and this is subject to change. Prior to R 2.4.0 it was stored in
> the workspace, but no longer."
>
> Yet, I suggest that, if to keep the current behavior or until the behavior
> is changed, at least write explicitly in the help file something like
> "clearing the global environment will not clear the warning list. To do so
> use assign("last.warning", NULL, envir = baseenv())"
>
> Thank you,
> Elad Zippory

Hi Elad,

I'm not a decision maker around here but I'm curious about your
suggestion. I always find it helpful to try to understand how people
use R and how they expect R to work.

>From what I understand, you agree that there's no contradiction of
behavior in terms of how R is documented to work and you agree that
rm(list=ls()) should indeed not clear the warnings list. First, let me
give my observation that I think the policy of writing R documentation
is to give sufficient information for what a function does. When there
is something surprising or there are performance issues to keep in
mind, occasionally the R documentation appropriately mentions what a
function does not do.

I think you are interested in making more of a "let's make it easier
on the user" argument so let me try to address that. I think it's easy
to learn how to find the last.warning object. This would only require
a user to read the first line of ?warnings and then to know about the
getAnywhere function. That's it.

In fact, I think that's too easy. I would personally be in favor of
making it _more_ difficult for a beginning user to modify
last.warning. I've never had to do such a thing and I would be
suspicious of beginning/intermediate users who claim there's a need
to. If you want a fresh R session, use a fresh R session. Clearing the
global environment will not give a fresh R session. Clearing the
global environment and clearing warnings will not do so either. In my
opinion, it is tricks like these that can lead to unfortunate
situations where results are not reproducible.

Also, you mention a Stack Overflow question. If you are going to refer
to something, please provide a link (perhaps in a footnote like this
[1] if you do not want to put a long distracting URL in your message).
Maybe there is no useful discussion there, but maybe there is and the
discussion has already raised the points I raise in this email. The
reader of your message is thus left wondering.

Let me note that I'm just an ordinary R user. I hope I don't scare you
off from giving more suggestions and wouldn't be surprised if others
disagree. I hope you send more messages like the one you just sent
because I'm interested in understanding what R users find confusing.

Best regards,

Scott

[1] an old but related Stack Overflow question:
http://stackoverflow.com/questions/5725106/r-how-to-clear-all-warnings

--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] help page of warnings()

2013-12-29 Thread Scott Kostyshak

On Sat, Dec 28, 2013 at 11:19 PM, Elad Zippory  wrote:
> Hi Scott,
>
> Thank you for your detailed response. (btw, the reason why I didn't link the
> Stack Overflow question is because I deleted it after I sent the e-mail).

Hi Elad,

Please keep the conversation on the list unless there is a reason for
it to be private, in which case please say so. This way everyone can
participate (and more importantly can correct my errors).

> The rationale behind my proposal was because I was surprised to learn that
> rm(list=ls()) does not clear the warning list. The reason why I was
> surprised is because it is not clear from the help page (if you are at a
> level that requires you to read the help page of such a base function, the
> warning that I quoted does not fully warn the 'user', who is not a
> 'developer', what is going on. Environments in R are not trivial knowledge
> that can be raised too concisely).

In some cases environments can be thought of like lists. As for how
name look-up goes, yes it takes some studying to learn about that.

> The reason why it mattered is because I am writing a program to be run on
> our HPC, and I want it to abort when there is a warning so I can attend to
> it right away. No point to discover after expensive usage that some warning
> should be investigated, casting doubt on several days of computation. It is
> also useful when writing recursive code, to abort immediately when the
> warning list is populated as it is very hard to understand what went wrong,
> and especially, where...

This is a great programming strategy. You might be interested in one
of my favorite recommendations: treat warnings like errors.

options(warn = 2) # asks R to treat warnings as errors. See ?options

As far as knowing more precisely where something went wrong (where not
in the sense of what line of code, but in which function), consider
using the traceback function. Or, in addition to the above options
command, you might like:

options(error = recover) # asks R to enter the debugger when there is an error

and because warnings are now errors, it also enters the debugger for
warnings. This way you can poke around where the warning occurred.

> So, those were my motivations. Again, if I would know that I need a fresh R
> session, I would get it. I don't like 'touching' what I don't understand. I
> just wish I knew I needed to do so without wasting a day trying to debug a
> warning, where all my actions to debug it were 'virtual'.

I still don't see a need to manually access last.warning for the
situation you described.

> Again, thank you for your detailed response, I hope that the case I am
> making is clearer now.

Thank you for giving more details on what you're trying to accomplish.

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

> Best regards,
> Elad Zippory
> Ph.D student
> Politics, NYU

> On Sat, Dec 28, 2013 at 9:19 PM, Scott Kostyshak 
> wrote:
>>
>> On Sat, Dec 28, 2013 at 6:06 PM, Elad Zippory 
>> wrote:
>> > Hi,
>> >
>> > I raised this issue at stackoverflow and it was suggested to raise it
>> > here:
>> >
>> > >From the current help page, it is unclear that "warnings()" does not
>> > clear
>> > after rm(list=ls()). Currently the page states that:
>> >
>> > "Warning: It is undocumented where last.warning is stored nor that it is
>> > visible, and this is subject to change. Prior to R 2.4.0 it was stored
>> > in
>> > the workspace, but no longer."
>> >
>> > Yet, I suggest that, if to keep the current behavior or until the
>> > behavior
>> > is changed, at least write explicitly in the help file something like
>> > "clearing the global environment will not clear the warning list. To do
>> > so
>> > use assign("last.warning", NULL, envir = baseenv())"
>> >
>> > Thank you,
>> > Elad Zippory
>>
>> Hi Elad,
>>
>> I'm not a decision maker around here but I'm curious about your
>> suggestion. I always find it helpful to try to understand how people
>> use R and how they expect R to work.
>>
>> From what I understand, you agree that there's no contradiction of
>> behavior in terms of how R is documented to work and you agree that
>> rm(list=ls()) should indeed not clear the warnings list. First, let me
>> give my observation that I think the policy of writing R documentation
>> is to give sufficient information for what a function does. When there
>> is something surprising or there are performance issues to keep in
>> mind, occasionally the R docum

Re: [Rd] [PATCH] suggestions for R-lang manual

2014-02-27 Thread Scott Kostyshak

On Thu, Nov 21, 2013 at 1:17 AM, Scott Kostyshak  wrote:
> Attached is a patch with suggestions for the R-lang manual at r64277.
>
> Below are a few comments (some are implemented in the patch):
>
> In the section "Objects", there is a table introduced by "The
> following table describes the possible values returned by typeof". One
> of the results is "any". Can "any" be returned by "typeof()" ?
>
> Regarding the "Recycling rules" section,
>
> -One exception is that when adding vectors to matrices, a warning is not
> -given if the lengths are incompatible.
> -@c Is that a bug?
> -
>
> was this a bug that was fixed? I see the following behavior:
>
>> myvec <- 1:3
>> mymat <- matrix(1:12, ncol=2)
>> myvec <- 1:5
>> myvec + mymat
>  [,1] [,2]
> [1,]29
> [2,]4   11
> [3,]6   13
> [4,]8   15
> [5,]   10   12
> [6,]7   14
> Warning message:
> In myvec + mymat :
>   longer object length is not a multiple of shorter object length
>>
>
> Regarding
>
> -The arguments in the call to the generic are rematched with the
> -arguments for the method using the standard argument matching mechanism.
> -The first argument, i.e.@: the object, will have been evaluated.
> -
>
> this information is duplicated. See a few paragraphs up "When the
> method is invoked it is called..."
>
> Scott
>
>
> --
> Scott Kostyshak
> Economics PhD Candidate
> Princeton University

The patch still applies cleanly (one offset) on r65090.

Best,

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [PATCH] suggestions for R-lang manual

2014-03-04 Thread Scott Kostyshak

On Mon, Mar 3, 2014 at 7:48 AM, Martin Maechler
 wrote:
>>>>>> Scott Kostyshak 
>>>>>> on Thu, 27 Feb 2014 16:43:02 -0500 writes:
>
>     > On Thu, Nov 21, 2013 at 1:17 AM, Scott Kostyshak 
>  wrote:
> >> Attached is a patch with suggestions for the R-lang manual at r64277.
> >>
> >> Below are a few comments (some are implemented in the patch):
> >>
> >> In the section "Objects", there is a table introduced by "The
> >> following table describes the possible values returned by typeof". One
> >> of the results is "any". Can "any" be returned by "typeof()" ?
>
> ANYSXP  is a valid internal type on the C level, and
> src/main/util.c  will make  typeof(ob) return "any"
> if you can get your hands at an R level object of that type.
> I'd guess you can only get it currently by using .Call() and
> using your own C code, .. but at least that way it must be possible.

Interesting to know.

> >> Regarding the "Recycling rules" section,
> >>
> >> -One exception is that when adding vectors to matrices, a warning is 
> not
> >> -given if the lengths are incompatible.
> >> -@c Is that a bug?
> >> -
> >>
> >> was this a bug that was fixed?
>
> I did not investigate in details, but yes, I vaguely remember we
> had fixed that.  So indeed, it's fine you omitted the para in
> your patch.
>
> >> I see the following behavior:
> >>
> >>> myvec <- 1:3
> >>> mymat <- matrix(1:12, ncol=2)
> >>> myvec <- 1:5
> >>> myvec + mymat
> >> [,1] [,2]
> >> [1,]29
> >> [2,]4   11
> >> [3,]6   13
> >> [4,]8   15
> >> [5,]   10   12
> >> [6,]7   14
> >> Warning message:
> >> In myvec + mymat :
> >> longer object length is not a multiple of shorter object length
> >>>
> >>
> >> Regarding
> >>
> >> -The arguments in the call to the generic are rematched with the
> >> -arguments for the method using the standard argument matching 
> mechanism.
> >> -The first argument, i.e.@: the object, will have been evaluated.
> >> -
> >>
> >> this information is duplicated. See a few paragraphs up "When the
> >> method is invoked it is called..."
>
> >> Scott
>
> Thank you, Scott.
> Indeed, I've finally carefully looked at the patch, and applied
> it - for R-devel, to become R 3.1.0 in April.

Thanks, Martin!

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

> Martin
>
>
> >> --
> >> Scott Kostyshak
> >> Economics PhD Candidate
> >> Princeton University
>
> > The patch still applies cleanly (one offset) on r65090.
>
> > Best,
> > Scott

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] duplication regression (?)

2014-04-14 Thread Scott Kostyshak

Below is an example of output that changed as a result of r64970. I
did not see any NEWS item suggesting this change is expected.

Note that the example is contrived and I don't have a use case for it.
I stumbled across it when playing with recent changes in R relating to
duplication. Does the example use undefined syntax?

-
fn1 <- function(mylist) {
fn1a <- function() mylist[[c(1,1)]][[1]] <<- 9
fn1a()
return(NULL)
}

fn2 <- function(myarg) fn1(myarg)

test_list <- list(list(list(1)))
print(test_list[[c(1,1,1)]])
fn2(test_list)
print(test_list[[c(1,1,1)]])
-

Before r64970 the output is
[1] 1
[1] 1

After r64970 the output is
[1] 1
[1] 9

> sessionInfo()
R Under development (unstable) (2014-04-10 r65396)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base
>

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] "Name partially matched in data frame"

2014-04-30 Thread Scott Kostyshak

On Wed, Apr 30, 2014 at 3:33 PM, Scott Kostyshak  wrote:
> Hi Dennis,
>
> On Wed, Apr 30, 2014 at 3:03 PM, Fisher Dennis  wrote:
>> R 3.1.0
>> OS X
>>
>> Colleagues,
>>
>> I recently updated to 3.1.0 and I have encountered
>> Warning messages: ...  Name partially matched in data frame
>> when I do something like:
>> DATAFRAME$colname
>> where colname is actually something longer than that (but unambiguous).
>>
>> I have much appreciated the partial matching capabilities because it fits 
>> with my workflow.  I often receive updated data months after the initial 
>> code is written.  In order to keep track of what I did in the past, I 
>> provide lengthy (unambiguous) names for columns, then abbreviate the names 
>> as I call them.  This behavior has been termed “lazy” in various 
>> correspondence on this mailing list but it works for me and probably works 
>> for others.
>
> Why not store that information elsewhere? e.g. in an attribute?
>
>> I realize that the new message is only a warning but it is a minor nuisance. 
>>  Would it be possible to add an
>> option(partialMatch=TRUE)   ## default is FALSE
>> or something similar to suppress that behavior?  That should keep both camps 
>> happy.
>
> There is currently no option to control that behavior and (although I
> do understand your use case) I personally hope one is not implemented.
> The reason is that you might put that option in your .Rprofile and
> when you share your code with me I get errors that columns aren't
> found.

Let me change this to "I would get warnings, which would make me worried."

> You can of course redefine the `$`:
>
>> dataf <- data.frame(longColumn = 5)
>> dataf$long
> [1] 5
> Warning message:
> In `$.data.frame`(dataf, long) : Name partially matched in data frame
>>
>> `$.data.frame` <-
> + function (x, name)
> + {
> + a <- x[[name]]
> + if (!is.null(a))
> + return(a)
> + a <- x[[name, exact = FALSE]]
> + return(a)
> + }
>>
>> dataf$long
> [1] 5
>>
>
> I hope you don't do that though.
>
> Another option is to use the more verbose dataf[["long", exact = FALSE]].
>
> Scott
>
>
> --
> Scott Kostyshak
> Economics PhD Candidate
> Princeton University

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] "Name partially matched in data frame"

2014-04-30 Thread Scott Kostyshak

Hi Dennis,

On Wed, Apr 30, 2014 at 3:03 PM, Fisher Dennis  wrote:
> R 3.1.0
> OS X
>
> Colleagues,
>
> I recently updated to 3.1.0 and I have encountered
> Warning messages: ...  Name partially matched in data frame
> when I do something like:
> DATAFRAME$colname
> where colname is actually something longer than that (but unambiguous).
>
> I have much appreciated the partial matching capabilities because it fits 
> with my workflow.  I often receive updated data months after the initial code 
> is written.  In order to keep track of what I did in the past, I provide 
> lengthy (unambiguous) names for columns, then abbreviate the names as I call 
> them.  This behavior has been termed “lazy” in various correspondence on this 
> mailing list but it works for me and probably works for others.

Why not store that information elsewhere? e.g. in an attribute?

> I realize that the new message is only a warning but it is a minor nuisance.  
> Would it be possible to add an
> option(partialMatch=TRUE)   ## default is FALSE
> or something similar to suppress that behavior?  That should keep both camps 
> happy.

There is currently no option to control that behavior and (although I
do understand your use case) I personally hope one is not implemented.
The reason is that you might put that option in your .Rprofile and
when you share your code with me I get errors that columns aren't
found.

You can of course redefine the `$`:

> dataf <- data.frame(longColumn = 5)
> dataf$long
[1] 5
Warning message:
In `$.data.frame`(dataf, long) : Name partially matched in data frame
>
> `$.data.frame` <-
+ function (x, name)
+ {
+ a <- x[[name]]
+ if (!is.null(a))
+ return(a)
+ a <- x[[name, exact = FALSE]]
+ return(a)
+ }
>
> dataf$long
[1] 5
>

I hope you don't do that though.

Another option is to use the more verbose dataf[["long", exact = FALSE]].

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] [patch] Add support for editor function in edit.default

2014-05-20 Thread Scott Kostyshak

Regarding the following extract of ?options:
 ‘editor’: a non-empty string, or a function that is called with a
  file path as argument.

edit.default currently calls the function with three arguments: name,
file, and title. For example, running the following

vimCmd <- 'vim -c "set ft=r"'
vimEdit <- function(file_) system(paste(vimCmd, file_))
options(editor = vimEdit)
myls <- edit(ls)

gives "Error in editor(name, file, title) : unused arguments (file, title)".

The attached patch changes edit.default to call the editor function
with just the file path. There is at least one inconsistent behavior
that this patch causes in its current form. It does not obey the
following (from ?edit):
 Calling ‘edit()’, with no arguments, will result in the temporary
file being reopened for further editing.

I see two ways to address this: (1) add a getEdFile() function to
utils/edit.R that calls a function getEd() defined in edit.c that
returns DefaultFileName; or (2) this patch could be rewritten in C in
a new function in edit.c.

Is there any interest in this patch?
If not, would there be interest in an update of the docs, either
?options (stating the possibility that if 'editor' is a function, it
might be called with 'name', 'file', and 'title' arguments) or ?edit
 ?

Scott


> sessionInfo()
R Under development (unstable) (2014-05-20 r65677)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


--
Scott Kostyshak
Economics PhD Candidate
Princeton University
Index: src/library/utils/R/edit.R
===
--- src/library/utils/R/edit.R	(revision 65677)
+++ src/library/utils/R/edit.R	(working copy)
@@ -53,7 +53,13 @@
   editor = getOption("editor"), ...)
 {
 if (is.null(title)) title <- deparse(substitute(name))
-if (is.function(editor)) invisible(editor(name, file, title))
+if (is.function(editor)) {
+if (file == "") file <- tempfile()
+objDep <- if (is.null(name)) "" else deparse(name)
+writeLines(objDep, con = file)
+editor(file)
+eval(parse(file))
+}
 else .External2(C_edit, name, file, title, editor)
 }
 
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Encourage exit with nonzero error status in ?last.dump

2014-06-12 Thread Scott Kostyshak

The following example in ?dump.frames

options(error = quote({dump.frames(to.file = TRUE); q()}))

is useful for teaching the user how to save a frame dump when R
encounters an error during non-interactive sessions. This command
however causes an additional change that on encountering an error R
exits with a 0 error status. Although it's just an example, it's an
important one as it's referenced in the 'Details' section of the help
file. I think it would be better to encourage exiting with a nonzero
error status:

options(error = quote({dump.frames(to.file = TRUE); q(status = 1)}))

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Encourage exit with nonzero error status in ?last.dump

2014-06-13 Thread Scott Kostyshak

On Fri, Jun 13, 2014 at 5:32 AM, Martin Maechler
 wrote:
>>>>>> Scott Kostyshak 
>>>>>> on Fri, 13 Jun 2014 02:04:36 -0400 writes:
>
> > The following example in ?dump.frames options(error =
> > quote({dump.frames(to.file = TRUE); q()}))
>
> > is useful for teaching the user how to save a frame dump
> > when R encounters an error during non-interactive
> > sessions. This command however causes an additional change
> > that on encountering an error R exits with a 0 error
> > status. Although it's just an example, it's an important
> > one as it's referenced in the 'Details' section of the
> > help file. I think it would be better to encourage exiting
> > with a nonzero error status:
>
> > options(error = quote({dump.frames(to.file = TRUE); q(status = 1)}))
>
> You are right.
> Thank you for the suggestion: it will be in next
> release.
>
> Martin Maechler,
> ETH Zurich

Thanks, Martin.


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] r65998 build error. share/Rd/macros/*: No such file or directory

2014-06-22 Thread Scott Kostyshak

As of r65998 I'm getting
/usr/bin/install: cannot stat
‘/home/scott/rbuilds/r-devel/repo/share/Rd/macros/*’: No such file or
directory

Commenting out the newly added

@for f in $(srcdir)/Rd/macros/*; do \
  $(INSTALL_DATA) $${f} "$(DESTDIR)$(rsharedir)/Rd/macros"; \
done

in share/Makefile.in
fixes compilation for me.

I'm on Ubuntu 13.10. My configure output is here:
https://www.dropbox.com/s/srwa1mbzesvvq5v/configure
my make output is here:
https://www.dropbox.com/s/q7ylkw00re7riaf/make
and my config.log is here:
https://www.dropbox.com/s/0w09zhds9q6253n/config.log

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] r65998 build error. share/Rd/macros/*: No such file or directory

2014-06-22 Thread Scott Kostyshak

On Sun, Jun 22, 2014 at 11:16 AM, Duncan Murdoch
 wrote:
> On 22/06/2014, 5:07 PM, Scott Kostyshak wrote:
>> As of r65998 I'm getting
>> /usr/bin/install: cannot stat
>> ‘/home/scott/rbuilds/r-devel/repo/share/Rd/macros/*’: No such file or
>> directory
>>
>> Commenting out the newly added
>>
>> @for f in $(srcdir)/Rd/macros/*; do \
>>   $(INSTALL_DATA) $${f} "$(DESTDIR)$(rsharedir)/Rd/macros"; \
>> done
>>
>> in share/Makefile.in
>> fixes compilation for me.
>>
>> I'm on Ubuntu 13.10. My configure output is here:
>> https://www.dropbox.com/s/srwa1mbzesvvq5v/configure
>> my make output is here:
>> https://www.dropbox.com/s/q7ylkw00re7riaf/make
>> and my config.log is here:
>> https://www.dropbox.com/s/0w09zhds9q6253n/config.log
>
> Just a missed commit, now fixed.

Thanks Duncan,

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] [patch] Fix n arg in mclapply call to ngettext

2014-06-29 Thread Scott Kostyshak

Regarding the following code,

warning(sprintf(ngettext(has.errors,
  "scheduled core %s encountered error in user code, all values of
the job will be affected",
  "scheduled cores %s encountered errors in user code, all values
of the jobs will be affected"),
paste(has.errors, collapse = ", ")),
  domain = NA)

has.errors is a vector whose elements are the cores that have encountered
errors. The plural message thus appears if the first element of has.errors is
greater than one and is singular otherwise. What we want is for the plural
message to be given if more than one core encountered errors. Changing the n
arg of ngettext from has.errors to length(has.errors) leads to the correct
messages.

Attached is a patch.

More details for completeness:

I've reproduced this on 3.1.0 and r66050.

Below is an example that leads to bad output sometimes (depending on
the order in which the cores finish).
library(parallel)
options(mc.cores = 4)
abc <- mclapply(2:5, FUN = function(x) stopifnot(x >= 4))
# Warning message:
# In mclapply(2:5, FUN = function(x) { :
#   scheduled core 1, 2 encountered error in user code, all values of
the job will be affected

# if a core with number great than 1 has the only error, then an
incorrect message is shown:
library(parallel)
options(mc.cores = 4)
abc <- mclapply(2:5, FUN = function(x) stopifnot(x <= 4))
# Warning message:
# In mclapply(2:5, FUN = function(x) { :
#  scheduled cores 4 encountered errors in user code, all values of
the jobs will be affected

> sessionInfo()
R Under development (unstable) (2014-06-29 r66050)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University
Index: src/library/parallel/R/unix/mclapply.R
===
--- src/library/parallel/R/unix/mclapply.R  (revision 66050)
+++ src/library/parallel/R/unix/mclapply.R  (working copy)
@@ -172,7 +172,7 @@
 if (length(has.errors) == cores)
 warning("all scheduled cores encountered errors in user code")
 else
-warning(sprintf(ngettext(has.errors,
+warning(sprintf(ngettext(length(has.errors),
  "scheduled core %s encountered error in 
user code, all values of the job will be affected",
  "scheduled cores %s encountered errors in 
user code, all values of the jobs will be affected"),
 paste(has.errors, collapse = ", ")),
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] [patch] Rscript off-by-one error in output

2014-07-09 Thread Scott Kostyshak

Rscript eats up the last argument when reporting the command it runs:

$ Rscript --verbose "/tmp/test.R" one two three
running
  '/usr/local/lib/R-devel/lib/R/bin/R --slave --no-restore
--file=/tmp/test.R --args one two'

With the patch below, I get the following:

$ Rscript --verbose "/tmp/test.R" one two three
running
  '/usr/local/lib/R-devel/lib/R/bin/R --slave --no-restore
--file=/tmp/test.R --args one two three'


Index: src/unix/Rscript.c
===
--- src/unix/Rscript.c  (revision 66100)
+++ src/unix/Rscript.c  (working copy)
@@ -249,7 +249,7 @@
 #endif
 if(verbose) {
  fprintf(stderr, "running\n  '%s", cmd);
- for(i = 1; i < ac-1; i++) fprintf(stderr, " %s", av[i]);
+ for(i = 1; i < ac; i++) fprintf(stderr, " %s", av[i]);
  fprintf(stderr, "'\n\n");
 }
 #ifndef _WIN32


Scott


> sessionInfo()
R Under development (unstable) (2014-07-08 r66100)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
 [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [patch] Add support for editor function in edit.default

2014-08-23 Thread Scott Kostyshak

On Tue, May 20, 2014 at 5:55 AM, Scott Kostyshak  wrote:
> Regarding the following extract of ?options:
>  ‘editor’: a non-empty string, or a function that is called with a
>   file path as argument.
>
> edit.default currently calls the function with three arguments: name,
> file, and title. For example, running the following

To be clear with what I view as problematic, note in the above that
the documentation says the function is called with a file path as an
argument, suggesting one argument; but in practice it is called with
three arguments.

> vimCmd <- 'vim -c "set ft=r"'
> vimEdit <- function(file_) system(paste(vimCmd, file_))
> options(editor = vimEdit)
> myls <- edit(ls)
>
> gives "Error in editor(name, file, title) : unused arguments (file, title)".
>
> The attached patch changes edit.default to call the editor function
> with just the file path. There is at least one inconsistent behavior
> that this patch causes in its current form. It does not obey the
> following (from ?edit):
>  Calling ‘edit()’, with no arguments, will result in the temporary
> file being reopened for further editing.
>
> I see two ways to address this: (1) add a getEdFile() function to
> utils/edit.R that calls a function getEd() defined in edit.c that
> returns DefaultFileName; or (2) this patch could be rewritten in C in
> a new function in edit.c.
>
> Is there any interest in this patch?
> If not, would there be interest in an update of the docs, either
> ?options (stating the possibility that if 'editor' is a function, it
> might be called with 'name', 'file', and 'title' arguments) or ?edit
>  ?

Any interest in this patch? If not, would a patch for the
documentation be considered?

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Looking for new maintainer of orphans R2HTML SemiPar cghseg hexbin lgtdl monreg muhaz operators pamr

2014-09-07 Thread Scott Kostyshak

On Sun, Sep 7, 2014 at 7:03 PM, Uwe Ligges
 wrote:
>
>
> On 08.09.2014 01:01, Gregory R. Warnes wrote:
>>
>> And I’ll pick up hexbin.
>
>
> Err, that one has been adopted a month ago already.
>
> open are:
>
> SemiPar cghseg monreg

I will take monreg. Coincidentally my recent research is related.

Best,

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

>
>
> Best,
> Uwe Ligges
>
>
>
>>
>> -Greg
>>
>> On Sep 7, 2014, at 12:17 PM, Romain Francois 
>> wrote:
>>
>>> I'll pick up operators.
>>>
>>> Le 7 sept. 2014 à 18:03, Uwe Ligges  a
>>> écrit :
>>>
>>>>
>>>>
>>>> On 05.09.2014 20:25, Greg Snow wrote:
>>>>>
>>>>> Uwe,
>>>>>
>>>>> Have all of these packages found new maintainers? if not, which ones
>>>>> are still looking to be adopted?
>>>>
>>>>
>>>> Thanks for asking, the ones still looking to be adaopted are:
>>>> SemiPar cghseg monreg operators
>>>>
>>>> Best,
>>>> Uwe Ligges
>>>>
>>>>
>>>>>
>>>>> thanks,
>>>>>
>>>>> On Fri, Aug 8, 2014 at 10:41 AM, Uwe Ligges 
>>>>> wrote:
>>>>>>
>>>>>> Dear maintainers and R-devel,
>>>>>>
>>>>>> Several orphaned CRAN packages are about to be archived due to
>>>>>> outstanding
>>>>>> QC problems, but have CRAN and BioC packages depending on them which
>>>>>> would
>>>>>> be broken by the archival (and hence need archiving alongside).
>>>>>> Therefore we are looking for new maintainers taking over
>>>>>> maintainership for
>>>>>> one or more of the following packages:
>>>>>>
>>>>>> R2HTML SemiPar cghseg hexbin lgtdl monreg muhaz operators pamr
>>>>>>
>>>>>> Package maintainers whose packages depend on one of these may be
>>>>>> natural
>>>>>> candidates to become new maintainers.
>>>>>> Hence this messages is addressed to all these maintainers via BCC and
>>>>>> to
>>>>>> R-devel.
>>>>>>
>>>>>> See
>>>>>>
>>>>>>   <http://CRAN.R-project.org/package=R2HTML>
>>>>>>   <http://CRAN.R-project.org/package=SemiPar>
>>>>>>   <http://CRAN.R-project.org/package=cghseg>
>>>>>>   <http://CRAN.R-project.org/package=hexbin>
>>>>>>   <http://CRAN.R-project.org/package=lgtdl>
>>>>>>   <http://CRAN.R-project.org/package=monreg>
>>>>>>   <http://CRAN.R-project.org/package=muhaz>
>>>>>>   <http://CRAN.R-project.org/package=operators>
>>>>>>   <http://CRAN.R-project.org/package=pamr>
>>>>>>
>>>>>> for information on the QC issues and the reverse dependencies.
>>>>>>
>>>>>> Best wishes,
>>>>>> Uwe Ligges
>>>>>> (for the CRAN team)
>>>>>>
>>>>>> __
>>>>>> R-devel@r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>> __
>>>> R-devel@r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>>
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [patch] Add support for editor function in edit.default

2014-09-09 Thread Scott Kostyshak

On Tue, Sep 9, 2014 at 2:24 AM, Deepayan Sarkar
 wrote:
> On Sun, Aug 24, 2014 at 9:14 AM, Scott Kostyshak  
> wrote:
>> On Tue, May 20, 2014 at 5:55 AM, Scott Kostyshak  
>> wrote:
>>> Regarding the following extract of ?options:
>>>  ‘editor’: a non-empty string, or a function that is called with a
>>>   file path as argument.
>>>
>>> edit.default currently calls the function with three arguments: name,
>>> file, and title. For example, running the following
>>
>> To be clear with what I view as problematic, note in the above that
>> the documentation says the function is called with a file path as an
>> argument, suggesting one argument; but in practice it is called with
>> three arguments.
>>
>>> vimCmd <- 'vim -c "set ft=r"'
>>> vimEdit <- function(file_) system(paste(vimCmd, file_))
>>> options(editor = vimEdit)
>>> myls <- edit(ls)
>>>
>>> gives "Error in editor(name, file, title) : unused arguments (file, title)".
>>>
>>> The attached patch changes edit.default to call the editor function
>>> with just the file path. There is at least one inconsistent behavior
>>> that this patch causes in its current form. It does not obey the
>>> following (from ?edit):
>>>  Calling ‘edit()’, with no arguments, will result in the temporary
>>> file being reopened for further editing.
>>>
>>> I see two ways to address this: (1) add a getEdFile() function to
>>> utils/edit.R that calls a function getEd() defined in edit.c that
>>> returns DefaultFileName; or (2) this patch could be rewritten in C in
>>> a new function in edit.c.
>>>
>>> Is there any interest in this patch?
>>> If not, would there be interest in an update of the docs, either
>>> ?options (stating the possibility that if 'editor' is a function, it
>>> might be called with 'name', 'file', and 'title' arguments) or ?edit
>>>  ?
>>
>> Any interest in this patch? If not, would a patch for the
>> documentation be considered?
>
> Given that edit() itself is called with the three arguments, it seems
> more general to pass them to the editor function, and I don't see the
> need for a special case. You can always write your function as
>
> vimEdit <- function(file_, ...) system(paste(vimCmd, file_))

Indeed, makes sense.

> I will clarify the documentation.

Great. Thanks a lot for taking the time to understand the issue.

Best,

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] last user argument missing from Rscript --verbose

2014-09-19 Thread Scott Kostyshak

On Fri, Sep 19, 2014 at 8:12 AM, Martin Maechler
 wrote:
>>>>>> Harris A Jaffee 
>>>>>> on Thu, 18 Sep 2014 19:32:29 +0200 writes:
>
> (using  HTML, please don't )
>
>> The loop that echoes the arguments almost always stops too soon.  It
>> apparently does that to avoid
>> echoing the "--args" (that had been inserted) when there are no user
>> arguments.  However, when there
>> are user arguments, the next element of the 'av' array is the last
>> argument and usually not "--args",
>> although it can be.
>> ?Rscript is a little sketchy:
>>  `--verbose' gives details of what `Rscript' is doing.  Also passed
>>   on to R.
>> What is passed to R is correct, but the diagnostic is not:
>>  $ Rscript --verbose /dev/null 1 2
>>   running
>>   '/path_to_R --slave --no-restore --file=/dev/null --args 1'
>> Fixed (only tested on Mac):
>>  $ Rscript --verbose /dev/null 1 2
>>   running
>>   '/Library/Frameworks/R.framework/Versions/3.1/Resources/bin/R --slave
>> --no-restore --file=/dev/null --args 1 2'
>
> You are right about the problem, also reproducible on Linux.
> You mention a 'fix'.
> It looks to me that is just
>
> -   for(i = 1; i < ac-1; i++) fprintf(stderr, " %s", av[i]);
> +   for(i = 1; i < ac; i++) fprintf(stderr, " %s", av[i]);
>
> in unix/Rscript.c, right ?

Yes, I suggested the same patch here:
http://r.789695.n4.nabble.com/patch-Rscript-off-by-one-error-in-output-td4693780.html

Scott

> BTW: If one use  -e 'commandArgs()'  instead of   /dev/null one
> sees that Rscript's "lying" about the last argument is not
> helpful anyway :
>
>   Rscript --verbose -e 'commandArgs()'
>
>   running
> '/usr/local64.sfs/app/R/R-3.1.1-inst/bin/R --slave --no-restore -e 
> commandArgs()'
>
>   [1] "/usr/local64.sfs/app/R/R-3.1.1-inst/bin/exec/R"
>   [2] "--slave"
>   [3] "--no-restore"
>   [4] "-e"
>   [5] "commandArgs()"
>   [6] "--args"
>
> because the '--args' appears anyway and indeed *is* passed to 'R'...
>
> A better fix would rather suppress that; but I will commit the
> above change.


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [patch] Rscript off-by-one error in output

2014-09-20 Thread Scott Kostyshak

On Wed, Jul 9, 2014 at 7:26 PM, Scott Kostyshak  wrote:
> Rscript eats up the last argument when reporting the command it runs:
>
> $ Rscript --verbose "/tmp/test.R" one two three
> running
>   '/usr/local/lib/R-devel/lib/R/bin/R --slave --no-restore
> --file=/tmp/test.R --args one two'
>
> With the patch below, I get the following:
>
> $ Rscript --verbose "/tmp/test.R" one two three
> running
>   '/usr/local/lib/R-devel/lib/R/bin/R --slave --no-restore
> --file=/tmp/test.R --args one two three'
>
>
> Index: src/unix/Rscript.c
> ===
> --- src/unix/Rscript.c  (revision 66100)
> +++ src/unix/Rscript.c  (working copy)
> @@ -249,7 +249,7 @@
>  #endif
>  if(verbose) {
>   fprintf(stderr, "running\n  '%s", cmd);
> - for(i = 1; i < ac-1; i++) fprintf(stderr, " %s", av[i]);
> + for(i = 1; i < ac; i++) fprintf(stderr, " %s", av[i]);
>   fprintf(stderr, "'\n\n");
>  }
>  #ifndef _WIN32
>
>
> Scott
>
>
>> sessionInfo()
> R Under development (unstable) (2014-07-08 r66100)
> Platform: x86_64-unknown-linux-gnu (64-bit)
>
> locale:
>  [1] LC_CTYPE=en_US.UTF-8   LC_NUMERIC=C
>  [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8
>  [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8
>  [7] LC_PAPER=en_US.UTF-8   LC_NAME=C
>  [9] LC_ADDRESS=C   LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
>
>
> --
> Scott Kostyshak
> Economics PhD Candidate
> Princeton University

For archival purposes, this was fixed at r66644.

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] Turn warnings or notes into errors on CMD check ?

2014-10-11 Thread Scott Kostyshak

Hi,

I am using a local patch to have CMD check exit with error if there is
a note or warning. Am I missing an already existing way to do this?

If not, Is there any interest in having an option or environment
variable for this upstream? I would be interested in making a patch.
If so, option or environment variable? Any suggestions for the name?
Should this be two options or one option with "1" means only turn
warnings into errors and "2" means turn both warnings and notes into
errors?

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Cursor not behaving properly

2014-11-18 Thread Scott Kostyshak

On Mon, Nov 10, 2014 at 10:52 AM, Kaiyin Zhong (Victor Chung)
 wrote:
> I found a strange bug in R recently (version 3.1.2):
>
> As you can see from the screenshots attached, when the cursor passes the
> right edge of the console, instead of start on a new line, it goes back to
> the beginning of the same line, and overwrites everything after it.
>
> This happens every time the size of the terminal is changed, for example,
> if you fit the terminal to the right half of the screen, start an R
> session, exec some commands, maximize the terminal, and type a long command
> into the session, then you will find the bug reproduced.
>
> I am on Ubuntu 14.04, and I have tested this in konsole, guake and
> gnome-terminal.

I can reproduce this, also on Ubuntu 14.04, with gnome-terminal and
xterm. If you don't get any response here, please file a bug report at
bugs.r-project.org.

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Cursor not behaving properly

2014-11-19 Thread Scott Kostyshak

On Tue, Nov 18, 2014 at 9:50 PM, Scott Kostyshak  wrote:
> On Mon, Nov 10, 2014 at 10:52 AM, Kaiyin Zhong (Victor Chung)
>  wrote:
>> I found a strange bug in R recently (version 3.1.2):
>>
>> As you can see from the screenshots attached, when the cursor passes the
>> right edge of the console, instead of start on a new line, it goes back to
>> the beginning of the same line, and overwrites everything after it.
>>
>> This happens every time the size of the terminal is changed, for example,
>> if you fit the terminal to the right half of the screen, start an R
>> session, exec some commands, maximize the terminal, and type a long command
>> into the session, then you will find the bug reproduced.
>>
>> I am on Ubuntu 14.04, and I have tested this in konsole, guake and
>> gnome-terminal.
>
> I can reproduce this, also on Ubuntu 14.04, with gnome-terminal and
> xterm. If you don't get any response here, please file a bug report at
> bugs.r-project.org.

For archival purposes, the OP reported the bug here:
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=16077

Scott


--
Scott Kostyshak
Economics PhD Candidate
Princeton University

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

[Rd] two typos in NEWS.Rd

2013-02-23 Thread Scott Kostyshak

Regarding:
"\pkg{parallle} (as in e.g. \code{mclapply()}."

Two typos:
"parallle" -> "parallel"
"\code{mclapply()}." -> "\code{mclapply()})"

Patch is attached.

Scott
diff --git a/doc/NEWS.Rd b/doc/NEWS.Rd
index c642432..012fd8f 100644
--- a/doc/NEWS.Rd
+++ b/doc/NEWS.Rd
@@ -928,7 +928,7 @@
   unloading.
 
   The Tcl/Tk event loop is inhibited in a forked child from package
-  \pkg{parallle} (as in e.g. \code{mclapply()}.
+  \pkg{parallel} (as in e.g. \code{mclapply()}).
 
   \item \code{parallel::makeCluster()} recognizes the value
   \samp{random} for the environment variable \env{R_PARALLEL_PORT}:
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

56 matches

Mail list logo