[Rd] Should Position() use match.fun()?

2021-09-08 Thread Steve Martin
Hello,

All of the funprog functions except Position() use match.fun() early
in the body of the function. (Filter() seems to rely on lapply() for
this, but the effect is the same.) In most cases this isn't a problem,
but I can't see why Position() shouldn't look something like

Position2 <- function(f, x, right = FALSE, nomatch = NA_integer_) {
f <- match.fun(f) # the only difference from Position()
ind <- if (right) rev(seq_along(x)) else seq_along(x)
for (i in ind) {
if (f(x[[i]])) return(i)
}
nomatch
}

This would make it consistent with the other funprog functions, and
would mean that Find() and Position() give the same result when
expected

> equals3 <- function(x) x == 3
> Position("equals3", 1:5)
Error in f(x[[i]]) : could not find function "f"
> Position2("equals3", 1:5)
[1] 3
> Find("equals3", 1:5)
[1] 3

Thanks,
Steve

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] partial matching of row names in [-indexing

2022-01-14 Thread Steve Martin
I don't think this is a bug in the documentation. The help page for
`?[.data.frame` has the following in the last paragraph of the
details:

Both [ and [[ extraction methods partially match row names. By default
neither partially match column names, but [[ will if exact = FALSE
(and with a warning if exact = NA). If you want to exact matching on
row names use match, as in the examples.

The example it refers to is

sw <- swiss[1:5, 1:4]  # select a manageable subset
sw["C", ] # partially matches
sw[match("C", row.names(sw)), ] # no exact match

Whether this is good behaviour or not is a different question, but the
documentation seems clear enough (to me, at least).

Best,
Steve

On Fri, 14 Jan 2022 at 20:40, Ben Bolker  wrote:
>
>
>People are often surprised that row-indexing a data frame by [ +
> character does partial matching (and annoyed that there is no way to
> turn it off:
>
> https://stackoverflow.com/questions/18033501/warning-when-partial-matching-rownames
>
> https://stackoverflow.com/questions/34233235/r-returning-partial-matching-of-row-names
>
> https://stackoverflow.com/questions/70716905/why-does-r-have-inconsistent-behaviors-when-a-non-existent-rowname-is-retrieved
>
>
> ?"[" says:
>
> Character indices can in some circumstances be partially matched
>   (see ‘pmatch’) to the names or dimnames of the object being
>   subsetted (but never for subassignment).  UNLIKE S (Becker et al_
>   p. 358), R NEVER USES PARTIAL MATCHING WHEN EXTRACTING BY ‘[’, and
>   partial matching is not by default used by ‘[[’ (see argument
>   ‘exact’).
>
> (EMPHASIS ADDED).
>
> Looking through the rest of that page, I don't see any other text that
> modifies or supersedes that statement.
>
>Is this a documentation bug?
>
> The example given in one of the links above:
>
> b <- as.data.frame(matrix(4:5, ncol = 1, nrow = 2, dimnames =
> list(c("A10", "B"), "V1")))
>
> b["A1",]  ## 4 (partial matching)
> b[rownames(b) == "A1",]  ## logical(0)
> b["A1", , exact=TRUE]## unused argument error
> b$V1[["A1"]] ## subscript out of bounds error
> b$V1["A1"]   ## NA
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] unsplit() mangles attributes

2022-11-01 Thread Steve Martin
Hello,

Unsplitting a named vector that's been split sets all the names as missing.

x <- 1:12
names(x) <- letters[x]
f <- gl(2, 6)

unsplit(split(x, f), f)
   
   123456789   10   11   12

The unsplit() function correctly deals with row names when unsplitting
a split data frame, and the same approach preserves regular names as
well. Here's a stripped-down version of unsplit() that keeps names:

unsplit_with_names <- function(value, f) {
  len <- length(f)
  x <- value[[1L]][rep(NA_integer_, len)] # names get lost here...
  split(x, f) <- value
  has_names <- !is.null(names(value[[1L]]))
  if (has_names) {
split(names(x), f) <- lapply(value, names) # so add them back here
  }
  x
}

unsplit_with_names(split(x, f), f)
 a  b  c  d  e  f  g  h  i  j  k  l
 1  2  3  4  5  6  7  8  9 10 11 12

I plan on reporting this on bugzilla, with a more general fix, but
would first like to see if I'm missing anything, and check that my
reasoning is clear.

It seems that names are the only attribute for unclassed vectors that
survive the default method of split(), and so I think the above
version of unsplit() replaces all the attributes it can for unclassed
vectors.

I'm less confident about classed vectors, as unsplit() isn't generic
and potentially needs to deal with objects. Dates and factors work
fine, as it seems they can only lose names; this is addressed with the
above version of unsplit(). But are there other attributes for classed
objects that may get lost with unsplit? Can my fix above cause
problems for certain classes? (Note that I didn't use the recursion
that unsplit() uses for data frames, as that relies on names not
themselves having names.)

The real challenge is that unsplit need not have all the information
about the original object it's trying to put back together. Take the
case of a vector with a dim attribute.

y <- matrix(x, 3, 4, dimnames = list(letters[1:3], letters[1:4]))

unsplit(split(y, f), f)
[1]  1  2  3  4  5  6  7  8  9 10 11 12

A possible solution is for split() to record the attributes of its
argument for later use by unsplit(). Again, consider some
stripped-down alternatives:

split_with_attr <- function(x, f) {
  res <- split(x, f)
  structure(res, original.attr = attributes(x))
}

unsplit_with_attr <- function(value, f) {
  len <- length(f)
  x <- value[[1L]][rep(NA_integer_, len)]
  split(x, f) <- value
  attributes(x) <- attr(value, "original.attr")
  x
}

unsplit_with_attr(split_with_attr(y, f), f)
  a b c  d
a 1 4 7 10
b 2 5 8 11
c 3 6 9 12

But this seems complicated, and may muck up existing code. It would be
much easier if I can just restrict attention to restoring lost names
for unclassed vectors :)

Any thoughts are much appreciated.

Thanks,
Steve

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Augment base::replace(x, list, value) to allow list= to be a predicate?

2023-03-07 Thread Steve Martin
That's an interesting example, as it's conceptually similar to what
Pavel is proposing, but structurally different. gsubfn() is more
complicated than a simple switch in the body of the function, and
wouldn't work well as an anonymous function.

Multiple dispatch can nicely encompass both of these cases. For replace(),

library(S7)

replace <- new_generic("replace", c("x", "list"), function(x, list,
values, ...) {
  S7_dispatch()
})

method(replace, list(class_any, class_any)) <- base::replace

method(replace, list(class_any, class_function)) <- function(x, list,
values, ...) {
  replace(x, list(x, ...), values)
}

x <- c(1 ,2, NA, 3)
replace(x, is.na(x), 0)
[1] 1 2 0 3

replace(x, is.na, 0)
[1] 1 2 0 3

And for gsub(),

gsub <- new_generic("gsub", c("pattern", "replacement"),
function(pattern, replacement, x, ...) {
  S7_dispatch()
})

method(gsub, list(class_character, class_character)) <- base::gsub

# My quick-and-dirty implementation as an example
method(gsub, list(class_character, class_function)) <-
function(pattern, replacement, x) {
  m <- regexpr(pattern, x)
  res <- replacement(regmatches(x, m))
  mapply(gsub, pattern, as.character(res), x, USE.NAMES = FALSE)
}

gsub("^..", toupper, c("abc", "xyz"))
[1] "ABc" "XYz"

But this isn't a simple change to replace() anymore, and I may just be
spending too much time tinkering with Julia.

Steve

On Tue, 7 Mar 2023 at 07:34, Gabor Grothendieck  wrote:
>
> This could be extended to sub and gsub as well which gsubfn in the
> gusbfn package already does:
>
>   library(gsubfn)
>   gsubfn("^..", toupper, c("abc", "xyz"))
>   ## [1] "ABc" "XYz"
>
> On Fri, Mar 3, 2023 at 7:22 PM Pavel Krivitsky  
> wrote:
> >
> > Dear All,
> >
> > Currently, list= in base::replace(x, list, value) has to be an index
> > vector. For me, at least, the most common use case is for list= to be
> > some simple property of elements of x, e.g.,
> >
> > x <- c(1,2,NA,3)
> > replace(x, is.na(x), 0)
> >
> > Particularly when using R pipes, which don't allow multiple
> > substitutions, it would simplify many of such cases if list= could be a
> > function that returns an index, e.g.,
> >
> > replace <- function (x, list, values, ...) {
> >   # Here, list() refers to the argument, not the built-in.
> >   if(is.function(list)) list <- list(x, ...)
> >   x[list] <- values
> >   x
> > }
> >
> > Then, the following is possible:
> >
> > c(1,2,NA,3) |> replace(is.na, 0)
> >
> > Any thoughts?
> > Pavel
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>
> --
> Statistics & Software Consulting
> GKX Group, GKX Associates Inc.
> tel: 1-877-GKX-GROUP
> email: ggrothendieck at gmail.com
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Use of `[` with array and resulting class

2023-09-29 Thread Steve Martin
This is due to `[` dropping dimensions by default. In your first
example, think of a[1, , ] as having dimension c(1, 3, 2), but,
because drop = TRUE, all dimensions of extent 1 (the first dimension)
are dropped and the result has dimension c(3, 2). In your second
example, b[1, , ] would have dimension c(1, 6, 1), but now both the
first and third dimensions are dropped, resulting in a vector with no
dimensions.

We can use the drop function to explicitly see why b[1, ,] doesn't
result in a matrix.

> drop(matrix(1:6, 6, 1))
[1] 1 2 3 4 5 6

Steve

On Fri, 29 Sept 2023 at 23:28, Joseph Wood  wrote:
>
> Hello,
>
> I recently discovered a possible inconsistency with usage of an object of
> class array.
>
> Consider the following example:
>
> ## Setup
>
> a <- array(1:6, dim = c(1, 3, 2))
> a
> , , 1
>
>  [,1] [,2] [,3]
> [1,]123
>
> , , 2
>
>  [,1] [,2] [,3]
> [1,]456
>
> class(a)
> [1] "array"
>
> dim(a)
> [1] 1 3 2
>
> ## Now use `[`
> a[1,,]
>  [,1] [,2]
> [1,]14
> [2,]25
> [3,]36
>
> class(a[1,,])
> [1] "matrix" "array"
>
> dim(a[1,,])
> [1] 3 2
>
> Up until this point, it makes sense to me. Now, let's consider when dim =
> c(1, 6, 1). This is where I have a little trouble understanding the
> behavior.
>
> ## Array with dim = c(1, any_number_here, 1)
>
> b <- array(1:6, dim = c(1, 6, 1))
> b
> , , 1
>
>  [,1] [,2] [,3] [,4] [,5] [,6]
> [1,]123456
>
> class(b)
> [1] "array"
>
> dim(b)
> [1] 1 6 1
>
> ## The problem
>
> b[1,,]
> [1] 1 2 3 4 5 6
>
> dim(b[1,,])
> NULL
>
> class(b[1,,])
> [1] "integer"
>
> I would have expected:
>
> b[1,,] ## produced the output with matrix(1:6, ncol = 1)
>  [,1]
> [1,]1
> [2,]2
> [3,]3
> [4,]4
> [5,]5
> [6,]6
>
> class(b[1,,])
> [1] "matrix" "array"
>
> dim(b[1,,])
> [1] 3 1
>
> Is this a bug? If not, any help understanding this behaviour would be much
> appreciated.
>
> Thanks,
> Joseph
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] zapsmall(x) for scalar x

2023-12-16 Thread Steve Martin
Zapping a vector of small numbers to zero would cause problems when
printing the results of summary(). For example, if
zapsmall(c(2.220446e-16, ..., 2.220446e-16)) == c(0, ..., 0) then
print(summary(2.220446e-16), digits = 7) would print
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
0  00   0   0  0

The same problem can also appear when printing the results of
summary.glm() with show.residuals = TRUE if there's little dispersion
in the residuals.

Steve

On Sat, 16 Dec 2023 at 17:34, Gregory Warnes  wrote:
>
> I was quite suprised to discover that applying `zapsmall` to a scalar value 
> has no apparent effect.  For example:
>
> > y <- 2.220446e-16
> > zapsmall(y,)
> [1] 2.2204e-16
>
> I was expecting zapsmall(x)` to act like
>
> > round(y, digits=getOption('digits'))
> [1] 0
>
> Looking at the current source code, indicates that `zapsmall` is expecting a 
> vector:
>
> zapsmall <-
> function (x, digits = getOption("digits"))
> {
> if (length(digits) == 0L)
> stop("invalid 'digits'")
> if (all(ina <- is.na(x)))
> return(x)
> mx <- max(abs(x[!ina]))
> round(x, digits = if (mx > 0) max(0L, digits - as.numeric(log10(mx))) 
> else digits)
> }
>
> If `x` is a non-zero scalar, zapsmall will never perform rounding.
>
> The man page simply states:
> zapsmall determines a digits argument dr for calling round(x, digits = dr) 
> such that values close to zero (compared with the maximal absolute value) are 
> ‘zapped’, i.e., replaced by 0.
>
> and doesn’t provide any details about how ‘close to zero’ is defined.
>
> Perhaps handling the special when `x` is a scalar (or only contains a single 
> non-NA value)  would make sense:
>
> zapsmall <-
> function (x, digits = getOption("digits"))
> {
> if (length(digits) == 0L)
> stop("invalid 'digits'")
> if (all(ina <- is.na(x)))
> return(x)
> mx <- max(abs(x[!ina]))
> round(x, digits = if (mx > 0 && (length(x)-sum(ina))>1 ) max(0L, digits - 
> as.numeric(log10(mx))) else digits)
> }
>
> Yielding:
>
> > y <- 2.220446e-16
> > zapsmall(y)
> [1] 0
>
> Another edge case would be when all of the non-na values are the same:
>
> > y <- 2.220446e-16
> > zapsmall(c(y,y))
> [1] 2.220446e-16 2.220446e-16
>
> Thoughts?
>
>
> Gregory R. Warnes, Ph.D.
> g...@warnes.net
> Eternity is a long time, take a friend!
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Re: zapsmall(x) for scalar x

2023-12-17 Thread Steve Martin
Sorry for being unclear. I was commenting on the edge case that
Gregory brought up when calling zapsmall() with a vector of small
values. I thought Gregory was asking for thoughts on that as well, but
maybe I misunderstood. IMO it would be weird for zapsmall() to make a
small scalar zero but not a vector of the identical values.

The example with summary() was meant to show that zapping a vector of
small values to 0 could change the current printing behavior for
certain objects. Ducan is right that zapping only a scalar to zero
wouldn't do anything.

>>> Isn’t that the correct outcome?  The user can change the number of digits 
>>> if they want to see small values…

I'm not sure a user would be able to change the digits without
updating other functions. If xx[finite] <- zapsmall(x[finite]) in
print.summaryDefault() makes a vector of 0s (e.g., zapsmall(x) works
like round(x, digits = getOption("digits")) and getOptions("digits")
is 7) then calling print(summary(2.220446e-16), digits = 16) would
still print a vector of 0s. The digits argument to print() wouldn't do
anything.

In any case, I just wanted to point out that changes to zapsmall() in
the corner case Gregory brought up could affect the way certain
objects are printed, both changing the current behavior and perhaps
requiring changes to some other functions.

Steve

On Sun, 17 Dec 2023 at 12:26, Barry Rowlingson
 wrote:
>
> I think what's been missed is that zapsmall works relative to the absolute 
> largest value in the vector. Hence if there's only one
> item in the vector, it is the largest, so its not zapped. The function's 
> raison d'etre isn't to replace absolutely small values,
> but small values relative to the largest. Hence a vector of similar tiny 
> values doesn't get zapped.
>
> Maybe the line in the docs:
>
> " (compared with the maximal absolute value)"
>
> needs to read:
>
> " (compared with the maximal absolute value in the vector)"
>
> Barry
>
>
>
>
>
> On Sun, Dec 17, 2023 at 2:17 PM Duncan Murdoch  
> wrote:
>>
>> This email originated outside the University. Check before clicking links or 
>> attachments.
>>
>> I'm really confused.  Steve's example wasn't a scalar x, it was a
>> vector.  Your zapsmall() proposal wouldn't zap it to zero, and I don't
>> see why summary() would if it was using your proposal.
>>
>> Duncan Murdoch
>>
>> On 17/12/2023 8:43 a.m., Gregory R. Warnes wrote:
>> > Isn’t that the correct outcome?  The user can change the number of digits 
>> > if they want to see small values…
>> >
>> >
>> > --
>> > Change your thoughts and you change the world.
>> > --Dr. Norman Vincent Peale
>> >
>> >> On Dec 17, 2023, at 12:11 AM, Steve Martin  
>> >> wrote:
>> >>
>> >> Zapping a vector of small numbers to zero would cause problems when
>> >> printing the results of summary(). For example, if
>> >> zapsmall(c(2.220446e-16, ..., 2.220446e-16)) == c(0, ..., 0) then
>> >> print(summary(2.220446e-16), digits = 7) would print
>> >>Min. 1st Qu.  MedianMean 3rd Qu.Max.
>> >> 0  00   0   0  0
>> >>
>> >> The same problem can also appear when printing the results of
>> >> summary.glm() with show.residuals = TRUE if there's little dispersion
>> >> in the residuals.
>> >>
>> >> Steve
>> >>
>> >>> On Sat, 16 Dec 2023 at 17:34, Gregory Warnes  wrote:
>> >>>
>> >>> I was quite suprised to discover that applying `zapsmall` to a scalar 
>> >>> value has no apparent effect.  For example:
>> >>>
>> >>>> y <- 2.220446e-16
>> >>>> zapsmall(y,)
>> >>> [1] 2.2204e-16
>> >>>
>> >>> I was expecting zapsmall(x)` to act like
>> >>>
>> >>>> round(y, digits=getOption('digits'))
>> >>> [1] 0
>> >>>
>> >>> Looking at the current source code, indicates that `zapsmall` is 
>> >>> expecting a vector:
>> >>>
>> >>> zapsmall <-
>> >>> function (x, digits = getOption("digits"))
>> >>> {
>> >>> if (length(digits) == 0L)
>> >>> stop("invalid 'digits'")
>> >>> if (all(ina <- is.na(x)))
>> >>> return(x)
>> >>> mx &

Re: [Rd] [External] Re: zapsmall(x) for scalar x

2023-12-18 Thread Steve Martin
 if (all(ina <- is.na(x)))
> >  return(x)
> >  mx <- mFUN(x, ina)
> >  round(x, digits = if(mx > 0) max(min.d, digits -
> as.numeric(log10(mx))) else digits)
> > }
> >
> > with optional 'min.d' as I had (vaguely remember to have) found
> > at the time that the '0' is also not always "the only correct" choice.
> Do you have a case or two where min.d could be useful?
>
> Serguei.
>
> >
> > Somehow I never got to propose/discuss the above,
> > but it seems a good time to do so now.
> >
> > Martin
> >
> >
> >
> >  >> barry
> >  >>
> >  >>
> >  >> On Sun, Dec 17, 2023 at 2:17 PM Duncan Murdoch <
> murdoch.dun...@gmail.com>
> >  >> wrote:
> >  >>
> >  >>> This email originated outside the University. Check before
> clicking links
> >  >>> or attachments.
> >  >>>
> >  >>> I'm really confused.  Steve's example wasn't a scalar x, it was
> a
> >  >>> vector.  Your zapsmall() proposal wouldn't zap it to zero, and
> I don't
> >  >>> see why summary() would if it was using your proposal.
> >  >>>
> >  >>> Duncan Murdoch
> >  >>>
> >  >>> On 17/12/2023 8:43 a.m., Gregory R. Warnes wrote:
> >  >>>> Isn’t that the correct outcome?  The user can change the
> number of
> >  >>> digits if they want to see small values…
> >  >>>>
> >  >>>> --
> >  >>>> Change your thoughts and you change the world.
> >  >>>> --Dr. Norman Vincent Peale
> >  >>>>
> >  >>>>> On Dec 17, 2023, at 12:11 AM, Steve Martin <
> stevemartin...@gmail.com>
> >  >>> wrote:
> >  >>>>> Zapping a vector of small numbers to zero would cause
> problems when
> >  >>>>> printing the results of summary(). For example, if
> >  >>>>> zapsmall(c(2.220446e-16, ..., 2.220446e-16)) == c(0, ..., 0)
> then
> >  >>>>> print(summary(2.220446e-16), digits = 7) would print
> >  >>>>> Min. 1st Qu.  MedianMean 3rd Qu.Max.
> >  >>>>> 0  00   0   0  0
> >  >>>>>
> >  >>>>> The same problem can also appear when printing the results of
> >  >>>>> summary.glm() with show.residuals = TRUE if there's little
> dispersion
> >  >>>>> in the residuals.
> >  >>>>>
> >  >>>>> Steve
> >  >>>>>
> >>>>>> On Sat, 16 Dec 2023 at 17:34, Gregory Warnes 
> wrote:
> >  >>>>>>
> >>>>>> I was quite suprised to discover that applying `zapsmall` to a
> scalar
> >  >>> value has no apparent effect.  For example:
> >  >>>>>>> y <- 2.220446e-16
> >  >>>>>>> zapsmall(y,)
> >>>>>> [1] 2.2204e-16
> >  >>>>>>
> >>>>>> I was expecting zapsmall(x)` to act like
> >  >>>>>>
> >  >>>>>>> round(y, digits=getOption('digits'))
> >>>>>> [1] 0
> >  >>>>>>
> >>>>>> Looking at the current source code, indicates that `zapsmall` is
> >  >>> expecting a vector:
> >>>>>> zapsmall <-
> >>>>>> function (x, digits = getOption("digits"))
> >>>>>> {
> >>>>>>   if (length(digits) == 0L)
> >>>>>>   stop("invalid 'digits'")
> >>>>>>   if (all(ina <- is.na(x)))
> >>>>>>   return(x)
> >>>>>>   mx <- max(abs(x[!ina]))
> >>>>>>   round(x, digits = if (mx > 0) max(0L, digits -
> >  >>> as.numeric(log10(mx))) else digits)
> >>>>>> }
> >  >>>>>>
> >>>>>> If `x` is a non-zero scalar, zapsmall will never perform rounding.
> >  >>>>>>
> >>>>>> The man page simply states:
> >>>>>> zapsmall determ

Re: [Rd] [External] Re: zapsmall(x) for scalar x

2023-12-19 Thread Steve Martin
Thanks for sharing, Martin. You're right that the interface for mFUN
should be more general than I initially thought.*

Perhaps you have other cases/examples where the ina argument is
useful, in which case ignore me, but your example with the robust mFUN
doesn't use the ina argument. What about having mFUN be only an
argument of x (NAs and all), with a default of \(x) max(abs(x), na.rm
= TRUE)? It's a minor difference, but it might make the mFUN argument
a bit simpler to use (no need to carry a dummy argument when NAs in x
can be handled directly).

Steve

* Tangent: Does boxplot.stats() use the number of NA values? The
documentation says NAs are omitted, and a quick scan of the code and
some tests suggests boxplot.stats(x) should give the same result as
boxplot.stats(x[!is.na(x)]), although I may be missing something. But
your point is well taken, and the interface should be more general
than I initially thought.

On Tue, 19 Dec 2023 at 11:25, Martin Maechler
 wrote:
>
> >>>>> Steve Martin
> >>>>> on Mon, 18 Dec 2023 07:56:46 -0500 writes:
>
> > Does mFUN() really need to be a function of x and the NA values of x? I
> > can't think of a case where it would be used on anything but the non-NA
> > values of x.
>
> > I think it would be easier to specify a different mFUN() (and document 
> this
> > new argument) if the function has one argument and is applied to the 
> non-NA
> > values of x.
>
> > zapsmall <- function(x,
> > digits = getOption("digits"),
> > mFUN = function(x) max(abs(x)),
> > min.d = 0L) {
> > if (length(digits) == 0L)
> > stop("invalid 'digits'")
> > if (all(ina <- is.na(x)))
> > return(x)
> > mx <- mFUN(x[!ina])
> > round(x, digits = if(mx > 0) max(min.d, digits - 
> as.numeric(log10(mx)))
> > else digits)
> > }
>
> > Steve
>
> Thank you, Steve,
> you are right that it would look simpler to do it that way.
>
> On the other hand, in your case, mFUN() no longer sees the
> original  n observations, and would not know if there where NAs
> in that case how many NAs there were in the original data.
>
> The examples I have on my version of zapsmall's help page (see below)
> uses a robust mFUN, "the upper hinge of a box plot":
>
>mF_rob <- function(x, ina) boxplot.stats(x, do.conf=FALSE)$stats[5]
>
> and if you inspect boxplot.stats() you may know that indeed it
> also wants to use the full data 'x' to compute its statistics and
> then deal with NAs directly.  Your simplified mFUN interface
> would not be fully consistent with boxplot(), and I think could
> not be made so,  hence my more flexible 2-argument "design" for  mFUN().
>
>  and BTW, these examples also exemplify the use of  `min.d`
> about which  Serguei Sokol asked for an example or two.
>
> Here I repeat my definition of zapsmall, and then my current set
> of examples:
>
> zapsmall <- function(x, digits = getOption("digits"),
>  mFUN = function(x, ina) max(abs(x[!ina])), min.d = 0L)
> {
> if (length(digits) == 0L)
> stop("invalid 'digits'")
> if (all(ina <- is.na(x)))
> return(x)
> mx <- mFUN(x, ina)
> round(x, digits = if(mx > 0) max(min.d, digits - as.numeric(log10(mx))) 
> else digits)
> }
>
>
> ##--- \examples{
> x2 <- pi * 100^(-2:2)/10
>print(  x2, digits = 4)
> zapsmall(  x2) # automatical digits
> zapsmall(  x2, digits = 4)
> zapsmall(c(x2, Inf)) # round()s to integer ..
> zapsmall(c(x2, Inf), min.d=-Inf) # everything  is small wrt  Inf
>
> (z <- exp(1i*0:4*pi/2))
> zapsmall(z)
>
> zapShow <- function(x, ...) rbind(orig = x, zapped = zapsmall(x, ...))
> zapShow(x2)
>
> ## using a *robust* mFUN
> mF_rob <- function(x, ina) boxplot.stats(x, do.conf=FALSE)$stats[5]
> ## with robust mFUN(), 'Inf' is no longer distorting the picture:
> zapShow(c(x2, Inf), mFUN = mF_rob)
> zapShow(c(x2, Inf), mFUN = mF_rob, min.d = -5) # the same
> zapShow(c(x2, 999), mFUN = mF_rob) # same *rounding* as w/ Inf
> zapShow(c(x2, 999), mFUN = mF_rob, min.d =  3) # the same
> zapShow(c(x2, 999), mFUN = mF_rob, min.d =  8) # small diff
> ##--- }
>
>
>
> > On Mon, Dec 18, 2023, 05:47 Serguei Sokol via R-devel 
> 
> > wrote:
>
> > Le 18/12/2023 à 11:24, Martin Maechler a écrit :
> > >>>>>> Serguei Sokol via R-devel
> > >>>>>>  on Mon, 18 Dec 2023 10

Re: [Rd] tools::startDynamicHelp(): Randomly prevents R from exiting (on MS Windows)

2024-01-07 Thread Steve Martin via R-devel
Henrik,

I was able to reproduce this both with Rscript and interactively using the same 
version of R you're using (fresh install) and Windows 10.0.22621.2715. It took 
about a dozen tries.

Steve

 Original Message 
On Jan 6, 2024, 12:38, Henrik Bengtsson wrote:

> ISSUE: On MS Windows, running cmd.exe, calling Rscript --vanilla -e "port R 
> --version R version 4.3.2 (2023-10-31 ucrt) -- "Eye Holes" Copyright (C) 2023 
> The R Foundation for Statistical Computing Platform: x86_64-w64-mingw32/x64 
> (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are 
> welcome to redistribute it under the terms of the GNU General Public License 
> versions 2 or 3. For more information about these matters see 
> https://www.gnu.org/licenses/. C:\Users\hb> Rscript --vanilla -e "port 
> Rscript --vanilla -e "port
[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Should subsetting named vector return named vector including named unmatched elements?

2024-01-18 Thread Steve Martin via R-devel
Jiří,

For your first question, the NA names make sense if you think of indexing with 
a character vector as the same as menu[match(select, names(menu))]. You're not 
indexing with "beans"; rather, "beans" becomes NA because it's not in the names 
of menu. (This is how it's documented in ?`[`: "Character vectors will be 
matched to the names of the object...")

Steve


On Thursday, January 18th, 2024 at 2:51 PM, Jiří Moravec 
 wrote:


> Subsetting vector (including lists) returns the same number of elements
> as the subsetting vector, including unmatched elements which are
> reported as `NA` or `NULL` (in case of lists).
> 
> Consider:
> 
> ```
> menu = list(
> "bacon" = "foo",
> "eggs" = "bar",
> "beans" = "baz"
> )
> 
> select = c("bacon", "eggs", "spam")
> 
> menu[select]
> # $bacon
> # [1] "foo"
> #
> # $eggs
> # [1] "bar"
> #
> # $
> 
> # NULL
> 
> `Wouldn't it be more logical to return named vector/list including names of 
> unmatched elements when subsetting using names? After all, the unmatched 
> elements are already returned. I.e., the output would look like this:`
> 
> menu[select]
> # $bacon
> # [1] "foo"
> #
> # $eggs
> # [1] "bar"
> #
> # $spam
> # NULL
> 
> ```
> 
> The simple fix `menu[select] |> setNames(select)` solves, but it feels
> 
> to me like something that could be a default behaviour.
> 
> On slightly unrelated note, when I was asking if there is a better
> solution, the `menu[select]` seems to allocate more memory than
> `menu_env = list2env(menu); mget(select, envir = menu, ifnotfound = 
> list(NULL)`. Or the sapply solution. Is this a benchmarking artifact?
> 
> https://stackoverflow.com/q/77828678/4868692
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel