Re: [Rd] [External] Re: zapsmall(x) for scalar x

Serguei Sokol via R-devel Mon, 18 Dec 2023 01:29:33 -0800

Le 17/12/2023 à 18:26, Barry Rowlingson a écrit :

I think what's been missed is that zapsmall works relative to the absolute
largest value in the vector. Hence if there's only one
item in the vector, it is the largest, so its not zapped. The function's
raison d'etre isn't to replace absolutely small values,
but small values relative to the largest. Hence a vector of similar tiny
values doesn't get zapped.


Maybe the line in the docs:

" (compared with the maximal absolute value)"

needs to read:

" (compared with the maximal absolute value in the vector)"

I agree that this change in the doc would clarify the situation butwould not resolve proposed corner cases.I think that an additional argument 'mx' (absolute max value ofreference) would do. Consider:


zapsmall2 <-
function (x, digits = getOption("digits"), mx=max(abs(x), na.rm=TRUE))
{
    if (length(digits) == 0L)
        stop("invalid 'digits'")
    if (all(ina <- is.na(x)))
        return(x)

round(x, digits = if (mx > 0) max(0L, digits -as.numeric(log10(mx))) else digits)

then zapsmall2() without explicit 'mx' behaves identically to actualzapsmall() and for a scalar or a vector of identical value, user canmanually fix the scale of what should be considered as small:


> zapsmall2(y)
[1] 2.220446e-16
> zapsmall2(y, mx=1)
[1] 0
> zapsmall2(c(y, y), mx=1)
[1] 0 0
> zapsmall2(c(y, NA))
[1] 2.220446e-16           NA
> zapsmall2(c(y, NA), mx=1)
[1]  0 NA

Obviously, the name 'zapsmall2' was chosen just for this explanation.The original name 'zapsmall' could be reused as a full backwardcompatibility is preserved.


Best,
Serguei.


Barry





On Sun, Dec 17, 2023 at 2:17 PM Duncan Murdoch <[email protected]>
wrote:

This email originated outside the University. Check before clicking links
or attachments.

I'm really confused.  Steve's example wasn't a scalar x, it was a
vector.  Your zapsmall() proposal wouldn't zap it to zero, and I don't
see why summary() would if it was using your proposal.

Duncan Murdoch

On 17/12/2023 8:43 a.m., Gregory R. Warnes wrote:

Isn’t that the correct outcome?  The user can change the number of

digits if they want to see small values…


--
Change your thoughts and you change the world.
--Dr. Norman Vincent Peale

On Dec 17, 2023, at 12:11 AM, Steve Martin <[email protected]>

wrote:

Zapping a vector of small numbers to zero would cause problems when
printing the results of summary(). For example, if
zapsmall(c(2.220446e-16, ..., 2.220446e-16)) == c(0, ..., 0) then
print(summary(2.220446e-16), digits = 7) would print
    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
         0          0            0           0           0          0

The same problem can also appear when printing the results of
summary.glm() with show.residuals = TRUE if there's little dispersion
in the residuals.

Steve

On Sat, 16 Dec 2023 at 17:34, Gregory Warnes <[email protected]> wrote:

I was quite suprised to discover that applying `zapsmall` to a scalar

value has no apparent effect.  For example:

y <- 2.220446e-16
zapsmall(y,)

[1] 2.2204e-16

I was expecting zapsmall(x)` to act like

round(y, digits=getOption('digits'))

[1] 0

Looking at the current source code, indicates that `zapsmall` is

expecting a vector:

zapsmall <-
function (x, digits = getOption("digits"))
{
     if (length(digits) == 0L)
         stop("invalid 'digits'")
     if (all(ina <- is.na(x)))
         return(x)
     mx <- max(abs(x[!ina]))
     round(x, digits = if (mx > 0) max(0L, digits -

as.numeric(log10(mx))) else digits)

}

If `x` is a non-zero scalar, zapsmall will never perform rounding.

The man page simply states:
zapsmall determines a digits argument dr for calling round(x, digits =

dr) such that values close to zero (compared with the maximal absolute
value) are ‘zapped’, i.e., replaced by 0.

and doesn’t provide any details about how ‘close to zero’ is defined.

Perhaps handling the special when `x` is a scalar (or only contains a

single non-NA value)  would make sense:

zapsmall <-
function (x, digits = getOption("digits"))
{
     if (length(digits) == 0L)
         stop("invalid 'digits'")
     if (all(ina <- is.na(x)))
         return(x)
     mx <- max(abs(x[!ina]))
     round(x, digits = if (mx > 0 && (length(x)-sum(ina))>1 ) max(0L,

digits - as.numeric(log10(mx))) else digits)

}

Yielding:

y <- 2.220446e-16
zapsmall(y)

[1] 0

Another edge case would be when all of the non-na values are the same:

y <- 2.220446e-16
zapsmall(c(y,y))

[1] 2.220446e-16 2.220446e-16

Thoughts?


Gregory R. Warnes, Ph.D.
[email protected]
Eternity is a long time, take a friend!



         [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

       [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Serguei Sokol
Ingenieur de recherche INRAE

Cellule Mathématiques
TBI, INSA/INRAE UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 61 55 98 49
email: [email protected]
https://www.toulouse-biotechnology-institute.fr/en/plateformes-plateaux/cellule-mathematiques/

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] [External] Re: zapsmall(x) for scalar x

Reply via email to