[Rd] Subsetting "dspMatrix" without coercion to "matrix"

2021-11-17 Thread Mikael Jagan
Currently, the class for dense symmetric matrices in packed storage, 
"dspMatrix", inherits its subset (i.e., `[`) methods from the "Matrix" 
class. As a result, subsetting "dspMatrix" requires coercion to 
"matrix". If memory cannot be allocated for this "matrix", then an error 
results.


n <- 3L
x <- as.double(seq_len(choose(n + 1L, 2L)))
S <- new("dspMatrix", uplo = "L", x = x, Dim = c(n, n))

S[, 1L]
# Error: vector memory exhausted (limit reached?)

traceback()
# 10: .dsy2mat(dsp2dsy(from))
# 9: asMethod(object)
# 8: as(x, "matrix")
# 7: x[, j = j, drop = TRUE]
# 6: x[, j = j, drop = TRUE]
# 5: eval(call, parent.frame())
# 4: eval(call, parent.frame())
# 3: callGeneric(x, , j = j, drop = TRUE)
# 2: S[, 1L]
# 1: S[, 1L]

This seems entirely avoidable, given that there is a relatively simple 
formula for converting 2-ary indices [i,j] of S to 1-ary indices k of 
S[lower.tri(S, TRUE)]:


k <- i + round(0.5 * (2L * n - j) * (j - 1L)) # for i >= j

Has the implementation of `[` for class "dspMatrix" been discussed 
already elsewhere? If not, shall I report it to Bugzilla as a wishlist item?


FWIW, I encountered this problem while trying to coerce a large "dist" 
object to "dspMatrix", expecting that I would be able to safely subset 
the result:


setAs(from = "dist", to = "dspMatrix", function(from) {
  p <- length(from)
  if (p > 0L) {
n <- as.integer(round(0.5 * (1 + sqrt(1 + 8 * p # p = n*(n-1)/2
x <- double(p + n)
i <- 1L + c(0L, cumsum(n:2L))
x[-i] <- from
  } else {
n <- 1L
x <- 0
  }
  new("dspMatrix", uplo = "L", x = x, Dim = c(n, n))
})

But that attempt failed for a entirely different reason. For large 
enough n, I would hit a memory limit at the line:


x[-i] <- from

So I guess my second question is: what is problematic about this 
benign-seeming subset-assignment?


Mikael

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] LOGNAME env var in the check code

2021-11-17 Thread Kurt Hornik
> Gábor Csárdi writes:

> While trying to reproduce a NOTE for
> * checking for new files in some other directories ... NOTE

> I noticed that the check code uses

> Sys.getenv("LOGNAME")

> to query the name of the current user. However on many systems this is
> not set, so this is the empty string, and then no NOTE is shown. (Only
> files owned by the current user generate a NOTE.)

> An alternative would be to call `id -un` to query the username, or
> create a file and then use `file.info()` to query its owner. Using one
> of these alternatives would make this check more reproducible.

Thanks, will try to get this improved (I like the "or" idea) ...

Best
-k

> Thanks,
> Gabor

> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Inconsistent is.list results on 'by' objects

2021-11-17 Thread Ofek Shilon
Thanks. 'by' is implemented by tapply, and it seems to behave rather
erratically for empty inputs:

> FUN = function(x) x[1,]
> FUNx <- function(x) FUN(df[x, , drop = FALSE])
> tapply(seq_len(10), df$b, FUNx) %>% storage.mode # array
> tapply(seq_len(0), c(), FUNx)   %>% storage.mode # logical
> tapply(seq_len(0), c(), NULL)   %>% storage.mode # integer

I think this warrants a fix, strictly in the tapply R code. WDYT?


On Wed, 17 Nov 2021 at 01:31, Kevin Ushey  wrote:

> You can see this a bit more clearly with e.g.
>
> > storage.mode(byy)
> [1] "list"
> > storage.mode(byy.empty)
> [1] "logical"
>
> So even though both objects have S3 class "by", they have a different
> underlying internal storage mode (as simplifying the result of 'by'
> has given you a 0-length logical, instead of a 0-length list).
>
> Best,
> Kevin
>
> On Tue, Nov 16, 2021 at 10:21 AM Bill Dunlap 
> wrote:
> >
> > Try adding simplify=FALSE to the call to by().
> >
> > -Bill
> >
> > On Tue, Nov 16, 2021 at 4:04 AM Ofek Shilon 
> wrote:
> >
> > > Take this toy code:
> > > df   <- data.frame(a=seq(10), b=rep(1:2, 5))
> > > df.empty <- subset(df, a>10)
> > > byy   <- by(data=df,   INDICES=df$b, FUN=function(x) x[1,])
> > > byy.empty <- by(data=df.empty, INDICES=df.empty$b, FUN=function(x)
> x[1,])
> > >
> > > class(byy)   # "by"
> > > class(byy.empty) # "by"
> > >
> > > is.list(byy)# TRUE
> > > is.list(byy.empty)  # FALSE!
> > >
> > >
> > > This behavior already messed up stuff elsewhere:
> > > https://github.com/Rdatatable/data.table/issues/5258
> > >
> > > I'd say any questions about the class of an object (whether 'class' or
> > > indirectly by 'is.**')
> > > should not have the answers depend on the specific contents of the
> object.
> > >
> > > Does this qualify as an R bug? Am I missing something?
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > __
> > > R-devel@r-project.org mailing list
> > > https://stat.ethz.ch/mailman/listinfo/r-devel
> > >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R-patched tarball at https://stat.ethz.ch/R/daily/ outdated

2021-11-17 Thread Gábor Csárdi
Hi all,

AFAICT https://stat.ethz.ch/R/daily/R-patched.tar.gz is still R 4.0.5 patched.

Probably needs a branch bump. FYI,
Gabor

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R-devel (r81196) hanging at dchisq(large) (PR#13309)

2021-11-17 Thread Avraham Adler
Hello, Martin et. al.

I apologize for top posting, but I believe I have tracked down the
difference why last time my build worked and now it hangs on
`dchisq(c(Inf, 1e80, 1e50, 1e40), df=10, ncp=1)`. and it's NOT the
BLAS. I built against both 3.15 AND R's vanilla and it hung both
times. The issue was passing "march=skylake". I own an i7-8700K which
gcc considers a skylake. When I pass mtune=skylake, it does not hang
and the make check-devel after the build completes.

Below is a list of the different flags passed when using mtune vs.
march. It stands to reason that at least one of them contributed to
the hanging issue which Martin fixed in
https://bugs.r-project.org/show_bug.cgi?id=13309. While I recognize
the obvious ones, I'm not an expert and do not understand which if any
may be the culprit. For reference, most of these flags are described
here: https://gcc.gnu.org/onlinedocs/gcc-8.3.0/gcc/x86-Options.html#x86-Options.

All the following flags are DISABLED for mtune=skylake (so
march=x86-64) and ENABLED when passing march-skylake. For the record,
I usually passed march in the past without problem:

madx, maes, mavx, mavx2, mbmi, mbmi2, mclflushopt, mcx16, mf16c, mfma,
mfsgsbase, mhle, mlzcnt, mmovbe, mpclmul, mpopcnt, mprfchw, mrdrnd,
mrdseed, msahf, msgx, msse3, msse4, msse4.1, msse4.2, mssse3, mxsave,
mxsavec, mxsaveopt, and mxsaves.

Inversely, mno-sse4 is enabled when using mtune and disabled when
using arch, of course.

For completeness, the following two are disabled on both mtune and
march but enabled when passing march=native, otherwise the latter is
the same as march=skylake: mabm and mrtm. Obviously these cannot
contribute to the hanging issue.

Any ideas, especially from the experts who understand how the flags
would address the code in dchisq, would be greatly appreciated.

Thank you,

Avi

On Wed, Nov 17, 2021 at 7:15 AM Avraham Adler  wrote:
>
> On Tue, Nov 16, 2021 at 10:42 PM Kevin Ushey  wrote:
> >
> > Do you see this same hang in a build of R with debug symbols? Can you
> > try running R with GDB, or even WinDbg or another debugger, to see
> > what the call stack looks like when the hang occurs? Does the hang
> > depend on the number of threads used by OpenBLAS?
> >
> > On the off chance it's relevant, I've seen hangs / crashes when using
> > a multithreaded OpenBLAS with R on some Linux systems before, but
> > never found the time to isolate a root cause.
> >
>
>
> This last was a good thought, Kevin, as I had just compiled OpenBLAS
> 3.18 multi-threaded, but I recompiled it single threaded and it still
> crashes. The version of R I built from source last time, (2021-05-20
> r80347), does not hang when calling `dchisq(c(Inf, 1e80, 1e50, 1e40),
> df=10, ncp=1)`. I think I built that with OpenBLAS 3.15. I can try
> doing that here. As for building with debug symbols, I have never done
> that before, so if you could provide some guidance (off-list if you
> think it is inappropriate to keep it here) or point me in the
> direction of some already posted advice, I would appreciate it!
>
> Avi
>
>
>
> > Best,
> > Kevin
> >
> > On Tue, Nov 16, 2021 at 5:12 AM Avraham Adler  
> > wrote:
> > >
> > > On Tue, Nov 16, 2021 at 8:43 AM Martin Maechler
> > >  wrote:
> > > >
> > > > > Avraham Adler
> > > > > on Tue, 16 Nov 2021 02:35:56 + writes:
> > > >
> > > > > I am building r-devel on Windows 10 64bit using Jeroen's mingw 
> > > > system,
> > > > > and I am finding that my make check-devel hangs on the above 
> > > > issue.
> > > > > Everything is vanila except that I am using OpenBLAS 0.3.18. I 
> > > > have
> > > > > been using OpenBLAS for over a decade and have not had this issue
> > > > > before. Is there anything I can do to dig deeper into this issue 
> > > > from
> > > > > my end? Could there be anything that changed in R-devel that may 
> > > > have
> > > > > triggered this? The bugzilla report doesn't have any code 
> > > > attached to
> > > > > it.
> > > >
> > > > > Thank you,
> > > > > Avi
> > > >
> > > > Hmm.. it would've be nice to tell a bit more, instead of having all
> > > > your readers to search links, etc.
> > > >
> > > > In the bugzilla bug report PR#13309
> > > > https://bugs.r-project.org/show_bug.cgi?id=13309 ,the example was
> > > >
> > > >  dchisq(x=Inf, df=10, ncp=1)
> > > >
> > > > I had fixed the bug 13 years ago, in svn rev 47005
> > > > with regression test in /tests/d-p-q-r-tests.R :
> > > >
> > > >
> > > > ## Non-central Chi^2 density for large x
> > > > stopifnot(0 == dchisq(c(Inf, 1e80, 1e50, 1e40), df=10, ncp=1))
> > > > ## did hang in 2.8.0 and earlier (PR#13309).
> > > >
> > > >
> > > > and you are seeing your version of R hanging at exactly this
> > > > location?
> > > >
> > > >
> > > > I'd bet quite a bit that the underlying code in these
> > > > non-central chi square computations *never* calls BLAS and hence
> > > > I cannot imagine how openBLAS could play a role.
> > > >
> > > > However, there mus

Re: [Rd] Subsetting "dspMatrix" without coercion to "matrix"

2021-11-17 Thread Mikael Jagan
This seems entirely avoidable, given that there is a relatively simple 
formula for converting 2-ary indices [i,j] of S to 1-ary indices k of 
S[lower.tri(S, TRUE)]:


k <- i + round(0.5 * (2L * n - j) * (j - 1L)) # for i >= j


I ought to be slightly more precise here: _coercion_ is avoidable, 
because we can always map [i,j] to [k], but memory limits are not. 
Certainly S@x[k] cannot be arbitrarily long...


At the very least, it would be convenient if the subset were performed 
efficiently whenever dimensions would be dropped anyway:


* S[i, ] and S[, j] where i and j are vectors indexing exactly zero or 
one rows/columns

* S[i] where i is a matrix of the form cbind(i, j)

This would support, e.g., a memory-efficient 'apply' analogue without 
any need for MARGIN...


applySymmetric <- function(X, FUN, ..., simplify = TRUE, check = TRUE) {
  if (check && !isSymmetric(X)) {
stop("'X' is not a symmetric matrix.")
  }
  ## preprocessing
  ans <- vector("list", n)
  for (i in seq_len(n)) {
ans[[i]] <- forceAndCall(1L, FUN, S[i, ], ...)
  }
  ## postprocessing
  ans
}

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] R-devel (r81196) hanging at dchisq(large) (PR#13309)

2021-11-17 Thread Avraham Adler
Hello.

I have isolated the issue: it is the fused-multiply-add instruction
set (FMA on Intel processors). Running -march=skylake -mno-fma not
only does not hang, but passes make check-all (using R's native BLAS).
My intuition remains that something in the new more precise ebd0 code
used in dpois_raw—called by dgamma, called by dchsq, called by
dnchisq—is hanging when the assembler uses FMA. Unfortunately, I have
come across other cases online where the extra precision and the
different assembler code of FMA vs. non-FMA has caused bugs, such as
[1]. Page 5 of this paper by Dr. William Kahan sheds some light on why
this may be happening [2] (PDF).

Martin & Morton, having written (PR#15628 [3]) and/or implemented the
ebd0 code that is now being used, can either of you think of any
reason why it would hang if compiled using FMA? Again, I'm not a
professional, but line 325 of the ebd0 function in bd0.c [4] has
"ADD1(-x * log1pmx ((M * fg - x) / x))" which looks like a
Multiply-Add to me, at least in the inner parenthesis. Is there
anything that can be, or should be, done with the code to prevent the
hang, or must we forbid the use of FMA instructions (and I guess FMA4
on AMD processors) when compiling R?

Also, What happens in the case where M/x neither over- nor
under-flowed, M_LN2 * ((double) -e) <= 1. + DBL_MAX / x, fg != 1, and
after 4 loops of lines 329 & 330, *yh is still finite? How does ebd0
exit in that case? There is no "return" after line 331. Am I missing
something? Could that be related to this issue?

As an aside, could ebd0 be enhanced by using FMA instructions on
processors which support them?

Thank you very much,

Avi

[1] 
https://flameeyes.blog/2014/10/27/the-subtlety-of-modern-cpus-or-the-search-for-the-phantom-bug/
[2] https://people.eecs.berkeley.edu/~wkahan/ieee754status/IEEE754.PDF
[3] https://bugs.r-project.org/show_bug.cgi?id=15628
[4] https://github.com/wch/r-source/blob/trunk/src/nmath/bd0.c

On Wed, Nov 17, 2021 at 3:55 PM Avraham Adler  wrote:
>
> Hello, Martin et. al.
>
> I apologize for top posting, but I believe I have tracked down the
> difference why last time my build worked and now it hangs on
> `dchisq(c(Inf, 1e80, 1e50, 1e40), df=10, ncp=1)`. and it's NOT the
> BLAS. I built against both 3.15 AND R's vanilla and it hung both
> times. The issue was passing "march=skylake". I own an i7-8700K which
> gcc considers a skylake. When I pass mtune=skylake, it does not hang
> and the make check-devel after the build completes.
>
> Below is a list of the different flags passed when using mtune vs.
> march. It stands to reason that at least one of them contributed to
> the hanging issue which Martin fixed in
> https://bugs.r-project.org/show_bug.cgi?id=13309. While I recognize
> the obvious ones, I'm not an expert and do not understand which if any
> may be the culprit. For reference, most of these flags are described
> here: 
> https://gcc.gnu.org/onlinedocs/gcc-8.3.0/gcc/x86-Options.html#x86-Options.
>
> All the following flags are DISABLED for mtune=skylake (so
> march=x86-64) and ENABLED when passing march-skylake. For the record,
> I usually passed march in the past without problem:
>
> madx, maes, mavx, mavx2, mbmi, mbmi2, mclflushopt, mcx16, mf16c, mfma,
> mfsgsbase, mhle, mlzcnt, mmovbe, mpclmul, mpopcnt, mprfchw, mrdrnd,
> mrdseed, msahf, msgx, msse3, msse4, msse4.1, msse4.2, mssse3, mxsave,
> mxsavec, mxsaveopt, and mxsaves.
>
> Inversely, mno-sse4 is enabled when using mtune and disabled when
> using arch, of course.
>
> For completeness, the following two are disabled on both mtune and
> march but enabled when passing march=native, otherwise the latter is
> the same as march=skylake: mabm and mrtm. Obviously these cannot
> contribute to the hanging issue.
>
> Any ideas, especially from the experts who understand how the flags
> would address the code in dchisq, would be greatly appreciated.
>
> Thank you,
>
> Avi
>
> On Wed, Nov 17, 2021 at 7:15 AM Avraham Adler  wrote:
> >
> > On Tue, Nov 16, 2021 at 10:42 PM Kevin Ushey  wrote:
> > >
> > > Do you see this same hang in a build of R with debug symbols? Can you
> > > try running R with GDB, or even WinDbg or another debugger, to see
> > > what the call stack looks like when the hang occurs? Does the hang
> > > depend on the number of threads used by OpenBLAS?
> > >
> > > On the off chance it's relevant, I've seen hangs / crashes when using
> > > a multithreaded OpenBLAS with R on some Linux systems before, but
> > > never found the time to isolate a root cause.
> > >
> >
> >
> > This last was a good thought, Kevin, as I had just compiled OpenBLAS
> > 3.18 multi-threaded, but I recompiled it single threaded and it still
> > crashes. The version of R I built from source last time, (2021-05-20
> > r80347), does not hang when calling `dchisq(c(Inf, 1e80, 1e50, 1e40),
> > df=10, ncp=1)`. I think I built that with OpenBLAS 3.15. I can try
> > doing that here. As for building with debug symbols, I have never done
> > that befo