Re: [R-pkg-devel] Advice for addressing CRAN rejection

2025-05-14 Thread Ben Bolker
On Wed, May 14, 2025 at 12:46 PM Dirk Eddelbuettel  wrote:
>
>
> [ If you could, please set your email software up such that posts to a
>   mailing lists are not signed, it makes reading them more cumbersome. It
>   also means I can't reply (easily) inline now. ]
>
> In most cases environment variables need to be set before the process
> starts. Setting them inside the running R process likely has no effect.

  That's what I thought, although it means then (I assume) that you
have to rely on whatever settings CRAN has determined for its testing
platforms.  The RhpcBLASctl package
 can do some of these
settings within a running R process, I don't know if it does what you
need or not.


>
> Dirk
>
> --
> dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Advice for addressing CRAN rejection

2025-05-14 Thread smallepsilon
Ben,

No need to apologize. I hope the following example helps clarify what I mean. 
Suppose that modify_matrix(mat, other_args) is a function that, among other 
things, applies eigen() to mat. For good reasons, other_args has no default 
value. It is sometimes convenient, though, to supply the user with default 
values. Therefore, there is another function, convenient_modify_matrix():

convenient_modify_matrix(mat) <- function(mat) modify_matrix(mat, other_args = 
default).

To help verify that the code is correct, I want to check the result from 
identical() below:

set.seed(20250514)
A <- modify_matrix(mat, other_args = default)
set.seed(20250514)
B <- convenient_modify_matrix(mat)
identical(A, B)

In an ideal world, the result would be TRUE. The results can differ, though, 
because of multithreading (synonymous with parallel computation, yes?) used by 
the code underlying eigen(). As I understand it, this occurs when calling 
LAPACK/BLAS routines on some systems (e.g., MKL). Can that multithreading be 
turned off? Searching online shows that there is a lot of interest in turning 
it on; not so much in turning it off.

Thanks,
Jesse





On Tuesday, May 13th, 2025 at 8:07 PM, Ben Bolker  wrote:

> 
> 
> Hmm. The thread you linked to is specifically an issue with
> non-deterministic linear algebra, the solution to which is to disable
> threaded computations. I don't think CRAN multithreads by default (and I
> don't know if they test on MKL at all ...?)
> 
> Can you provide more specific/concrete examples of the tests? (Again,
> I apologize if there were examples posted up-thread -- I'm too lazy to
> search for them.) I'm not quite sure I understand your comment about
> 
> > Suppose, for example, that X is a symmetric, positive definite
> 
> matrix. Then identical() will usually distinguish between (X^1/2)^-1 and
> (X^-1)^1/2 (the kind of thing I want to be able to check) while
> all.equal() will generally not
> 
> 
> What is X^1/2? (There are infinitely many ways to take a matrix
> square root ...) Interpreting X^(1/2) as chol(X) and X^(-1) as solve(X),
> these are not even close:
> 
> > set.seed(101); m <- crossprod(matrix(rnorm(9), 3, 3))
> 
> > all.equal(solve(chol(m)), chol(solve(m)))
> 
> [1] "Mean relative difference: 0.6655765"
> 
> 
> In general "convenience shortcuts" that do any kind of rearranging of
> a floating point computation cannot be guaranteed to be identical;
> this is a corollary of
> 
> https://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-doesn_0027t-R-think-these-numbers-are-equal_003f
> 
> See also
> https://stackoverflow.com/questions/9508518/why-are-these-numbers-not-equal/9508558#9508558
> 
> (e.g., floating-point addition is not associative)
> 
> I apologize if this sounds basic/is telling you something you already
> know, but from what I can understand of your questions so far, you are
> asking for something that is not possible in general.
> 
> Can you clarify further please?
> 
> cheers
> Ben Bolker
> 
> 
> 
> On 5/13/25 15:08, smallepsilon wrote:
> 
> > Ben,
> > 
> > The thread to which I alluded is here: 
> > https://stat.ethz.ch/pipermail/r-help/2025-May/480866.html
> > 
> > Further clarification: The package provides some convenience shortcuts for 
> > the user which should run the same calculations as their longer 
> > counterparts. I want to use identical() to provide strong evidence that 
> > this is happening. Suppose, for example, that X is a symmetric, positive 
> > definite matrix. Then identical() will usually distinguish between 
> > (X^1/2)^-1 and (X^-1)^1/2 (the kind of thing I want to be able to check) 
> > while all.equal() will generally not (unless I set the tolerance 
> > sufficiently low, but that is just making all.equal() behave more like 
> > identical()). Using all.equal() helps detect catastrophic errors, but those 
> > would be detected in other tests already.
> > 
> > Thanks,
> > 
> > Jesse
> > 
> > On Tuesday, May 13th, 2025 at 1:41 PM, Ben Bolker bbol...@gmail.com wrote:
> > 
> > > Can you please clarify (maybe by linking back to an earlier thread, don't 
> > > remember if you discussed this previously) what you mean by "I realized 
> > > that because all.equal() does not test (even as a proxy) that the same 
> > > calculations were done"?
> > > 
> > > On Tue, May 13, 2025, 1:05 PM smallepsilon smallepsi...@proton.me wrote:
> > > 
> > > > I have been trying to fix some issues with my package's testing on 
> > > > CRAN, which culminated in a rejection email from a CRAN administrator 
> > >

Re: [R-pkg-devel] Advice for addressing CRAN rejection

2025-05-14 Thread smallepsilon
On Wednesday, May 14th, 2025 at 10:51 AM, Tim Taylor 
 wrote:

> I'd just use all.equal but I think you could just check the call is 
> constructed correctly, e.g.
> 
> convenient_modify_matrix <- function(mat) modify_matrix(mat, other_args = 
> default)
> 
> identical(
> body(convenient_modify_matrix),
> call("modify_matrix", quote(mat), other_args = quote(default))
> )

I wondered whether something such as that is possible, but unfortunately, in 
the actual package, the calls might be different. For example, a function might 
use an intermediate result saved in one object so the earlier calculations need 
not be repeated. The calculations in both cases should be exactly the same, 
though.

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Advice for addressing CRAN rejection

2025-05-14 Thread smallepsilon






On Wednesday, May 14th, 2025 at 10:33 AM, Dirk Eddelbuettel  
wrote:

> 
> 
> Section 'A.3.1.3 Intel MKL' of the R Installation and Administration manual
> covers that for the MKL case (and general OpenMP cases)
> 
> The default number of threads will be chosen by the OpenMP software, but
> can be controlled by setting ‘OMP_NUM_THREADS’ or ‘MKL_NUM_THREADS’, and
> in recent versions seems to default to a sensible value for sole use of
> the machine.

Are these options to be set using Sys.setenv()? I tried that on a Linux 
machine, and it did not prevent the problem. I am not sure that is actually how 
these settings are controlled, though. Specifically,

Sys.setenv(MKL_NUM_THREADS = 1)
n <- 50
set.seed(20250504)
Sigma <- rWishart(1, df = n, Sigma = diag(n))[,,1]
e1 <- eigen(Sigma)
e2 <- eigen(Sigma)
identical(e1, e2)

results in FALSE. I am not sure whether the age of the hardware is relevant, 
but it is about 14 years old. If anyone tests this and generally gets TRUE, 
what happens when n is increased?

> 
> The entire section makes for good reading, it mixes 'how to install MKL' with
> 'how to use MKL' and touches upon the parellism issue you have here.
> 

A minor point, which might be relevant: the discussion I see in A.3 is about 
results differing across implementations, not within a given implementation.

> Dirk
> 
> --
> dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Advice for addressing CRAN rejection

2025-05-14 Thread Dirk Eddelbuettel


[ If you could, please set your email software up such that posts to a
  mailing lists are not signed, it makes reading them more cumbersome. It
  also means I can't reply (easily) inline now. ]

In most cases environment variables need to be set before the process
starts. Setting them inside the running R process likely has no effect.

Dirk

-- 
dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Advice for addressing CRAN rejection

2025-05-14 Thread Tim Taylor
On Wed, 14 May 2025, at 4:18 PM, smallepsilon wrote:

> No need to apologize. I hope the following example helps clarify what I 
> mean. Suppose that modify_matrix(mat, other_args) is a function that, 
> among other things, applies eigen() to mat. For good reasons, 
> other_args has no default value. It is sometimes convenient, though, to 
> supply the user with default values. Therefore, there is another 
> function, convenient_modify_matrix():
>

I'd just use all.equal but I *think* you could just check the call is 
constructed correctly, e.g.

convenient_modify_matrix <- function(mat) modify_matrix(mat, other_args = 
default)

identical(
  body(convenient_modify_matrix),
  call("modify_matrix", quote(mat), other_args = quote(default))
)

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Advice for addressing CRAN rejection

2025-05-14 Thread Dirk Eddelbuettel


On 14 May 2025 at 15:18, smallepsilon wrote:
| Ben,
| 
| No need to apologize. I hope the following example helps clarify what I mean. 
Suppose that modify_matrix(mat, other_args) is a function that, among other 
things, applies eigen() to mat. For good reasons, other_args has no default 
value. It is sometimes convenient, though, to supply the user with default 
values. Therefore, there is another function, convenient_modify_matrix():
| 
| convenient_modify_matrix(mat) <- function(mat) modify_matrix(mat, other_args 
= default).
| 
| To help verify that the code is correct, I want to check the result from 
identical() below:
| 
| set.seed(20250514)
| A <- modify_matrix(mat, other_args = default)
| set.seed(20250514)
| B <- convenient_modify_matrix(mat)
| identical(A, B)
| 
| In an ideal world, the result would be TRUE. The results can differ, though, 
because of multithreading (synonymous with parallel computation, yes?) used by 
the code underlying eigen(). As I understand it, this occurs when calling 
LAPACK/BLAS routines on some systems (e.g., MKL). Can that multithreading be 
turned off? Searching online shows that there is a lot of interest in turning 
it on; not so much in turning it off.

Section 'A.3.1.3 Intel MKL' of the R Installation and Administration manual
covers that for the MKL case (and general OpenMP cases)

   The default number of threads will be chosen by the OpenMP software, but
   can be controlled by setting ‘OMP_NUM_THREADS’ or ‘MKL_NUM_THREADS’, and
   in recent versions seems to default to a sensible value for sole use of
   the machine.

The entire section makes for good reading, it mixes 'how to install MKL' with
'how to use MKL' and touches upon the parellism issue you have here.

Dirk

-- 
dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Advice for addressing CRAN rejection

2025-05-14 Thread smallepsilon
On Wednesday, May 14th, 2025 at 12:59 PM, Ben Bolker  wrote:

> 
> 
> On Wed, May 14, 2025 at 12:46 PM Dirk Eddelbuettel e...@debian.org wrote:
> 
> > [ If you could, please set your email software up such that posts to a
> > mailing lists are not signed, it makes reading them more cumbersome. It
> > also means I can't reply (easily) inline now. ]
> > 
> > In most cases environment variables need to be set before the process
> > starts. Setting them inside the running R process likely has no effect.
> 
> 
> That's what I thought, although it means then (I assume) that you
> have to rely on whatever settings CRAN has determined for its testing
> platforms. The RhpcBLASctl package
> https://CRAN.R-project.org/package=RhpcBLASctl can do some of these
> settings within a running R process, I don't know if it does what you
> need or not.

Thanks for the package reference. The description is promising, but using it 
still resulted in non-identical objects.

All of this leads me to think that the lack of control over the CRAN platforms 
and the difficulty (if not impossibility) of testing what I want to test mean 
that it is reasonable to remove my attempts to test it from the package that I 
upload to CRAN. Thoughts?

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Advice for addressing CRAN rejection

2025-05-14 Thread smallepsilon
On Wednesday, May 14th, 2025 at 2:51 PM, Ivan Krylov  
wrote:

> > My submission was rejected, not because of test failures, but because
> > I had "removed the failing tests which is not the idea of tests." No
> > errors/warnings/notes were reported to me.
> 
> 
> Try measuring the test coverage of your package before and after
> removing the extraneous tests (the 'covr' package should the job).
> Perhaps you may see a cheap way to increase the coverage. Offer the
> increased coverage as an additional argument when resubmitting. After
> all, tests are code and thus may also contain bugs, and sometimes it's
> the right choice to remove the buggy code.
> 
> --
> Best regards,
> Ivan

Thanks - I have not used that package before now. It is currently running

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Advice for addressing CRAN rejection

2025-05-14 Thread Ivan Krylov via R-package-devel
В Tue, 13 May 2025 17:05:14 +
smallepsilon  пишет:

> In many package tests, I want to verify that two ways of specifying
> something lead to the execution of exactly the same calculations.

Would you like to try tracing the linear algebra calculations
themselves (together with their arguments) instead of trying to trace
their results? I think it should be possible using the 'mockery'
package or even by hand (after a lot of effort) using base::trace().

> My submission was rejected, not because of test failures, but because
> I had "removed the failing tests which is not the idea of tests." No
> errors/warnings/notes were reported to me.

Try measuring the test coverage of your package before and after
removing the extraneous tests (the 'covr' package should the job).
Perhaps you may see a cheap way to increase the coverage. Offer the
increased coverage as an additional argument when resubmitting. After
all, tests are code and thus may also contain bugs, and sometimes it's
the right choice to remove the buggy code.

-- 
Best regards,
Ivan

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] Advice for addressing CRAN rejection

2025-05-14 Thread Tim Taylor



> On 14 May 2025, at 02:21, Dirk Eddelbuettel  wrote:
> 
…
> There is a higher-level README linked from CRAN package pages but I can't
> find it now :-/

https://cran.r-project.org/web/checks/check_issue_kinds.html I believe 

Tim
[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel