[R-pkg-devel] R package submission - too many threads error

2025-02-14 Thread Stephen Abrams
Hi - my submission was rejected with the following error in one of my
vignettes.

On Debian GNU/Linux trixie/sid:

Error: processing vignette 'modeling_with_binary_classifiers.Rmd'
failed with diagnostics:
24 simultaneous processes spawned

On Windows:

Error: processing vignette 'modeling_with_binary_classifiers.Rmd'
failed with diagnostics:
72 simultaneous processes spawned

I encountered a similar error while using R CMD check --as-cran on my local
(windows) machine and solved it by using a suggestion from this thread:
https://stackoverflow.com/questions/50571325/r-cran-check-fail-when-using-parallel-functions

Specifically, checking for Sys.getenv("_R_CHECK_LIMIT_CORES_", "") and
forcibly bypassing the parallel processing capability in the package. Is
there a better way to do this? Should I just skip the vignette altogether
and try to resolve this after the package is accepted?

For a little more detail, the package (called spect) uses the caret package
under the hood and takes advantage of parallel processing if the user
specifies it. The package is located at the github repo below. The bypass
occurs at line 378 of spect.R:
https://github.com/dawdawdo/spect

A secondary worry is that even if I resolve this, there might be something
else causing threads to spin up. How can I test for that when the error
doesn't trigger when I run R CMD check? I don't want to waste shared
resources if I can check it myself first.

Any help would be greatly appreciated. Thanks!


-- 
Stephen Abrams
Divergent Blue <http://www.divergentblue.com/>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] R package submission - too many threads error

2025-02-14 Thread Stephen Abrams
I appreciate the welcome! Also - I believe that replying to an email is the
way to respond here, but please let me know if that's not the case.

In any event - passing in a cluster context is an interesting idea. I will
think about that. Also, it seems that despite me telling myself to write
bug-free code, you have correctly identified that I don't actually make use
of the passed cores parameter - oops! This is where it would have really
helped me to have a peer reviewer. Thanks!

On Fri, Feb 14, 2025 at 3:48 PM Ivan Krylov  wrote:

> Dear Stephen Abrams,
>
> Welcome to R-package-devel!
>
> В Thu, 13 Feb 2025 22:20:50 -0500
> Stephen Abrams  пишет:
>
> > A secondary worry is that even if I resolve this, there might be
> > something else causing threads to spin up.
>
> Instead of using detectCores() [*] and creating cluster objects
> yourself, how about letting the user provide a cluster object for you
> as a function argument? Yes, it takes slightly more typing for the user,
> but on the other hand it lets the user:
>
>  - choose the number of cores for themselves (currently the code seems
>to be ignoring the 'cores' argument)
>  - distribute the computation over the network by connecting to the
>machines they know about
>  - provide a completely custom, non-PSOCK cluster object that
>'parallel' will nevertheless will work with
>
> Since you're already using doParallel, maybe the right choice is to let
> the user call registerDoParallel()?
>
> Determining the right amount of parallelism in your code is a
> surprisingly hard problem. Especially on shared computers, a program
> naively deciding to use all (or 3/4 of all, or 1/2 of all) processors
> may end up working much worse than a purely sequential one [**].
>
> While rendering the vignette in a CRAN package, create a two-process
> cluster or set use_parallel = FALSE: CRAN needs the rest of the
> processors to check other packages in parallel with yours [***].
>
> Good luck!
>
> --
> Best regards,
> Ivan
>
> [*]
>
> https://github.com/dawdawdo/spect/blob/d48b002332f1a1c2d302afb28e08e7998f416200/R/spect.R#L388
>
> [**]
> https://mastodon.social/@henrikbengtsson/113835651303559942
>
> [***]
>
> http://contributor.r-project.org/cran-cookbook/code_issues.html#using-more-than-2-cores
>


-- 
Stephen Abrams
Divergent Blue <http://www.divergentblue.com/>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[R-pkg-devel] How to resolve missing packages without Notes

2025-03-09 Thread Stephen Abrams
Hi again - I'm working through the final details of a submission. My
current problem is that I make use of a dependency of the caret package
(kernlab) in one of my vignettes. If I don't include kernlab in my
DESCRIPTION file, I get the following ERROR:

* checking re-building of vignette outputs ... [16s] ERROR
Error(s) in re-building vignettes:
--- re-building 'create_synthetic_data.Rmd' using rmarkdown
--- finished re-building 'create_synthetic_data.Rmd'

--- re-building 'modeling_with_binary_classifiers.Rmd' using rmarkdown

Quitting from lines 56-63 [unnamed-chunk-4]
(modeling_with_binary_classifiers.Rmd)
Error: processing vignette 'modeling_with_binary_classifiers.Rmd' failed
with diagnostics:
Required packages are missing: kernlab
--- failed re-building 'modeling_with_binary_classifiers.Rmd'

If I do include kernlab as a dependency in my DESCRIPTION file, I get the
following NOTE:

* checking dependencies in R code ... NOTE
Packages in Depends field not imported from:
  'kernlab' 'randomForest'
  These packages need to be imported from (in the NAMESPACE file)
  for when this namespace is loaded but not attached.

My NAMESPACE file is auto-generated by roxygen and actually says not to
edit it. I think the problem is the implicit dependency inside caret, but
I'm not sure how to solve it so the automated checks work. When I run this
on my own machine, it passes the --as-cran check. Any advice would be
greatly appreciated.

Thanks!


-- 
Stephen Abrams
Divergent Blue <http://www.divergentblue.com/>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] How to resolve missing packages without Notes

2025-03-09 Thread Stephen Abrams
Awesome - much appreciated. I think I'm missing the forest for the trees at
this point - thanks!

On Sun, Mar 9, 2025 at 3:12 PM Dirk Eddelbuettel  wrote:

>
> On 9 March 2025 at 15:03, Stephen Abrams wrote:
> | Hi again - I'm working through the final details of a submission. My
> | current problem is that I make use of a dependency of the caret package
> | (kernlab) in one of my vignettes. If I don't include kernlab in my
> | DESCRIPTION file, I get the following ERROR:
> [...]
> | My NAMESPACE file is auto-generated by roxygen and actually says not to
>
> You need to edit DESCRIPTION (by hand). That is all that there is, and
> _Writing R Extensions_ is fairly clear about this. Here is a quote from
> Section 1.1.3
>
>  The ‘Suggests’ field uses the same syntax as ‘Depends’ and lists
>   packages that are not necessarily needed.  This includes packages used
>   only in examples, tests or vignettes (*note Writing package
>   vignettes::), and packages loaded in the body of functions.
>
> Dirk
>
> --
> dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org
>


-- 
Stephen Abrams
Divergent Blue <http://www.divergentblue.com/>

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel