Re: [Rd] importing namespaces from base packages
On Mon, Mar 12, 2018 at 2:18 PM, Martin Maechler wrote: > [...] > Is that so? Not according to my reading of the 'Writing R > Extensions' manual, nor according to what I have been doing in > all of my packages for ca. 2 years: > > The rule I have in my mind is > > 1) NAMESPACE Import(s|From) \ > <==> DESCRIPTION -> 'Imports:' > 2) .. using "::" in R code / > > > If you really found that you did not have to import from say > 'utils', I think this was a *un*lucky coincidence. Of course, the importFrom() is mandatory in NAMESPACE otherwise the package does not pass the checks. The question was related to the relation between the packages mentioned in the NAMESPACE and the packages mentioned in the Imports: field from DESCRIPTION. For instance, the current version 3.1 of package QCA on CRAN mentions in the DESCRIPTION: Imports: venn (≥ 1.2), shiny, methods, fastdigest while the NAMESPACE file has: import(shiny) import(venn) import(fastdigest) importFrom("utils", "packageDescription", "remove.packages", "capture.output") importFrom("stats", "glm", "predict", "quasibinomial", "binom.test", "cutree", "dist", "hclust", "na.omit", "dbinom", "setNames") importFrom("grDevices", "dev.cur", "dev.new", "dev.list") importFrom("graphics", "abline", "axis", "box", "mtext", "par", "title", "text") importFrom("methods", "is") There are functions from packages utils, stats, grDevices and graphics for which the R checks do not require a specific entry in the Imports: field. I suspect because all of these packages are part of the base R, but so is package methods. The question is why is it not mandatory for those packages to be mentioned in the Imports: field from DESCRIPTION, while removing package methods from that field runs into an error, despite maintaining the package in the NAMESPACE's importFrom(). > [...] > There are places in the R source where it is treated specially, > indeed, part of 'methods' may be needed when it is neither > loaded nor attached (e.g., when R runs with only base, say, and > suddenly encounters an S4 object), and there still are > situations where 'methods' needs to be in the search() path and > not just loaded, but these cases should be unrelated to the > above DESCRIPTION-Imports vs NAMESPACE-Imports correspondence. This is what I had expected myself, then the above behavior has to have another explanation. It is just a curiosity, there is naturally nothing wrong with maintaining package methods in the Imports: field. Only odd why some base R packages are treated differently than other base R packages, at the package checks stage. Thank you, Adrian -- Adrian Dusa University of Bucharest Romanian Social Data Archive Soseaua Panduri nr. 90-92 050663 Bucharest sector 5 Romania https://adriandusa.eu [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] [parallel] fixes load balancing of parLapplyLB
Chunk size support has been added in R-devel 74353. Please let me know if you find any problem. Thanks, Tomas On 03/01/2018 09:19 AM, Christian Krause wrote: Dear Tomas, Thanks for your commitment to fix this issue and also to add the chunk size as an argument. If you want our input, let us know ;) Best Regards On 02/26/2018 04:01 PM, Tomas Kalibera wrote: Dear Christian and Henrik, thank you for spotting the problem and suggestions for a fix. We'll probably add a chunk.size argument to parLapplyLB and parLapply to follow OpenMP terminology, which has already been an inspiration for the present code (parLapply already implements static scheduling via internal function staticClusterApply, yet with a fixed chunk size; parLapplyLB already implements dynamic scheduling via internal function dynamicClusterApply, but with a fixed chunk size set to an unlucky value so that it behaves like static scheduling). The default chunk size for parallelLapplyLB will be set so that there is some dynamism in the schedule even by default. I am now testing a patch with these changes. Best Tomas On 02/20/2018 11:45 AM, Christian Krause wrote: Dear Henrik, The rationale is just that it is within these extremes and that it is really simple to calculate, without making any assumptions and knowing that it won't be perfect. The extremes A and B you are mentioning are special cases based on assumptions. Case A is based on the assumption that the function has a long runtime or varying runtime, then you are likely to get the best load balancing with really small chunks. Case B is based on the assumption that the function runtime is the same for each list element, i.e. where you don't actually need load balancing, i.e. just use `parLapply` without load balancing. This new default is **not the best one**. It's just a better one than we had before. There is no best one we can use as default because **we don't know the function runtime and how it varies**. The user needs to decide that because he/she knows the function. As mentioned before, I will write a patch that makes the chunk size an optional argument, so the user can decide because only he/she has all the information to choose the best chunk size, just like you did with the `future.scheduling` parameter. Best Regards On February 19, 2018 10:11:04 PM GMT+01:00, Henrik Bengtsson wrote: Hi, I'm trying to understand the rationale for your proposed amount of splitting and more precisely why that one is THE one. If I put labels on your example numbers in one of your previous post: nbrOfElements <- 97 nbrOfWorkers <- 5 With these, there are two extremes in how you can split up the processing in chunks such that all workers are utilized: (A) Each worker, called multiple times, processes one element each time: nbrOfElements <- 97 nbrOfWorkers <- 5 nbrOfChunks <- nbrOfElements sapply(parallel:::splitList(1:nbrOfElements, nbrOfChunks), length) [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [30] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [59] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 [88] 1 1 1 1 1 1 1 1 1 1 (B) Each worker, called once, processes multiple element: nbrOfElements <- 97 nbrOfWorkers <- 5 nbrOfChunks <- nbrOfWorkers sapply(parallel:::splitList(1:nbrOfElements, nbrOfChunks), length) [1] 20 19 19 19 20 I understand that neither of these two extremes may be the best when it comes to orchestration overhead and load balancing. Instead, the best might be somewhere in-between, e.g. (C) Each worker, called multiple times, processing multiple elements: nbrOfElements <- 97 nbrOfWorkers <- 5 nbrOfChunks <- nbrOfElements / nbrOfWorkers sapply(parallel:::splitList(1:nbrOfElements, nbrOfChunks), length) [1] 5 5 5 5 4 5 5 5 5 5 4 5 5 5 5 4 5 5 5 5 However, there are multiple alternatives between the two extremes, e.g. nbrOfChunks <- scale * nbrOfElements / nbrOfWorkers So, is there a reason why you argue for scale = 1.0 to be the optimal? FYI, In future.apply::future_lapply(X, FUN, ...) there is a 'future.scheduling' scale factor(*) argument where default future.scheduling = 1 corresponds to (B) and future.scheduling = +Inf to (A). Using future.scheduling = 4 achieves the amount of load-balancing you propose in (C). (*) Different definition from the above 'scale'. (Disclaimer: I'm the author) /Henrik On Mon, Feb 19, 2018 at 10:21 AM, Christian Krause wrote: Dear R-Devel List, I have installed R 3.4.3 with the patch applied on our cluster and ran a *real-world* job of one of our users to confirm that the patch works to my satisfaction. Here are the results. The original was a series of jobs, all essentially doing the same stuff using bootstrapped data, so for the original there is more data and I show the arithmetic mean with standard deviation. The confirmation with the patched R was only a single instance of that series of jobs. ## Job Efficiency The job efficiency is
Re: [Rd] importing namespaces from base packages
> Adrian Dușa > on Tue, 13 Mar 2018 09:17:08 +0200 writes: > On Mon, Mar 12, 2018 at 2:18 PM, Martin Maechler > wrote: >> [...] >> Is that so? Not according to my reading of the 'Writing R >> Extensions' manual, nor according to what I have been doing in >> all of my packages for ca. 2 years: >> >> The rule I have in my mind is >> >> 1) NAMESPACE Import(s|From) \ >> <==> DESCRIPTION -> 'Imports:' >> 2) .. using "::" in R code / >> >> >> If you really found that you did not have to import from say >> 'utils', I think this was a *un*lucky coincidence. > Of course, the importFrom() is mandatory in NAMESPACE otherwise the package > does not pass the checks. > The question was related to the relation between the packages mentioned in > the NAMESPACE and the packages mentioned in the Imports: field from > DESCRIPTION. > For instance, the current version 3.1 of package QCA on CRAN mentions in > the DESCRIPTION: > Imports: venn (≥ 1.2), shiny, methods, fastdigest > while the NAMESPACE file has: > import(shiny) > import(venn) > import(fastdigest) > importFrom("utils", "packageDescription", "remove.packages", > "capture.output") > importFrom("stats", "glm", "predict", "quasibinomial", "binom.test", > "cutree", "dist", "hclust", "na.omit", "dbinom", "setNames") > importFrom("grDevices", "dev.cur", "dev.new", "dev.list") > importFrom("graphics", "abline", "axis", "box", "mtext", "par", "title", > "text") > importFrom("methods", "is") > There are functions from packages utils, stats, grDevices and graphics for > which the R checks do not require a specific entry in the Imports: field. > I suspect because all of these packages are part of the base R, but so is > package methods. The question is why is it not mandatory for those packages > to be mentioned in the Imports: field from DESCRIPTION, while removing > package methods from that field runs into an error, despite maintaining the > package in the NAMESPACE's importFrom(). Thank you, Adrian, for clarification of your question. As a matter of fact, I was not aware of what you showed above, and personally I think I do add every package/namespace mentioned in NAMESPACE to the DESCRIPTION's "Imports:" field. AFAIK the above phenomenon is not documented, and rather the docs would imply that this phenomenon might go away -- I for one would vote for more consistency here .. Martin >> [...] >> There are places in the R source where it is treated specially, >> indeed, part of 'methods' may be needed when it is neither >> loaded nor attached (e.g., when R runs with only base, say, and >> suddenly encounters an S4 object), and there still are >> situations where 'methods' needs to be in the search() path and >> not just loaded, but these cases should be unrelated to the >> above DESCRIPTION-Imports vs NAMESPACE-Imports correspondence. > This is what I had expected myself, then the above behavior has to have > another explanation. > It is just a curiosity, there is naturally nothing wrong with maintaining > package methods in the Imports: field. Only odd why some base R packages > are treated differently than other base R packages, at the package checks > stage. > Thank you, > Adrian > -- > Adrian Dusa > University of Bucharest > Romanian Social Data Archive > Soseaua Panduri nr. 90-92 > 050663 Bucharest sector 5 > Romania > https://adriandusa.eu > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] importing namespaces from base packages
It seems that they are defined in tools/R/check.R. For instance, line 363-364 says: ## The default set of packages here are as they are because ## .get_S3_generics_as_seen_from_package needs utils,graphics,stats and then on lines 368 (Windows) and 377 (other OS) it has: "R_DEFAULT_PACKAGES=utils,grDevices,graphics,stats" So these pass R CMD check and are an "industrial standard". Changing this will be break half of CRAN packages. Cheers, Jari Oksanen On 13/03/18 13:47, Martin Maechler wrote: Adrian Dușa on Tue, 13 Mar 2018 09:17:08 +0200 writes: > On Mon, Mar 12, 2018 at 2:18 PM, Martin Maechler > wrote: >> [...] >> Is that so? Not according to my reading of the 'Writing R >> Extensions' manual, nor according to what I have been doing in >> all of my packages for ca. 2 years: >> >> The rule I have in my mind is >> >> 1) NAMESPACE Import(s|From) \ >> <==> DESCRIPTION -> 'Imports:' >> 2) .. using "::" in R code / >> >> >> If you really found that you did not have to import from say >> 'utils', I think this was a *un*lucky coincidence. > Of course, the importFrom() is mandatory in NAMESPACE otherwise the package > does not pass the checks. > The question was related to the relation between the packages mentioned in > the NAMESPACE and the packages mentioned in the Imports: field from > DESCRIPTION. > For instance, the current version 3.1 of package QCA on CRAN mentions in > the DESCRIPTION: > Imports: venn (≥ 1.2), shiny, methods, fastdigest > while the NAMESPACE file has: > import(shiny) > import(venn) > import(fastdigest) > importFrom("utils", "packageDescription", "remove.packages", > "capture.output") > importFrom("stats", "glm", "predict", "quasibinomial", "binom.test", > "cutree", "dist", "hclust", "na.omit", "dbinom", "setNames") > importFrom("grDevices", "dev.cur", "dev.new", "dev.list") > importFrom("graphics", "abline", "axis", "box", "mtext", "par", "title", > "text") > importFrom("methods", "is") > There are functions from packages utils, stats, grDevices and graphics for > which the R checks do not require a specific entry in the Imports: field. > I suspect because all of these packages are part of the base R, but so is > package methods. The question is why is it not mandatory for those packages > to be mentioned in the Imports: field from DESCRIPTION, while removing > package methods from that field runs into an error, despite maintaining the > package in the NAMESPACE's importFrom(). Thank you, Adrian, for clarification of your question. As a matter of fact, I was not aware of what you showed above, and personally I think I do add every package/namespace mentioned in NAMESPACE to the DESCRIPTION's "Imports:" field. AFAIK the above phenomenon is not documented, and rather the docs would imply that this phenomenon might go away -- I for one would vote for more consistency here .. Martin >> [...] >> There are places in the R source where it is treated specially, >> indeed, part of 'methods' may be needed when it is neither >> loaded nor attached (e.g., when R runs with only base, say, and >> suddenly encounters an S4 object), and there still are >> situations where 'methods' needs to be in the search() path and >> not just loaded, but these cases should be unrelated to the >> above DESCRIPTION-Imports vs NAMESPACE-Imports correspondence. > This is what I had expected myself, then the above behavior has to have > another explanation. > It is just a curiosity, there is naturally nothing wrong with maintaining > package methods in the Imports: field. Only odd why some base R packages > are treated differently than other base R packages, at the package checks > stage. > Thank you, > Adrian > -- > Adrian Dusa > University of Bucharest > Romanian Social Data Archive > Soseaua Panduri nr. 90-92 > 050663 Bucharest sector 5 > Romania > https://adriandusa.eu > [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] 64-bit integer type warning on windows
Dear list, During the last two months, I spent a lot of time trying to remove the following warnings of my package randtoolbox: congruRand.c:180:3: warning: ISO C does not support the 'I64' ms_scanf length modifier [-Wformat=] sscanf(params[0], "%" PRIu64 "\n", &inp_mod); Please see https://www.r-project.org/nosvn/R.check/r-devel-windows-ix86+x86_64/randtoolbox-00install.html The warning is raised by the option -pedantic added when checking the package by R-devel. This warning is due to the use of the macro PRIu64. Note that this warning does not show up when I use PRIu64 with printf via Rprintf in other parts of the file congruRand.c. My first reaction to look at https://msdn.microsoft.com/fr-fr/library/f9ezhzhs(v=vs.110).aspx is useless, because R on windows uses Mingw-W64 integer types and not the MS version. Then, I tried to find information in such as Write Portable Code by Brian Hook and C11 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf). They do recommend to use the macro SCNu64 (reserved for scanf family) rather than PRIu64 reserved for printf family functions. Unfortunately when I check the package with R-devel, I still have the warning. On GitHub, there are a version of mingw64 sources : https://github.com/Alexpux/mingw-w64/ . It is explicitly written that SCNu64 is I64u, see line 210 of https://github.com/Alexpux/mingw-w64/blob/master/mingw-w64-headers/crt/inttypes.h So, I was wondering if this warning was removable at all? Does anyone encounter this issue? Any help is appreciated. Thanks in advance, Christophe PS: latest version of randtoolbox source code is here https://r-forge.r-project.org/scm/viewvc.php/pkg/randtoolbox/?root=rmetrics Christophe Dutang CEREMADE, Univ. Paris Dauphine, France Web : http://dutangc.free.fr __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel