Re: [Rd] importing namespaces from base packages

2018-03-13 Thread Adrian Dușa
On Mon, Mar 12, 2018 at 2:18 PM, Martin Maechler 
wrote:
> [...]
> Is that so?   Not according to my reading of the 'Writing R
> Extensions' manual, nor according to what I have been doing in
> all of my packages for ca. 2 years:
>
> The rule I have in my mind is
>
>  1) NAMESPACE Import(s|From) \
>   <==>  DESCRIPTION -> 'Imports:'
>  2) .. using "::" in  R code /
>
>
> If you really found that you did not have to import from say
> 'utils', I think this was a *un*lucky coincidence.

Of course, the importFrom() is mandatory in NAMESPACE otherwise the package
does not pass the checks.
The question was related to the relation between the packages mentioned in
the NAMESPACE and the packages mentioned in the Imports: field from
DESCRIPTION.

For instance, the current version 3.1 of package QCA on CRAN mentions in
the DESCRIPTION:

Imports: venn (≥ 1.2), shiny, methods, fastdigest

while the NAMESPACE file has:

import(shiny)
import(venn)
import(fastdigest)
importFrom("utils", "packageDescription", "remove.packages",
"capture.output")
importFrom("stats", "glm", "predict", "quasibinomial", "binom.test",
"cutree", "dist", "hclust", "na.omit", "dbinom", "setNames")
importFrom("grDevices", "dev.cur", "dev.new", "dev.list")
importFrom("graphics", "abline", "axis", "box", "mtext", "par", "title",
"text")
importFrom("methods", "is")

There are functions from packages utils, stats, grDevices and graphics for
which the R checks do not require a specific entry in the Imports: field.
I suspect because all of these packages are part of the base R, but so is
package methods. The question is why is it not mandatory for those packages
to be mentioned in the Imports: field from DESCRIPTION, while removing
package methods from that field runs into an error, despite maintaining the
package in the NAMESPACE's importFrom().



> [...]
> There are places in the R source where it is treated specially,
> indeed, part of 'methods' may be needed when it is neither
> loaded nor attached (e.g., when R runs with only base, say, and
> suddenly encounters an S4 object), and there still are
> situations where 'methods' needs to be in the search() path and
> not just loaded, but these cases should be unrelated to the
> above DESCRIPTION-Imports vs NAMESPACE-Imports correspondence.

This is what I had expected myself, then the above behavior has to have
another explanation.
It is just a curiosity, there is naturally nothing wrong with maintaining
package methods in the Imports: field. Only odd why some base R packages
are treated differently than other base R packages, at the package checks
stage.

Thank you,
Adrian

--
Adrian Dusa
University of Bucharest
Romanian Social Data Archive
Soseaua Panduri nr. 90-92
050663 Bucharest sector 5
Romania
https://adriandusa.eu

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [parallel] fixes load balancing of parLapplyLB

2018-03-13 Thread Tomas Kalibera


Chunk size support has been added in R-devel 74353. Please let me know 
if you find any problem.


Thanks,
Tomas

On 03/01/2018 09:19 AM, Christian Krause wrote:

Dear Tomas,

Thanks for your commitment to fix this issue and also to add the chunk size as 
an argument. If you want our input, let us know ;)

Best Regards

On 02/26/2018 04:01 PM, Tomas Kalibera wrote:

Dear Christian and Henrik,

thank you for spotting the problem and suggestions for a fix. We'll probably 
add a chunk.size argument to parLapplyLB and parLapply to follow OpenMP 
terminology, which has already been an inspiration for the present code 
(parLapply already implements static scheduling via internal function 
staticClusterApply, yet with a fixed chunk size; parLapplyLB already implements 
dynamic scheduling via internal function dynamicClusterApply, but with a fixed 
chunk size set to an unlucky value so that it behaves like static scheduling). 
The default chunk size for parallelLapplyLB will be set so that there is some 
dynamism in the schedule even by default. I am now testing a patch with these 
changes.

Best
Tomas


On 02/20/2018 11:45 AM, Christian Krause wrote:

Dear Henrik,

The rationale is just that it is within these extremes and that it is really 
simple to calculate, without making any assumptions and knowing that it won't 
be perfect.

The extremes A and B you are mentioning are special cases based on assumptions. 
Case A is based on the assumption that the function has a long runtime or 
varying runtime, then you are likely to get the best load balancing with really 
small chunks. Case B is based on the assumption that the function runtime is 
the same for each list element, i.e. where you don't actually need load 
balancing, i.e. just use `parLapply` without load balancing.

This new default is **not the best one**. It's just a better one than we had 
before. There is no best one we can use as default because **we don't know the 
function runtime and how it varies**. The user needs to decide that because 
he/she knows the function. As mentioned before, I will write a patch that makes 
the chunk size an optional argument, so the user can decide because only he/she 
has all the information to choose the best chunk size, just like you did with 
the `future.scheduling` parameter.

Best Regards

On February 19, 2018 10:11:04 PM GMT+01:00, Henrik Bengtsson 
 wrote:

Hi, I'm trying to understand the rationale for your proposed amount of
splitting and more precisely why that one is THE one.

If I put labels on your example numbers in one of your previous post:

nbrOfElements <- 97
nbrOfWorkers <- 5

With these, there are two extremes in how you can split up the
processing in chunks such that all workers are utilized:

(A) Each worker, called multiple times, processes one element each
time:


nbrOfElements <- 97
nbrOfWorkers <- 5
nbrOfChunks <- nbrOfElements
sapply(parallel:::splitList(1:nbrOfElements, nbrOfChunks), length)

[1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[30] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[59] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
[88] 1 1 1 1 1 1 1 1 1 1


(B) Each worker, called once, processes multiple element:


nbrOfElements <- 97
nbrOfWorkers <- 5
nbrOfChunks <- nbrOfWorkers
sapply(parallel:::splitList(1:nbrOfElements, nbrOfChunks), length)

[1] 20 19 19 19 20

I understand that neither of these two extremes may be the best when
it comes to orchestration overhead and load balancing. Instead, the
best might be somewhere in-between, e.g.

(C) Each worker, called multiple times, processing multiple elements:


nbrOfElements <- 97
nbrOfWorkers <- 5
nbrOfChunks <- nbrOfElements / nbrOfWorkers
sapply(parallel:::splitList(1:nbrOfElements, nbrOfChunks), length)

[1] 5 5 5 5 4 5 5 5 5 5 4 5 5 5 5 4 5 5 5 5

However, there are multiple alternatives between the two extremes, e.g.


nbrOfChunks <- scale * nbrOfElements / nbrOfWorkers

So, is there a reason why you argue for scale = 1.0 to be the optimal?

FYI, In future.apply::future_lapply(X, FUN, ...) there is a
'future.scheduling' scale factor(*) argument where default
future.scheduling = 1 corresponds to (B) and future.scheduling = +Inf
to (A).  Using future.scheduling = 4 achieves the amount of
load-balancing you propose in (C).   (*) Different definition from the
above 'scale'. (Disclaimer: I'm the author)

/Henrik

On Mon, Feb 19, 2018 at 10:21 AM, Christian Krause
 wrote:

Dear R-Devel List,

I have installed R 3.4.3 with the patch applied on our cluster and

ran a *real-world* job of one of our users to confirm that the patch
works to my satisfaction. Here are the results.

The original was a series of jobs, all essentially doing the same

stuff using bootstrapped data, so for the original there is more data
and I show the arithmetic mean with standard deviation. The
confirmation with the patched R was only a single instance of that
series of jobs.

## Job Efficiency

The job efficiency is 

Re: [Rd] importing namespaces from base packages

2018-03-13 Thread Martin Maechler
> Adrian Dușa 
> on Tue, 13 Mar 2018 09:17:08 +0200 writes:

> On Mon, Mar 12, 2018 at 2:18 PM, Martin Maechler 

> wrote:
>> [...]
>> Is that so?   Not according to my reading of the 'Writing R
>> Extensions' manual, nor according to what I have been doing in
>> all of my packages for ca. 2 years:
>> 
>> The rule I have in my mind is
>> 
>> 1) NAMESPACE Import(s|From) \
>>  <==>  DESCRIPTION -> 'Imports:'
>> 2) .. using "::" in  R code /
>> 
>> 
>> If you really found that you did not have to import from say
>> 'utils', I think this was a *un*lucky coincidence.

> Of course, the importFrom() is mandatory in NAMESPACE otherwise the 
package
> does not pass the checks.
> The question was related to the relation between the packages mentioned in
> the NAMESPACE and the packages mentioned in the Imports: field from
> DESCRIPTION.

> For instance, the current version 3.1 of package QCA on CRAN mentions in
> the DESCRIPTION:

> Imports: venn (≥ 1.2), shiny, methods, fastdigest

> while the NAMESPACE file has:

> import(shiny)
> import(venn)
> import(fastdigest)
> importFrom("utils", "packageDescription", "remove.packages",
> "capture.output")
> importFrom("stats", "glm", "predict", "quasibinomial", "binom.test",
> "cutree", "dist", "hclust", "na.omit", "dbinom", "setNames")
> importFrom("grDevices", "dev.cur", "dev.new", "dev.list")
> importFrom("graphics", "abline", "axis", "box", "mtext", "par", "title",
> "text")
> importFrom("methods", "is")

> There are functions from packages utils, stats, grDevices and graphics for
> which the R checks do not require a specific entry in the Imports: field.
> I suspect because all of these packages are part of the base R, but so is
> package methods. The question is why is it not mandatory for those 
packages
> to be mentioned in the Imports: field from DESCRIPTION, while removing
> package methods from that field runs into an error, despite maintaining 
the
> package in the NAMESPACE's importFrom().


Thank you, Adrian,  for clarification of your question.
As a matter of fact, I was not aware of what you showed above,
and personally I think I do add every package/namespace mentioned in
NAMESPACE to the DESCRIPTION's  "Imports:" field.

AFAIK the above phenomenon is not documented, and rather the
docs would imply that this phenomenon might go away -- I for one
would vote for more consistency here ..

Martin

>> [...]
>> There are places in the R source where it is treated specially,
>> indeed, part of 'methods' may be needed when it is neither
>> loaded nor attached (e.g., when R runs with only base, say, and
>> suddenly encounters an S4 object), and there still are
>> situations where 'methods' needs to be in the search() path and
>> not just loaded, but these cases should be unrelated to the
>> above DESCRIPTION-Imports vs NAMESPACE-Imports correspondence.

> This is what I had expected myself, then the above behavior has to have
> another explanation.
> It is just a curiosity, there is naturally nothing wrong with maintaining
> package methods in the Imports: field. Only odd why some base R packages
> are treated differently than other base R packages, at the package checks
> stage.

> Thank you,
> Adrian

> --
> Adrian Dusa
> University of Bucharest
> Romanian Social Data Archive
> Soseaua Panduri nr. 90-92
> 050663 Bucharest sector 5
> Romania
> https://adriandusa.eu

> [[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] importing namespaces from base packages

2018-03-13 Thread Jari Oksanen
It seems that they are defined in tools/R/check.R. For instance, line 
363-364 says:


## The default set of packages here are as they are because
## .get_S3_generics_as_seen_from_package needs utils,graphics,stats

and then on lines 368 (Windows) and 377 (other OS) it has:
"R_DEFAULT_PACKAGES=utils,grDevices,graphics,stats"

So these pass R CMD check and are an "industrial standard". Changing 
this will be break half of CRAN packages.


Cheers, Jari Oksanen

On 13/03/18 13:47, Martin Maechler wrote:

Adrian Dușa 
 on Tue, 13 Mar 2018 09:17:08 +0200 writes:


 > On Mon, Mar 12, 2018 at 2:18 PM, Martin Maechler 

 > wrote:
 >> [...]
 >> Is that so?   Not according to my reading of the 'Writing R
 >> Extensions' manual, nor according to what I have been doing in
 >> all of my packages for ca. 2 years:
 >>
 >> The rule I have in my mind is
 >>
 >> 1) NAMESPACE Import(s|From) \
 >>  <==>  DESCRIPTION -> 'Imports:'
 >> 2) .. using "::" in  R code /
 >>
 >>
 >> If you really found that you did not have to import from say
 >> 'utils', I think this was a *un*lucky coincidence.

 > Of course, the importFrom() is mandatory in NAMESPACE otherwise the 
package
 > does not pass the checks.
 > The question was related to the relation between the packages mentioned 
in
 > the NAMESPACE and the packages mentioned in the Imports: field from
 > DESCRIPTION.

 > For instance, the current version 3.1 of package QCA on CRAN mentions in
 > the DESCRIPTION:

 > Imports: venn (≥ 1.2), shiny, methods, fastdigest

 > while the NAMESPACE file has:

 > import(shiny)
 > import(venn)
 > import(fastdigest)
 > importFrom("utils", "packageDescription", "remove.packages",
 > "capture.output")
 > importFrom("stats", "glm", "predict", "quasibinomial", "binom.test",
 > "cutree", "dist", "hclust", "na.omit", "dbinom", "setNames")
 > importFrom("grDevices", "dev.cur", "dev.new", "dev.list")
 > importFrom("graphics", "abline", "axis", "box", "mtext", "par", "title",
 > "text")
 > importFrom("methods", "is")

 > There are functions from packages utils, stats, grDevices and graphics 
for
 > which the R checks do not require a specific entry in the Imports: field.
 > I suspect because all of these packages are part of the base R, but so is
 > package methods. The question is why is it not mandatory for those 
packages
 > to be mentioned in the Imports: field from DESCRIPTION, while removing
 > package methods from that field runs into an error, despite maintaining 
the
 > package in the NAMESPACE's importFrom().


Thank you, Adrian,  for clarification of your question.
As a matter of fact, I was not aware of what you showed above,
and personally I think I do add every package/namespace mentioned in
NAMESPACE to the DESCRIPTION's  "Imports:" field.

AFAIK the above phenomenon is not documented, and rather the
docs would imply that this phenomenon might go away -- I for one
would vote for more consistency here ..

Martin

 >> [...]
 >> There are places in the R source where it is treated specially,
 >> indeed, part of 'methods' may be needed when it is neither
 >> loaded nor attached (e.g., when R runs with only base, say, and
 >> suddenly encounters an S4 object), and there still are
 >> situations where 'methods' needs to be in the search() path and
 >> not just loaded, but these cases should be unrelated to the
 >> above DESCRIPTION-Imports vs NAMESPACE-Imports correspondence.

 > This is what I had expected myself, then the above behavior has to have
 > another explanation.
 > It is just a curiosity, there is naturally nothing wrong with maintaining
 > package methods in the Imports: field. Only odd why some base R packages
 > are treated differently than other base R packages, at the package checks
 > stage.

 > Thank you,
 > Adrian

 > --
 > Adrian Dusa
 > University of Bucharest
 > Romanian Social Data Archive
 > Soseaua Panduri nr. 90-92
 > 050663 Bucharest sector 5
 > Romania
 > https://adriandusa.eu

 > [[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] 64-bit integer type warning on windows

2018-03-13 Thread Christophe DUTANG
Dear list,

During the last two months, I spent a lot of time trying to remove the 
following warnings of my package randtoolbox:

congruRand.c:180:3: warning: ISO C does not support the 'I64' ms_scanf length 
modifier [-Wformat=]
   sscanf(params[0], "%" PRIu64 "\n", &inp_mod);

Please see 
https://www.r-project.org/nosvn/R.check/r-devel-windows-ix86+x86_64/randtoolbox-00install.html
 

The warning is raised by the option -pedantic added when checking the package 
by R-devel. This warning is due to the use of the macro PRIu64. Note that this 
warning does not show up when I use PRIu64 with printf via Rprintf in other 
parts of the file congruRand.c. 

My first reaction to look at 
https://msdn.microsoft.com/fr-fr/library/f9ezhzhs(v=vs.110).aspx is useless, 
because R on windows uses Mingw-W64 integer types and not the MS version. 

Then, I tried to find information in such as Write Portable Code by Brian Hook 
and C11 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf).

They do recommend to use the macro SCNu64 (reserved for scanf family) rather 
than PRIu64 reserved for printf family functions. 

Unfortunately when I check the package with R-devel, I still have the warning.

On GitHub, there are a version of mingw64 sources : 
https://github.com/Alexpux/mingw-w64/ . It is explicitly written that SCNu64 is 
I64u, see line 210 of 
https://github.com/Alexpux/mingw-w64/blob/master/mingw-w64-headers/crt/inttypes.h
 

So, I was wondering if this warning was removable at all? Does anyone encounter 
this issue?

Any help is appreciated.

Thanks in advance, Christophe

PS: latest version of randtoolbox source code is here 
https://r-forge.r-project.org/scm/viewvc.php/pkg/randtoolbox/?root=rmetrics

Christophe Dutang
CEREMADE, Univ. Paris Dauphine, France
Web : http://dutangc.free.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel