Re: [Rd] Get Logical processor count correctly whether NUMA is enabled or disabled

2018-08-29 Thread Srinivasan, Arunkumar
Dear Tomas, thank you very much. I installed r-devel r75201 and tested.

The machine with 88 cores has NUMA disabled. It therefore has 2 processor 
groups with 64 and 24 processors each.

require(parallel)
detectCores()
# [1] 88

This is great!

Then I went on to test with a simple 'foreach()' loop. I started with 64 
processors (max limit of 1 processor group). I ran with a simple function of 
0.5s sleep.

require(snow)
require(doSNOW)
require(foreach)

cl <- makeCluster(64L, "SOCK")
registerDoSNOW(cl)
system.time(foreach(i=1:64) %dopar% Sys.sleep(0.5))
# user  system elapsed 
# 0.060.000.64 
system.time(foreach(i=1:65) %dopar% Sys.sleep(0.5))
#user  system elapsed 
#0.030.011.04 
stopCluster(cl)

With a cluster of 64 processors and loop running with 64 iterations, it 
completed in ~.5s (0.64), and with 65 iterations, it took ~1s as expected.
 
cl <- makeCluster(65L, "SOCK")
registerDoSNOW(cl)
system.time(foreach(i=1:64) %dopar% Sys.sleep(0.5))
   user  system elapsed 
   0.030.020.61 
system.time(foreach(i=1:65) %dopar% Sys.sleep(0.5))
# Timing stopped at: 0.08 0 293
stopCluster(cl)

However, when I increased the cluster to have 65 processors, a loop with 64 
iterations seem to complete as expected, but using all 65 processors to loop 
over 65 iterations didn't seem to complete. I stopped it after ~5mins. The same 
happens with the cluster started with any number between 65 and 88. It seems to 
me like we are still not being able to use >64 processors all at the same time 
even if detectCores() returns the right count now.

I'd appreciate your thoughts on this.

Best,
Arun.

-Original Message-
From: Tomas Kalibera  
Sent: 27 August 2018 19:43
To: Srinivasan, Arunkumar ; 
r-devel@r-project.org
Subject: Re: [Rd] Get Logical processor count correctly whether NUMA is enabled 
or disabled

Dear Arun,

thank you for checking the workaround scripts.

I've modified detectCores() to use GetLogicalProcessorInformationEx. It is in 
revision 75198 of R-devel, could you please test it on your machines? For a 
binary, you can wait until the R-devel snapshot build gets to at least this svn 
revision.

Thanks for the link to the processor groups documentation. I don't have a 
machine to test this on, but I would hope that snow clusters (e.g. 
PSOCK) should work fine on systems with >64 logical processors as they spawn 
new processes (not just threads). Note that FORK clusters are not supported on 
Windows.

Thanks
Tomas

On 08/21/2018 02:53 PM, Srinivasan, Arunkumar wrote:
> Dear Tomas, thank you for looking into this. Here's the output:
>
> # number of logical processors - what detectCores() should return out 
> <- system("wmic cpu get numberoflogicalprocessors", intern=TRUE)
> [1] "NumberOfLogicalProcessors  \r" "22 \r" "22   
>   \r"
> [4] "20 \r" "22 \r" "\r"
> sum(as.numeric(gsub("([0-9]+).*", "\\1", grep("[0-9]+[ \t]*", out, 
> value=TRUE # [1] 86
>
> [I've asked the IT team to understand why one of the values is 20 instead of 
> 22].
>
> # number of cores - what detectCores(FALSE) should return out <- 
> system("wmic cpu get numberofcores", intern=TRUE)
> [1] "NumberOfCores  \r" "22 \r" "22 \r" "20   
>   \r" "22 \r"
> [6] "\r"
> sum(as.numeric(gsub("([0-9]+).*", "\\1", grep("[0-9]+[ \t]*", out, 
> value=TRUE # [1] 86
>
> [Currently hyperthreading is disabled. So this output being identical to the 
> previous output makes sense].
>
> system("wmic computersystem get numberofprocessors") 
> NumberOfProcessors
> 4
>
> In addition, I'd also bring to your attention this documentation: 
> https://docs.microsoft.com/en-us/windows/desktop/ProcThread/processor-groups 
> on processor groups which explain how one should go about running a process 
> ro run on multiple groups (which seems to be different to NUMA). All this 
> seems overly complicated to allow a process to use all cores by default TBH.
>
> Here's a project on Github 'fio' where the issue of running a process on more 
> than 1 processor group has come up -  https://github.com/axboe/fio/issues/527 
> and is addressed - 
> https://github.com/axboe/fio/blob/c479640d6208236744f0562b1e79535eec290e2b/os/os-windows-7.h
>  . I am not sure though if this is entirely relevant since we would be 
> forking new processes in R instead of allowing a single process to use all 
> cores. Apologies if this is utterly irrelevant.
>
> Thank you,
> Arun.
>
> From: Tomas Kalibera 
> Sent: 21 August 2018 11:50
> To: Srinivasan, Arunkumar ; 
> r-devel@r-project.org
> Subject: Re: [Rd] Get Logical processor count correctly whether NUMA 
> is enabled or disabled
>
> Dear Arun,
>
> thank you for the report. I agree with the analysis, detectCores() will only 
> report logical processors in the NUMA group in which R is running. I don't 
> have a system to test on, could you please check these workarounds for 

Re: [Rd] conflicted: an alternative conflict resolution strategy

2018-08-29 Thread Hadley Wickham
>> conflicted applies a few heuristics to minimise false positives (at the
>> cost of introducing a few false negatives). The overarching goal is to
>> ensure that code behaves identically regardless of the order in which
>> packages are attached.
>>
>> -   A number of packages provide a function that appears to conflict
>> with a function in a base package, but they follow the superset
>> principle (i.e. they only extend the API, as explained to me by
>> Hervè Pages).
>>
>> conflicted assumes that packages adhere to the superset principle,
>> which appears to be true in most of the cases that I’ve seen.
>
>
> It seems that you may be able to strengthen this heuristic from a blanket 
> assumption to something more narrowly targeted by looking for one or more of 
> the following to confirm likely-superset adherence
>
> matching or purely extending formals (ie all the named arguments of base::fun 
> match including order, and there are new arguments in pkg::fun only if 
> base::fun takes ...)
> explicit call to  base::fun in the body of pkg::fun
> UseMethod(funname) and at least one provided S3 method calls base::fun
> S4 generic creation using fun or base::fun as the seeding/default method body 
> or called from at least one method

Oooh nice, idea I'll definitely try it out.

>> For
>> example, the lubridate package provides `as.difftime()` and `date()`
>> which extend the behaviour of base functions, and provides S4
>> generics for the set operators.
>>
>> conflict_scout(c("lubridate", "base"))
>> #> 5 conflicts:
>> #> * `as.difftime`: [lubridate]
>> #> * `date`   : [lubridate]
>> #> * `intersect`  : [lubridate]
>> #> * `setdiff`: [lubridate]
>> #> * `union`  : [lubridate]
>>
>> There are two popular functions that don’t adhere to this principle:
>> `dplyr::filter()` and `dplyr::lag()` :(. conflicted handles these
>> special cases so they correctly generate conflicts. (I sure wish I’d
>> know about the subset principle when creating dplyr!)
>>
>> conflict_scout(c("dplyr", "stats"))
>> #> 2 conflicts:
>> #> * `filter`: dplyr, stats
>> #> * `lag`   : dplyr, stats
>>
>> -   Deprecated functions should never win a conflict, so conflicted
>> checks for use of `.Deprecated()`. This rule is very useful when
>> moving functions from one package to another. For example, many
>> devtools functions were moved to usethis, and conflicted ensures
>> that you always get the non-deprecated version, regardess of package
>> attach order:
>
>
> I would completely believe this rule is useful for refactoring as you 
> describe, but that is the "same function" case. For an end-user in the 
> "different function same symbol" case it's not at all clear to me that the 
> deprecated function should always win.
>
> People sometimes use deprecated functions. It's not great, and eventually 
> they'll need to fix that for any given case, but imagine if you deprecated 
> the filter verb in dplyr (I know this will never happen, but I think it's 
> illustrative none the less).
>
> Consider a piece of code someone wrote before this hypothetical deprecation 
> of filter. The fact that it's now deprecated certainly doesn't mean that they 
> secretly wanted stats::filter all along, right? Conflicted acting as if it 
> does will lead to them getting the exact kind of error you're looking to 
> protect them from, and with even less ability to understand why because they 
> are already doing "The right thing" to protect themselves by using conflicted 
> in the first place...

Ah yes, good point. I'll add some heuristic to check that the function
name appears in the first argument of the .Deprecated call (assuming
that the call looks something like `.Deprecated("pkg::foo")`)

>> Finally, as mentioned above, the user can declare preferences:
>>
>> conflict_prefer("select", "MASS")
>> #> [conflicted] Will prefer MASS::select over any other package
>> conflict_scout(c("dplyr", "MASS"))
>> #> 1 conflict:
>> #> * `select`: [MASS]
>>
>
> I deeply worry about people putting this kind of thing, or even just 
> library(conflicted), in their .Rprofile and thus making their scripts 
> substantially less reproducible. Is that a consequence you have thought about 
> to this kind of functionality?

Yes, and I've already recommended against it in two places :)  I'm not
sure if there's any more I can do - people already put (e.g.)
`library(ggplot2)` in their .Rprofile, which is just as bad from a
reproducibility standpoint.

Thanks for the thoughtful feedback!

Hadley

-- 
http://hadley.nz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Where does L come from?

2018-08-29 Thread Hadley Wickham
Thanks for the great discussion everyone!
Hadley
On Sat, Aug 25, 2018 at 8:26 AM Hadley Wickham  wrote:
>
> Hi all,
>
> Would someone mind pointing to me to the inspiration for the use of
> the L suffix to mean "integer"?  This is obviously hard to google for,
> and the R language definition
> (https://cran.r-project.org/doc/manuals/r-release/R-lang.html#Constants)
> is silent.
>
> Hadley
>
> --
> http://hadley.nz



-- 
http://hadley.nz

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] "utils::file.edit" does not understand "editor" with additional arguments

2018-08-29 Thread Prof Brian Ripley
We do not have the 'at a minimum' information requested by the posting 
guide, and I cannot reproduce anything like this on a Unix-alike.  Both 
file.edit and edit.default call the same underlying C code, and that 
single-quotes the 'editor' argument to allow for spaces in its path/name 
so I would not expect this to work.


Two workarounds:

1) Set an alias in your shell (e.g. in .bashrc) for 'subl -n'.  This is 
something widely needed on macOS where many editors are invoked by 'open 
-a', and I also use it for 'emacsclient -n'.


2) Make use of the ability to specify editor as an R function, invoking 
the external program by system2() etc.



On 28/08/2018 20:07, Randy Lai wrote:

I am using Sublime Text as my editor. If I run `subl -n .Rprofile` in bash, a 
file would be opened in a new window.

Back in R, if I run this


file.edit(".Rprofile", editor="'subl -n'")

sh: 'subl -n': command not found
Warning message:
error in running command

However, the interesting bit happens when I run

edit(1:10, editor="'subl -n’")

It does open Sublime Text. It seems that `file.edit` and `edit` are behaving 
differently when “editor” has additional arguments.

Randy


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Emeritus Professor of Applied Statistics, University of Oxford

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] build package with unicode (farsi) strings

2018-08-29 Thread Faridedin Cheraghi
Hi,

I have a R script file with Persian letters in it defined as a variable:

#' @export
letters_fa <- c('الف','ب','پ','ت','ث','ج','چ','ح','خ','ر','ز','د')

I have specified the encoding field in my DESCRIPTION file of my package.

...
Encoding: UTF-8
...

I also included Sys.setlocale(locale="Persian") in my .RProfile, so it is
executed when RCMD is called. However, after a BUILD and INSTALL, when I
access the variable from the package, the characters are not printed
correctly:
> futils::letters_fa
 [1] "<84><81>" "" ""
   "" ""
 [6] "" "<86>" ""
   "" ""
[11] "" ""


thanks
Farid

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel