A summary for reference: the new detectCores() for Windows in R-devel seems to be working both for logical and physical cores on systems with >64 logical processors (thanks to Arun for testing!). If the feature is important for anyone particularly using an older version of Windows and/or on a system with >64 logical processors, it would be nice if you could test and report any possible problem.
As I mentioned earlier, in older versions of R one can as a workaround use "wmic" to detect the number of processors on systems with >64 logical processors (with appropriate error handling added as needed): # detectCores() out <- system("wmic cpu get numberoflogicalprocessors", intern=TRUE) sum(as.numeric(gsub("([0-9]+).*", "\\1", grep("[0-9]+[ \t]*", out, value=TRUE)))) #detectCores(logical=FALSE) out <- system("wmic cpu get numberofcores", intern=TRUE) sum(as.numeric(gsub("([0-9]+).*", "\\1", grep("[0-9]+[ \t]*", out, value=TRUE)))) The remaining problem with running using >64 processors on Windows turned out to be due to a bug in sockets communication, debugged and fixed in R-devel by Luke Tierney. Tomas On 08/29/2018 12:42 PM, Srinivasan, Arunkumar wrote: > Dear Tomas, thank you very much. I installed r-devel r75201 and tested. > > The machine with 88 cores has NUMA disabled. It therefore has 2 processor > groups with 64 and 24 processors each. > > require(parallel) > detectCores() > # [1] 88 > > This is great! > > Then I went on to test with a simple 'foreach()' loop. I started with 64 > processors (max limit of 1 processor group). I ran with a simple function of > 0.5s sleep. > > require(snow) > require(doSNOW) > require(foreach) > > cl <- makeCluster(64L, "SOCK") > registerDoSNOW(cl) > system.time(foreach(i=1:64) %dopar% Sys.sleep(0.5)) > # user system elapsed > # 0.06 0.00 0.64 > system.time(foreach(i=1:65) %dopar% Sys.sleep(0.5)) > # user system elapsed > # 0.03 0.01 1.04 > stopCluster(cl) > > With a cluster of 64 processors and loop running with 64 iterations, it > completed in ~.5s (0.64), and with 65 iterations, it took ~1s as expected. > > cl <- makeCluster(65L, "SOCK") > registerDoSNOW(cl) > system.time(foreach(i=1:64) %dopar% Sys.sleep(0.5)) > user system elapsed > 0.03 0.02 0.61 > system.time(foreach(i=1:65) %dopar% Sys.sleep(0.5)) > # Timing stopped at: 0.08 0 293 > stopCluster(cl) > > However, when I increased the cluster to have 65 processors, a loop with 64 > iterations seem to complete as expected, but using all 65 processors to loop > over 65 iterations didn't seem to complete. I stopped it after ~5mins. The > same happens with the cluster started with any number between 65 and 88. It > seems to me like we are still not being able to use >64 processors all at the > same time even if detectCores() returns the right count now. > > I'd appreciate your thoughts on this. > > Best, > Arun. > > -----Original Message----- > From: Tomas Kalibera <tomas.kalib...@gmail.com> > Sent: 27 August 2018 19:43 > To: Srinivasan, Arunkumar <arunkumar.sriniva...@uk.mlp.com>; > r-devel@r-project.org > Subject: Re: [Rd] Get Logical processor count correctly whether NUMA is > enabled or disabled > > Dear Arun, > > thank you for checking the workaround scripts. > > I've modified detectCores() to use GetLogicalProcessorInformationEx. It is in > revision 75198 of R-devel, could you please test it on your machines? For a > binary, you can wait until the R-devel snapshot build gets to at least this > svn revision. > > Thanks for the link to the processor groups documentation. I don't have a > machine to test this on, but I would hope that snow clusters (e.g. > PSOCK) should work fine on systems with >64 logical processors as they spawn > new processes (not just threads). Note that FORK clusters are not supported > on Windows. > > Thanks > Tomas > > On 08/21/2018 02:53 PM, Srinivasan, Arunkumar wrote: >> Dear Tomas, thank you for looking into this. Here's the output: >> >> # number of logical processors - what detectCores() should return out >> <- system("wmic cpu get numberoflogicalprocessors", intern=TRUE) >> [1] "NumberOfLogicalProcessors \r" "22 \r" "22 >> \r" >> [4] "20 \r" "22 \r" "\r" >> sum(as.numeric(gsub("([0-9]+).*", "\\1", grep("[0-9]+[ \t]*", out, >> value=TRUE)))) # [1] 86 >> >> [I've asked the IT team to understand why one of the values is 20 instead of >> 22]. >> >> # number of cores - what detectCores(FALSE) should return out <- >> system("wmic cpu get numberofcores", intern=TRUE) >> [1] "NumberOfCores \r" "22 \r" "22 \r" "20 >> \r" "22 \r" >> [6] "\r" >> sum(as.numeric(gsub("([0-9]+).*", "\\1", grep("[0-9]+[ \t]*", out, >> value=TRUE)))) # [1] 86 >> >> [Currently hyperthreading is disabled. So this output being identical to the >> previous output makes sense]. >> >> system("wmic computersystem get numberofprocessors") >> NumberOfProcessors >> 4 >> >> In addition, I'd also bring to your attention this documentation: >> https://docs.microsoft.com/en-us/windows/desktop/ProcThread/processor-groups >> on processor groups which explain how one should go about running a process >> ro run on multiple groups (which seems to be different to NUMA). All this >> seems overly complicated to allow a process to use all cores by default TBH. >> >> Here's a project on Github 'fio' where the issue of running a process on >> more than 1 processor group has come up - >> https://github.com/axboe/fio/issues/527 and is addressed - >> https://github.com/axboe/fio/blob/c479640d6208236744f0562b1e79535eec290e2b/os/os-windows-7.h >> . I am not sure though if this is entirely relevant since we would be >> forking new processes in R instead of allowing a single process to use all >> cores. Apologies if this is utterly irrelevant. >> >> Thank you, >> Arun. >> >> From: Tomas Kalibera <tomas.kalib...@gmail.com> >> Sent: 21 August 2018 11:50 >> To: Srinivasan, Arunkumar <arunkumar.sriniva...@uk.mlp.com>; >> r-devel@r-project.org >> Subject: Re: [Rd] Get Logical processor count correctly whether NUMA >> is enabled or disabled >> >> Dear Arun, >> >> thank you for the report. I agree with the analysis, detectCores() will only >> report logical processors in the NUMA group in which R is running. I don't >> have a system to test on, could you please check these workarounds for me on >> your systems? >> >> # number of logical processors - what detectCores() should return out >> <- system("wmic cpu get numberoflogicalprocessors", intern=TRUE) >> sum(as.numeric(gsub("([0-9]+).*", "\\1", grep("[0-9]+[ \t]*", out, >> value=TRUE)))) >> >> # number of cores - what detectCores(FALSE) should return out <- >> system("wmic cpu get numberofcores", intern=TRUE) >> sum(as.numeric(gsub("([0-9]+).*", "\\1", grep("[0-9]+[ \t]*", out, >> value=TRUE)))) >> >> # number of physical processors - as a sanity check >> >> system("wmic computersystem get numberofprocessors") >> >> Thanks, >> Tomas >> >> On 08/17/2018 05:11 PM, Srinivasan, Arunkumar wrote: >> Dear R-devel list, >> >> R's detectCores() function internally calls "ncpus" function to get the >> total number of logical processors. However, this doesnot seem to take NUMA >> into account on Windows machines. >> >> On a machine having 48 processors (24 cores) in total and windows server >> 2012 installed, if NUMA is enabled and has 2 nodes (node 0 and node 1 each >> having 24 CPUs), then R's detectCores() only detects 24 instead of the total >> 48. If NUMA is disabled, detectCores() returns 48. >> >> Similarly, on a machine with 88 cores (176 processors) and windows server >> 2012, detectCores() with NUMA disabled only returns the maximum value of 64. >> If NUMA is enabled with 4 nodes (44 processors each), then detectCores() >> will only return 44. This is particularly limiting since we cannot get to >> use all processors by enabling/disabling NUMA in this case. >> >> We think this is because R's ncpus.c file uses >> "PSYSTEM_LOGICAL_PROCESSOR_INFORMATION" >> (https://msdn.microsoft.com/en-us/library/windows/desktop/ms683194(v=vs.85).aspx) >> instead of "PSYSTEM_LOGICAL_PROCESSOR_INFORMATION_EX" >> (https://msdn.microsoft.com/en-us/library/windows/desktop/dd405488(v=vs.85).aspx). >> Specifically, quoting from the first link: >> >> "On systems with more than 64 logical processors, the >> GetLogicalProcessorInformation function retrieves logical processor >> information about processors in the >> https://msdn.microsoft.com/en-us/library/windows/desktop/dd405503(v=vs.85).aspx >> to which the calling thread is currently assigned. Use the >> https://msdn.microsoft.com/en-us/library/windows/desktop/dd405488(v=vs.85).aspx >> function to retrieve information about processors in all processor groups >> on the system." >> >> Therefore, it might be possible to get the right count of total processors >> even with NUMA enabled by using "GetLogicalProcessorInformationEX". It'd be >> nice to know what you think. >> >> Thank you very much, >> Arun. >> >> -- >> Arun Srinivasan >> Analyst, Millennium Management LLC >> 50 Berkeley Street | London, W1J 8HD >> [[alternative HTML version deleted]] ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel