I'm not sure the user would know ;). This is very system-specific issue just 
because the Linux network stack behaves so differently from other OSes (for 
purely historical reasons). That makes it hard to abstract as a "feature" for 
the R sockets that are supposed to be platform-independent. At least 
TCP_NODELAY is actually part of POSIX so it is on better footing, and disabling 
delayed ACK is practically only useful to work around the other side having 
Nagle on, so I would expect it to be rarely used.

This is essentially RFC since we don't have a mechanism for socket options 
(well, almost, there is timeout and blocking already...) and I don't think we 
want to expose low-level details so perhaps one idea would be to add something 
like delay=NA to socketConnection() in order to not touch (NA), enable (TRUE) 
or disable (FALSE) TCP_NODELAY. I wonder if there is any other way we could 
infer the intention of the user to try to choose the right approach...

Cheers,
Simon


> On Nov 3, 2020, at 02:28, Jeff <j...@vtkellers.com> wrote:
> 
> Could TCP_NODELAY and TCP_QUICKACK be exposed to the R user so that they 
> might determine what is best for their potentially latency- or 
> throughput-sensitive application?
> 
> Best,
> Jeff
> 
> On Mon, Nov 2, 2020 at 14:05, Iñaki Ucar <iu...@fedoraproject.org> wrote:
>> On Mon, 2 Nov 2020 at 02:22, Simon Urbanek <simon.urba...@r-project.org> 
>> wrote:
>>> It looks like R sockets on Linux could do with TCP_NODELAY -- without 
>>> (status quo):
>> How many network packets are generated with and without it? If there
>> are many small writes and thus setting TCP_NODELAY causes many small
>> packets to be sent, it might make more sense to set TCP_QUICKACK
>> instead.
>> Iñaki
>>> Unit: microseconds
>>>                    expr      min       lq     mean  median       uq      max
>>>  clusterEvalQ(cl, iris) 1449.997 43991.99 43975.21 43997.1 44001.91 48027.83
>>>  neval
>>>   1000
>>> exactly the same machine + R but with TCP_NODELAY enabled in 
>>> R_SockConnect():
>>> Unit: microseconds
>>>                    expr     min     lq     mean  median      uq      max 
>>> neval
>>>  clusterEvalQ(cl, iris) 156.125 166.41 180.8806 170.247 174.298 5322.234  
>>> 1000
>>> Cheers,
>>> Simon
>>> > On 2/11/2020, at 3:39 AM, Jeff <j...@vtkellers.com> wrote:
>>> >
>>> > I'm exploring latency overhead of parallel PSOCK workers and noticed that 
>>> > serializing/unserializing data back to the main R session is 
>>> > significantly slower on Linux than it is on Windows/MacOS with similar 
>>> > hardware. Is there a reason for this difference and is there a way to 
>>> > avoid the apparent additional Linux overhead?
>>> >
>>> > I attempted to isolate the behavior with a test that simply returns an 
>>> > existing object from the worker back to the main R session.
>>> >
>>> > library(parallel)
>>> > library(microbenchmark)
>>> > gcinfo(TRUE)
>>> > cl <- makeCluster(1)
>>> > (x <- microbenchmark(clusterEvalQ(cl, iris), times = 1000, unit = "us"))
>>> > plot(x$time, ylab = "microseconds")
>>> > head(x$time, n = 10)
>>> >
>>> > On Windows/MacOS, the test runs in 300-500 microseconds depending on 
>>> > hardware. A few of the 1000 runs are an order of magnitude slower but 
>>> > this can probably be attributed to garbage collection on the worker.
>>> >
>>> > On Linux, the first 5 or so executions run at comparable speeds but all 
>>> > subsequent executions are two orders of magnitude slower (~40 
>>> > milliseconds).
>>> >
>>> > I see this behavior across various platforms and hardware combinations:
>>> >
>>> > Ubuntu 18.04 (Intel Xeon Platinum 8259CL)
>>> > Linux Mint 19.3 (AMD Ryzen 7 1800X)
>>> > Linux Mint 20 (AMD Ryzen 7 3700X)
>>> > Windows 10 (AMD Ryzen 7 4800H)
>>> > MacOS 10.15.7 (Intel Core i7-8850H)
>>> >
>>> > ______________________________________________
>>> > R-devel@r-project.org mailing list
>>> > https://stat.ethz.ch/mailman/listinfo/r-devel
>>> >
>>> ______________________________________________
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> --
>> Iñaki Úcar
> 
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to