I'm not sure the user would know ;). This is very system-specific issue just because the Linux network stack behaves so differently from other OSes (for purely historical reasons). That makes it hard to abstract as a "feature" for the R sockets that are supposed to be platform-independent. At least TCP_NODELAY is actually part of POSIX so it is on better footing, and disabling delayed ACK is practically only useful to work around the other side having Nagle on, so I would expect it to be rarely used.
This is essentially RFC since we don't have a mechanism for socket options (well, almost, there is timeout and blocking already...) and I don't think we want to expose low-level details so perhaps one idea would be to add something like delay=NA to socketConnection() in order to not touch (NA), enable (TRUE) or disable (FALSE) TCP_NODELAY. I wonder if there is any other way we could infer the intention of the user to try to choose the right approach... Cheers, Simon > On Nov 3, 2020, at 02:28, Jeff <j...@vtkellers.com> wrote: > > Could TCP_NODELAY and TCP_QUICKACK be exposed to the R user so that they > might determine what is best for their potentially latency- or > throughput-sensitive application? > > Best, > Jeff > > On Mon, Nov 2, 2020 at 14:05, Iñaki Ucar <iu...@fedoraproject.org> wrote: >> On Mon, 2 Nov 2020 at 02:22, Simon Urbanek <simon.urba...@r-project.org> >> wrote: >>> It looks like R sockets on Linux could do with TCP_NODELAY -- without >>> (status quo): >> How many network packets are generated with and without it? If there >> are many small writes and thus setting TCP_NODELAY causes many small >> packets to be sent, it might make more sense to set TCP_QUICKACK >> instead. >> Iñaki >>> Unit: microseconds >>> expr min lq mean median uq max >>> clusterEvalQ(cl, iris) 1449.997 43991.99 43975.21 43997.1 44001.91 48027.83 >>> neval >>> 1000 >>> exactly the same machine + R but with TCP_NODELAY enabled in >>> R_SockConnect(): >>> Unit: microseconds >>> expr min lq mean median uq max >>> neval >>> clusterEvalQ(cl, iris) 156.125 166.41 180.8806 170.247 174.298 5322.234 >>> 1000 >>> Cheers, >>> Simon >>> > On 2/11/2020, at 3:39 AM, Jeff <j...@vtkellers.com> wrote: >>> > >>> > I'm exploring latency overhead of parallel PSOCK workers and noticed that >>> > serializing/unserializing data back to the main R session is >>> > significantly slower on Linux than it is on Windows/MacOS with similar >>> > hardware. Is there a reason for this difference and is there a way to >>> > avoid the apparent additional Linux overhead? >>> > >>> > I attempted to isolate the behavior with a test that simply returns an >>> > existing object from the worker back to the main R session. >>> > >>> > library(parallel) >>> > library(microbenchmark) >>> > gcinfo(TRUE) >>> > cl <- makeCluster(1) >>> > (x <- microbenchmark(clusterEvalQ(cl, iris), times = 1000, unit = "us")) >>> > plot(x$time, ylab = "microseconds") >>> > head(x$time, n = 10) >>> > >>> > On Windows/MacOS, the test runs in 300-500 microseconds depending on >>> > hardware. A few of the 1000 runs are an order of magnitude slower but >>> > this can probably be attributed to garbage collection on the worker. >>> > >>> > On Linux, the first 5 or so executions run at comparable speeds but all >>> > subsequent executions are two orders of magnitude slower (~40 >>> > milliseconds). >>> > >>> > I see this behavior across various platforms and hardware combinations: >>> > >>> > Ubuntu 18.04 (Intel Xeon Platinum 8259CL) >>> > Linux Mint 19.3 (AMD Ryzen 7 1800X) >>> > Linux Mint 20 (AMD Ryzen 7 3700X) >>> > Windows 10 (AMD Ryzen 7 4800H) >>> > MacOS 10.15.7 (Intel Core i7-8850H) >>> > >>> > ______________________________________________ >>> > R-devel@r-project.org mailing list >>> > https://stat.ethz.ch/mailman/listinfo/r-devel >>> > >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >> -- >> Iñaki Úcar > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel