[Rd] parallel PSOCK connection latency is greater on Linux?
Jeff
je|| @end|ng |rom vtke||er@@com
Tue Nov 10 01:38:51 CET 2020
I do enjoy free lunch solutions if they exist.
That said, I think the abstraction proposed by Simon is reasonable.
Whether it should be applied to TCP_NODELAY or TCP_QUICKACK is
unfortunately beyond my Linux/networking knowledge.
Jeff Keller
On Wed, Nov 4, 2020 at 11:41, I�aki Ucar <iucar using fedoraproject.org>
wrote:
> Please, check a tcpdump session on localhost while running the
> following script:
>
> library(parallel)
> library(tictoc)
> cl <- makeCluster(1)
> Sys.sleep(1)
>
> for (i in 1:10) {
> tic()
> x <- clusterEvalQ(cl, iris)
> toc()
> }
>
> The initialization phase comprises 7 packets. Then, the 1-second sleep
> will help you see where the evaluation starts. Each clusterEvalQ
> generates 6 packets:
>
> 1. main -> worker PSH, ACK 1026 bytes
> 2. worker -> main ACK 66 bytes
> 3. worker -> main PSH, ACK 3758 bytes
> 4. main -> worker ACK 66 bytes
> 5. worker -> main PSH, ACK 2484 bytes
> 6. main -> worker ACK 66 bytes
>
> The first two are the command and its ACK, the following are the data
> back and their ACKs. In the first 4-5 iterations, I see no delay at
> all. Then, in the following iterations, a 40 ms delay starts to happen
> between packets 3 and 4, that is: the main process delays the ACK to
> the first packet of the incoming result.
>
> So I'd say Nagle is hardly to blame for this. It would be interesting
> to see how many packets are generated with TCP_NODELAY on. If there
> are still 6 packets, then we are fine. If we suddenly see a gazillion
> packets, then TCP_NODELAY does more harm than good. On the other hand,
> TCP_QUICKACK would surely solve the issue without any drawback. As
> Nagle himself put it once, "set TCP_QUICKACK. If you find a case where
> that makes things worse, let me know."
>
> I�aki
>
> On Wed, 4 Nov 2020 at 04:34, Simon Urbanek
> <simon.urbanek using r-project.org <mailto:simon.urbanek using r-project.org>>
> wrote:
>>
>> I'm not sure the user would know ;). This is very system-specific
>> issue just because the Linux network stack behaves so differently
>> from other OSes (for purely historical reasons). That makes it hard
>> to abstract as a "feature" for the R sockets that are supposed to be
>> platform-independent. At least TCP_NODELAY is actually part of POSIX
>> so it is on better footing, and disabling delayed ACK is practically
>> only useful to work around the other side having Nagle on, so I
>> would expect it to be rarely used.
>>
>> This is essentially RFC since we don't have a mechanism for socket
>> options (well, almost, there is timeout and blocking already...) and
>> I don't think we want to expose low-level details so perhaps one
>> idea would be to add something like delay=NA to socketConnection()
>> in order to not touch (NA), enable (TRUE) or disable (FALSE)
>> TCP_NODELAY. I wonder if there is any other way we could infer the
>> intention of the user to try to choose the right approach...
>>
>> Cheers,
>> Simon
>>
>>
>> > On Nov 3, 2020, at 02:28, Jeff <jeff using vtkellers.com
>> <mailto:jeff using vtkellers.com>> wrote:
>> >
>> > Could TCP_NODELAY and TCP_QUICKACK be exposed to the R user so
>> that they might determine what is best for their potentially
>> latency- or throughput-sensitive application?
>> >
>> > Best,
>> > Jeff
>> >
>> > On Mon, Nov 2, 2020 at 14:05, I�aki Ucar
>> <iucar using fedoraproject.org <mailto:iucar using fedoraproject.org>> wrote:
>> >> On Mon, 2 Nov 2020 at 02:22, Simon Urbanek
>> <simon.urbanek using r-project.org <mailto:simon.urbanek using r-project.org>>
>> wrote:
>> >>> It looks like R sockets on Linux could do with TCP_NODELAY --
>> without (status quo):
>> >> How many network packets are generated with and without it? If
>> there
>> >> are many small writes and thus setting TCP_NODELAY causes many
>> small
>> >> packets to be sent, it might make more sense to set TCP_QUICKACK
>> >> instead.
>> >> I�aki
>> >>> Unit: microseconds
>> >>> expr min lq mean median
>> uq max
>> >>> clusterEvalQ(cl, iris) 1449.997 43991.99 43975.21 43997.1
>> 44001.91 48027.83
>> >>> neval
>> >>> 1000
>> >>> exactly the same machine + R but with TCP_NODELAY enabled in
>> R_SockConnect():
>> >>> Unit: microseconds
>> >>> expr min lq mean median uq
>> max neval
>> >>> clusterEvalQ(cl, iris) 156.125 166.41 180.8806 170.247 174.298
>> 5322.234 1000
>> >>> Cheers,
>> >>> Simon
>> >>> > On 2/11/2020, at 3:39 AM, Jeff <jeff using vtkellers.com
>> <mailto:jeff using vtkellers.com>> wrote:
>> >>> >
>> >>> > I'm exploring latency overhead of parallel PSOCK workers and
>> noticed that serializing/unserializing data back to the main R
>> session is significantly slower on Linux than it is on Windows/MacOS
>> with similar hardware. Is there a reason for this difference and is
>> there a way to avoid the apparent additional Linux overhead?
>> >>> >
>> >>> > I attempted to isolate the behavior with a test that simply
>> returns an existing object from the worker back to the main R
>> session.
>> >>> >
>> >>> > library(parallel)
>> >>> > library(microbenchmark)
>> >>> > gcinfo(TRUE)
>> >>> > cl <- makeCluster(1)
>> >>> > (x <- microbenchmark(clusterEvalQ(cl, iris), times = 1000,
>> unit = "us"))
>> >>> > plot(x$time, ylab = "microseconds")
>> >>> > head(x$time, n = 10)
>> >>> >
>> >>> > On Windows/MacOS, the test runs in 300-500 microseconds
>> depending on hardware. A few of the 1000 runs are an order of
>> magnitude slower but this can probably be attributed to garbage
>> collection on the worker.
>> >>> >
>> >>> > On Linux, the first 5 or so executions run at comparable
>> speeds but all subsequent executions are two orders of magnitude
>> slower (~40 milliseconds).
>> >>> >
>> >>> > I see this behavior across various platforms and hardware
>> combinations:
>> >>> >
>> >>> > Ubuntu 18.04 (Intel Xeon Platinum 8259CL)
>> >>> > Linux Mint 19.3 (AMD Ryzen 7 1800X)
>> >>> > Linux Mint 20 (AMD Ryzen 7 3700X)
>> >>> > Windows 10 (AMD Ryzen 7 4800H)
>> >>> > MacOS 10.15.7 (Intel Core i7-8850H)
>> >>> >
>> >>> > ______________________________________________
>> >>> > R-devel using r-project.org <mailto:R-devel using r-project.org> mailing
>> list
>> >>> > <https://stat.ethz.ch/mailman/listinfo/r-devel>
>> >>> >
>> >>> ______________________________________________
>> >>> R-devel using r-project.org <mailto:R-devel using r-project.org> mailing
>> list
>> >>> <https://stat.ethz.ch/mailman/listinfo/r-devel>
>> >> --
>> >> I�aki �car
>> >
>> > ______________________________________________
>> > R-devel using r-project.org <mailto:R-devel using r-project.org> mailing list
>> > <https://stat.ethz.ch/mailman/listinfo/r-devel>
>> >
>>
>
>
> --
> I�aki �car
[[alternative HTML version deleted]]
More information about the R-devel
mailing list