[R] socket problems (maybe bugs?)

Christian Lederer rsubscriber at slcmsr.net
Thu Feb 17 04:08:14 CET 2005


Dear R Gurus,

for some purpose i have to use a socket connection, where i have to read
and write both text and binary data (each binary data package will be
preceeded by a header line).
When experimenting, i encountered some problems (with R-2.0.1 under
different Linuxes (SuSE and Gentoo)).

Since the default mode for socket connections is non-blocking,
i first tried socketSelect() in order to see whether the socket is ready
for reading:

# Server:
s <- socketConnection(port=2222, server=TRUE, open="w+b")
writeLines("test", s)
writeBin(1:10, s, size=4, endian="big")

# Client, variation 1:
s <- socketConnection(port=2222, server=FALSE, open="w+b")
socketSelect(list(s))
readLines(s, n=1)     # works, "test" is read
socketSelect(list(s)) # does never return, although the server wrote 1:10

(This seems to happen only, when i mix text and binary reads.)
However, without socketSelect(), R may crash if i try to read from an
empty socket:

Server:
s <- socketConnection(port=2222, server=TRUE, open="w+b")
writeLines("test", s)
writeBin(1:10, s, size=4, endian="big")

# Client, variation 2:
s <- socketConnection(port=2222, server=FALSE, open="w+b")
readLines(s, n=1)                              # works, "test" is read
readBin(s, "int", size=4, n=10, endian="big")  # works, 1:10 is read
readBin(s, "int", size=4, n=10, endian="big")  # second read leads to
                                               # segmentation fault

If i omit the endian="big" option, the second read does not crash, but
just gets 10 random numbers.

On the first view, this does not seem to be a problem, since the
data will be preeceded by a header, which contains the number of
bytes in the binary block.
However, due to race conditions, i cannot exclude this situation:

time    server             client
t0      sends header
t1                         reads header
t2                         tries to read binary, crashes
t3      sends binary


If i open the client socket in blocking mode, the second variation seems
to work (the second read just blocks as desired).
When using only one socket, i can do without socketSelect(), but
i have the follwoing questions:

1. Can i be sure, the the blocking variation will also work for larger
data sets, when e.g. the server starts writing before the client is
reading?

2. How could i proceed, if i needed several sockets?
Then i cannot use socketSelect due to the problem described in
variation 1.
I also cannot use blocking sockets, since reading from an empty socket
would block the others.
Without blocking and socketSelect(), i might run into the race condition
described above.

In any case, the readBin() crash with endian="big" is a bug in
my eyes. For non-blocking sockets, readBin() should just return numeric(0),
if no data are written on the socket.
I also stronlgy suspect that the socketSelect() behaviour as described in
variation 1 is a bug.


Christian :-(




More information about the R-help mailing list