sendData {parallel}R Documentation

Cluster Back-end Interface

Description

The communication primitives used by the parallel package to handle the state and communicate with nodes in the clusters.

Usage

  sendData(node, data)
  recvData(node)
  recvOneData(cl)
  closeNode(node)

Arguments

cl

The cluster object, visible to the user. Should be a list inheriting from class cluster, containing the node objects.

node

The node object corresponding to one execution unit inside the cluster.

data

The data structure containing a message to the node.

Details

A `[.cluster` method is provided, which retains the classes of the cluster when subset. The cluster back-end should either rely on this method or supply its own method that also invokes this method through NextMethod or calls .subset directly.

The data messages sent to the nodes are lists containing the following elements:

type

A short string describing the type of packet:

DONE

Sent by the default stopCluster implementation before calling closeNode.

EXEC

The packet contains a job to execute.

value

For messages of type “EXEC”, a list with the following elements:

fun

The function to execute.

args

The arguments for fun above as a list.

return

Defaults to TRUE. Not currently used by parallel.

tag

The same tag must be returned back from the worker. Used to identify individual elements of a larger job when using dynamic load balancing.

If the “DONE” messages are used (for example when calling stopCluster.default), the node can close the connection upon receipt.

The response to an “EXEC” message that should be returned by recvData is a list with the following elements:

type

A string, "VALUE".

value

The value of do.call(fun, args, quote = TRUE). If the evaluation raised an error, the value of the error.

success

A logical scalar indicating whether the evaluation completed without raising an error.

time

The time it took to complete the job, an object of class proc_time. Can be obtained using system.time or by subtracting outputs of proc.time.

tag

The original tag from the “EXEC” message.

recvData can block if the job is not yet complete, and recvOneData should block until at least one node is able to return a complete job result.

The default closeNode method does nothing. It is envisaged that stopCluster is used to shut down the entire cluster, although other back-ends may use this to implement node-specific logic.

Value

sendData

Ignored. Called for the side effect of sending the data to the node.

recvData

The result of the job previously submitted to the node.

recvOneData

A list with the following items:

node

The index of the node returning the data.

value

The result of recvData(cluster[[node]]).

closeNode

Ignored. Called for the side effect of cleaning up the connection to the node.

See Also

stopCluster should also be implemented, but is a user interface and documented separately. The default method will post termination messages to individual nodes and then call closeNode on them.

Examples

## Not run: 
  # A toy cluster consisting of one connection.
  sendData.mynode <- function(node, data) serialize(data, node)
  recvData.mynode <- function(node) unserialize(node)
  recvOneData.mycluster <- function(cl) list(
    node = 1, value = recvData(cl[[1]])
  )
  closeNode.mynode <- function(node) close(node)

  # Not shown: R starting a serverSocket on the other end, ready to
  # accept connections and evaluate jobs
  cl <- structure(list(
    structure(
      socketConnection(..., blocking = TRUE, open = 'a+b'),
      class = 'mynode'
    )
  ), class = c('mycluster', 'cluster'))
  clusterEvalQ(cl, Sys.getpid())
  stopCluster(cl)
  rm(cl)

## End(Not run)

[Package parallel version 4.5.0 Index]