R: Cluster Back-end Interface

sendData {parallel}

R Documentation

Cluster Back-end Interface

Description

The communication primitives used by the parallel package to handle the state and communicate with nodes in the clusters.

Usage

  sendData(node, data)
  recvData(node)
  recvOneData(cl)
  closeNode(node)

Arguments

cl

The cluster object, visible to the user. Should be a list inheriting from class cluster, containing the node objects.

node

The node object corresponding to one execution unit inside the cluster.

data

The data structure containing a message to the node.

Details

A `[.cluster` method is provided, which retains the classes of the cluster when subset. The cluster back-end should either rely on this method or supply its own method that also invokes this method through NextMethod or calls .subset directly.

The data messages sent to the nodes are lists containing the following elements:

type

A short string describing the type of packet:

DONE: Sent by the default stopCluster implementation before calling closeNode.
EXEC: The packet contains a job to execute.

value

For messages of type “EXEC”, a list with the following elements:

fun: The function to execute.
args: The arguments for fun above as a list.
return: Defaults to TRUE. Not currently used by parallel.
tag: The same tag must be returned back from the worker. Used to identify individual elements of a larger job when using dynamic load balancing.

If the “DONE” messages are used (for example when calling stopCluster.default), the node can close the connection upon receipt.

The response to an “EXEC” message that should be returned by recvData is a list with the following elements:

type: A string, "VALUE".
value: The value of do.call(fun, args, quote = TRUE). If the evaluation raised an error, the value of the error.
success: A logical scalar indicating whether the evaluation completed without raising an error.
time: The time it took to complete the job, an object of class proc_time. Can be obtained using system.time or by subtracting outputs of proc.time.
tag: The original tag from the “EXEC” message.

recvData can block if the job is not yet complete, and recvOneData should block until at least one node is able to return a complete job result.

The default closeNode method does nothing. It is envisaged that stopCluster is used to shut down the entire cluster, although other back-ends may use this to implement node-specific logic.

Value

sendData

Ignored. Called for the side effect of sending the data to the node.

recvData

The result of the job previously submitted to the node.

recvOneData

A list with the following items:

node: The index of the node returning the data.
value: The result of recvData(cluster[[node]]).

closeNode

Ignored. Called for the side effect of cleaning up the connection to the node.

Examples

## Not run: 
  # A toy cluster consisting of one connection.
  sendData.mynode <- function(node, data) serialize(data, node)
  recvData.mynode <- function(node) unserialize(node)
  recvOneData.mycluster <- function(cl) list(
    node = 1, value = recvData(cl[[1]])
  )
  closeNode.mynode <- function(node) close(node)

  # Not shown: R starting a serverSocket on the other end, ready to
  # accept connections and evaluate jobs
  cl <- structure(list(
    structure(
      socketConnection(..., blocking = TRUE, open = 'a+b'),
      class = 'mynode'
    )
  ), class = c('mycluster', 'cluster'))
  clusterEvalQ(cl, Sys.getpid())
  stopCluster(cl)
  rm(cl)

## End(Not run)

[Package parallel version 4.6.0 Index]