[R-sig-hpc] ANN: RHIPE 0.1

Saptarshi Guha saptarshi.guha at gmail.com
Mon Apr 27 16:14:57 CEST 2009

I'd like to announce the release of the 0.1 version of RHIPE -R and
Hadoop Integrated Processing Environment. Using RHIPE, it is possible
to write map-reduce algorithms using the R language and start them
from within R.
RHIPE is built on Hadoop and so benefits from Hadoop's fault
tolerance, distributed file system and job scheduling features.
For the R user, there is rhlapply which runs an lapply across the cluster.
For the Hadoop user, there is rhmr which runs a general map-reduce program.

The tired example of counting words:

m <- function(key,val){
  words <- substr(val," +")[[1]]
  wc <- table(words)
  cln <- names(wc)
r <- function(key,value){
  value <- do.call("rbind",value)

URL: http://ml.stat.purdue.edu/rhipe

There are some downsides to RHIPE which are described at

Saptarshi Guha

More information about the R-sig-hpc mailing list