[R] Hadoop R integration

Lalitha Kristipati Lalitha.Kristipati at techmahindra.com
Tue May 26 12:51:14 CEST 2015


Hi,

I need to showcase how R and Hadoop can work together using ORCH. I have found a sample code from the ORACLE website as follows
dfs <- hdfs.attach("ontime_DB")

res <- hadoop.run(
        dfs,
        mapper = function(key, value) {
          if (key == 'SFO' & !is.na(x$ARRDELAY)) {
            keyval(key, value)
          }
          else {
            NULL
          }
            },
         reducer = function(key, values) {
            for (x in values) {
                sumAD <- sumAD + x$ARRDELAY
                count <- count + 1
                  }
                  res <- sumAD / count
                  keyval(key, res)
            })

OUTPUT:

> hdfs.get(res)
   key     val1
1  SFO   17.44828

I could not understand in this code where is ORCH acting as a connector. Even if you can explain how ORCH acts as a connector to Hadoop with another example it would also be helpful.


Regards,
Lalitha Kristipati
Associate Software Engineer



============================================================================================================================
Disclaimer:  This message and the information contained herein is proprietary and confidential and subject to the Tech Mahindra policy statement, you may review the policy at http://www.techmahindra.com/Disclaimer.html externally http://tim.techmahindra.com/tim/disclaimer.html internally within TechMahindra.
============================================================================================================================


	[[alternative HTML version deleted]]



More information about the R-help mailing list