[R] SAS and R on multiple operating systems

Barry Rowlingson b.rowlingson at lancaster.ac.uk
Tue Apr 6 00:12:13 CEST 2010


On Mon, Apr 5, 2010 at 9:13 PM, Roger DeAngelis(xlr82sas)
<rdeangel at amgen.com> wrote:
>
> Hi,
>
> This is not meant to be critical of R, but is intended as
> a possible source for improvements to R.
> SAS needs the competition.
>
>
> I am reasonably knowledgeable about
>
> R
> SAS-(all products including IML)

> SAS has native low level functions like dcreate(create directory), fopen,
> fread..
> that can be used on all operating systems eliminating
> operating specific commands, like dir(windows), ls(unix) and ispf(3.4 on
> mainframes).

 ...so your reasonable knowledge of R doesn't extend to
help.search("file") and then typing "?files" then? :)

> SAS provides one IDE to multiple operating systems simultaneously.

 Seems to be two things going on here - file system access and SAS execution:

> data "c:\tmp\class.sas7bdat"; /* create sas dataset class in windows */
>  set unx.class; /* dataset class is in the UNIX work directory - not
> mounted in windows */
> run;

 - if this is some kind of abstraction of a network file system then
it's best done at the OS level. On my Linux box, for example, I can
use smbfs to connect to my Windows network file system. Then I can
read, write, edit those files with anything on my Linux box. Having an
application punch through a network to a custom server using some
unknown (possibly insecure) protocol for a specific purpose seems a
bit pointless when you can do it at the OS level using a secure
protocol (smbfs secure? Ummm transport it over ssh of course...)


> libname xls "c:\temp\test.xls"; /* does not have to exist */
>
> data xls.class;  /* create excel file under windows */
>  set unx.class; /* remote unix system - file system not mounted in windows
> */
> run;

 - Not sure I understand what these do. Is 'unx' a special word here?
Does the .class mean something? What is 'set' setting?

> Other functions I use all the time when coding SAS.
>
> 1. Highlight a block of code and hit F1 and the code is run
>   interactivel under windows.

> 3. Highlight and hit F3 and the code is run
>   interactively in unix.

Okay, what's going on here? You have a Windows box (presumably in
front of you) and a Unix box somewhere on the network. And hitting F1
runs it on the Windows box and hitting F4 magically runs it on the
Unix box? I'm guessing this isn't SAS straight-out-of-the-box, someone
has set this all up carefully (for example, how do you authenticate to
the Unix box?).

 This is actually a nice paradigm. Users are developing code on their
desktops with N=10 and then can launch it on the mainframe with
N=100000000 with a single button press.

Again, you could implement this at the operating system level with
ssh. I can do:

 ssh fnordbox R CMD BATCH analysis.R analysis.Rout

and it would run the job on my machine fnordbox. Obviously it would
need access to the .R and any data files but thanks to the miracle of
shared network file systems it can do that. It's not one-button
execution, but it's likely that any one-button execution you have is
hiding a lot of setup behind it - such as authorisation and
authentication to the remote system.

 Another way of doing this would be to use Rserve, a general R
execution server.

 Of course things on R, like many open-source projects, get done
either by people to scratch an itch they have themselves, or by people
paid to scratch other people's itches. I'm not sure duplicating some
of SAS's enterprisey functionality is likely to itch enough for people
to do it, when the cheaper option of learning the R (or Unixy) way of
doing it is much more flexible...

Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman



More information about the R-help mailing list