[R] A comment about R:

Frank E Harrell Jr f.harrell at vanderbilt.edu
Fri Jan 6 05:23:35 CET 2006

Leif Kirschenbaum wrote:
> A few thoughts about R vs SAS:
> I started learning SAS 8 years ago at IBM, I believe it was version 6.10.
> I started with R 7 months ago.
> Learning curve:
>   I think I can do everything in R after 7 months that I could do in SAS after about 4 years.
> Bugs:
>   I suffered through several SAS version changes, 7.0, 7.1, 7.2, 8.0, 9.0 (I may have misquoted some version numbers). Every version change gave me headaches, as every version release (of an expensive commercially produced software set) had bugs which upset or crashed previously working code. I had code which ran fine under Windows 2000 and terribly under Windows XP. Most bugs I found were noted by SAS, but never fixed.
>   With R I have encounted very few bugs, except for an occasional crash of R, which I usually ascribe to some bug in Windows XP.
> Help:
>   SAS help was OK. As others have mentioned, there is too much. I even had the set of printed manuals on my desk (stretching 4 feet or so), which were quote impenetrable. I had almost no support from colleagues: even within IBM the number of advanced SAS users was small.
>   With R this mailing list has been of great help: almost every issue I copy some program and save it as a "R hint xxxx" file.
> I would say that I would appreciate a few more program examples with the help pages for some functions. For instance, "?Control" tells me about "if(cond) cons.expr  else  alt.expr", however an example of
>    if(i==1) { print("one") 
>    } else if(i==2) { print("two")
>    } else if(i>2) { print("bigger than two") }
>  at the end of that help section would have been very helpful for me a few months ago.
> Functions:
>   Writing my own functions in SAS was by use of macros, and usually depended heavily on macro substitution. Learning SAS's macro language, especially macro substitution, was very difficult and it took me years to be able to write complicated functions. Quite different situation in R. Some functions I have written by dint of copying code from other people's packages, which has been very helpful.
>   I wanted to generate arbitrary k-values (the k-multiplier of sigma for a given alpha, beta, and N to establish confidence limits around a mean for small populations). I had a table from a years old microfiche book giving values but wanted to generate my own. I had to find the correct integrals to approximate the k-values and then write two SAS macros which iterated to the desired level of tolerance to generate values. I would guess that there is either an R base function or a package which will do this for me (when I need to start generating AQL tables). Given the utility of these numbers, I was disappointed with SAS.
> Data manipulation:
>   All SAS data is in 2-dimensional datasets, which was very frustrating after having used variables, arrays, and matrices in BASIC, APL, FORTRAN, C, Pascal, and LabVIEW. SAS allows you to access only 1 row of a dataset at a time which was terribly horribly incomprehensibly frustrating. There were so many many problems I had to solve where I had to work around this SAS paradigm.
>   In R, I can access all the elements of a matrix/dataframe at once, and I can use >2 dimensional matrices. In fact, the limitations of SAS I had ingrained from 7.5 years has sometimes made me forget how I can do something so easily in R, like be able to know when a value in a column of a dataframe changes:
>   DF$marker <- DF[1:(nrow(DF)-1),icol] != DF[2:nrow(DF),icol]
> This was hard to do in SAS...and even after years it was sometimes buggy, keeping variable values from previous iterations of a SAS program.
>   One very nice advantage with SAS is that after data is saved in libraries, there is a GUI showing all the libraries and the datasets inside the libraries with sizes and dates. While we can save Rdata objects in an external file, the base package doesn't seem to have the same capabilities as SAS.
> Graphics:
>   SAS graphics were quite mediocre, and generating customized labels was cumbersome. Porting code from one Windows platform to another produced unpredictable and sometimes unworkable results.
>   It has been easier in R: I anticipate that I will be able to port R Windows code to *NIX and generate the same graphics.
> Batch commands:
>   I am working on porting some of my R code to our *NIX server to generate reports and graphs on a scheduled basis. Although a few at IBM did this with SAS, I would have found doing this fairly daunting.
> -Leif


Those are excellent points.  I'm especially glad you mentioned data 
manipulation.  I find that R is far ahead of SAS in this respect 
although most people are shocked to hear me say that.  We are doing all 
our data manipulation (merging, recoding, etc.) in R for pharmaceutical 
research.  The ability to deal with lists of data frames also helps us a 
great deal when someone sends us a clinical trial database made of 50 
SAS datasets.


> -----------------------------
>  Leif Kirschenbaum, Ph.D.
>  Senior Yield Engineer
>  Reflectivity
>  leif at reflectivity.com
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

More information about the R-help mailing list