[R] Inefficiency of SAS Programming

Peter Dalgaard p.dalgaard at biostat.ku.dk
Fri Feb 27 01:13:53 CET 2009


Barry Rowlingson wrote:
> 2009/2/26 Frank E Harrell Jr <f.harrell at vanderbilt.edu>:
>> If anyone wants to see a prime example of how inefficient it is to program
>> in SAS, take a look at the SAS programs provided by the US Agency for
>> Healthcare Research and Quality for risk adjusting and reporting for
>> hospital outcomes at http://www.qualityindicators.ahrq.gov/software.htm .
>>  The PSSASP3.SAS program is a prime example.  Look at how you do a vector
>> product in the SAS macro language to evaluate predictions from a logistic
>> regression model.  I estimate that using R would easily cut the programming
>> time of this set of programs by a factor of 4.
> 
>  Plenty of examples ripe for sending to www.thedailywtf.com there. Like this:
> 
>     IF &N. =  1 THEN SUB_N = 1;
>     IF &N. =  3 THEN SUB_N = 2;
>     IF &N. =  4 THEN SUB_N = 3;
>     IF &N. =  6 THEN SUB_N = 4;
>     IF &N. =  7 THEN SUB_N = 5;
>     IF &N. =  8 THEN SUB_N = 6;
>     IF &N. =  9 THEN SUB_N = 7;
>     IF &N. = 10 THEN SUB_N = 8;
>     IF &N. = 11 THEN SUB_N = 9;
>     IF &N. = 12 THEN SUB_N = 10;
>     IF &N. = 13 THEN SUB_N = 11;
>     IF &N. = 14 THEN SUB_N = 12;
>     IF &N. = 15 THEN SUB_N = 13;
>     IF &N. = 17 THEN SUB_N = 14;
>     IF &N. = 18 THEN SUB_N = 15;
>     IF &N. = 19 THEN SUB_N = 16;
> 
> Of course it's possible to write code like that in any language, it
> just looks worse when it's in ALL CAPS and written in a style that
> looks like the 1980s and onward never happened. The question is
> whether it's possible to write this better in SAS. Most of us on this
> list could write it in R in a better way.

Presumably, something like

      IF &N. =  1 THEN SUB_N = 1;
      ELSE IF &N. < 5 THEN SUB_N = &N.-1;
      ELSE IF &N. < 16 THEN SUB_N = &N.-2;
      ELSE SUB_N = &N.-3;

would work, provided that 2, 5, 16 are impossible values. Problem is 
that it actually makes the code harder to grasp, so experienced SAS 
programmers go for the dumb but readable code like the above.

In R, the cleanest I can think of is

subn <- match(n, setdiff(1:19, c(2,5,16)))

or maybe just

subn <- match(n, c(1, 3:4, 6:15, 17:19))

although

subn <- factor(n, levels = c(1, 3:4, 6:15, 17:19))

might be what is really wanted

-- 
    O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907




More information about the R-help mailing list