[R] Can R replicate this data manipulation in SAS?

Douglas Bates bates at stat.wisc.edu
Fri Apr 22 18:50:57 CEST 2011


On Thu, Apr 21, 2011 at 5:34 PM, peter dalgaard <pdalgd at gmail.com> wrote:
>
> On Apr 21, 2011, at 16:00 , Bert Gunter wrote:
>
>> Folks:
>>
>> It is perhaps worth noting that this is probably  a Type III error: right
>> answer to the wrong question. The right question would be: what data
>> structures and analysis strategy are appropriate in R? As usual, different
>> language architectures mean that different paradigms should be used to best
>> fit a language's strengths and weaknesses. Direct translations do not
>> necessarily do this.
>
> Hum, there is a point, though: If you take the crude translation approach, you will soon realize that there is very little that SAS (or SPSS, or...) can do that you literally can't do in R.

What about reading a deck of punched cards with the cards statement in
SAS?  How do you propose to do that in R?

> It is often the case that there is much neater and well-structured approach in R, but the flip side is that there are cases where the neat solution is hard to find, and maybe some cases where it doesn't really exist (e.g. not everything can be vectorized). This is the sort of thing that in some circles give R a reputation for being poorly suited for data handling, compared to the DATA step in SAS. Do notice the circular logic that occurs when defining "typical statistical task" as "something you can do in SAS", though.
>
> (One example is "last observation carried forward", a rather dubious technique for filling in missing observations in longitudinal studies, which probably directly stems from the RETAIN directive in SAS.
>
> In R, you may find yourself doing something like
>
>  x[is.na(x)] <- x[!is.na(x)][cumsum(!is.na(x))[is.na(x)]]
>
> which isn't even completely failsafe. However, you'll get the result soon enough with
>
>  for (i in seq_len(x)) if (is.na(x[i])) x[i] <- t else t <- x[i]
>
> and this time, you can actually read the code.
>
> Of course, approx() will do the trick much more swiftly than either of the above.)
>
> --
> Peter Dalgaard
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list