Frank E Harrell Jr
f.harrell at vanderbilt.edu
Thu Jan 17 14:33:35 CET 2008
Walter Paczkowski wrote:
> Good morning,
>
> I use SAS and R/S-Plus as my primary tools so I have a lot of experience with these programs. By far and away, SAS is superior for handling the "messy" datasets, but also the very large ones. I work at times with datasets in the hundreds of thousands (and on occasion, millions) of records. SAS, and especially PROC SQL, are invaluable for this. But once I get to datasets manageable for R/S-Plus, then I ship to these tools for the programming and graphics. This seems to work great.
>
> Walt Paczkowski
> Data Analytics Corp.
Previously I used SAS for 23 years and now R/S-Plus for 17. SAS is
effective for large datasets (in my work > 500,000 subjects) but except
for that, R is far superior to SAS for data management and manipulation.
Just four of the reasons are that you can
- merge data frames multiple ways and compare the results
- deal with arrays (lists) of datasets using high-level operators
- easily do complex calculations on serial data such as find the highest
blood pressure per subject that is measured before something else is
measured
- sense the type of a variable (character, factor, date, discrete
numeric, continuous numeric, etc.) while analyzing it, and tailor the
analysis to the type of variable
http://biostat.mc.vanderbilt.edu/twiki/pub/Main/RS/sintro.pdf has a
large section on data manipulation in S.
Frank
>
>
>>
>>
>> I wonder if those who complain about SAS as a programming environment have
>> discovered SAS/IML which provides a programming environment akin to Matlab
>> which is more than capable (at least for those problems which can be treated
>> with a matrix like approach). As someone who uses both SAS and R - graphical
>> output is so much easier in R, but for handling large 'messy' datasets SAS
>> wins hands down...
>> Cheers
>> Rob
>>
>>
>>
>>> SAS has no facilities for date arithmetic and no easy way to
>>> build it yourself. In fact, that's the biggest problem with
>>> SAS: it stinks as a programming environment, so it's always
>>> much more difficult than it should be to do something new.
>>> As soon as you get away from the canned procs and have to
>>> write something of your own, SAS falls down.
>>>
>>> I don't know enough about SPSS to comment.
>>> --
>>> Jeff
