[R] A comment about R:
Robert W. Baer, Ph.D.
rbaer at atsu.edu
Thu Jan 5 08:20:41 CET 2006
>> On Wed, 4 Jan 2006, Roger Bivand wrote:
>> > Could I ask for comments on:
>> >
>> > source(url("http://spatial.nhh.no/R/etc/capabilities.R"), echo=TRUE)
>> >
>> > as a reproduction of the Stata capabilities session? Both the t test
>> > and
>> > the chi-square from our side point up oddities. I didn't succeed on
>> > putting fit lines on a grouped xyplot, so backed out to base graphics.
>> > This could be Swoven, possibly using the RweaveHTML driver.
>> >
>>
Excellent! Although I will point out that the Stata summarize command is a
little different than the R summary command. The summarize command is a
little more like:
summarize <- function(x){
obs=length(x)
mn=mean(x)
sd=sd(x)
min=min(x)
max=max(x)
cat('obs \t Average \t Std. Dev. \t Min \t Max \n',
obs,'\t',mn,'\t',sd,'\t',min,'\t',max,'\n')
}
As a user of statistics rather than a statistician, I have to agree with the
original author whose premise was that different statistical packages have
different strengths. I think the main basis for his comments on R were,
reading between the lines, that he knew it mostly from talking to friends.
Any statistical tool for those of us in the back rows is as easy as our
mentor make it. At my institution there is a paucity of good mentors, and I
have found the learning curve equally steep for Stata 7 for which I have
many, many volumes of documentation and R for which I have greatly benefited
from several of the terrific contributed documentation and books already
mentioned.
The original article was about SAS, Stata, and SPSS strengths for carrying
out 'tradtional statistics'. What are R's strengths? Too numerous to
mention in the hands of the right users. However, I would point to things
like the tools at the Bioconductor site as a broad illustration of the
nearly infinite flexibility and extensibility of R for specialized
statistical tasks. Does this mean that R is a poor tool to choose for the
basic and traditional procedures? Hardly! (Well written documentation like
John Fox's cars, Peter Dalgard's ISwR, and John Verzani's Simple R
contributed documentation put introductory R statistical procedures within
easy grasp of users. I have found that non-statistics students rapidly
catch on with 'problem-specific' guidance once they overcome the lack of
GUI. (R-commander is certainly a solution there). As the number of R
mentors grows to rival SAS, Stata, and SPSS, the everyday tasks might even
appear easier to new initiates than the corresponding syntax and thought
processes in the other programs.
So, what are R's major weaknesses? I do not think they are statistical.
Rather, it is having 'mentors' who have gone before to do the type of
analysis that you (the end user) wish to do, and who have graciously left
behind a paper trail of how to syntactically address a specific statistical
task. There is a huge amount out there, but it is hard to find at the
beginning. [BTW: This listserve is of course a tremendous resource, and why
should we not read the posting guide out of simple respect for those who
have given us such a great resource. I don't like getting flammed either,
but darn it, sometimes I deserve it ;-).]
Finally, this thread has made me think back 3-4 years to when I first
discovered R. The think that frustrated me the most in the early weeks was
getting data into R. It took me no time to learn to generate data with all
kinds of distrbutions, no time to discover 'build in' datasets from the
data() function, or to enter data a number at a time with the c() funtion.
BUT HOW was I to get the datasets (spreadsheet, database) from my laboratory
into R? This somehow has been much easier to figure out in the other (often
GUI) statistical environments I have used. [Of course, I finally discovered
the documentation for the foreign package and later learned about RODBC, and
I was blown away by the flexibility available.
Well just the thoughts of one end user type...
Rob
More information about the R-help
mailing list