[R] R vs. other software? (was: Software for Epidmiological, Longitudinal Data)
spencer.graves at pdf.com
Mon Jul 10 04:03:00 CEST 2006
There have been numerous discussions on this listserve of the
comparative advantages of R vs. SAS, for example, and various benchmark
studies have been published. If you haven't already tried Google, I
encourage you to do so.
PRIMARY STRENGTHS OF R
From my perspective, R is the platform of choice for new statistical
algorithm development for a rapidly increasing number of people. The
open source nature of R provides huge advantages for learning and for
new algorithm development. If I don't understand something, I can walk
through the code line by line until I figure it out. That's especially
easy with the 'debug' facility in R.
Similarly, the ease with which I can experiment with a modification
of an existing algorithm depends on the availability of source code for
something similar. With commercial software, that obstacle is often
insurmountable. With open source, this kind of experimentation is
trivial for anyone who is not intimidated by command-driven software.
The popularity of R can be judged in part from the fact that it is
available from 58 mirrors in in 24 countries; those were the numbers I
got when I counted a couple of months ago. At that time, the mirror I
checked offered free instant access to 724 contributed packages beyond
the base distribution, and this does not include much of the
'Bioconductor', which specializes in microarray data. There are
occasional difference between mirrors, but those are temporary.
At one time, the market dominance of both SAS and IBM was sustained
by the perception that 'nobody ever got fired' for using them. Those
days are history now for SAS as well as IBM, even with the Food and Drug
Administration (FDA), where for many years, SAS was the de facto
standard in many pharmaceutical companies for their applications for
I have no data on the use of R in articles submitted to refereed
journals, but I'm confident it is growing.
Increasingly, when I consider reading an article or buying a book, I
look for the availability of a companion R package, because it can make
a huge difference in how easily and thoroughly I can absorb the material.
The core R distribution seems to be as solid as anything on the
market today. Contributed packages run from rock solid to highly buggy.
GRAPHICAL USER INTERFACE
R does not ship with a GUI, but several are available; see
"www.sciviews.org/_rgui". I use XEmacs (ESS), mentioned on that web
site, but it's primary a command line editor. More GUI-type features
are available from other open source software such as SciViews and JGR.
The level of support available from this listserve beats the level of
technical support I've ever gotten for any commercial software. Some
people's questions never get answered, but that's true with technical
support anywhere. Other questions generate a feeding frenzy of replies.
This listserve has a posting guide
(www.R-project.org/posting-guide.html); I believe that posts that more
closely match those guidelines are more likely to be greeted with
multiple replies than silence. R-help is on-line, 24-7. If you really
want an answer and don't get one in 24 hours, review the posting guide,
and think about how you can make your question more clear, perhaps with
a different subject line and / or a simple (or simpler) self-contained
The "R Site Search" is also great, and now there's an R Wiki.
For two other previous comments related to this, see the following:
Hope this helps.
Andrea Meyer wrote:
> We are a team working on a prospective psychological study. The study
> design is based on assessing data of three generations of humans over a
> long time period, wherein epidemiological as well as biological data
> will be assessed. Sample sizes will range from about 100 to several
> thousand depending on the research question.
> Currently we are looking for an apropriate statistical package. Here are
> some features that the software should have:
> - strong in the analysis of epidemiological and longitudinal data
> - platform independent (should run under different operating systems
> like Windows, Mac OS, Unix)
> - Ease of use for non-statistic-professionals (i.e. userfriendly GUI)
> - High acceptance by scientific journals, by the FDA
> - Importance relative to other packages with respect to the number of
> users, the number of publications in which the software is used, the
> market share etc. (including the recent development of these indices!)
> As we had some problems in finding information concerning these items we
> would like to ask you where we might find it (if at all) and why R is
> presumably the best competitor and why?
> Thanks in advance for any suggestions concerning this!
More information about the R-help