[R] R vs. other software? (was: Software for Epidmiological, Longitudinal Data)

Spencer Graves spencer.graves at pdf.com
Mon Jul 10 04:03:00 CEST 2006

	  There have been numerous discussions on this listserve of the 
comparative advantages of R vs. SAS, for example, and various benchmark 
studies have been published.  If you haven't already tried Google, I 
encourage you to do so.


	  From my perspective, R is the platform of choice for new statistical 
algorithm development for a rapidly increasing number of people.  The 
open source nature of R provides huge advantages for learning and for 
new algorithm development.  If I don't understand something, I can walk 
through the code line by line until I figure it out.  That's especially 
easy with the 'debug' facility in R.

	  Similarly, the ease with which I can experiment with a modification 
of an existing algorithm depends on the availability of source code for 
something similar.  With commercial software, that obstacle is often 
insurmountable.  With open source, this kind of experimentation is 
trivial for anyone who is not intimidated by command-driven software.


	  The popularity of R can be judged in part from the fact that it is 
available from 58 mirrors in in 24 countries;  those were the numbers I 
got when I counted a couple of months ago.  At that time, the mirror I 
checked offered free instant access to 724 contributed packages beyond 
the base distribution, and this does not include much of the 
'Bioconductor', which specializes in microarray data.  There are 
occasional difference between mirrors, but those are temporary.

	  At one time, the market dominance of both SAS and IBM was sustained 
by the perception that 'nobody ever got fired' for using them.  Those 
days are history now for SAS as well as IBM, even with the Food and Drug 
Administration (FDA), where for many years, SAS was the de facto 
standard in many pharmaceutical companies for their applications for 
regulatory approval.

	  I have no data on the use of R in articles submitted to refereed 
journals, but I'm confident it is growing.

	  Increasingly, when I consider reading an article or buying a book, I 
look for the availability of a companion R package, because it can make 
a huge difference in how easily and thoroughly I can absorb the material.

	  The core R distribution seems to be as solid as anything on the 
market today.  Contributed packages run from rock solid to highly buggy.


	  R does not ship with a GUI, but several are available;  see 
"www.sciviews.org/_rgui".  I use XEmacs (ESS), mentioned on that web 
site, but it's primary a command line editor.  More GUI-type features 
are available from other open source software such as SciViews and JGR.


	  The level of support available from this listserve beats the level of 
technical support I've ever gotten for any commercial software.  Some 
people's questions never get answered, but that's true with technical 
support anywhere.  Other questions generate a feeding frenzy of replies. 
  This listserve has a posting guide 
(www.R-project.org/posting-guide.html);  I believe that posts that more 
closely match those guidelines are more likely to be greeted with 
multiple replies than silence.  R-help is on-line, 24-7.  If you really 
want an answer and don't get one in 24 hours, review the posting guide, 
and think about how you can make your question more clear, perhaps with 
a different subject line and / or a simple (or simpler) self-contained 

	  The "R Site Search" is also great, and now there's an R Wiki.

	  For two other previous comments related to this, see the following:



	  Hope this helps.
	  Spencer Graves

Andrea Meyer wrote:
> Hello
> We are a team working on a prospective psychological study. The study 
> design is based on assessing data of three generations of humans over a 
> long time period, wherein epidemiological as well as biological data 
> will be assessed. Sample sizes will range from about 100 to several 
> thousand depending on the research question.
> Currently we are looking for an apropriate statistical package. Here are 
> some features that the software should have:
> - strong in the analysis of epidemiological and longitudinal data
> - platform independent (should run under different operating systems 
> like Windows, Mac OS, Unix)
> - Ease of use for non-statistic-professionals (i.e. userfriendly GUI)
> - High acceptance by scientific journals, by the FDA
> - Importance relative to other packages with respect to the number of 
> users, the number of publications in which the software is used, the 
> market share etc.  (including the recent development of these indices!) 
> As we had some problems in finding information concerning these items we 
> would like to ask you where we might find it (if at all) and why R is 
> presumably the best competitor and why?
> Thanks in advance for any suggestions concerning this!

More information about the R-help mailing list