[R] Reasons to Use R

Bi-Info (http://members.home.nl/bi-info) bi-info at home.nl
Tue Apr 10 00:23:08 CEST 2007


Licensing is a big issue in software. The way I prefer it is an easy 
license, a license which makes it possible that I can work on another 
PC, without paying a lot of money. R produces quite good results and is 
widely used. That makes it a statistical package that I want.
The other thing is that working with large datasets requires "some" 
effort by software makers to get it working. I doubt if R has the 
capability of working consistently with large datasets. That is an issue 
I think. I have done some comparisons between SPSS and R, and R seems to 
be performing allright, so I can do computations with it. Nonetheless: 
the data handling is not quite as good I think in comparison with SAS.

When I started doing statistics there were about three packages: SPSS, 
SAS and BMDP (at least: these were available). On a PC you were required 
to use SPSS.
Nowadays there are hundreds, some with excellent database facilities, or 
you can compute the newest statistical tests, or an exotic one. I 
haven't got a clue how to work with new database facilities. dBase was 
my only database education and everything has changed. So I cannot 
answer if R is capable of working with large datasets in relation to 
databases. I really don't know. The only thing I know that if I compute 
a ChiSq, it works on a relatively large dataset (not Fisher tests by the 
way). The same with a likelihood procedure, or tabulations including 
non-parametrics or factor analysis.   But databases are an issue I've 
been told by a guy who works with R. SAS was a better option he told me.

So what's the big deal about S using files instead of memory like R. I 
don't get the point. Isn't there enough swap space for S? (Who cares 
anyway: it works, isn't it?) Or are there any problems with S and large 
datasets? I don't get it. You use them, Greg. So you might discuss that 
issue.

Wilfred










The licences keep changing, some have in the past but don't now, some
you can get an additional licence for home at a discounted price. Some
it depends on the type of licence you have at work (currently our SAS
licence is such that the 3 people in my group can all have it installed,
but at most 1 can be using it at any 1 time, how does that affect
installing/using it at home).  I may be able to install some of the
software at home also, but for most of them I have given up trying to
figure out the legality of it and so I have not installed them at home
to be on the safe side.

Some of the doctors I work with who are also affiliated with the local
university have mentioned that they can get a discounted academic
version of SAS and could use that, but my interpretation of the academic
licence that one showed me (probably not the most recent) said (in my
interpretation, I am not a lawyer) that if they published the results
without paying a licence upgrade fee, they would be violating the
licence (the academic version was intended for teaching only).

The R licence on the other hand is pretty clear that I can install it
and use it pretty much anywhere I want.

You are right in correcting me, R is not the only package that can be
used on multiple computers.  I do think it is the most straight forward
of the good ones.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow op intermountainmail.org
(801) 408-8111



> -----Original Message-----
> From: Gabor Grothendieck [mailto:ggrothendieck op gmail.com] 
> Sent: Monday, April 09, 2007 10:44 AM
> To: Greg Snow
> Cc: Lorenzo Isella; r-help op stat.math.ethz.ch
> Subject: Re: [R] Reasons to Use R
> 
> I might be wrong about this but I thought that the licenses 
> for at least some of the commercial packages do let you make 
> a copy of the one you have at work for home use.
> 
> On 4/9/07, Greg Snow <Greg.Snow op intermountainmail.org> wrote:
> > Here are a couple more thougts to add to what you have 
> already received:
> >
> > You mentioned that price is not at issue, but there are other costs 
> > than money that you may want to look at.  On my work 
> machine I have R, 
> > S-PLUS, SAS, SPSS, and a couple of other stats programs; on 
> my laptop 
> > and home computers I have R installed.  So, if a deadline 
> is looming 
> > and I am working on a project mainly in R, it is easy to 
> work on it on 
> > the bus or at home (or in a boring meeting), the same does not work 
> > for a SAS or SPSS project (Hmm, thinking about this now, 
> maybe I need 
> > to do less in R :-).
> >
> > R and S-PLUS are very flexible/customizable, if you have a certain 
> > plot that you make often you can write your own 
> function/script to do 
> > it automatically, most other programs will give you their standard, 
> > then you have to modify it to meet your specifications.  
> With sweave 
> > (and the odf and html extensions) you can automate whole 
> reports, very 
> > useful for things that you do month after month.
> >
> > And what I think is the biggest advantage of R and S-PLUS 
> is that they 
> > strongly encourage you to think about your data.  Other 
> programs (at 
> > least that I am familiar with) tend to have 1 specific way 
> of treating 
> > your data, and expect you to modify your data to fit that programs 
> > model.  These models can be overrestrictive (force you to 
> restructure 
> > your data to fit their model) or underrestrictive (allow 
> things that 
> > should really be separate data objects to be combined into a single
> > "dataset") and sometimes both.  S on the other hand allows many 
> > different ways to store and work with your data, and as you analyze 
> > the data, different branches of new analysis open up depending on 
> > early results rather than just getting stock output for a 
> procedure.  
> > If all you want is a black box where data goes in one end and a 
> > specific answer comes out the other, then most programs 
> will work; but 
> > if you want to really understand what your data has to tell 
> you, then 
> > R/S-PLUS makes this easy and natural.
> >
> > Hope this helps,
> >
> >
> > --
> > Gregory (Greg) L. Snow Ph.D.
> > Statistical Data Center
> > Intermountain Healthcare
> > greg.snow op intermountainmail.org
> > (801) 408-8111
> >
> >
> >
> > > -----Original Message-----
> > > From: r-help-bounces op stat.math.ethz.ch 
> > > [mailto:r-help-bounces op stat.math.ethz.ch] On Behalf Of Lorenzo 
> > > Isella
> > > Sent: Thursday, April 05, 2007 9:02 AM
> > > To: r-help op stat.math.ethz.ch
> > > Subject: [R] Reasons to Use R
> > >
> > > Dear All,
> > > The institute I work for is organizing an internal 
> workshop for High 
> > > Performance Computing (HPC).
> > > I am planning to attend it and talk a bit about fluid 
> dynamics, but 
> > > there is also quite a lot of interest devoted to data 
> > > post-processing and management of huge data sets.
> > > A lot of people are interested in image processing/pattern 
> > > recognition and statistic applied to geography/ecology, 
> but I would 
> > > like not to post this on too many lists.
> > > The final aim of the workshop is  understanding hardware 
> > > requirements and drafting a list of the equipment we 
> would like to 
> > > buy. I think this could be the venue to talk about R as well.
> > > Therefore, even if it is not exactly a typical mailing list 
> > > question, I would like to have suggestions about where to collect 
> > > info about:
> > > (1)Institutions (not only academia) using R (2)Hardware 
> > > requirements, possibly benchmarks (3)R & clusters, R & 
> multiple CPU 
> > > machines, R performance on different hardware.
> > > (4)finally, a list of the advantages for using R over commercial 
> > > statistical packages. The money-saving in itself is not a reason 
> > > good enough and some people are scared by the lack of 
> professional 
> > > support, though this mailing list is simply wonderful.
> > >
> > > Kind Regards
> > >
> > > Lorenzo Isella
> > >
> > > ______________________________________________
> > > R-help op stat.math.ethz.ch mailing list 
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> > >
> >
> > ______________________________________________
> > R-help op stat.math.ethz.ch mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>

______________________________________________
R-help op stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


-- 
No virus found in this incoming message.


20:34



More information about the R-help mailing list