[R] Using R for Production - Discussion
baptiste.auguie at googlemail.com
Tue Nov 2 07:36:23 CET 2010
Regarding your '10 commandments' in Q3, you may find useful tips in
"the R inferno" by Pat Burns.
On 2 November 2010 05:04, Santosh Srinivas <santosh.srinivas at gmail.com> wrote:
> Hello Group,
> This is an open-ended question.
> Quite fascinated by the things I can do and the control I have on my
> activities since I started using R.
> I basically have been using this for analytical related work off my desktop.
> My experience has been quite good and most issues where I need to
> investigate and solve are typical items more related to data errors, format
> corruption, etc... not necessarily "R" Related.
> Complementing this with Python gives enough firepower to do lots of
> production (analytical related activities) on the cloud (from my research I
> see that every innovative technology provider seems to support Python ...
> google, amazon, etc).
> Question on using R for Production activities:
> Q1) Does anyone have experience of using R-scripts etc ... for production
> related activities. E.g. serving off a computational/ analytical /
> simulation environment from a webportal with the analytical processing done
> in R.
> I've seen that most useful things for normal (not rocket science) business
> (80-20 rule) can be done just as well in R in comparison with tools like
> SAS, Matlab, etc.
> Q2) I haven't tried the processing routines for much larger data-sets
> assuming "size" is not a constraint nowadays.
> I know that I should try out ... but any forewarnings would help. Is it
> likely that something that works for my "desktop" dataset is quite as likely
> to work when scaled up to a "cloud dataset"?
> Assuming that I do the clearing out of unused objects, not running into
> infinite loops, etc?
> i.e. is there any problem with the "fundamental architecture of R itself"?
> (like press articles often say)
> Q3) There are big fans of the SAS, Matlab, Mathworks environments out there
> .... does anyone have a comparison of how R fares.
> >From my experience R is quite neat and low level ... so overheads should be
> quite low.
> Most slowness comes due to lack of knowledge (see my code ... like using the
> wrong structures, functions, loops, etc.) rather than something wrong with
> the way R itself is.
> Perhaps there is no "commercial" focus to enhance performance related issues
> but my guess is that it is just matter of time till the community evolves
> the language to score higher on that too.
> And perhaps develops documentation to assist the challenge users with
> "performance tips" (the ten commandments types)
> Q4) You must have heard about the latest comment from James Goodnight of SAS
> ... "We haven't noticed that a lot. Most of our companies need industrial
> strength software that has been tested, put through every possible scenario
> or failure to make sure everything works correctly."
> My "gut" is that random passionate geeks (playing part-time) do better
> testing than a military of professionals ... (but I've no empirical evidence
> I am not taking a side here (although I appreciate those who do!) .. but
> looking for an objective reasoning.
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help