[R] Excel
Greg Snow
Greg.Snow at intermountainmail.org
Thu Aug 30 21:13:51 CEST 2007
Earlier this week I was doing some work at our house and since my wife was at the dentist office our 3 year old son was "helping" me. He really wanted to use the hammer, so I showed him where to tap and he was excited to tap (not doing much good, but also not doing any damage). He liked that so much that he started to look for other things that he could use the hammer on, some were benign, others made me jump in and stop him before he did major damage.
What does this have to do with Excel and R? Well I see Excel as being very much like my hammer and different users like the different users of the hammer.
There are some carpenters who can use a hammer along with other tools to make things of pure beauty. Simillarly, I expect there are people who can use excel/spreadsheets along with other tools to make useful and beautiful things (I expect that you (Erich) may be one of those).
My use of the hammer is far less than that of the experts, I can about 9 times out of 10 survive the use without blood and major bruising, fix or make something that works, but is far from a work of art. My use of spreadsheets is similar, I do use them for some specific purposes (holding the bus schedule on my pda, amortization tables), but I know better than to use them where I would cause more harm than good.
I fear that many users of Excel and other spreadsheets are much like my toddler, they have a tool and they want to use it, ignorant of whether they will cause more damage than good. This is part of why I put in the reference to Dr. Burns page, he does not say get rid of spreadsheet altogether, but he points out that there are several cases where other tools work better (It is possible to embed a screw in a piece of wood using a hammer, but a screwdriver generally works better).
Erich Neuwirth wrote:
> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Erich Neuwirth
> Sent: Wednesday, August 29, 2007 4:46 PM
> To: r-help
> Subject: Re: [R] Excel
>
> Greg Snow wrote:
> > >
> > > Or do you trust all of your clients to know to use
> R(D)COM as well
> > > as how to install and use it?
>
> Do you trust your clients to be fluent enough in R to use it?
I admitted that I don't trust some of my clients with their own data (I think my record is 9 different sexes), I certainly don't expect them to use R. I trust them to bring their questions and data to me, then I use R (or one of my coworkers uses their favorite stats package) to help them. That way I know that the correct things are being done (on the flip side, they trust me to leave patient care up to them and not try surgery myself, it is much better for the patients that way).
> For most of my clients, that is not true.
> For this kind of users, the following strategy works.
> They have their data in Excel, and I write some macros in
> Excel which use RExcel to compute results and put them in the sheet.
> The clients press buttons similar to what they would do with
> the Analysis toolpack in Excel, but they get the full power of R.
Who installs R and sets up R dcom on all these computers?
If the above clients want something in addition to what you prepared, do you add additional macros for them? Do they know enough R to write their own Rapply statement? Or do they go looking in the Excel help and use the wrong set of tools? Or do you have them trained enough that they don't ask any additional questions?
I have no problem with you using this approach, I just think that if you are going to advocate that others use excel with clients, that you also let them know of the extra work that they may be committing themselves to.
>
> > >
> >> >> The formula
> >> >> =RApply("var",A1:A1000) in an Excel cell for example
> will use R to
> >> >> compute the variance of the data in column A in Excel.
> >> >> If you change any of the values in the range A1:A1000 will
> >> >> automatically recompute the variance.
> > >
> > > And what happens when you enter a value in cell A1001?
>
> If you use the following formula
> =RApply("var",DownFrom(A1))
>
> where DownFrom ist he following function defined in VBA
>
> Function DownFrom(startcell As Range) As Range
> Application.Volatile
> Set DownFrom = Range(startcell, startcell.End(xlDown))
> End Function
>
> Then writing a value in A1001 will a extend the range to
> which the function "var" is applied.
So to answer a known inconsistancy in spreadsheets, you are now bringing in a 3rd program/language. At what point is it just simpler to use R directly? (for me it was back a couple of steps).
> Arguments of worksheet function can be dynamic ranges, but
> sadly this is a relatively unknown fact.
And this is one of my main complaints about Excel as well as other MS products and MS wannabees. They may have implemented tools that allow you to do the correct thing, but in practice they make doing the wrong thing so much easier. I now encourage my clients to use at least access rather than excel for data entry because it forces them to think about what the data will look like before entering any values (one of the examples on Dr. Burns spreadsheet addiction page came from my group, life would have been much easier if the client had not trusted excel, but used access or another database program).
>
> > >
> > > And what happens if you set cell B1 to =A1 and do the
> magic copy so
> > > that b2=a2, b3=a3, ... Then put
> > > =Rapply("var",B1:B1000) in a cell, does the answer match with
> > > =Rapply("var",A1:A1000)?
>
> Of course it does, what else would you expect to happen?
> and if you copy the the formula in the cell containing
> =Rapply("var",A1:A1000) into the cell to its right, this cell
> it will automatically contain the formula
> =Rapply("var",B1:B1000)
Well the naïve expectation is that they would match, but the naïve expectation is that =var(A1:A1000) would also match =var(B1:B1000) (but it doesn't). I expected yours to suffer the same fate since I thougth that excel would interpret the blanks vs. 0's before passing the info to R. If you have done better at this than excel, then great job (I wanted to try this out myself, but am having problems installing the Rexcel tools).
> > >
> > > Yes, the auto-recompute could be considered a nice
> feature, but does
> > > it really save that much work compared to running a script in R
> > > after updating the data? (a couple of clicks in Rgui, a couple of
> > > keystrokes in ESS, one line of code in an R terminal)
>
> Auto-recomputing can do more.
> My favorite demo is the display of a kernel density estimator.
> You move a slider controlling the window width. The slider
> controls the value in a cell. When the cell contents change,
> R recomputes the kernel density estimator in some Excel cells
> and Excel then updates the graph. So you have an animated
> display in Excel which allows you to instantly see the
> changes in the graph.
> This is user controlled animation. It can be done using for
> example by Tcl/Tk also, but it is much more work.
I believe that which is more work depends on the person and their experience.
Here is some R code to do similar:
library(tkrplot)
my.data <- c(rnorm(20, 10, 2), rnorm(30, 15, 3))
my.adjust <- 1
tt <- tktoplevel()
img <- tkrplot(tt, function() plot(density(my.data, adjust=my.adjust)))
f <- function(...) {
my.a <- as.numeric(tclvalue('adjust'))
if(my.a != my.adjust){
my.adjust <<- my.a
tkrreplot(img)
}
}
s <- tkscale(tt, command=f, from=0.05, to=3.00, variable='adjust',
resolution=0.05, orient="horiz")
tkpack(img,s)
The above code is only a small modification of the example for tkrplot.
One of these days I'm going to rewrite the Tk demos in the TeachingDemos package to use tkrplot, it makes these things quite easy (maybe my son can use the hammer to help me, maybe not).
I do have to say that when I read your description of having R do the computations, but then having excel create the plot, I shuddered at the thought (and not in a good way).
[snip] I think there has been enough discussion on the pivot tables and right now we should probably agree to disagree.
> Of course it is possible to design terrible spreadsheets.
> But it is also quite possible to write terrible R code.
Yes that is true, but one major difference is that I have never known an R user that saw R as the only available tool. Personally I recommend that stats students know at least 3 stats packages (and I ask about this when I interview) as well as other computer tools. Too many people that only know excel think that excel can solve all problems. A while back I was talking to my boss (non-statistician) about something and she suggested creating a spreadsheet to track some information, I asked why not use access instead. Her reply was that she knew how to make a spreadsheet, but did not know access (or other db), then asked what the difference was. As I explained some of the differences, she realized that a db was the appropriate way to do what she wanted.
> What makes spreadsheets very special is autorecalculation and
> autoupdating of charts, but also the point-and-click method
> of creating formulas (using relative and absolute references).
I agree with the above statement, but I think that I am thinking of a different definition of 'special' than what you intended.
> And of course, knowing one tool always is dangerous, but this
> is also true in the case where the single known tool is R.
Here we agree
Actually, I like the R dcom interface idea. A few years ago I considered using it for an introductory class (reasoning that the students would already be familiar with excel), I don't remember all the reasons why I ended up not going with it. I have some co-workers that I may recommend it to as well (once they have learned enough R). I am just very wary of recommending it to the general public, or recommending statisticians use it with clients without the additional warnings of probable consequences.
For me personally, I prefer typing to clicking, so many of the arguments for it don't sway me (but they may be fine for others who prefer clicking to typing).
>
> -- Erich Neuwirth, University of Vienna Faculty of Computer
> Science Computer Supported Didactics Working Group Visit our
> SunSITE at http://sunsite.univie.ac.at Phone:
> +43-1-4277-39464 Fax: +43-1-4277-39459
> --
> Erich Neuwirth, University of Vienna
> Faculty of Computer Science
> Computer Supported Didactics Working Group Visit our SunSITE
> at http://sunsite.univie.ac.at
> Phone: +43-1-4277-39464 Fax: +43-1-4277-39459
>
--
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at intermountainmail.org
(801) 408-8111
More information about the R-help
mailing list