[R] A comment about R:

Wed Jan 4 10:47:55 CET 2006

Dear Bob,
The reasons you mentioned are supposedly good features in R -- not giving
lots of output you do not necessarily need. I guess the question is why do
you want R to produce what you get from SPSS?  SPSS is hardly a gold
standard in statistical software.  
But I agree that it is quite difficult for users of SPSS to unlearn SPSS (or
SAS) while using R. 

Best Marwan

----------------------------------------------
Marwan Khawaja   http://staff.aub.edu.lb/~mk36
---------------------------------------------- 

> -----Original Message-----
> From: r-help-bounces at stat.math.ethz.ch 
> [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Bob Green
> Sent: Wednesday, January 04, 2006 3:37 AM
> To: r-help at stat.math.ethz.ch
> Subject: Re: [R] A comment about R:
> 
> 
> >Hello,
> 
> 
> >Unlike most posts on the R mailing list I feel qualified to 
> comment on 
> >this one.  For about 3 months I have been trying to learn 
> use R,  after 
> >having  used various versions of SPSS for about  10 years.
> 
> 
> I think it is far too simplistic to ascribe non-use of R to 
> laziness.  This may well be the case for some, however, I 
> have read 5-6 books on R, waded through on-line resources,  
> read the documentation and asked multiple questions via 
> e-mails - and still find even some of the basics very difficult.
> 
> There are several reasons for this:
> 
> 1. For some tasks R is extremely user-unfriendly.  Some 
> comparative examples:
> 
> (a) In running a chi-square analysis in SPSS the following 
> syntax is included
> 
> /STATISTIC=CHISQ
>    /CELLS= COUNT EXPECTED ROW COLUMN TOTAL RESID .
> 
> this produces expected and observed counts, row & column 
> percentages, residuals, chi-square & Fisher's exact  test + 
> other output.
> 
> In R, it is a herculean task to produce similar output . It 
> certainly, can't be produced in 2 lines as far as I can tell.
> 
> (b)  in SPSS if I want to compare multiple variables by a 
> single dependent variable this is readily performed
> 
> CROSSTABS
>    /TABLES=baserdis  baserenh  basersoc baseradd socbest 
> disbest entbest addbest worsdis worsphy by group
> 
> I used the chi-square example again, but the same applies for 
> a t-test. I started looking into how  to do something similar 
> in R, with the t-test command but gave up. R does force the 
> user to take a more considered approach to analysis.
> 
> (c) To obtain a correlation matrix in R with the correlation 
> & p-value is no simple task -
> 
> In SPSS this is obtained via:
> 
> GET
>    FILE='D:\a study\data\dat\key data\master data.sav'.
> NONPAR CORR
>    /VARIABLES= goodnum badnum good5 bad5 avfreq avdayamt
>    /PRINT=KENDALL TWOTAIL
>    /MISSING=PAIRWISE .
> 
> In R something like this is required -
> 
>  > by(mydat, mydat$group, function(x) {
> + nm <- names(x)
> + rho <- matrix(, 6, 2)
> + rho.nm <- matrix(, 6, 2)
> + k <- 1
> + for(i in 2:4) {
> + for(j in (i + 1):5) {
> + x.i <- x[, i]
> + x.j <- x[, j]
> + ct <- cor.test(x.i, x.j, method=c("kendall") , alternative 
> + =c("two-sided")) rho[k, 1] <- ct$estimate rho[k, 2] <- 
> + round(ct$p-value, 3) rho.nm[k, ] <- c(nm[i], nm[j]) k <- k 
> + 1 } } rho 
> + <- cbind(as.data.frame(rho.nm), as.data.frame(rho))
> + names(rho) <- c("freq.i", "freq.j", "cor", "p-value") rho
> + })
> 
> 2) It is not always clear what the output produced by R, is. 
> The Mann-Whitney U-test is a good example. In R, it seems a 
> standardised value is obtained. I was advised that it is easy 
> enough to check this as R is open-source, but at least for 
> me, I don't believe I would understand this code anyway. It 
> is confusing when comparative programs such as R and SPSS 
> produce dis-similar results. For the user it is important to 
> be able to fairly easily reconcile such differences, to 
> engender confidence in results.
> 
> 3) I find the help files in R quite difficult to understand.  
> For example, see help(t.test).  It is almost assumed by the 
> examples that you know what to do. Personally, I would find 
> some form of simple decision tree easier -e.g. If you want to 
> perform a t-test with the dependent variable in one column 
> and the dependent use in another use t.test(AVFREQ~GROUP) . 
> If you want to perform a t-test with the dependent variable 
> in separate columns (each column representing a different 
> group) use - t.test(AVFREQ1, AVFREQ2) .
> 
> 4) My initial approach to using R, was to run commands I had 
> used commonly in SPSS and compare the results. I have only 
> got as far  as basic ANOVA. 
> This has been time-consuming and at times it has been 
> difficult to obtain advice. Some people on the R list have 
> been extremely generous with their time and knowledge, and I 
> have much appreciated this assistance. At other times I see 
> responses met  with something like arrogance. With the 
> sophistication of R, there is also an elitism.  This is a 
> barrier to R being more widely accepted and used.
> 
> 5) differences in terminology - this is just part of the 
> learning process, but I still found it took quite some time 
> to work out simple commands and what different analyses were called.
> 
> 6) system administrators may be wary of freeware.
> 
> No doubt for the sophisticated user, my comments may seem 
> trite and easily resolved, however I believe my comments have 
> some relevance as to why R is not more readily used or accepted.
> 
> 
> Bob Green
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>