[R] By() with method = spearman
Doran, Harold
HDoran at air.org
Wed Sep 19 18:30:50 CEST 2007
I still get an error
> tmp$Grade <- factor(tmp$Grade)
> lapply(split(tmp, f = tmp$Grade),
function(x){cor(x[,c("mtsc07","DCBASmathscoreSPRING")], use='complete',
+ method='spearman')})
Error in cor(x[, c("mtsc07", "DCBASmathscoreSPRING")], use = "complete",
:
'x' is empty
I noticed tmp$Grade (my index variable) was numeric. So, I coerced it
into a factor. I get the same error message, however.
Notice, however, that this code works correctly
lapply(split(tmp, f = tmp$Grade),
function(x){cor(x[,c("mtsc07","DCBASmathscoreSPRING")], use='complete',
method='pearson')})
The only differece is that method is changed to pearson.
> -----Original Message-----
> From: Chuck Cleland [mailto:ccleland at optonline.net]
> Sent: Wednesday, September 19, 2007 12:22 PM
> To: Doran, Harold
> Subject: Re: [R] By() with method = spearman
>
> Doran, Harold wrote:
> > Thanks, Chuck. Seems odd though, doesn't it? There must be
> something
> > with my data set. But, I don't have any clue what it might
> be since I
> > can compute pearson using by() and I can subset and
> actually compute
> > spearman using just cor()
>
> Harold:
> What happens when you approach the problem with split() and
> lapply() instead of by()? For example:
>
> lapply(split(iris, f = iris$Species),
> function(x){cor(x[,c("Sepal.Length","Sepal.Width")], use='complete',
> method='spearman')})
>
> $setosa
> Sepal.Length Sepal.Width
> Sepal.Length 1.0000000 0.7553375
> Sepal.Width 0.7553375 1.0000000
>
> $versicolor
> Sepal.Length Sepal.Width
> Sepal.Length 1.000000 0.517606
> Sepal.Width 0.517606 1.000000
>
> $virginica
> Sepal.Length Sepal.Width
> Sepal.Length 1.0000000 0.4265165
> Sepal.Width 0.4265165 1.0000000
>
> hope this helps,
>
> Chuck
>
> >> -----Original Message-----
> >> From: Chuck Cleland [mailto:ccleland at optonline.net]
> >> Sent: Wednesday, September 19, 2007 12:14 PM
> >> To: Doran, Harold
> >> Cc: r-help at r-project.org
> >> Subject: Re: [R] By() with method = spearman
> >>
> >> Doran, Harold wrote:
> >>> I have a data set where I want the correlations between 2
> variables
> >>> conditional on a students grade level.
> >>>
> >>> This code works just fine.
> >>>
> >>> by(tmp[,c('mtsc07', 'DCBASmathscoreSPRING')], tmp$Grade, cor,
> >>> use='complete', method='pearson')
> >>>
> >>> However, this generates an error
> >>>
> >>> by(tmp[,c('mtsc07', 'DCBASmathscoreSPRING')], tmp$Grade, cor,
> >>> use='complete', method='spearman') Error in FUN(data[x, ],
> >> ...) : 'x'
> >>> is empty
> >>>
> >>> I can subset the data by grade and compute spearman rho as
> >>>
> >>> tmp5 <- subset(tmp, Grade == 5)
> >>> cor(tmp5[,c('mtsc07', 'DCBASmathcountSPRING')], use='complete',
> >>> method='spearman')
> >>>
> >>> But doing this iteratively is inefficient.
> >>>
> >>> I don't see anything in the help man for by() or cor() that
> >> tells me
> >>> what the problem is. I might be missing it though. Any thoughts?
> >> It works as expected using the iris data:
> >>
> >> by(iris[,c('Sepal.Length', 'Sepal.Width')], iris$Species, cor,
> >> use='complete', method='spearman')
> >>
> >> iris$Species: setosa
> >> Sepal.Length Sepal.Width
> >> Sepal.Length 1.0000000 0.7553375
> >> Sepal.Width 0.7553375 1.0000000
> >> --------------------------------------------------------------
> >> -------------------------------------------------------
> >>
> >> iris$Species: versicolor
> >> Sepal.Length Sepal.Width
> >> Sepal.Length 1.000000 0.517606
> >> Sepal.Width 0.517606 1.000000
> >> --------------------------------------------------------------
> >> -------------------------------------------------------
> >>
> >> iris$Species: virginica
> >> Sepal.Length Sepal.Width
> >> Sepal.Length 1.0000000 0.4265165
> >> Sepal.Width 0.4265165 1.0000000
> >>
> >>> sessionInfo()
> >> R version 2.5.1 Patched (2007-09-16 r42884)
> >> i386-pc-mingw32
> >>
> >> locale:
> >> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> >> States.1252;LC_MONETARY=English_United
> >> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
> >>
> >> attached base packages:
> >> [1] "stats" "graphics" "grDevices" "utils" "datasets"
> >> "methods" "base"
> >>
> >> other attached packages:
> >> lattice
> >> "0.16-5"
> >>
> >>> Thanks,
> >>> Harold
> >>>
> >>>
> >>>> sessionInfo()
> >>> R version 2.5.0 (2007-04-23)
> >>> i386-pc-mingw32
> >>>
> >>> locale:
> >>> LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
> >>> States.1252;LC_MONETARY=English_United
> >>> States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252
> >>>
> >>> attached base packages:
> >>> [1] "stats" "graphics" "grDevices" "utils" "datasets"
> >>> "methods" "base"
> >>>
> >>> other attached packages:
> >>> lattice
> >>> "0.15-4"
> >>>
> >>>
> >>> [[alternative HTML version deleted]]
> >>>
> >>> ______________________________________________
> >>> R-help at r-project.org mailing list
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide
> >>> http://www.R-project.org/posting-guide.html
> >>> and provide commented, minimal, self-contained,
> reproducible code.
> >> --
> >> Chuck Cleland, Ph.D.
> >> NDRI, Inc.
> >> 71 West 23rd Street, 8th floor
> >> New York, NY 10010
> >> tel: (212) 845-4495 (Tu, Th)
> >> tel: (732) 512-0171 (M, W, F)
> >> fax: (917) 438-0894
> >>
> >
> > ______________________________________________
> > R-help at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
>
>
> --
> Chuck Cleland, Ph.D.
> NDRI, Inc.
> 71 West 23rd Street, 8th floor
> New York, NY 10010
> tel: (212) 845-4495 (Tu, Th)
> tel: (732) 512-0171 (M, W, F)
> fax: (917) 438-0894
>
More information about the R-help
mailing list