[R] Help on calculating spearman rank correlation for a data frame with conditions
David Winsemius
dwinsemius at comcast.net
Wed Aug 29 06:56:21 CEST 2012
On Aug 28, 2012, at 9:20 PM, R. Michael Weylandt wrote:
> On Tue, Aug 28, 2012 at 9:01 PM, Yi <liuyi.feier at gmail.com> wrote:
>> Dear all,
>>
>> Suppose my data frame is as follows:
>>
>> id price distance
>> 1 2 4
>> 1 3 5
>> ...
>> 2 4 8
>> 2 5 9
>> ...
>> n 3 7
>> n 8 9
>>
>> I would like to calculate the rank-order correlation between price
>> and
>> distance for each id.
>>
>> cor(price,distance,method = "spearman") calculate a correlation for
>> all.
>>
>> Then I tried to use
>> apply(data,list='id',cor(price , distance , method = "spearman"))
>> to
>>
>
> You seem to have been cut off mid-thought, but I'm guessing you want
> something more like:
>
> tapply(data, data$id, function(x), cor(x$price, x$distance, method =
> "spearman"))
I am dubious. tapply takes an atomic vector rather than a dataframe as
its first argument. Generally one needs to use an lapply(split()) or
by() for such group operations involving more than one vector:
Here's my guess:
lapply( split(data, data$id), function(dfrm)
{ cor(x=dfrm[["price"]], y=dfrm[["distance"]], method =
"spearman") } )
OR:
by(data, data$id, function(dfrm) cor( x=dfrm[["price"]],
y=dfrm[["distance"]] , , method = "spearman") )
--
David Winsemius, MD
Alameda, CA, USA
More information about the R-help
mailing list