[R] a pickle with ranks and reals?
Thomas W Blackwell
tblackw at umich.edu
Fri Aug 22 15:03:47 CEST 2003
John -
Here are two equivalent solutions to your final question:
data <- data.frame(x=seq(15), y=sample(seq(15), 15),
subj=sample(c("harry","steve","nathan","john"), 15, T))
result.1 <- unclass(by(data, data$subj, function(dd) cor(dd$x, dd$y)))
result.2 <- unclass(by(data, data$subj, function(dd) cor(dd[c(1,2)])[1,2]))
I guess I prefer result.1 since the code is easier to read,
even though it does bury literal column names into the code.
The "function(dd)" stuff is a very common construction in by(),
sapply(), lapply() constructs. It defines a little function
in-line, without ever naming it, and passes it as the third
argument to by(). I use this all the time, when I need to
rearrange the order, or do a little bit of subscripting (as here),
in the arguments of a function (cor()) which I would otherwise
just pass directly as the third argument to by().
I'll let others comment on my use of unclass() here. The
goal was to get a numeric vector with a names attribute, so
it can be incorporated into further processing. I'm surprised
just how much tinkering it took to get this all to work.
This might actually make a useful example to add to the help
page for by().
- tom blackwell - u michigan medical school - ann arbor -
On Fri, 22 Aug 2003, John Christie wrote:
> . . . And, I also wanted to analyze correlations subject by subject and
> compare my two groups. However, there doesn't seem to be a good way to
> get this. I tried using "by" with "cor". However, this requires
> binding x and y which causes cor to return a matrix (if you could pass
> it x and y separate it would just return a number).
>
> given
>
> data frame s
> x y subj
> 4 7 harry
> 5 1 harry
> 6 9 harry
> 2 4 steve
> 3 7 steve
> ...
>
> i'd like to be able to produce
>
> r subj
> .12 harry
> .52 steve
> ...
>
> any tips?
More information about the R-help
mailing list