[R] Help with one of "those" apply functions
Steve Lianoglou
mailinglist.honeypot at gmail.com
Wed Feb 2 23:34:34 CET 2011
Hi,
On Wed, Feb 2, 2011 at 4:08 PM, Robin Jeffries <rjeffries at ucla.edu> wrote:
> Hello there,
>
> I'm still struggling with the *apply commands. I have 5 people with id's
> from 10 to 14. I have varying amounts (nrep) of repeated outcome (value)
> measured on them.
>
> nrep <- 1:5
> id <- rep(c("p1", "p2", "p3", "p4", "p5"), nrep)
> value <- rnorm(length(id))
>
> I want to create a new vector that contains the sum of the values per
> person.
>
> subject.value[1] <- value[1] # 1 measurement
> subject.value[2] <- sum(value[2:3]) # the next 2 measurements
> ...
> subject.value[5] <- sum(value[11:15]) # the next 5 measurements
>
>
> I'd imagine it'll be some sort of *apply(value, nrep, sum) but I can't seem
> to land on the right format.
>
> Can someone give me a heads up as to what the correct syntax and function
> is?
In addition to tapply (as Phil pointed out), you can look at the
functions in plyr.
I somehow find them more intuitive, at times, then their sister "base"
functions, especially since more often than not you'll have your data
in a data.frame.
For instance:
R> set.seed(123)
R> nrep <- 1:5
R> id <- rep(c("p1", "p2", "p3", "p4", "p5"), nrep)
R> value <- rnorm(length(id))
R> DF <- data.frame(id=id, value=value)
R> tapply(value, id, sum)
p1 p2 p3 p4 p5
-0.5604756 1.3285308 1.9148611 -1.9366599 1.5395087
R> library(plyr)
R> ddply(DF, .(id), summarize, total=sum(value))
id total
1 p1 -0.5604756
2 p2 1.3285308
3 p3 1.9148611
4 p4 -1.9366599
5 p5 1.5395087
In this case, though, I'll grant you that tapply is simpler if you
already know how to use it.
--
Steve Lianoglou
Graduate Student: Computational Systems Biology
| Memorial Sloan-Kettering Cancer Center
| Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact
More information about the R-help
mailing list