[R] Help with ddply to eliminate a for..loop
Bos, Roger
roger.bos at rothschild.com
Thu Aug 26 22:33:42 CEST 2010
I created a small example to show something that I do a lot of. "scale"
data by month and return a data.frame with the output. "id" represents
repeated observations over "time" and I want to scale the "slope"
variable. The "out" variable shows the output I want. My for..loop
does the job but is probably very slow versus other methods. ddply
seems ideal, but despite playing with the baseball examples quite a bit
I can't figure out how to get it to work with my sample dataset.
TIA for any help, Roger
Here is the sample code:
dat <- data.frame(id=rep(letters[1:5],3),
time=c(rep(1,5),rep(2,5),rep(3,5)), slope=1:15)
dat
for (i in 1:3) {
mat <- dat[dat$time==i, ]
outi <- data.frame(mat$time, mat$id, slope=scale(mat$slope))
if (i==1) {
out <- outi
} else {
out <- rbind(out, outi)
}
}
out
Here is the sample output:
> dat <- data.frame(id=rep(letters[1:5],3),
time=c(rep(1,5),rep(2,5),rep(3,5)), slope=1:15)
> dat
id time slope
1 a 1 1
2 b 1 2
3 c 1 3
4 d 1 4
5 e 1 5
6 a 2 6
7 b 2 7
8 c 2 8
9 d 2 9
10 e 2 10
11 a 3 11
12 b 3 12
13 c 3 13
14 d 3 14
15 e 3 15
> for (i in 1:3) {
+ mat <- dat[dat$time==i, ]
+ outi <- data.frame(mat$time, mat$id, slope=scale(mat$slope))
+ if (i==1) {
+ out .... [TRUNCATED]
> out
mat.time mat.id slope
1 1 a -1.2649111
2 1 b -0.6324555
3 1 c 0.0000000
4 1 d 0.6324555
5 1 e 1.2649111
6 2 a -1.2649111
7 2 b -0.6324555
8 2 c 0.0000000
9 2 d 0.6324555
10 2 e 1.2649111
11 3 a -1.2649111
12 3 b -0.6324555
13 3 c 0.0000000
14 3 d 0.6324555
15 3 e 1.2649111
>
***************************************************************
This message is for the named person's use only. It may\...{{dropped:20}}
More information about the R-help
mailing list