[R] "By" function Frame Conversion (with Multiple Indices)
arun
smartpink111 at yahoo.com
Fri Jan 4 15:30:23 CET 2013
Hi,
You could try this:
dat1<-read.table(text="
id,age,weight,height,gender
1,22,180,72,m
2,13,100,67,f
3,5,40,40,f
4,6,42,,f
5,12,98,66,
6,50,255,60,m
",sep=",",header=TRUE,stringsAsFactors=FALSE,na.strings="")
list1<-by(dat1[c("weight","height")],dat1[c("age","gender")],colMeans,na.rm=TRUE,simplify=FALSE)
list2<-split(dat1,list(dat1$age,dat1$gender))
names(list1)<-names(list2)
res<-do.call(rbind,list1)
res2<-cbind(read.table(text=row.names(res),sep=".",header=FALSE,stringsAsFactors=FALSE),res)
colnames(res2)[1:2]<-c("age","gender")
row.names(res2)<-1:nrow(res2)
res2
# age gender weight height
#1 5 f 40 40
#2 6 f 42 NaN
#3 13 f 100 67
#4 22 m 180 72
#5 50 m 255 60
library(plyr)
ddply(dat1,.(age,gender),colwise(mean,c("weight","height")),na.rm=TRUE)
# age gender weight height
#1 5 f 40 40
#2 6 f 42 NaN
#3 12 <NA> 98 66 #prints groups which are missing
#4 13 f 100 67
#5 22 m 180 72
#6 50 m 255 60
A.K.
----- Original Message -----
From: "Ray DiGiacomo, Jr." <rayd at liondatasystems.com>
To: R Help <r-help at r-project.org>
Cc:
Sent: Friday, January 4, 2013 12:00 AM
Subject: [R] "By" function Frame Conversion (with Multiple Indices)
Hello,
I have the following dataset. Please note that there are missing values on
records 4 and 5:
id,age,weight,height,gender
1,22,180,72,m
2,13,100,67,f
3,5,40,40,f
4,6,42,,f
5,12,98,66,
6,50,255,60,m
I'm using the "By" function like this:
list1 <- by(dataset[c("weight", "height")],
dataset[c("age", "gender")],
colMeans,
na.rm = TRUE)
I then convert the list above to a frame like this:
as.data.frame( do.call(rbind, list1) )
I get this output from the code above:
weight height
1 40 40
2 42 NaN
3 100 67
4 180 72
5 255 60
I want to get the output above, but I also want two additional columns
named "age" and "gender" (with the age and gender values from the "By"
function output).
How would I do this?
Best Regards,
Ray DiGiacomo, Jr.
Healthcare Predictive Analytics Specialist
President, Lion Data Systems LLC
President, The Orange County R User Group
Board Member, TDWI
rayd at liondatasystems.com
(m) 408-425-7851
San Juan Capistrano, California USA
twitter.com/liondatasystems
linkedin.com/in/raydigiacomojr
youtube.com/user/liondatasystems/videos
liondatasystems.com/courses
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list