[R] aggregate function with a dataframe for both "x" and "by"
David Winsemius
dwinsemius at comcast.net
Thu Oct 6 05:47:39 CEST 2011
On Oct 5, 2011, at 7:45 PM, Eva Powers wrote:
> I have 2 dataframes. "mydata" contains numerical data. "mybys"
> contains
> information on the "group" each row of the data is in. I wish to
> aggregate
> each column in mydata using the corresponding column in mybys.
corresponding?
>
> Please see the example below. What is a more elegant or "better"
> way to
> accomplish this task?
>
> mydata = data.frame(testvar1=c(1,3,5,7,8,3,5,NA,4,5,7,9),
> testvar2=c(11,33,55,77,88,33,55,NA,44,55,77,99) )
>
> mybys=data.frame(mbn1=c('red','blue',1,2,NA,'big',1,2,'red',1,NA,
> 12),mbn2=c('wet','dry',99,95,NA,'damp',95,99,'red',99,NA,NA) ,
> stringsAsFactors =F)
>
> myaggs <- data.frame(matrix(data=NA, nrow=nrow(mydata),
> ncol=ncol(mydata) ) )
>
> for(i in 1: ncol(mydata) ) {temp <- aggregate(mydata[i], by =
> as.list(mybys[i]), FUN=sum, na.rm=T)
> rownums <- match(mybys[,i],temp[,1])
> myaggs[,i] <- temp[rownums,2] }
> myaggs
>
> Finally, how do I convert and use "mybys" to factors, so that I can
> tell R
> that the NA values form a group?
>
> I tried substituting this line above:
>
> temp <- aggregate(mydata[,i], by = as.list(mybys[,i]), FUN=sum,
> na.rm=T)
>
> ... but get the error message: "Error in
> aggregate.data.frame(as.data.frame(x), ...) :
> arguments must have same length"
>
David Winsemius, MD
West Hartford, CT
More information about the R-help
mailing list