[R] standardizing one variable by dividing each value by the mean -but within levels of a factor

Jason Smith devjason at gmail.com
Thu Jan 21 15:28:44 CET 2010


Dimitri Liakhovitski <ld7631 <at> gmail.com> writes:

> 
> One follow up question - the proposed solution was (notice - this time
> I am introducing one NA in data frame "x")
> 
> x<-data.frame(factor=c("b","b","d","d","e","e"),values=c(1,NA,10,20,100,200))
> x$std.via.ave<-ave(x$values, x$factor, FUN=function(x)x/mean(x))
> 
> I compared the result to my own clumsy solution:
> 
> factor.level.means<-as.data.frame(tapply(x$values,x$factor,mean, na.rm=T))
> factor.level.means$factor<-row.names(factor.level.means)
> names(factor.level.means)[1]<-"means"
> factor.level.means
> 
> x$std<-NA
> for(i in 1:nrow(x)){ #i<-1
>   x[i,"std"]<-
factor.level.means[factor.level.means$factor==x[i,"factor"],"means"]
> }
> x$std<-x$values/x$std
> 
> If one compares x$std to x$std.via.ave - one notices that ave results
> in an NA for the very first observation - because it seems to be using
> na.rm=F when it calculates the means.
> Is there a way to fix that in the ave solution?

I think you are asking how to have the first observation in the ave solution be
calculated as 1 instead of NA.  

As you noted, the ave solution is currently using the default na.rm=F 
(see ?mean). Simply pass na.rm=T in to your custom function in the ave solution
for it to remove the NA and you will get 1 as the average using the ave
approach:

x$std.via.ave<-ave(x$values, x$factor, FUN=function(x)x/mean(x,na.rm=T))

--jason



More information about the R-help mailing list