[R] Getting the groupmean for each person

Liaw, Andy andy_liaw at merck.com
Mon May 10 13:37:29 CEST 2004


Both of you might have missed my question from Friday:  For very long `x'
(e.g., length=50000), indexing by names can take a long time.  See that
thread for detail.  (For small data, you can hardly tell the difference.)

Also, I'm trying to write the function in a way that one can pass in more
than one grouping variables in a list, much like tapply.  The version I
shown is a simplified version to demonstrate the `problem' I had.  I
obviously missed the fact that tapply returns 1D array...

Best,
Andy

> From: kjetil at acelerate.com 
> 
> On 10 May 2004 at 10:09, Christophe Pallier wrote:
> 
> > 
> > 
> > Liaw, Andy wrote:
> > 
> > >Suppose I
> > >define the function:
> > >
> > >fun <- function(x, f) {
> > >    m <- tapply(x, f, mean)
> > >    ans <- x - m[match(f, unique(f))]
> > >    names(ans) <- names(x)
> > >    ans
> > >}
> > >
> > >  
> > >
> > 
> > May I ask what is the purpose of match(f,unique(f)) ?
> > 
> > To remove the group means, I have be using:
> > 
> > x-tapply(x,f,mean)[f]
> > 
> > for a while, (and I am now changing to 
> > x-tapply(x,f,mean)[as.character(f)] because of the peculiarities of
> 
> wouldn't 
>  sweep(as.array(x), 1, tapply(x,f,mean)[as.character(f)] , "-")
> 
> be more natural?
> 
> Kjetil Halvorsen
> 
> > indexing named vectors with factors )
> > 
> > The use of tapply(x,f,mean)[match(f,unique(f))] assumes a particular
> > order in the result of tapply, no? It seems a bit dangerous to me.
> > 
> > 
> > Christophe Pallier
> > 
> > ______________________________________________
> > R-help at stat.math.ethz.ch mailing list
> > https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide!
> > http://www.R-project.org/posting-guide.html
> > 
> 
> 
> 
>




More information about the R-help mailing list