[R] aggregate and names of factors

Frank E Harrell Jr feh3k at spamcop.net
Mon Dec 8 20:51:40 CET 2003


On 08 Dec 2003 14:13:59 +0100
Peter Dalgaard <p.dalgaard at biostat.ku.dk> wrote:

> Christophe Pallier <pallier at lscp.ehess.fr> writes:
> 
> > Hello,
> > 
> > I use the function 'aggregate' a lot.
> > 
> > One small annoyance is that it is necessary to name the factors in the
> > 'by' list to get the names in the resulting data.frame (else, they
> > appear as Group.1, Group.2...etc). For example, I am forced to
> > write:
> > 
> > aggregate(y,list(f1=f1,f2=f2),mean)
> > 
> > instead of aggregate(y,list(f1,f2),mean)
> > 
> > (for two factors with short names, it is not such a big deal, but I
> > ususally have about 8 factors with long names...)
> > 
> > I wrote a modified 'aggregate.data.frame' function (see the code
> > below) so that it parses the names of the factors and uses them in the
> > output
> > data.frame. I can now typer aggregate(y,list(f1,f2),mean) ans the
> > resulting data.frame
> > has variables with names 'f1' and 'f2'.
> > 
> > However, I have a few questions:
> > 
> > 1. Is is a good idea at all? When expressions rather than variables
> > are
> >    used as factors, this will probably result in a mess. Can one test
> >    if an argument within a list, is just a variable name or a more
> >    complex expression?). Is there a better way?
> 
> This issue is not just relevant for aggregate. There are a couple of
> other places where you want a named list to get names on output -
> lapply(list(foo,bar,baz) function(x) lm(x~age)), say. One option that
> I've been toying around with is to clone the code from data.frame and
> have a function namedList() or nlist() which automagically supplies
> names by deparsing the call. Now where did I put that code sketch...

llist in the Hmisc packages does that

Frank


> 
> -- 
>    O__  ---- Peter Dalgaard             Blegdamsvej 3  
>   c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
>  (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
> ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help


---
Frank E Harrell Jr    Professor and Chair            School of Medicine
                      Department of Biostatistics    Vanderbilt University
---
Frank E Harrell Jr    Professor and Chair            School of Medicine
                      Department of Biostatistics    Vanderbilt University




More information about the R-help mailing list