[R] dataframes and factors

Wolfgang Koller koller2 at fgr.wu-wien.ac.at
Thu Jun 15 14:31:24 CEST 2000

At 13:57 15.06.00 +0200, you wrote:
>Wolfgang Koller <koller2 at fgr.wu-wien.ac.at> writes:
>> Dear R-List,
>> I have a dataframe X containing factor f and numeric variable x1, x2, ... I
>> want to create a new dataframe (or possibly a matrix) that gives statistics
>> (e.g. sum) for the variables x1, x2, ... in each group defined by factor f.
>> What is the simplest way to do this?
>> I tried:
>>   attach(X)
>>   Z <- data.frame(f=levels(f),x1=as.vector(lapply(split(x1,f),sum)))
>> and stumbled on:
>>   Error in data.frame(f = levels(f), x1 = as.vector(lapply(split(x1, f),
>>         arguments imply differing number of rows: 5, 1
>> Help is much appreciated,
>does aggregate(X, f, sum) do what you want?

Yes it does. I should have remembered aggregate(), of course...
Nevertheless, since X contains also factors I have to select variables for
which summary statistics have to be computed. So I used:

  Z <- data.frame(f=levels(f),x1=as.vector(tapply(x1,f,sum)))

I think, in the example I tried first, based on lapply(), the problem was
the conversion of a list to a vector, a problem which arises quite often. 

Thanks for your help,

Wolfgang Koller

Wolfgang Koller,  koller2 at fgr.wu-wien.ac.at
Forschungsinstitut für Europafragen
Wirtschaftsuniversitaet Wien
Althanstrasse 39-45, 1090 Vienna, Austria
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch

More information about the R-help mailing list