[R] Grouping data.frames
Peter Dalgaard
p.dalgaard at biostat.ku.dk
Wed Jan 7 00:46:00 CET 2004
Olaf Mersmann <olafm at tako.de> writes:
> Hello all,
>
> I'm new to R (and the S language in general) so go easy on me if this is really simple.
>
> Given a data.frame df which looks like this:
> f1 f2 f3 f4 c1 c2
> 1 y y a b 10 20
> 2 n y b a 20 20
> 3 n n b b 8 10
> 4 y n a a 30 5
>
> I'd like to aggregate it by the factors f1 and f2 (or f2 and f3, or any other combination of the three) and compute the sum of c1 and c2 (as separate values). I can do this just fine as long as there is only one column with counts using tapply of mApply out of Hmisc, but I've been unable to come up with a solution that works with two or more columns.
>
> In SQL a query to achieve this would look something like this:
> SELECT f1, f2, sum(c1), sum(2) FROM df GROUP BY f1, f2
>
> An hints on how this is done efficiently in R would be greatly appreciated.
I think aggregate() will do what you want. If not, notice that
whatever you can do with a single factor, you can also do with
interaction(f1,f2) or maybe interaction(f1,f2, drop=TRUE).
--
O__ ---- Peter Dalgaard Blegdamsvej 3
c/ /'_ --- Dept. of Biostatistics 2200 Cph. N
(*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
More information about the R-help
mailing list