[R] aggregate function

Gabor Grothendieck ggrothendieck at myway.com
Mon Jul 26 21:12:30 CEST 2004


[Sorry if this gets posted twice.  I have been having some
problems with gmane posting.]

We can use rowsum like this:

	> rowsum(frame$Total * (frame[,3:5]>0), frame$Year)

	Tus Whi Norw
	1994 1 0.00 1
	1995 1 1.00 1
	1997 2 4.00 5
	1998 0 0.00 1
	1999 1 2.04 2

Note that only years that are actually present will be in the
resulting matrix. 1996 is not in the sample data in your post
so there is no row for 1996. If that's not a problem or if your 
real data covers all the years anyways we are done.

If missing years is a problem then merge in some zero rows with
the years first. The first two lines below do this and the third 
line is the same as the line above:

	> frame <- merge(frame, 1994:1999, by = 1, all = TRUE)
	> frame[is.na(frame)] <- 0

	> rowsum(frame$Total * (frame[,3:5]>0), frame$Year)

	Tus Whi Norw
	1994 1 0.00 1
	1995 1 1.00 1
	1996 0 0.00 0 <-- now we have a row for 1996
	1997 2 4.00 5
	1998 0 0.00 1
	1999 1 2.04 2


Luis Rideau Cruz <Luisr at frs.fo> :

I have the folowing frame(there are more columns than shown),
1 2 3 4 5
Year Total Tus Whi Norw
1994 1.00 1830 0 355
1995 1.00 0 0 0
1995 1.00 0 0 0
1995 1.00 4910 4280 695
1997 1.00 0 0 110
1997 0.58 0 0 0
1997 1.00 0 0 0
1994 1.00 0 0 0
1997 1.00 0 40 70
1998 1.00 0 0 1252
1999 1.04 0 74 0
1999 1.00 0 0 0
1999 1.02 0 0 0
1999 1.00 0 0 0
1999 1.00 0 0 171
1999 1.00 1794 0 229
1999 1.00 0 3525 0
1997 1.00 1335 1185 147
1997 1.00 4925 1057 4801
1997 1.00 0 6275 1773

I try to get sum("Total") by "Year" in which Tus>0, sum("Total") by "Year" in which Whi>0,,,and so on.

I have done something like this;

a<-as.list(numeric(3))
for (i in 3:5)
{
a[[i]]<-aggregate(frame[,"Total"],list(Year=frame$"Year",
Tus=frame$"i">0),sum)
}

But I get

"Error in FUN(X[[as.integer(1)]], ...) : arguments must have same length"

Also by doing one by one

aggregate(frame[,"Total"],list(Year=frame$"Year",
Tus=frame$"Tus">0),sum)


The result is something like;

Year Tus x
1994 FALSE 49.69
1995 FALSE 49.35
1996 FALSE 56.95
1997 FALSE 57.00
1998 FALSE 57.00
1999 FALSE 58.09
2000 FALSE 56.97
2001 FALSE 57.95
2002 FALSE 57.10
2003 FALSE 56.16
2000 TRUE 1.00
2002 TRUE 1.00
2003 TRUE 2.01




More information about the R-help mailing list