[R] Aggregate behaviour inconsistent (?) when FUN=table
Alain Guillet
alain.guillet at uclouvain.be
Tue Feb 6 09:20:03 CET 2018
Dear R users,
When I use aggregate with table as FUN, I get what I would call a
strange behaviour if it involves numerical vectors and one "level" of it
is not present for every "levels" of the "by" variable:
---------------------------
> df <-
data.frame(A=c(1,1,1,1,0,0,0,0),B=c(1,0,1,0,0,0,1,0),C=c(1,0,1,0,0,1,1,1))
> aggregate(df[1:2],list(df$C),table,simplify = TRUE,drop=TRUE)
Group.1 A.0 A.1 B
1 0 1 2 3
2 1 3 2 2, 3
> table(df$C,df$B)
0 1
0 3 0
1 2 3
---------------
As you can see, a comma appears in the column with the variable B in the
aggregate whereas when I call table I obtain the same result as if B was
defined as a factor (I suppose it comes from the fact "non-factor
arguments a are coerced via factor" according to the details of the
table help). I find it completely normal if I remember that aggregate
first splits the data into subsets and then compute the table. But then
I don't understand why it works differently with character vectors.
Indeed if I use character vectors, I get the same result as with factors:
------------------------
> df <-
data.frame(A=factor(c("1","1","1","1","0","0","0","0")),B=factor(c("1","0","1","0","0","0","1","0")),C=factor(c("1","0","1","0","0","1","1","1")))
> aggregate(df[1:2],list(df$C),table,simplify = TRUE,drop=TRUE)
Group.1 A.0 A.1 B.0 B.1
1 0 1 2 3 0
2 1 3 2 2 3
> df <-
data.frame(A=factor(c(1,1,1,1,0,0,0,0)),B=factor(c(1,0,1,0,0,0,1,0)),C=factor(c(1,0,1,0,0,1,1,1)))
> aggregate(df[1:2],list(df$C),table,simplify = TRUE,drop=TRUE)
Group.1 A.0 A.1 B.0 B.1
1 0 1 2 3 0
2 1 3 2 2 3
---------------------
Is it possible to precise anything about this behaviour in the aggregate
help since the result is not completely compatible with the expectation
of result we can have according to the table help? Or would it be
possible to have the same results independently of the vector type? This
post was rejected on the R-devel mailing list so I ask my question here
as suggested.
Best regards,
Alain Guillet
--
Alain Guillet
Statistician and Computer Scientist
SMCS - IMMAQ - Université catholique de Louvain
http://www.uclouvain.be/smcs
Bureau c.316
Voie du Roman Pays, 20 (bte L1.04.01)
B-1348 Louvain-la-Neuve
Belgium
Tel: +32 10 47 30 50
Accès: http://www.uclouvain.be/323631.html
More information about the R-help
mailing list