[R] expand.grid
Berwin A Turlach
berwin at maths.uwa.edu.au
Wed Jan 19 11:04:09 CET 2011
G'day Nick,
On Wed, 19 Jan 2011 09:43:56 +0100
"Nick Sabbe" <nick.sabbe at ugent.be> wrote:
> Given a dataframe
>
> dfr<-data.frame(c1=c("a", "b", NA, "a", "a"), c2=c("d", NA, "d", "e",
> "e"), c3=c("g", "h", "i", "j", "k"))
>
> I would like to have a dataframe with all (unique) combinations of
> all the factors present.
Easy:
R> expand.grid(lapply(dfr, levels))
c1 c2 c3
1 a d g
2 b d g
3 a e g
4 b e g
5 a d h
6 b d h
7 a e h
8 b e h
9 a d i
10 b d i
11 a e i
12 b e i
13 a d j
14 b d j
15 a e j
16 b e j
17 a d k
18 b d k
19 a e k
20 b e k
> In fact, I would like a simple solution for these two cases: given
> the three factor columns above, I would like both all _possible_
> combinations of the factor levels, and all _present_ combinations of
> the factor levels (e.g. if I would do this for the first 4 rows of
> dfr, it would contain no combinations with c3="k").
R> dfrpart <- lapply(dfr[1:4,], factor)
R> expand.grid(lapply(dfrpart, levels))
c1 c2 c3
1 a d g
2 b d g
3 a e g
4 b e g
5 a d h
6 b d h
7 a e h
8 b e h
9 a d i
10 b d i
11 a e i
12 b e i
13 a d j
14 b d j
15 a e j
16 b e j
> It would also be nice to be able to choose whether or not NA's are
> included.
R> expand.grid(lapply(dfrpart, function(x) c(levels(x),
+ if(any(is.na(x))) NA else NULL)))
c1 c2 c3
1 a d g
2 b d g
3 <NA> d g
4 a e g
5 b e g
6 <NA> e g
7 a <NA> g
8 b <NA> g
9 <NA> <NA> g
10 a d h
11 b d h
....
HTH.
Cheers,
Berwin
========================== Full address ============================
Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr)
School of Maths and Stats (M019) +61 (8) 6488 3383 (self)
The University of Western Australia FAX : +61 (8) 6488 1028
35 Stirling Highway
Crawley WA 6009 e-mail: berwin at maths.uwa.edu.au
Australia http://www.maths.uwa.edu.au/~berwin
More information about the R-help
mailing list