[R] expand.grid
Nick Sabbe
nick.sabbe at ugent.be
Wed Jan 19 11:38:29 CET 2011
<slaps self in forehead/>
I appear to have misinterpreted the help: considering that it explicitly
makes note of factors, I wrongly assumed that it would use the levels of a
factor automatically. My bad.
For completeness' sake, my final solution:
getLevels<-function(vec, includeNA=FALSE, onlyOccurring=FALSE)
{
if(onlyOccurring)
{
rv<-levels(factor(vec))
}
else
{
rv<-levels(vec)
}
#cat("levels so far: ", rv, "\n")
if(includeNA && any(is.na(vec)))
{
rv<-c(rv,NA)
}
#cat("levels with na: ", rv, "\n")
return(rv)
}
expand.combs<-function(dfr, includeNA=FALSE, onlyOccurring=FALSE)
{
expand.grid(lapply(dfr, getLevels, includeNA, onlyOccurring))
}
Thx.
Nick Sabbe
--
ping: nick.sabbe at ugent.be
link: http://biomath.ugent.be
wink: A1.056, Coupure Links 653, 9000 Gent
ring: 09/264.59.36
-- Do Not Disapprove
-----Original Message-----
From: Berwin A Turlach [mailto:berwin at maths.uwa.edu.au]
Sent: woensdag 19 januari 2011 11:04
To: Nick Sabbe
Cc: r-help at r-project.org
Subject: Re: [R] expand.grid
G'day Nick,
On Wed, 19 Jan 2011 09:43:56 +0100
"Nick Sabbe" <nick.sabbe at ugent.be> wrote:
> Given a dataframe
>
> dfr<-data.frame(c1=c("a", "b", NA, "a", "a"), c2=c("d", NA, "d", "e",
> "e"), c3=c("g", "h", "i", "j", "k"))
>
> I would like to have a dataframe with all (unique) combinations of
> all the factors present.
Easy:
R> expand.grid(lapply(dfr, levels))
c1 c2 c3
1 a d g
2 b d g
3 a e g
4 b e g
5 a d h
6 b d h
7 a e h
8 b e h
9 a d i
10 b d i
11 a e i
12 b e i
13 a d j
14 b d j
15 a e j
16 b e j
17 a d k
18 b d k
19 a e k
20 b e k
> In fact, I would like a simple solution for these two cases: given
> the three factor columns above, I would like both all _possible_
> combinations of the factor levels, and all _present_ combinations of
> the factor levels (e.g. if I would do this for the first 4 rows of
> dfr, it would contain no combinations with c3="k").
R> dfrpart <- lapply(dfr[1:4,], factor)
R> expand.grid(lapply(dfrpart, levels))
c1 c2 c3
1 a d g
2 b d g
3 a e g
4 b e g
5 a d h
6 b d h
7 a e h
8 b e h
9 a d i
10 b d i
11 a e i
12 b e i
13 a d j
14 b d j
15 a e j
16 b e j
> It would also be nice to be able to choose whether or not NA's are
> included.
R> expand.grid(lapply(dfrpart, function(x) c(levels(x),
+ if(any(is.na(x))) NA else NULL)))
c1 c2 c3
1 a d g
2 b d g
3 <NA> d g
4 a e g
5 b e g
6 <NA> e g
7 a <NA> g
8 b <NA> g
9 <NA> <NA> g
10 a d h
11 b d h
....
HTH.
Cheers,
Berwin
========================== Full address ============================
Berwin A Turlach Tel.: +61 (8) 6488 3338 (secr)
School of Maths and Stats (M019) +61 (8) 6488 3383 (self)
The University of Western Australia FAX : +61 (8) 6488 1028
35 Stirling Highway
Crawley WA 6009 e-mail: berwin at maths.uwa.edu.au
Australia http://www.maths.uwa.edu.au/~berwin
More information about the R-help
mailing list