How to create unique factor from two factors? + Boostrap Q
Prof Brian Ripley
ripley at stats.ox.ac.uk
Sun Nov 9 16:14:06 CET 2003
Well, it is one of those things
-- it works in R but not in S
-- it appears in the examples for help(":") but is not otherwise mentioned
on the help page (why?)
-- it does not give a numerical list of combinations, as asked for
-- it does give unused levels, which in this application is disastrous.
so I at least do not find it `easier'.
> a <- factor(letters)[1:6]
> b <- factor(rep(letters[1:3], each=2))
> a:b
[1] a:a b:a c:b d:b e:c f:c
78 Levels: a:a a:b a:c b:a b:b b:c c:a c:b c:c d:a d:b d:c e:a e:b e:c ...
On Sun, 9 Nov 2003 kjetil at entelnet.bo wrote:
On 9 Nov 2003 at 13:29, Prof Brian Ripley wrote:
>
> > Factor3 <- factor(unclass(Factor1) + nlevels(Factor1)*(unclass(Factor2)-1))
> >
>
> Cannot this be done even easier by calculating the interaction?
>
> > a <- factor(rep(1:3,rep(3,3)))
a <- factor(rep(1:3, each=3) is definitely easier!
> > b <- factor(rep(1:3,3))
> > ab <- a:b
> > ab
> [1] 1:1 1:2 1:3 2:1 2:2 2:3 3:1 3:2 3:3
> Levels: 1:1 1:2 1:3 2:1 2:2 2:3 3:1 3:2 3:3
>
> Kjetil Halvorsen
>
> > will give you the unique combinations, not labelled as you do but then I
> > don't think you need that.
> >
On Sun, 9 Nov 2003, Scott Norton wrote:
> >
> > > This might be easy but I'm very new to R and this question doesn't seem to
> > > have any nice keywords that bring up relevant search results when I search
> > > the CRAN search engine. Therefore, I'll plead (as I have in the recent
> > > past) Newbie status.
> > >
> > >
> > >
> > > I have a data frame with two factors (Factor 1 and 2) which together specify
> > > another unique level. I want to create a third factor in the data frame
> > > that captures this uniqueness.
> > >
> > > For example, say I had dataframe, Df, with Factors, 1 and 2. I want to
> > > create Factor 3 and add it to my Df dataframe.
> > >
> > > i.e.
> > >
> > > Df dataframe: WANT TO
> > >
> > > Row# Factor1 Factor2 CREATE THIS: Factor 3 Data
> > >
> > > 1 1 1 1 23
> > >
> > > 2 1 2 2 43
> > >
> > > 3 1 2 2 19
> > >
> > > 4 1 2 2 11
> > >
> > > 5 1 4 3 3
> > >
> > > 6 1 4 3 13
> > >
> > > 7 3 1 4 52
> > >
> > > 8 3 1 4 12
> > >
> > > 9 3 1 4 9
> > >
> > > 10 3 3 5 21
> > >
> > > 11 3 3 5 43
> > >
> > >
> > > 12 4 1 6 32
> > >
> > > 13 4 1 6 18
> > >
> > > 14 4 2 7 52
> > >
> > > 15 4 2 7 21
> > >
> > >
> > >
> > >
> > > and of course, I'm trying to create Factor 3 without loops..
> > >
> > >
> > >
> > > My end goal here (which I add because maybe I don't need to create Factor 3
> > > (although I'm still curious)), is to bootstrap "sample" Factor 3. I want to
> > > repeatedly grab, say, 3 levels of Factor 3, and take the mean of those
> > > levels (e.g. say in my first bootstrap sample, I grab levels 2,4, and 7 from
> > > Factor 3, then I want to take the mean of rows, 2,3,4,7,8,9,14,15). Of
> > > course, each sample from Factor 3 for my bootstrap will most likely have a
> > > differing number of rows since my experiment is not balanced. I'm not sure
> > > if this is an issue yet when I try to implement the "boot" function in R (I
> > > haven't gotten to that point yet).
> >
> > The boot package will easily do this for you.
> >
