[R] Re: coding factor replicates
Bill.Venables@CMIS.CSIRO.AU
Bill.Venables at CMIS.CSIRO.AU
Thu Jan 24 04:34:07 CET 2002
> -----Original Message-----
> From: Douglas Bates [mailto:bates at stat.wisc.edu]
> Sent: Thursday, January 24, 2002 8:55 AM
> To: Uwe Ligges
> Cc: Brad Buchsbaum; r-help at stat.math.ethz.ch
> Subject: Re: [R] Re: coding factor replicates
>
> Douglas Bates <bates at cs.wisc.edu> writes:
>
> > Uwe Ligges <ligges at statistik.uni-dortmund.de> writes:
> >
> > > Brad Buchsbaum wrote:
> > > >
> > > > Hi All,
> > > >
> > > > If I have a factor f:
> > > >
> > > > A B C B C A C B A A B ....
> > > >
> > > > and I would like to generate a factor to indicate the trial number
> > > > as a function of condition: e.g.
> > > >
> > > > 1 1 1 2 2 2 3 3 3 4 4 ...
> > > >
> > > > how might I attack this in R?
> > >
> > > What about something like
> > > as.factor(outer(rep(1, 3), 1:4))
> >
> > I think the point is that the 1's are at the first occurence of the
> > level, the 2's at the second occurence, etc. This seems like the sort
> > of problem that Bill Venables would come up with a devilishly clever
> > way of solving.
> >
> > I would do it as
> >
> > > result <- seq(along = f) # create an vector to hold the
response
> > > sp <- split(seq(along = f), f) # split the factor on levels
> > > result[unlist(sp)] <- unlist(lapply(sp, function(x) seq(along = x)))
> > > result
> > [1] 1 1 1 2 2 2 3 3 3 4 4
> >
> > but I'm sure Bill would do it much more elegantly than that.
>
> Before others point out the obvious simplification (I did it in stages
> and assembled the "swish" result, as Bill would term it - apparently
> swish has a different connotation in Australia than it does in North
> America), the second line could be
>
> > sp <- split(result, f) # split the index vector on factor levels
Doug is much too kind (I think). The tricks with match() I have learned
from him are just amazing.
With this problem you can cheat a bit if you assume that the trials are
contiguous (as I think they must be). All you need to know then are (1) the
run length of a trial and (2) the number of trials.
> run.length <- which(duplicated(f))[1] - 1
> no.trials <- ceiling(length(f)/run.length)
> trials <- factor(rep(1:no.trials, rep(run.length, no.trials),
length.out = length(f)))
> trials
[1] 1 1 1 2 2 2 3 3 3 4 4
Levels: 1 2 3 4
No more elegant than Doug's, I contend!
Bill Venables.
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list