[R] the first and last observation for each subject
William Dunlap
wdunlap at tibco.com
Mon Jan 5 18:18:31 CET 2009
Arg, the 'sapply(...)' in the function was in the initial
comment,
gm <- function(x, group){ # medians by group:
sapply(split(x,group),median)
but someone's mailer put a newline before the sapply
gm <- function(x, group){ # medians by group:
sapply(split(x,group),median)
so it got executed, wasting all that time. (Of course the
same mailer will mess up this message in the same way -
just delete the sapply() call from gm().)
Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com
> -----Original Message-----
> From: hadley wickham [mailto:h.wickham at gmail.com]
> Sent: Monday, January 05, 2009 9:10 AM
> To: William Dunlap
> Cc: gallon.li at gmail.com; R help
> Subject: Re: [R] the first and last observation for each subject
>
> > Another application of that technique can be used to quickly compute
> > medians by groups:
> >
> > gm <- function(x, group){ # medians by group:
> > sapply(split(x,group),median)
> > o<-order(group, x)
> > group <- group[o]
> > x <- x[o]
> > changes <- group[-1] != group[-length(group)]
> > first <- which(c(TRUE, changes))
> > last <- which(c(changes, TRUE))
> > lowerMedian <- x[floor((first+last)/2)]
> > upperMedian <- x[ceiling((first+last)/2)]
> > median <- (lowerMedian+upperMedian)/2
> > names(median) <- group[first]
> > median
> > }
> >
> > For a 10^5 long x and a somewhat fewer than 3*10^4 distinct groups
> > (in random order) the times are:
> >
> >> group<-sample(1:30000, size=100000, replace=TRUE)
> >> x<-rnorm(length(group))*10 + group
> >> unix.time(z0<-sapply(split(x,group), median))
> > user system elapsed
> > 2.72 0.00 3.20
> >> unix.time(z1<-gm(x,group))
> > user system elapsed
> > 0.12 0.00 0.16
> >> identical(z1,z0)
> > [1] TRUE
>
> I get:
>
> > unix.time(z0<-sapply(split(x,group), median))
> user system elapsed
> 2.733 0.017 2.766
> > unix.time(z1<-gm(x,group))
> user system elapsed
> 2.897 0.032 2.946
>
>
> Hadley
>
>
> --
> http://had.co.nz/
>
More information about the R-help
mailing list