[R] long format - find age when another variable is first 'high'

Gabor Grothendieck ggrothendieck at gmail.com
Mon May 25 16:14:28 CEST 2009


Depending on what you want (haven't checked the speed) you could try
this one where
we have changed the ldlc in the first row so that it has none > 130
for id=1 just to
illustrate that case as well:

> d <- data.frame(id = c(rep(1, 3),  rep(2, 2), 3),  age=c(5, 10, 15, 4, 7, 12),
+  ldlc=c(122, 120, 125, 105, 142, 160))

> library(sqldf)
> sqldf("select * from d left join (select id, min(age) min_age from d where ldlc > 130 group by id) using(id)")
  id age ldlc min_age
1  1   5  122    <NA>
2  1  10  120    <NA>
3  1  15  125    <NA>
4  2   4  105     7.0
5  2   7  142     7.0
6  3  12  160    12.0

> # or this (which just gives the data frame of id and min_age):

> sqldf("select id, min_age from d left join (select id, min(age) min_age from d where ldlc > 130 group by id) using(id) group by id")
  id min_age
1  1    <NA>
2  2     7.0
3  3    12.0

> # or this (which is similar but omits the NAs)

> sqldf("select id, min(age) from d where ldlc > 130 group by id")
  id min(age)
1  2        7
2  3       12

See sqldf home page at:
http://sqldf.googlecode.com

On Mon, May 25, 2009 at 8:45 AM, David Freedman <3.14david at gmail.com> wrote:
>
> Dear R,
>
> I've got a data frame with children examined multiple times and at various
> ages.  I'm trying to find the first age at which another variable
> (LDL-Cholesterol) is >= 130 mg/dL; for some children, this may never happen.
> I can do this with transformBy and ddply, but with 10,000 different
> children, these functions take some time on my PCs - is there a faster way
> to do this in R?  My code on a small dataset follows.
>
> Thanks very much, David Freedman
>
> d<-data.frame(id=c(rep(1,3),rep(2,2),3),age=c(5,10,15,4,7,12),ldlc=c(132,120,125,105,142,160))
> d$high.ldlc<-ifelse(d$ldlc>=130,1,0)
> d
> library(plyr)
> d2<-ddply(d,~id,transform,plyr.minage=min(age[high.ldlc==1]));
> library(doBy)
> d2<-transformBy(~id,da=d2,doby.minage=min(age[high.ldlc==1]));
> d2
> --
> View this message in context: http://www.nabble.com/long-format---find-age-when-another-variable-is-first-%27high%27-tp23706393p23706393.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list