[R] filter data set unique, duplicate..

Dimitris Rizopoulos dimitris.rizopoulos at med.kuleuven.be
Wed Aug 3 11:21:32 CEST 2005


maybe you could consider something like this:

dat <- data.frame(x = c(1, 2, 2, 3, 3, 4),
                  y1 = c(1, 1, 2, 1, 7, 8),
                  y2 = c(NA, NA, NA, 5, 5, 4),
                  y3 = c(3, 11, NA, 16, 2, 1))
#############
out <- as.data.frame(lapply(dat[-1], function(y, x) tapply(y, x, max, 
na.rm = TRUE), x = dat["x"]))
out[out == -Inf] <- NA
out$x <- unique(dat["x"])
out


I hope it helps.

Best,
Dimitris

----
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven

Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/16/336899
Fax: +32/16/337015
Web: http://www.med.kuleuven.be/biostat/
     http://www.student.kuleuven.be/~m0390867/dimitris.htm


----- Original Message ----- 
From: "Anders Bjørgesæter" <anders.bjorgesater at bio.uio.no>
To: <r-help at stat.math.ethz.ch>
Sent: Wednesday, August 03, 2005 10:40 AM
Subject: [R] filter data set unique, duplicate..


> Hello
>
> First, thanks for the help for an earlier question about error 
> handling!
>
> I have problem filtering a dataset.
> I'm trying to filter the data in the y columns based on the values 
> in the x
> column, e.g.:
>
> x          y1        y2                    yn
> 1.0       1          NA                  3
> 2.0       1          NA                  11
> 2.0       2          NA                  NA
> 3.0       1          5                      16
> 3.0       7          5                      2
> 4.0       8          4                      1
>
> and want to keep the highest y if x is identical, like this:
>
> x          y1        y2                    yn
> 1.0       1          NA                  3
> 2.0       2          NA                  11
> 3.0       7          5                      16
> 4.0       8          4                      1
>
> or just as good:
>
> x          y1        y2                    yn
> 1.0    1          NA                  3
> 2.0       NA*    NA                  NA
> 2.0       2          NA                  11
> 3.0       NA*    5                      16
> 3.0       7          NA*                NA*
> 4.0       8          4                      1
>
> If any has any suggestions or pointers how to do this I would really
> appreciate it.
>
> /Anders
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! 
> http://www.R-project.org/posting-guide.html
>




More information about the R-help mailing list