[R] filter data set unique, duplicate..
Sundar Dorai-Raj
sundar.dorai-raj at pdf.com
Wed Aug 3 20:58:24 CEST 2005
Hi, Anders/Dimitris,
Dimitris Rizopoulos wrote:
> maybe you could consider something like this:
>
> dat <- data.frame(x = c(1, 2, 2, 3, 3, 4),
> y1 = c(1, 1, 2, 1, 7, 8),
> y2 = c(NA, NA, NA, 5, 5, 4),
> y3 = c(3, 11, NA, 16, 2, 1))
> #############
> out <- as.data.frame(lapply(dat[-1], function(y, x) tapply(y, x, max,
> na.rm = TRUE), x = dat["x"]))
> out[out == -Inf] <- NA
> out$x <- unique(dat["x"])
Beware this line. If "x" is not sorted as it is in "dat" then your rows
will be misaligned.
Here's another solution using "by" though it's no more efficient than
what Dimitris has given.
out <- by(dat[-1], dat[1], function(y) {
max.na <- function(x)
if(all(is.na(x))) NA else max(x, na.rm = TRUE)
apply(y, 2, max.na)
})
out <- as.data.frame(do.call("rbind", out))
out <- cbind(x = as.numeric(row.names(out)), out)
out
HTH,
--sundar
> out
>
>
> I hope it helps.
>
> Best,
> Dimitris
>
> ----
> Dimitris Rizopoulos
> Ph.D. Student
> Biostatistical Centre
> School of Public Health
> Catholic University of Leuven
>
> Address: Kapucijnenvoer 35, Leuven, Belgium
> Tel: +32/16/336899
> Fax: +32/16/337015
> Web: http://www.med.kuleuven.be/biostat/
> http://www.student.kuleuven.be/~m0390867/dimitris.htm
>
>
> ----- Original Message -----
> From: "Anders Bjørgesæter" <anders.bjorgesater at bio.uio.no>
> To: <r-help at stat.math.ethz.ch>
> Sent: Wednesday, August 03, 2005 10:40 AM
> Subject: [R] filter data set unique, duplicate..
>
>
>
>>Hello
>>
>>First, thanks for the help for an earlier question about error
>>handling!
>>
>>I have problem filtering a dataset.
>>I'm trying to filter the data in the y columns based on the values
>>in the x
>>column, e.g.:
>>
>>x y1 y2 yn
>>1.0 1 NA 3
>>2.0 1 NA 11
>>2.0 2 NA NA
>>3.0 1 5 16
>>3.0 7 5 2
>>4.0 8 4 1
>>
>>and want to keep the highest y if x is identical, like this:
>>
>>x y1 y2 yn
>>1.0 1 NA 3
>>2.0 2 NA 11
>>3.0 7 5 16
>>4.0 8 4 1
>>
>>or just as good:
>>
>>x y1 y2 yn
>>1.0 1 NA 3
>>2.0 NA* NA NA
>>2.0 2 NA 11
>>3.0 NA* 5 16
>>3.0 7 NA* NA*
>>4.0 8 4 1
>>
>>If any has any suggestions or pointers how to do this I would really
>>appreciate it.
>>
>>/Anders
>>
>>______________________________________________
>>R-help at stat.math.ethz.ch mailing list
>>https://stat.ethz.ch/mailman/listinfo/r-help
>>PLEASE do read the posting guide!
>>http://www.R-project.org/posting-guide.html
>>
>
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
More information about the R-help
mailing list