[R] select duplicate identifier with higher mean across sample columns
Adrian Johnson
oriolebaltimore at gmail.com
Mon Nov 5 16:47:48 CET 2012
Thanks a lot for the help.
-Adrian
On Sun, Nov 4, 2012 at 2:39 PM, jim holtman <jholtman at gmail.com> wrote:
> Is this what you want:
>
>> mdf <- read.table(text = " id samp1 samp2 samp2a
> + 1 A 100 110 110
> + 2 A 120 130 150
> + 3 C 101 131 151
> + 4 D 110 150 130
> + 5 E 132 122 122
> + 6 F 123 143 143", header = TRUE)
>> result <- do.call(rbind, lapply(split(mdf, mdf$id), function(.id){
> + maxIndx <- which.max(rowMeans(.id[, -1L]))
> + .id[maxIndx, ]
> + }))
>>
>> result
> id samp1 samp2 samp2a
> A A 120 130 150
> C C 101 131 151
> D D 110 150 130
> E E 132 122 122
> F F 123 143 143
>
>
> On Sun, Nov 4, 2012 at 2:25 PM, Adrian Johnson
> <oriolebaltimore at gmail.com> wrote:
>> Hi Group:
>> I searched R groups before posting this question. I could not find the
>> appropriate answer and I do not have clear understanding how to do
>> this in R.
>>
>> I have a data frame with duplicated row identifiers but with different
>> values across columns. I want to select the identifier with higher
>> inter-quartile range or mean.
>>
>>
>> id <- c("A", "A", "C", "D", "E", "F")
>> year <- c(2000, 2001, 2001, 2002, 2003, 2004)
>> samp1 <- c(100, 120, 101, 110, 132,123)
>> samp2 <- c(110, 130, 131, 150, 122,143)
>> mdf <- data.frame(id,samp1,samp2,samp2a)
>>
>>
>>> mdf
>> id samp1 samp2 samp2a
>> 1 A 100 110 110
>> 2 A 120 130 150
>> 3 C 101 131 151
>> 4 D 110 150 130
>> 5 E 132 122 122
>> 6 F 123 143 143
>>
>>
>> There are two A ids in this df. I want to select the row with higher mean.
>>
>> How can I do this.
>> Thanks
>> Adrian
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Jim Holtman
> Data Munger Guru
>
> What is the problem that you are trying to solve?
> Tell me what you want to do, not how you want to do it.
More information about the R-help
mailing list