[R] nested subset for a dataframe

David Winsemius dwinsemius at comcast.net
Wed Mar 10 16:58:04 CET 2010


On Mar 10, 2010, at 10:30 AM, arnaud chozo wrote:

> Hi,
>
> I've a beginner question. I'm trying to extract data in my dataframe
> according to some nested rules.
>
> I have something like the dataframe test.df:
>
> test.df = data.frame(V1=c(rep("A",10), rep("B",10), rep("C",5)),
> V2=c(rep(1,5), rep(2,5), rep(1,5), rep(2,5), rep(1,5)))
>
>   V1 V2
> 1   A  1
> 2   A  1
> 3   A  1
> 4   A  1
> 5   A  1
> 6   A  2
> 7   A  2
> 8   A  2
> 9   A  2
> 10  A  2
> 11  B  1
> 12  B  1
> 13  B  1
> 14  B  1
> 15  B  1
> 16  B  2
> 17  B  2
> 18  B  2
> 19  B  2
> 20  B  2
> 21  C  1
> 22  C  1
> 23  C  1
> 24  C  1
> 25  C  1
>
> For each value of the variable V1 (group A, B or C), I want to  
> extract rows
> for which V2 is the max for the group in V1, in order to get:
>
>   V1 V2
> 1   A  2
> 2   A  2
> 3   A  2
> 4   A  2
> 5  A  2
> 6  B  2
> 7  B  2
> 8  B  2
> 9  B  2
> 10  B  2
> 11  C  1
> 12  C  1
> 13  C  1
> 14  C  1
> 15  C  1
>

 > test.df[test.df$V2 == ave(test.df$V2, test.df$V1, FUN=max), ]
    V1 V2
6   A  2
7   A  2
8   A  2
9   A  2
10  A  2
16  B  2
17  B  2
18  B  2
19  B  2
20  B  2
21  C  1
22  C  1
23  C  1
24  C  1
25  C  1

You get a bit of extra information in the form of the row numbers  
which were extracted. If you want to get rid of that information, it  
would not be difficult.

-- 
David.
> I wrote this function:
>
> mytest = function(df) {
>  myS = unique(df$V1)
>  df.tmp = subset(df, df$V1==myS[[1]])
>  df.sub = subset(df.tmp, df.tmp$V2==max(df.tmp$V2))
>  for (i in 2:length(myS)) {
>    df.tmp = subset(df, df$V1==myS[[i]])
>    df.sub = merge(df.sub, subset(df.tmp, df.tmp$V2==max(df.tmp$V2)),
> all=TRUE)
>  }
>  df.sub
> }
>
> but need some more efficient and more general. Any idea?
>
> Thanks in advance,
> Arnaud
>
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list