[R] Extracing Unique Rows based on 2 Column
Boris Steipe
boris.steipe at utoronto.ca
Mon Nov 30 05:36:19 CET 2015
A logical expression applied to a vector (such as a dataframe column) gives you a logical vector that you can use for selection. You can combine several of these with the & (AND) and | (OR) operator. In your case, you apparently want a range of possible values. Use the %in% operator.
Consider eg.
orderlist$i == 2
orderlist$i == 2 & orderlist$j < 3
orderlist$i %in% c(5, 7)
Cheers,
B.
On Nov 29, 2015, at 10:55 PM, Ragia Ibrahim <ragia11 at hotmail.com> wrote:
> Dear group,
> kindly, I have a data frame, as follows:
>
>
> Measure_id i j value rank
> 1 1 2 3 2.0 1.0000000
> 2 1 5 1 2.0 1.0000000
> 3 1 2 1 1.5 0.7500000
> 4 1 5 2 1.5 0.7500000
> 5 1 7 3 1.5 1.0000000
> 6 1 2 4 1.0 0.5000000
> 7 1 7 5 1.0 0.6666667
> 8 2 5 2 2.5 1.0000000
> 9 2 2 1 2.0 1.0000000
> 10 2 2 4 2.0 1.0000000
> .. ... . . ... ...
>
> I want to select distinct rows based on two coulmn ( Measure_id and i )
>
> for example for Measure_id = 1,2 the result would be....
> 1 1 2 3 2.0 1.0000000
> 2 1 5 1 2.0 1.0000000
> 5 1 7 3 1.5 1.0000000
> 8 2 5 2 2.5 1.0000000
> 9 2 2 1 2.0 1.0000000
>
>
> kindly how I could do this?
>
> example of the data frame are followed using dput.
>
> dput(orderlist)
>
> structure(list(Measure_id = c(1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2,
> 2, 2, 2, 3, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5,
> 5, 5, 5), i = c(2, 5, 2, 5, 7, 2, 7, 5, 2, 2, 7, 2, 5, 7, 2,
> 2, 2, 5, 5, 7, 7, 2, 5, 2, 2, 5, 7, 7, 2, 2, 5, 2, 5, 7, 7),
> j = c(3, 1, 1, 2, 3, 4, 5, 2, 1, 4, 5, 3, 1, 3, 1, 3, 4,
> 1, 2, 3, 5, 4, 2, 1, 3, 1, 3, 5, 1, 4, 2, 3, 1, 3, 5), value = c(2,
> 2, 1.5, 1.5, 1.5, 1, 1, 2.5, 2, 2, 2, 1.5, 1.5, 1, 1, 0,
> 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 2, 2, 2, 1, 1, 1, 1),
> rank = c(1, 1, 0.75, 0.75, 1, 0.5, 0.666666666666667, 1,
> 1, 1, 1, 0.75, 0.6, 0.5, 1, 0, 0, NaN, NaN, NaN, NaN, 1,
> 1, 0, 0, 0, NaN, NaN, 1, 1, 1, 0.5, 0.5, 1, 1)), class = c("grouped_df",
> "tbl_df", "tbl", "data.frame"), row.names = c(NA, -35L), .Names = c("Measure_id",
> "i", "j", "value", "rank"), vars = list(Measure_id), indices = list(
> 0:6, 7:13, 14:20, 21:27, 28:34), group_sizes = c(7L, 7L,
> 7L, 7L, 7L), biggest_group_size = 7L, labels = structure(list(
> Measure_id = c(1, 2, 3, 4, 5)), class = "data.frame", row.names = c(NA,
> -5L), .Names = "Measure_id", vars = list(Measure_id)))
>
>
>
>
> thanks in advance
> Ragia
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list