[R] Filtering using multiple rows in dplyr
Sumitrajit Dhar
@-dh@r @ending from northwe@tern@edu
Wed May 30 18:18:36 CEST 2018
Hi Folks,
I have just started using dplyr and could use some help getting unstuck. It could well be that dplyr is not the package to be using, but let me just pose the question and seek your advice.
Here is my basic data frame.
head(h)
subject ageGrp ear hearingGrp sex freq L2 Ldp Phidp NF SNR
1 HALAF032 A L A F 2 0 -23.54459 55.56005 -43.08282 19.538232
2 HALAF032 A L A F 2 2 -32.64881 86.22040 -23.31558 -9.333224
3 HALAF032 A L A F 2 4 -18.91058 42.12168 -35.60250 16.691919
4 HALAF032 A L A F 2 6 -23.85937 297.94499 -20.70452 -3.154846
5 HALAF032 A L A F 2 8 -14.45381 181.75329 -24.17094 9.717128
6 HALAF032 A L A F 2 10 -20.42384 67.12998 -35.77357 15.349728
‘subject’ and ‘freq’ together make a set of data and I am interested in how the last four columns vary as a function of L2. So I grouped by ‘subject’ and ‘freq’ and can look at basic summaries.
h_byFunc <- h %>% group_by(subject, freq)
> h_byFunc %>% summarize(l = mean(Ldp), s = sd(Ldp) )
# A tibble: 1,175 x 4
# Groups: subject [?]
subject freq l s
<fct> <int> <dbl> <dbl>
1 HALAF032 2 -13.8 8.39
2 HALAF032 4 -15.8 11.0
3 HALAF032 8 -23.4 6.51
4 HALAF033 2 -14.2 9.64
5 HALAF033 4 -12.3 8.92
6 HALAF033 8 -6.55 12.3
7 HALAF036 2 -14.9 12.6
8 HALAF036 4 -16.7 11.2
9 HALAF036 8 -21.7 6.56
10 HALAF039 2 0.242 12.4
# ... with 1,165 more rows
What I would like to do is filter some groups out based on various criteria. For example, if SNR > 3 in three consecutive L2 within a group, that group qualifies and I would add a column, say “clean” and assign it a value “Y.” Is there a way to do this in dplyr or should I be looking at a different way.
Thanks in advance for your help.
Regards,
Sumit
More information about the R-help
mailing list