[R] Is there a simple way to analyse all the data using dplyr?
Chris Evans
chr|@ho|d @end|ng |rom p@yctc@org
Mon Sep 21 14:12:53 CEST 2020
I am sure the answer is "yes" and I'm also sure the question may sound mad. Here's a reprex that I think captures what I'm doing
n <- 500
gender <- sample(c("Man","Woman","Other"), n, replace = TRUE)
GPC_score <- rnorm(n)
scaleMeasures <- runif(n)
bind_cols(gender = gender,
GPC_score = GPC_score,
scaleMeasures = scaleMeasures) -> tibUse
### let's have the correlation between the two variables broken down by gender
tibUse %>%
filter(gender != "Other") %>%
select(gender, GPC_score, scaleMeasures) %>%
na.omit() %>%
group_by(gender) %>%
summarise(cor = cor(cur_data())[1,2]) -> tmp1
### but I'd also like the correlation for the whole dataset, not by gender
### this is a kludge to achieve that which I am using partly because I cant'
### find the equivalent of cur_data() for an ungrouped tibble/df
tibUse %>%
mutate(gender = "All") %>% # nasty kludge to get all the data!
select(gender, GPC_score, scaleMeasures) %>%
na.omit() %>%
group_by(gender) %>% # ditto!
summarise(cor = cor(cur_data())[1,2]) -> tmp2
bind_rows(tmp1,
tmp2)
### gets me what I want:
# A tibble: 3 x 2
gender cor
<chr> <dbl>
1 Man 0.0225
2 Woman 0.0685
3 All 0.0444
In reality I have some functions that are more complex than cor()[2,1] (sorry about that particular kludge) that digest dataframes and I'd love to have a simpler way of doing this.
So two questions:
1) I am sure there a term/function that works on an ungrouped tibble in dplyr as cur_data() does for a grouped tibble ... but I can't find it.
2) I suspect someone has automated a way to get the analysis of the complete data after the analyses of the groups within a single dplyr run ... it seems an obvious and common use case, but I can't find that either.
Sorry, I'm over 99% sure I'm being stupid and missing the obvious here ... but that's the recurrent problem I have with my wetware and searchware doesn't seem to being fixing this!
TIA,
Chris
--
Small contribution in our coronavirus rigours:
https://www.coresystemtrust.org.uk/home/free-options-to-replace-paper-core-forms-during-the-coronavirus-pandemic/
Chris Evans <chris using psyctc.org> Visiting Professor, University of Sheffield <chris.evans using sheffield.ac.uk>
I do some consultation work for the University of Roehampton <chris.evans using roehampton.ac.uk> and other places
but <chris using psyctc.org> remains my main Email address. I have a work web site at:
https://www.psyctc.org/psyctc/
and a site I manage for CORE and CORE system trust at:
http://www.coresystemtrust.org.uk/
I have "semigrated" to France, see:
https://www.psyctc.org/pelerinage2016/semigrating-to-france/
https://www.psyctc.org/pelerinage2016/register-to-get-updates-from-pelerinage2016/
If you want an Emeeting, I am trying to keep them to Thursdays and my diary is at:
https://www.psyctc.org/pelerinage2016/ceworkdiary/
Beware: French time, generally an hour ahead of UK.
More information about the R-help
mailing list