[R] Antwort: Re: dplyr : row total for all groups in dplyr summarise
G.Maubach at weinwolf.de
G.Maubach at weinwolf.de
Tue Jul 5 11:27:52 CEST 2016
Hi guys,
I checked out your example but I can't follow the results.:
> mtcars %>%
+ group_by (am, gear) %>%
+ summarise (n=n()) %>%
+ mutate(rel.freq = paste0(round(100 * n/sum(n), 0), "%")) %>%
+ ungroup() %>%
+ mutate(row.tot = sum(n))
Source: local data frame [4 x 5]
am gear n rel.freq row.tot
(dbl) (dbl) (int) (chr) (int)
1 0 3 15 79% 32
2 0 4 4 21% 32
3 1 4 8 62% 32
4 1 5 5 38% 32
We have a total of 32 cases and 15 * 100 / 32 = 48,9 % instead of 79 %.
The same with the other columns. How is 79 % calculated?
When searching the web I saw this example:
-- cut --
#-- not run --
url <- "http://www.lock5stat.com/datasets/HollywoodMovies2011.csv"
response <- GET(url)
Hollywoodmovies2011 <- content(x = GET(url), as = data.frame)
#-- end not run
Hollywoodmovies2011 %>%
group_by(genre) %>%
summarize(count = n()) %>%
mutate(rf = count / sum(count))
-- cut --
which gives
Source: local data frame [9 x 3]
Genre count %
(fctr) (int) (dbl)
1 Action 32 0.235294118
2 Adventure 1 0.007352941
3 Animation 12 0.088235294
4 Comedy 27 0.198529412
5 Drama 21 0.154411765
6 Fantasy 2 0.014705882
7 Horror 17 0.125000000
8 Romance 11 0.080882353
9 Thriller 13 0.095588235
Here the % correspond to the count and the sum of count, e. g. sum = 136
and 32 / 136 = 0,2352941.
What is the difference when counting? What do the relative counts in the
first example mean?
Kind regards
Georg
Von: Ulrik Stervbo <ulrik.stervbo at gmail.com>
An: David Winsemius <dwinsemius at comcast.net>,
Kopie: r-help at r-project.org, maicel at infomed.sld.cu
Datum: 05.07.2016 06:06
Betreff: Re: [R] dplyr : row total for all groups in dplyr
summarise
Gesendet von: "R-help" <r-help-bounces at r-project.org>
That will give you the wrong result when used on summarised data
David Winsemius <dwinsemius at comcast.net> schrieb am Di., 5. Juli 2016
02:10:
> I thought there was an nrow() function?
>
> Sent from my iPhone
>
> On Jul 4, 2016, at 9:59 AM, Ulrik Stervbo <ulrik.stervbo at gmail.com>
wrote:
>
> If you want the total number of rows in the original data.frame after
> counting the rows in each group, you can ungroup and sum the row counts,
> like:
>
> library("dplyr")
>
>
> mtcars %>%
> group_by (am, gear) %>%
> summarise (n=n()) %>%
> mutate(rel.freq = paste0(round(100 * n/sum(n), 0), "%")) %>%
> ungroup() %>%
> mutate(row.tot = sum(n))
>
> HTH
> Ulrik
>
> On Mon, 4 Jul 2016 at 18:23 David Winsemius <dwinsemius at comcast.net>
> wrote:
>
>>
>> > On Jul 4, 2016, at 6:56 AM, maicel at infomed.sld.cu wrote:
>> >
>> > Hello,
>> > How can I aggregate row total for all groups in dplyr summarise ?
>>
>> Row total … of what? Aggregate … how? What is the desired answer?
>>
>>
>>
>> > library(dplyr)
>> > mtcars %>%
>> > group_by (am, gear) %>%
>> > summarise (n=n()) %>%
>> > mutate(rel.freq = paste0(round(100 * n/sum(n), 0), "%"))
>> >
>> > best regard
>> > Maicel Monzon
>> >
>> >
>> >
>> > ----------------------------------------------------------------
>> >
>> >
>> >
>> >
>> > --
>> > Este mensaje le ha llegado mediante el servicio de correo electronico
>> que ofrece Infomed para respaldar el cumplimiento de las misiones del
>> Sistema Nacional de Salud. La persona que envia este correo asume el
>> compromiso de usar el servicio a tales fines y cumplir con las
regulaciones
>> establecidas
>> >
>> > Infomed: http://www.sld.cu/
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
[[alternative HTML version deleted]]
______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list