[R] Antwort: Re: dplyr : row total for all groups in dplyr summarise

G.Maubach at weinwolf.de G.Maubach at weinwolf.de
Tue Jul 5 11:27:52 CEST 2016


Hi guys,

I checked out your example but I can't follow the results.:

> mtcars %>%
+   group_by (am, gear) %>%
+   summarise (n=n()) %>%
+   mutate(rel.freq = paste0(round(100 * n/sum(n), 0), "%")) %>%
+   ungroup() %>%
+   mutate(row.tot = sum(n))
Source: local data frame [4 x 5]

     am  gear     n rel.freq row.tot
  (dbl) (dbl) (int)    (chr)   (int)
1     0     3    15      79%      32
2     0     4     4      21%      32
3     1     4     8      62%      32
4     1     5     5      38%      32

We have a total of 32 cases and 15 * 100 / 32 = 48,9 % instead of 79 %. 
The same with the other columns. How is 79 % calculated?

When searching the web I saw this example:

-- cut --

#-- not run --
url <- "http://www.lock5stat.com/datasets/HollywoodMovies2011.csv"
response <- GET(url)
Hollywoodmovies2011 <- content(x = GET(url), as = data.frame)
#-- end not run

Hollywoodmovies2011 %>% 
  group_by(genre) %>%
  summarize(count = n()) %>%
  mutate(rf = count / sum(count))

-- cut --

which gives

Source: local data frame [9 x 3]

      Genre count           %
     (fctr) (int)       (dbl)
1    Action    32 0.235294118
2 Adventure     1 0.007352941
3 Animation    12 0.088235294
4    Comedy    27 0.198529412
5     Drama    21 0.154411765
6   Fantasy     2 0.014705882
7    Horror    17 0.125000000
8   Romance    11 0.080882353
9  Thriller    13 0.095588235

Here the % correspond to the count and the sum of count, e. g. sum = 136 
and 32 / 136 = 0,2352941.

What is the difference when counting? What do the relative counts in the 
first example mean?

Kind regards

Georg





Von:    Ulrik Stervbo <ulrik.stervbo at gmail.com>
An:     David Winsemius <dwinsemius at comcast.net>, 
Kopie:  r-help at r-project.org, maicel at infomed.sld.cu
Datum:  05.07.2016 06:06
Betreff:        Re: [R] dplyr : row total for all groups in dplyr 
summarise
Gesendet von:   "R-help" <r-help-bounces at r-project.org>



That will give you the wrong result when used on summarised data

David Winsemius <dwinsemius at comcast.net> schrieb am Di., 5. Juli 2016 
02:10:

> I thought there was an nrow() function?
>
> Sent from my iPhone
>
> On Jul 4, 2016, at 9:59 AM, Ulrik Stervbo <ulrik.stervbo at gmail.com> 
wrote:
>
> If you want the total number of rows in the original data.frame after
> counting the rows in each group, you can ungroup and sum the row counts,
> like:
>
> library("dplyr")
>
>
> mtcars %>%
>    group_by (am, gear) %>%
>    summarise (n=n()) %>%
>    mutate(rel.freq = paste0(round(100 * n/sum(n), 0), "%")) %>%
>    ungroup() %>%
>    mutate(row.tot = sum(n))
>
> HTH
> Ulrik
>
> On Mon, 4 Jul 2016 at 18:23 David Winsemius <dwinsemius at comcast.net>
> wrote:
>
>>
>> > On Jul 4, 2016, at 6:56 AM, maicel at infomed.sld.cu wrote:
>> >
>> > Hello,
>> > How can I aggregate row total for all groups in dplyr summarise ?
>>
>> Row total … of what? Aggregate … how? What is the desired answer?
>>
>>
>>
>> > library(dplyr)
>> > mtcars %>%
>> >  group_by (am, gear) %>%
>> >  summarise (n=n()) %>%
>> >  mutate(rel.freq = paste0(round(100 * n/sum(n), 0), "%"))
>> >
>> > best regard
>> > Maicel Monzon
>> >
>> >
>> >
>> > ----------------------------------------------------------------
>> >
>> >
>> >
>> >
>> > --
>> > Este mensaje le ha llegado mediante el servicio de correo electronico
>> que ofrece Infomed para respaldar el cumplimiento de las misiones del
>> Sistema Nacional de Salud. La persona que envia este correo asume el
>> compromiso de usar el servicio a tales fines y cumplir con las 
regulaciones
>> establecidas
>> >
>> > Infomed: http://www.sld.cu/
>> >
>> > ______________________________________________
>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

                 [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list