[R] Sum by group
Jeff Newmiller
jdnewm|| @end|ng |rom dcn@d@v|@@c@@u@
Sat Dec 7 08:48:46 CET 2024
That is what the "summarise" function is designed to do (instead of mutate). All of the calculations in a summarise have to aggregate all group rows down to one value, but that is what you want in this case.
Please note that you are supposed to make your examples reproducible by including your library statements when posting on this mailing list. From your code sequence I can guess that you are using the dplyr package, but on this mailing list it is bad form to assume your readers know which user-contributed packages you are using. Perhaps use the "reprex" package to check that other people will have all the info they need to run your example in their computers so they can be clear what you are dealing with without guessing.
On December 6, 2024 12:31:15 PM PST, D <dykim7411 using gmail.com> wrote:
>I have population data (“totpopE”) at the census tract level (“GEOID”),
>which are nested within Precincts (“Precinct”). Please see below my data
>structure.
>
>I used the code to sum population data per precinct:
>
>inters <- inters %>%
>
> group_by(Precinct) %>%
>
> mutate(TotalPop = sum(totpopE)
>
> )
>
>However, said code produced too large sums because each census tract
>(“GEOID”) has multiple observations/rows. Each census tract has the same
>value. Is there any way I can use one value for each census tract to
>estimate total populations at the precinct level? It would be appreciated
>if anyone can provide codes to sum population data per precinct.
>
>> tail(df, n=20)Simple feature collection with 20 features and 3 fields
>Geometry type: POINT
>Dimension: XY
>Bounding box: xmin: 989211 ymin: 193205 xmax: 997877 ymax: 222689
>Projected CRS: NAD83 / New York Long Island (ftUS)# A tibble: 20 × 4#
>Groups: GEOID [5]
> Precinct GEOID totpopE geometry
> <fct> <chr> <dbl> <POINT [US_survey_foot]> 1 20
>36061015700 10352 (989211 222689) 2 20 36061015700
>10352 (989211 222689) 3 20 36061015700 10352
>(989211 222689) 4 20 36061015700 10352 (989211
>222689) 5 79 36047123700 9448 (996168 193934) 6 79
> 36047123700 9448 (996598 193205) 7 79
>36047123700 9448 (996598 193205) 8 79 36047123700
>9448 (996598 193205) 9 19 36061013400 11387
>(996703 219599)10 19 36061013400 11387 (996475
>220183)11 19 36061013800 12826 (996871 222201)12 19
> 36061013800 12826 (997576 221811)13 19
>36061013800 12826 (997877 221608)14 19 36061013800
>12826 (997877 221608)15 19 36061013800 12826
>(996216 221621)16 13 36061004400 15945 (990562
>205039)17 13 36061004400 15945 (989574 206434)18 13
> 36061004400 15945 (989574 206434)19 13
>36061004400 15945 (989574 206434)20 9 36061004400
>15945 (989699 205397)
>
> [[alternative HTML version deleted]]
>
>______________________________________________
>R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide https://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
--
Sent from my phone. Please excuse my brevity.
More information about the R-help
mailing list