[R-sig-Geo] Calculating median age for a group of US census blocks?

Kevin Zembower kev|n @end|ng |rom zembower@org
Mon Aug 7 20:52:33 CEST 2023


Yes, I see what you mean:

 > median(c(60, 50, 40, 20, 20, 20))
[1] 30
 > median(c(50, 50, 50, 20, 20, 20))
[1] 35
 >

Thanks so much for that clear example.

-Kevin

On 8/7/23 14:45, Sean Trende wrote:
> This is correct on the second question, at least for more recent censuses.  On the first question, imagine a block where the ages of three individuals are 60, 50, and 40, and another one where the ages are 20, 20, and 20.  Using your approach you would have 50 * 3 = 150 for the first block, and 20*3 = 60 for the second block.  The median of 60 and 150 is 105.  Even dividing that by three you get 35, which is not the correct median age (30).
> 
> -----Original Message-----
> From: R-sig-Geo <r-sig-geo-bounces using r-project.org> On Behalf Of Josiah Parry
> Sent: Monday, August 7, 2023 2:38 PM
> To: Kevin Zembower <kevin using zembower.org>
> Cc: r-sig-geo using r-project.org
> Subject: Re: [R-sig-Geo] Calculating median age for a group of US census blocks?
> 
> Hey Kevin, I don't think you're going to be able to get individual level data from the US Census Bureau. The closest you may be able to get is the current population survey (CPS) which I believe is also available via tidycensus. Regarding your first question, I'm not sure I follow what your objective is with it. I would use a geography of census block groups as the measure of median for census block groups. Otherwise it is unclear how you are defining what a "group of blocks" is.
> 
> On Mon, Aug 7, 2023 at 2:34 PM Kevin Zembower via R-sig-Geo < r-sig-geo using r-project.org> wrote:
> 
>> Hello, all,
>>
>> I'd like to obtain the median age for a population in a specific group
>> of US Decennial census blocks. Here's an example of the problem:
>>
>> ## Example of calculating median age of population in census blocks.
>> library(tidyverse)
>> library(tidycensus)
>>
>> counts <- get_decennial(
>>       geography = "block",
>>       state = "MD",
>>       county = "Baltimore city",
>>       table = "P1",
>>       year = 2020,
>>       sumfile = "dhc") %>%
>>       mutate(NAME = NULL) %>%
>>       filter(substr(GEOID, 6, 11) == "271101" &
>>              substr(GEOID, 12, 15) %in% c(3000, 3001, 3002)
>>              )
>>
>> ages <- get_decennial(
>>       geography = "block",
>>       state = "MD",
>>       county = "Baltimore city",
>>       table = "P13",
>>       year = 2020,
>>       sumfile = "dhc") %>%
>>       mutate(NAME = NULL) %>%
>>       filter(substr(GEOID, 6, 11) == "271101" &
>>              substr(GEOID, 12, 15) %in% c(3000, 3001, 3002)
>>              )
>>
>> I have two questions:
>>
>> 1. Is it mathematically valid to multiply the population of a block by
>> the median age of that block (in other words, assign the median age to
>> each member of a block), then calculate the median of those numbers
>> for a group of blocks?
>>
>> 2. Is raw data on the ages of individuals available anywhere else in
>> the census data? I can find tables such as P12, that breaks down the
>> population by age ranges or bins, but can't find specific data of
>> counts per age in years.
>>
>> Thanks for your advice and help.
>>
>> -Kevin
>>
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo using r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo




More information about the R-sig-Geo mailing list