[R-sig-Geo] Calculating median age for a group of US census blocks?

Kevin Zembower kev|n @end|ng |rom zembower@org
Thu Aug 31 21:47:43 CEST 2023


Sorry to resurrect a long-dead thread, but I'm still struggling with my
desire to assign a median age to the population in a group of US census
blocks. I'm using the data from the US Census table P12, which bins the
ages into ranges.

I'm convinced (thank you!) that I can't compute the exact median age.
Can I compute the lower and upper bounds of the median age? Can I
assign all the people in a binned age range (say "20 to 29 years") to
the lower limit of the range, then compute the median of those ages,
and say that the true median age is between this lower limit and the
upper one, computed similarly?

If this is valid, how do I deal with the "85 years and older" bin? I
have 9 people 85 years and older, out of a total population of 537
people in my group of census blocks. For the lower bounds of the
median, I assign all 9 the age of 85. What can I do for the upper
bounds? 

I've done this, and found that the true median age is between 40 and 44
years old, if I drop all the "85 years and older" population as NA. The
true mean is between 39.96 and 43.46, similarly. 

One thought: If there are 9 people in the "85 years and older" group,
should I drop them and also drop the 9 youngest ages?

I look forward to reading your thoughts. Thank you for any advice and
guidance.

-Kevin

On Tue, 2023-08-08 at 12:00 +0200, r-sig-geo-request using r-project.org
wrote:
> 
> Message: 2
> Date: Mon, 7 Aug 2023 18:33:41 +0000
> From: Kevin Zembower <kevin using zembower.org>
> To: "r-sig-geo using r-project.org" <r-sig-geo using r-project.org>
> Subject: [R-sig-Geo] Calculating median age for a group of US census
>         blocks?
> Message-ID:
>         <01000189d146bd0d-ecb41aac-0501-46f4-b313-a1faebeff2a9-
> 000000 using email.amazonses.com>
>         
> Content-Type: text/plain; charset="utf-8"
> 
> Hello, all,
> 
> I'd like to obtain the median age for a population in a specific
> group 
> of US Decennial census blocks. Here's an example of the problem:
> 
> ## Example of calculating median age of population in census blocks.
> library(tidyverse)
> library(tidycensus)
> 
> counts <- get_decennial(
>      geography = "block",
>      state = "MD",
>      county = "Baltimore city",
>      table = "P1",
>      year = 2020,
>      sumfile = "dhc") %>%
>      mutate(NAME = NULL) %>%
>      filter(substr(GEOID, 6, 11) == "271101" &
>             substr(GEOID, 12, 15) %in% c(3000, 3001, 3002)
>             )
> 
> ages <- get_decennial(
>      geography = "block",
>      state = "MD",
>      county = "Baltimore city",
>      table = "P13",
>      year = 2020,
>      sumfile = "dhc") %>%
>      mutate(NAME = NULL) %>%
>      filter(substr(GEOID, 6, 11) == "271101" &
>             substr(GEOID, 12, 15) %in% c(3000, 3001, 3002)
>             )
> 
> I have two questions:
> 
> 1. Is it mathematically valid to multiply the population of a block
> by 
> the median age of that block (in other words, assign the median age
> to 
> each member of a block), then calculate the median of those numbers
> for 
> a group of blocks?
> 
> 2. Is raw data on the ages of individuals available anywhere else in
> the 
> census data? I can find tables such as P12, that breaks down the 
> population by age ranges or bins, but can't find specific data of
> counts 
> per age in years.
> 
> Thanks for your advice and help.
> 
> -Kevin
> 
> 
> 
> 
> ------------------------------
> 
> Message: 3
> Date: Mon, 7 Aug 2023 14:38:16 -0400
> From: Josiah Parry <josiah.parry using gmail.com>
> To: Kevin Zembower <kevin using zembower.org>
> Cc: "r-sig-geo using r-project.org" <r-sig-geo using r-project.org>
> Subject: Re: [R-sig-Geo]  Calculating median age for a group of US
>         census blocks?
> Message-ID:
>         <
> CAL3ufUJVvcZvdtYM2V0tmo9U-RMZ1zOGL8NZDhjK7V8GFc77HA using mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
> 
> Hey Kevin, I don't think you're going to be able to get individual
> level
> data from the US Census Bureau. The closest you may be able to get is
> the
> current population survey (CPS) which I believe is also available via
> tidycensus. Regarding your first question, I'm not sure I follow what
> your
> objective is with it. I would use a geography of census block groups
> as the
> measure of median for census block groups. Otherwise it is unclear
> how you
> are defining what a "group of blocks" is.
> 
> ------------------------------
> 
> Message: 4
> Date: Mon, 7 Aug 2023 19:00:38 +0000
> From: Kevin Zembower <kevin using zembower.org>
> To: Josiah Parry <josiah.parry using gmail.com>
> Cc: "r-sig-geo using r-project.org" <r-sig-geo using r-project.org>
> Subject: Re: [R-sig-Geo]  Calculating median age for a group of US
>         census blocks?
> Message-ID:
>         <01000189d15f6aa3-d32ffe39-a210-436f-9f8f-cc551370f034-
> 000000 using email.amazonses.com>
>         
> Content-Type: text/plain; charset="utf-8"
> 
> Josiah, thanks for your reply.
> 
> Regarding my objective, I'm trying to compile census statistics for
> the 
> blocks that make up the neighborhood where I live. It consists of ten
> census blocks, of which I selected three for simplicity in my
> example. 
> The census block-group which contains these ten blocks also contains 
> some blocks which are outside of my neighborhood and shouldn't be 
> counted or included.
> 
> Since I won't be able to calculate the median age from the age and
> count 
> data, and since the individual data doesn't seem to be available, is
> it 
> your thought that I can't produce a valid median age for a group of 
> census blocks?
> 
> Thanks so much for your advice.
> 
> -Kevin
> 
> ------------------------------
> 
> Message: 5
> Date: Mon, 7 Aug 2023 18:45:48 +0000
> From: Sean Trende <strende using realclearpolitics.com>
> To: Josiah Parry <josiah.parry using gmail.com>, Kevin Zembower
>         <kevin using zembower.org>
> Cc: "r-sig-geo using r-project.org" <r-sig-geo using r-project.org>
> Subject: Re: [R-sig-Geo]  Calculating median age for a group of US
>         census blocks?
> Message-ID:
>         <
> BLAPR20MB39382F6CD501D6B1ED8F2C11BE0CA using BLAPR20MB3938.namprd20.prod.ou
> tlook.com>
>         
> Content-Type: text/plain; charset="utf-8"
> 
> This is correct on the second question, at least for more recent
> censuses.  On the first question, imagine a block where the ages of
> three individuals are 60, 50, and 40, and another one where the ages
> are 20, 20, and 20.  Using your approach you would have 50 * 3 = 150
> for the first block, and 20*3 = 60 for the second block.  The median
> of 60 and 150 is 105.  Even dividing that by three you get 35, which
> is not the correct median age (30).
> 
> ------------------------------
> 
> Message: 6
> Date: Mon, 7 Aug 2023 18:52:33 +0000
> From: Kevin Zembower <kevin using zembower.org>
> To: Sean Trende <strende using realclearpolitics.com>,  Josiah Parry
>         <josiah.parry using gmail.com>
> Cc: "r-sig-geo using r-project.org" <r-sig-geo using r-project.org>
> Subject: Re: [R-sig-Geo]  Calculating median age for a group of US
>         census blocks?
> Message-ID:
>         <01000189d1580211-8b8fa766-f820-4ae9-862b-e98e1a4881bf-
> 000000 using email.amazonses.com>
>         
> Content-Type: text/plain; charset="utf-8"
> 
> Yes, I see what you mean:
> 
>  > median(c(60, 50, 40, 20, 20, 20))
> [1] 30
>  > median(c(50, 50, 50, 20, 20, 20))
> [1] 35
>  >
> 
> Thanks so much for that clear example.
> 
> -Kevin
> 
> ------------------------------
> 
> Message: 7
> Date: Mon, 7 Aug 2023 18:53:05 +0000
> From: Jeff Boggs <jboggs using brocku.ca>
> To: "r-sig-geo using r-project.org" <r-sig-geo using r-project.org>, Kevin
>         Zembower <kevin using zembower.org>
> Subject: Re: [R-sig-Geo]  Calculating median age for a group of US
>         census blocks?
> Message-ID:
>         <
> YT3PR01MB91703A158414A8F28FB4052FC00CA using YT3PR01MB9170.CANPRD01.PROD.OU
> TLOOK.COM>
>         
> Content-Type: text/plain; charset="us-ascii"
> 
> Responses to your questions:
> Q1: No. It is not mathematically valid, sadly.
> 
> Q2: I do not know, but your intuition that this is a possible
> solution is correct.
> 
> I don't use US Census data anymore, but suspect that the data exists.
> Whether they are publicly-available is a different question. I
> suspect, though, that block level age-sex cohort in five-year
> intervals is available, given this is the usual ingredient for a
> population pyramid. That data could be used to calculate a less exact
> median, if you make some simplifying assumptions.
> 
> Best regards,
> Jeff
> 
> ------------------------------
> 
> Message: 8
> Date: Mon, 7 Aug 2023 15:43:50 -0400
> From: Dexter Locke <dexter.locke using gmail.com>
> To: Kevin Zembower <kevin using zembower.org>
> Cc: Josiah Parry <josiah.parry using gmail.com>,  "r-sig-geo using r-project.org"
>         <r-sig-geo using r-project.org>
> Subject: Re: [R-sig-Geo]  Calculating median age for a group of US
>         census blocks?
> Message-ID:
>         <
> CAA=SVwHn=92B-k1tBZm2ioEW79gJx_QX0VD-x2UUEQOBQ+TEvg using mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
> 
> Hi Kevin and all,
> 
> Given the binned data, you could count the number of people per age
> class
> for those 10 blocks. You can then express that in a number of
> different ways, like percent under 25 years old, or by calculating
> the
> dependency
> ratio
> <
> https://www.who.int/data/gho/indicator-metadata-registry/imr-details/1
> 119#:~:text=Definition%3A,a%20specific%20point%20in%20time.>
> .
> 
> I do think it is feasible to calculate an estimated mean from the
> counts
> within groups representing ranges. See, for example, here:
> https://stackoverflow.com/questions/18887382/how-to-calculate-the-median-on-grouped-dataset
> 
> Since you are working in Baltimore, you may consider looking at The
> Baltimore Neighborhood Indicators Alliance
> https://bniajfi.org/vital_signs/.
> They provide useful data on a range of issues (transportation, crime,
> education, environment etc.) including summaries from Census-derived
> demographics. What you are seeking may already exist. BNIA creates
> neighborhoods or "community statistical areas" (n=55) based on
> aggregates
> of Census data.
> 
> Although not pertaining to age, Baltimore City Planning has paid
> Census in
> the past to aggregate from individual-level Census data to the more
> colloquially-used definitions of Baltimore shown here (n = 273):
> https://data.baltimorecity.gov/datasets/neighborhood-1/explore?location=39.284832%2C-76.620516%2C12.91
> 
> Best, Dexter
> https://dexterlocke.com/
> 
> 





More information about the R-sig-Geo mailing list