Calculation of Age heaping
> If you want to look at each digit, you should take a step back and
> think about what the
> Whipple index is actually doing. Basically, the model underlying the
> Whipple index is
> that Pr(age = xy) = Pr(age = x*)Pr(age = *y) if there is no age
> heaping. Or rather,
> since the age is restricted to 23..62 (a whole number of decades), it is
> that
> Pr(age - 23 = xy) = Pr(age - 23 = x*)Pr(age - 23 = *y) for 0 <= x <=
> 3, 0 <= y <= 9
> and the "nothing to see here" case is Pr(age = *y) = 1/10.
> I wasted way too much time trying to find a free age data set where
> age *wasn't* already
> grouped into 5 year bands.
>
> So what's wrong with a chi-square test?
> I would certainly want to check whether the high and low digits of age
> - 23 were in fact independent.
>
> >
> > > https://en.wikipedia.org/wiki/Whipple%27s_index
> > > ----
> > >
> > > then, the article says the algorithm is
> > > ----
> > > The index score is obtained by summing the number of persons in the age
> > > range 23 and 62 inclusive, who report ages ending in 0 and 5, dividing
> > > that sum by the total population between ages 23 and 62 years
> inclusive,
> > > and multiplying the result by 5. Restated as a percentage, index scores
> > > range between 100 (no preference for ages ending in 0 and 5) and 500
> > > (all people reporting ages ending in 0 and 5).
> > > ----
> > >
> > > that seems fairly straight forward. if you are trying to learn R,
> > > and/or learn programming, i might suggest you *not* use a package, and
> > > rather work on coding up the calculation yourself. that would probably
> > > be a good, but not too hard, exercise, of some interest. enjoy!
> >
> >
