[R] density with weights missing values

Matthias Gondan m@tth|@@-gond@n @end|ng |rom gmx@de
Tue Jul 13 09:36:53 CEST 2021


Thanks Martin and the others. I will do so accordingly. 

I guess the 0.1% of the population who uses density with weights will write code like this

x = c(1, 2, 3, NA)
weights = c(1, 1, 1, 1)
density(x[!is.na(x)], weights=weights[!is.na(x)])

These people won’t be affected. For the 0.01% of people with code like this,

density(x, weights=weights[!is.na(x)], na.rm=TRUE)

the corrected version would almost surely raise an error. Note that the error message can, in principle, check if length(x[!is.na(x)]) == length(the provided weights) and tell the programmer that this was the old behavior.

Best wishes,

Matthias

PS. Sorry for the HTML email. I’ve given up trying to fix such behavior.


Von: Martin Maechler
Gesendet: Dienstag, 13. Juli 2021 09:09
An: Matthias Gondan
Cc: r-help using r-project.org
Betreff: Re: [R] density with weights missing values

>>>>> Matthias Gondan 
>>>>>     on Mon, 12 Jul 2021 15:09:38 +0200 writes:

    > Weighted mean behaves differently:
    > • weight is excluded for missing x
    > • no warning for sum(weights) != 1

    >> weighted.mean(c(1, 2, 3, 4), weights=c(1, 1, 1, 1))
    > [1] 2.5
    >> weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1))
    > [1] NA
    >> weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1), na.rm=TRUE)
    > [1] 2


I'm sure the 'weights' argument in weighted.mean() has been used
much more often than the one in density().
Hence, it's quite "probable statistically" :-)  that the
weighted.mean() behavior in the NA case has been more rational
and thought through 

So I agree with you, Matthias, that ideally density() should
behave differently here,  probably entirely analogously to weighted.mean().

Still, Bert and others are right that there is no bug formally,
but something that possibly should be changed; even though it
breaks back compatibility for those cases,  such case may be
very rare (I'm not sure I've ever used weights in density() but
I know I've used it very much all those 25 years ..).

https://www.r-project.org/bugs.html

contains good information about determining if something may be
a bug in R *and* tell you how to apply for an account on R's
bugzilla for reporting it formally.
I'm hereby encouraging you, Matthias, to do that and then in
your report mention both density() and weighted.mean(), i.e., a
cleaned up version of the union of your first 2 e-mails..

Thank you for thinking about this and concisely reporting it.
Martin


    > Von: Richard O'Keefe
    > Gesendet: Montag, 12. Juli 2021 13:18
    > An: Matthias Gondan
    > Betreff: Re: [R] density with weights missing values

    > Does your copy of R say that the weights must add up to 1?
    > ?density doesn't say that in mine.   But it does check.

another small part to could be improved, indeed,
thank you, Richard.

--
Martin Maechler
ETH Zurich  and  R Core team
 
    > On Mon, 12 Jul 2021 at 22:42, Matthias Gondan <matthias-gondan using gmx.de> wrote:
    >> 
    >> Dear R users,
    >> 
    >> This works as expected:
    >> 
    >> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE))
    >> 
    >> This raises an error
    >> 
    >> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 1)))
    >> • plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, NA)))
[..............]


	[[alternative HTML version deleted]]



More information about the R-help mailing list