[R] Bootstrap CIs for weighted means of paired differences
David Winsemius
dwinsemius at comcast.net
Wed Nov 19 17:49:45 CET 2014
On Nov 19, 2014, at 6:08 AM, i.petzev wrote:
> Hi David,
>
> thanks a lot for the response. I see that this works. I am not sure, however, what the appropriate way to do this is. It also works if you do not define weights in the boot() function (weighted bootstrap) but rather in the vw_m_diff function (ordinary bootstrap), i.e.,
>
>
> vw_m_diff <- function(dataset,d) {
> differences <- dataset[d,1]-dataset[d,2]
> weights <- dataset[d, "weights"]
> return(weighted.mean(x=differences, w=weights))
> }
>
> with
>
> boot(dataset, statistic=vw_m_diff, R = 1000)
I don't undertand how that is in the slightest way different than what I suggested. You've only substituted 'd' for 'w' as a formal parameter in the function. That should not change the processed results at all.
--
David.
>
> I guess this is rather a statistical question and hence I will have to look further into it.
>
> In any case, thanks a lot for your help.
>
> Best
>
>
>
> On 15 Nov 2014, at 17:27, David Winsemius <dwinsemius at comcast.net> wrote:
>
>>
>> On Nov 14, 2014, at 3:18 PM, David Winsemius wrote:
>>
>>>
>>> On Nov 14, 2014, at 12:15 PM, ivan wrote:
>>>
>>>> Hi,
>>>>
>>>> I am trying to compute bootstrap confidence intervals for weighted means of
>>>> paired differences with the boot package. Unfortunately, the weighted mean
>>>> estimate lies out of the confidence bounds and hence I am obviously doing
>>>> something wrong.
>>>>
>>>> Appreciate any help. Thanks. Here is a reproducible example:
>>>>
>>>>
>>>> library(boot)
>>>> set.seed(1111)
>>>> x <- rnorm(50)
>>>> y <- rnorm(50)
>>>> weights <- runif(50)
>>>> weights <- weights / sum(weights)
>>>> dataset <- cbind(x,y,weights)
>>>> vw_m_diff <- function(dataset,w, d) {
>>>
>>> My understanding of the principle underlying the design of the bootstrapped function was that the data was the first argument and the index vector was the second. (I admit to not knowing what it would do with a third argument. So I would have guessed that you wanted:
>>>
>>> vw_m_diff <- function(dataset,w) {
>>> differences <- dataset[d,1]-dataset[d,2]
>>> weights <- dataset[w, "weights"]
>>> return(weighted.mean(x=differences, w=weights))
>>> }
>>
>> I'm sorry. That was the code I first editted. This is the code that produced that output:
>>
>> vw_m_diff <- function(dataset,w) {
>> differences <- dataset[w,1]-dataset[w,2]
>> weights <- dataset[w, "weights"]
>> return(weighted.mean(x=differences, w=weights))
>> }
>>
>>>
>>> I get what appears to me as a sensible set of estimates (since they seem centered on zero) although I further admit I do not know what the theoretic CI _should_ be for this problem:
>>>
>>>> res_boot <- boot(dataset, statistic=vw_m_diff, R = 1000, w=dataset[,3])
>>>> boot.ci(res_boot)
>>> BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
>>> Based on 1000 bootstrap replicates
>>>
>>> CALL :
>>> boot.ci(boot.out = res_boot)
>>>
>>> Intervals :
>>> Level Normal Basic
>>> 95% (-0.5657, 0.4962 ) (-0.5713, 0.5062 )
>>>
>>> Level Percentile BCa
>>> 95% (-0.6527, 0.4249 ) (-0.5579, 0.5023 )
>>> Calculations and Intervals on Original Scale
>>>
>>>
>>>> differences <- dataset[d,1]-dataset[d,2]
>>>> weights <- w[d]
>>>> return(weighted.mean(x=differences, w=weights))
>>>> }
>>>> res_boot <- boot(dataset, statistic=vw_m_diff, R = 1000, w=dataset[,3])
>>>> boot.ci(res_boot)
>>>>
>>>> *BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS*
>>>> *Based on 1000 bootstrap replicates*
>>>>
>>>> *CALL : *
>>>> *boot.ci <http://boot.ci>(boot.out = res_boot)*
>>>>
>>>> *Intervals : *
>>>> *Level Normal Basic *
>>>> *95% (-0.8365, -0.3463 ) (-0.8311, -0.3441 ) *
>>>>
>>>> *Level Percentile BCa *
>>>> *95% (-0.3276, 0.1594 ) (-0.4781, -0.3477 ) *
>>>>
>>>> weighted.mean(x=dataset[,1]-dataset[,2], w=dataset[,3])
>>>>
>>>> *[1] -0.07321734*
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> David Winsemius
>>> Alameda, CA, USA
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>> David Winsemius
>> Alameda, CA, USA
>
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list