[R] Bias-corrected percentile confidence intervals
Joe Ceradini
joeceradini at gmail.com
Wed Aug 16 21:30:46 CEST 2017
Hi folks,
I'm trying to estimate bias-corrected percentile (BCP) confidence
intervals on a vector from a simple for loop used for resampling. I am
attempting to follow steps in Manly, B. 1998. Randomization, bootstrap
and monte carlo methods in biology. 2nd edition., p. 48. PDF of the
approach/steps should be available here:
https://wyocoopunit.box.com/s/9vm4vgmbx5h7um809bvg6u7wr392v6i9
If this is too statsy for the R list, I can try stackexchange or
crossval, but the R list provides all kinds of great help so thought
I'd try it first. I aware of boot::boot but am hoping to avoid it for
the current analysis I am working on (the boot function became quite
challenging, for me, for a few reasons).
I cannot figure out where I'm going wrong but the estimates from my
attempt at the BCP CI are different enough from other methods that I
assume I'm doing something wrong.
require(boot)
data("mtcars")
# 1) Bootstrap 95% CI for R-Squared via boot::boot
# statmethods.net/advstats/bootstrapping.html
# Function for boot
rsq <- function(formula, data, indices) {
d <- data[indices,]
fit <- lm(formula, data=d)
return(summary(fit)$r.square)
}
# bootstrapping with 1000 replications
results <- boot(data=mtcars, statistic=rsq,
R=1000, formula = mpg ~ wt + disp)
# get 95% confidence interval
rsq.ci <- boot.ci(results, type="all")
# 2) Bootstrap via for loop and estimate bias-corrected pecentile CI's
# Fit LM with all the data
lm.obs <- lm(mpg ~ wt + disp, mtcars)
# Bootstrap with for loop
iterations <- 1000 # # of iterations
rsq.out <- vector() # to hold loop output
for(i in 1:iterations){
boot.df <- mtcars[sample(nrow(mtcars), nrow(mtcars), replace = TRUE), ]
boot.lm <- lm(mpg ~ wt + disp, boot.df)
boot.rsq <- summary(boot.lm)$r.squared
rsq.out <- c(rsq.out, boot.rsq)
}
hist(rsq.out)
# Bias-corrected percentile CI
# Manly, B. 1998. Randomization, bootstrap and monte carlo methods in
biology. 2nd edition.
# - p. 48
# Proportion of times that the boot estimate exceeds observed estimate
mtcar.boot.rsq <- data.frame(boot.rsq = rsq.out,
obs.rsq = summary(lm.obs)$r.squared)
head(mtcar.boot.rsq)
# If boot mean is > observed mean, code 1, otherwise 0
mtcar.boot.rsq$boot.high <- ifelse(mtcar.boot.rsq$boot.rsq >
mtcar.boot.rsq$obs.rsq, 1, 0)
# mean is the proportion of times boot mean > obs mean
mean(mtcar.boot.rsq$boot.high)
head(mtcar.boot.rsq)
# Use proportion to get Z score, then use that to calculate the new bias-correct
# Z score to look up the new proportion to use in quantile()
rsq.bc <- quantile(mtcar.boot.rsq$boot.rsq,
c(pnorm((2*pnorm(mean(mtcar.boot.rsq$boot.high))) - 1.96),
pnorm((2*pnorm(mean(mtcar.boot.rsq$boot.high))) + 1.96)))
# ESTIMATES
# My attempt at BCP CI (ehhhh)
rsq.bc
31.59952% -- 0.7747525
99.97103% -- 0.9034800
# Boot package
rsq.ci
Intervals :
Level Normal Basic
95% ( 0.6690, 0.8659 ) ( 0.6818, 0.8824 )
Level Percentile BCa
95% ( 0.6795, 0.8800 ) ( 0.6044, 0.8511 )
# Simple percentile CI
quantile(rsq.out, c(0.025, 0.975))
2.5% 97.5%
0.6871754 0.8809823
boot.ci(results, type="perc")
Level Percentile
95% ( 0.6795, 0.8800 )
I appreciate any advice!
Thanks.
More information about the R-help
mailing list