[R] Bootstrap P-Value

Fri Nov 6 19:01:24 CET 2020

Dear Greg:

H0: Mean 1- Mean 2 = 0
Ha: Mean 1 - Mean 2 ! = 0

with many thanks
abou
______________________

*AbouEl-Makarim Aboueissa, PhD*

*Professor, Statistics and Data Science*
*Graduate Coordinator*

*Department of Mathematics and Statistics*
*University of Southern Maine*

On Fri, Nov 6, 2020 at 12:35 PM Greg Snow <538280 using gmail.com> wrote:

> A p-value is for testing a specific null hypothesis, but you do not
> state your null hypothesis anywhere.
>
> It is the null value that needs to be subtracted from the bootstrap
> differences, not the observed difference.  By subtracting the observed
> difference you are setting a situation where the p-value will always
> be about 0.5 or about 1 (depending on 1 tailed or 2 tailed).  If
> instead you subtract a null value (such as 0), then the p-values will
> be closer to what you are expecting.
>
> On Fri, Nov 6, 2020 at 9:44 AM AbouEl-Makarim Aboueissa
> <abouelmakarim1962 using gmail.com> wrote:
> >
> > *Dear All:*
> >
> > *I am trying to compute the p-value of the bootstrap test; please see
> > below.*
> >
> > *In example 1 the p-value agrees with the confidence interval.*
> > *BUT, in example 2  the p-value DOES NOT agree with the confidence
> > interval. In Example 2, the p-value should be zero or close to zero.*
> >
> > *I am not sure what went wrong, or not sure if I missed something.*
> >
> > *any help would be appreciated.*
> >
> >
> > *with many thanks*
> > *abou*
> >
> >
> >
> > #####  Two - Sample Bootstrap
> >
> > #####  Source:
> > http://www.ievbras.ru/ecostat/Kiril/R/Biblio_N/R_Eng/Chernick2011.pdf
> >
> > #####  Example 1:
> > #####  ----------
> >
> >
> >
> > set.seed(1)
> >
> > n1 <- 29
> > n1
> > x1 <- rnorm(n1, 1.143, 0.164) #some random normal variates: mean1 = 1.143
> > x1
> >
> > n2 <- 33
> > n2
> > x2 <- rnorm(n2, 1.175, 0.169) #2nd random sample: mean2 = 1.175
> > x2
> >
> > obs.diff.theta <- mean(x1) - mean(x2)
> > obs.diff.theta
> >
> > theta <- as.vector(NULL) #### vector to hold difference estimates
> >
> > iterations <- 1000
> >
> > for (i in 1:1000) {                        #bootstrap resamples
> >  xx1 <- sample(x1, n1, replace = TRUE)
> >  xx2 <- sample(x2, n2, replace = TRUE)
> >  theta[i] <- mean(xx1) - mean(xx2)
> >  }
> >
> >
> >
> > ##### Confidence Interval:
> > ##### --------------------
> >
> >
> > quantile(theta, probs = c(.025,0.975)) #Efron percentile CI on difference
> > in means
> >
> > ##### 2.5% 97.5%
> > ##### - 0.1248539 0.0137601
> >
> >
> > ##### P-Value
> > ##### -------
> >
> > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/ (iterations+1)
> >
> > #####  p.value <- (sum (theta >= obs.diff.theta) + 1)/ (iterations+1)
> >
> > p.value
> >
> >
> >
> > #### R OUTPUT
> >
> > #### > quantile(theta, probs = c(.025,0.975))
> > ####        2.5%       97.5%
> > #### -0.12647744  0.02099391
> >
> > #### > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/
> (iterations+1)
> > #### > p.value
> > #### [1] 1
> >
> > #####  Example 2:
> > #####  ----------
> >
> >
> > set.seed(5)
> >
> > n1 <- 29
> > ### n1
> > x1 <- rnorm(n1, 10.5, 0.15) ######   sample 1 with mean1 = 10.5
> > ### x1
> >
> > n2 <- 33
> > ### n2
> > x2 <- rnorm(n2, 1.5, 0.155) #####  Sample 2 with mean2 = 1.5
> > ### x2
> >
> > obs.diff.theta <- mean(x1) - mean(x2)
> > obs.diff.theta
> >
> > theta <- as.vector(NULL) #### vector to hold difference estimates
> >
> > iterations <- 1000
> >
> > #####   bootstrap resamples
> >
> > for (i in 1:1000) {
> >  xx1 <- sample(x1, n1, replace = TRUE)
> >  xx2 <- sample(x2, n2, replace = TRUE)
> >  theta[i] <- mean(xx1) - mean(xx2)
> >  }
> >
> >
> >
> > ##### Confidence Interval:
> > ##### --------------------
> >
> >
> > ######  CI on difference in means
> >
> > quantile(theta, probs = c(.025,0.975))
> >
> >
> >
> > ##### P-Value
> > ##### -------
> >
> > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/ (iterations+1)
> >
> > ##### p.value <- (sum (theta >= obs.diff.theta) + 1)/ (iterations+1)
> >
> > p.value
> >
> > ##### R OUTPUT
> >
> > ####   > ######  CI on difference in means
> > ####   >
> > ####   > quantile(theta, probs = c(.025,0.975))
> > ####       2.5%    97.5%
> > ####   8.908398 9.060601
> >
> > ####   > ##### P-Value
> > ####   > p.value <- (sum (abs(theta) >= obs.diff.theta) + 1)/
> (iterations+1)
> >
> > ####   > p.value
> > ####   [1] 0.4835165
> >
> > ______________________
> >
> >
> > *AbouEl-Makarim Aboueissa, PhD*
> >
> > *Professor, Statistics and Data Science*
> > *Graduate Coordinator*
> >
> > *Department of Mathematics and Statistics*
> > *University of Southern Maine*
> >
> >         [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Gregory (Greg) L. Snow Ph.D.
> 538280 using gmail.com
>

	[[alternative HTML version deleted]]