BayesianPower

Fayette Klaassen

Introduction

BayesianPower can be used for sample size determination (using bayes_sampsize) and power calculation (using bayes_power) when Bayes factors are used to compare an inequality constrained hypothesis \(H_i\) to its complement \(H_c\), another inequality constrained hypothesis \(H_j\) or the unconstrained hypothesis \(H_u\). Power is defined as a combination of controlled error probabilities. The unconditional or conditional error probabilities can be controlled. Four approaches to control these probabilities are available in the methods of this package. Users are advised to read this vignette and the paper available at 10.17605/OSF.IO/D9EAJ where the four available approaches are presented in detail (Klaassen, Hoijtink & Gu, unpublished)).

Power calculation with bayes_power()

bayes_power(n, h1, h2, m1, m2, sd1=1, sd2=1, scale = 1000, bound1 = 1, bound2 = 1/bound1, datasets = 1000, nsamp = 1000, seed = 31)

Arguments

n A number. The sample size for which the error probabilities must be computed.

h1 A constraint matrix defining H1, see below for more details.

h2 A constraint matrix defining H2, or a character 'u' or 'c' for the unconstrained or complement hypothesis.

m1 A vector of expected population means under H1 (standardized), see below for more details.

m2 A vector of expected populations means under H2 (standardized). m2 must be of same length as m1.

sd1 A vector of standard deviations under H1. Must be a single number (equal standard deviation under all populations), or a vector of the same length as m1

sd2 A vector of standard deviations under H2. Must be a single number (equal standard deviation under all populations), or a vector of the same length as m2

scale A number or use the default 1000 to set the prior scale.

bound1 A number. The boundary above which BF12 favors H1, see below for more details.

bound2 A number. The boundary below which BF12 favors H2.

datasets A number. The number of datasets to simulate to compute the error probabilities

nsamp A number. The number of prior or posterior samples to determine the complexity or fit.

seed A number. The random seed to be set.

Details

Specifying hypotheses

Hypotheses are defined by means of a constraint matrix, that specifies the ordered constraints between the means \(\boldsymbol\mu\) using a constraint matrix \(R\), such that \(R \boldsymbol{\mu} > \bf{0}\), where \(R\) is a matrix with \(J\) columns and \(K\) rows, where \(J\) is the number of group means and \(K\) is the number of constraints between the means, \(\boldsymbol\mu\) is a vector of \(J\) means and \(\bf{0}\) is a vector of \(K\) zeros. The constraint matrix \(R\) contains a set of linear inequality constraints.

Consider

##      [,1] [,2] [,3]
## [1,]    1   -1    0
## [2,]    0    1   -1
## [1] 0.4 0.2 0.0
##      [,1]
## [1,]  0.2
## [2,]  0.2
##      [,1]
## [1,] TRUE
## [2,] TRUE

The matrix \(R\) specifies that the sum of \(1 \times \mu_1\) and \(-1 \times \mu_2\) and \(0 \times \mu_3\) is larger than \(0\), and the sum of \(0 \times \mu_1\) and \(1 \times \mu_2\) and \(-1 \times \mu_3\) is larger than \(0\). This can also be written as: \(\mu_1 > \mu_2 > \mu_3\). For more information about the specification of constraint matrices, see for example [@hoijtink12book].

The argument h1 has to be a constraint matrix as specified above. The argument h2 can be either a constraint matrix, or the character 'u' or 'c' if the goal is to compare \(H_1\) with \(H_u\), the unconstrained hypothesis, or \(H_c\) the complement hypothesis.

Specifying population means

Hypothesized population means have to be defined under \(H_1\) and \(H_2\), also if \(H_u\) or \(H_c\) are considered as \(H_2\). The group specific standard deviations can be set under sd1 and sd2, by default, all group standard deviations are \(1\).

Prior scale

The prior scale can be set using scale. By default, a scale of 1000 is used. This implies that the prior covariance matrix is proportional to the standard errors of the sampled data, by a factor of 1000.

Setting bounds

bound1 and bound2 describe the boundary used for interpreting a Bayes factor. If bound1 = 1, all \(BF_{12} > 1\) are considered to express evidence in favor of \(H_1\), if bound1 = 3, all \(BF_{12} > 3\) are considered to express evidence in favor of \(H_1\). Similarly, bound2 is the boundary below which \(BF_{12}\) is considered to express evidence in favor of \(H_2\).

Examples

Example 1. \(H_1\) vs \(H_c\)

An example where three group means are ordered in \(H_1: \mu_1 > \mu_2 > \mu_3\) which is compared to its complement. The power is determined for \(n = 40\)

Example 2. H1 vs H2

An example where four group means are ordered in \(H_1: \mu_1 > \mu_2 > \mu_3 > \mu_4\) and in \(H_2: \mu_3 > \mu_2 > \ mu_4 > \mu_1\). Only Bayes factors larger than \(3\) are considered evidence in favor of \(H_1\) and only Bayes factors smaller than \(1/3\) are considered evidence in favor of \(H_2\).

Sample size determination with bayes_sampsize()

bayes_sampsize(h1, h2, m1, m2, sd1 = 1, sd2 = 1, scale = 1000, type = 1, cutoff, bound1 = 1, bound2 = 1 / bound1, datasets = 1000, nsamp = 1000, minss = 2, maxss = 1000, seed = 31)

Arguments

The arguments are the same as for bayes_power() with the addition of:

typeA character. The type of error to be controlled. The options are: "1", "2", "de", "aoi", "med.1", "med.2". See below for more details.

cutoff A number. The cutoff criterion for type. If type is "1", "2", "de", "aoi", cutoff must be between \(0\) and \(1\). If type is "med.1" or "med.2", cutoff must be larger than \(1\). See below for more details.

minss A number. The minimum sample size.

maxss A number. The maximum sample size.

Details

bayes_sampsize() iteratively uses bayes_power() to determine the error probabilities for a sample size, evaluates whether the chosen error is below the cutoff, and adjusts the sample size.

type

[@klaassenPIH] describes in detail the different types of controlling error probabilities that can be considered. Specifying "1" or "2" indicates that the Type 1 or Type 2 error probability has to be controlled, respectively the probability of concluding \(H_2\) is the best hypothesis when \(H_1\) is true or concluding that \(H_1\) is the best hypothesis when \(H_2\) is true. Note that when \(H_1\) or \(H_2\) is considered the best hypothesis depends on the values chosen for bound1 and bound2. Specifying "de" or "aoi" indicates that the Decision error probability (average of Type 1 and Type 2) or the probability of Indecision has to be controlled. Finally, specifying " med.1" or "med.2" indicates the minimum desired median \(BF_{12}\) when \(H_1\) is true, or the minimum desired median \(BF_{21}\) when \(H_2\) is true.

Examples

References

Hoijtink, H. (2012). Informative hypotheses. Theory and practice for behavioral and social scientists. Boca Raton: Chapman Hall/CRC.

Klaassen, F., Hoijtink, H., Gu, X. (unpublished). The power of informative hypotheses. Pre-print available at https://doi.org/10.17605/OSF.IO/D9EAJ