Functions in this package provide solution to classical problem in survey methodology - an optimum sample allocation in stratified sampling schemes. In this context, the optimal allocation is in the classical Tschuprov-Neyman’s sense and it satisfies additional lower or upper bounds restrictions imposed on sample sizes in strata. There are few different algorithms available to use, and one them is based on popular sample allocation method that applies Neyman allocation to recursively reduced set of strata.
A minor modification of the classical optimium sample allocation problem leads to the minimum sample size allocation. This problem lies in the determination of a vector of strata sample sizes that minimizes total sample size, under assumed fixed level of the pi-estimator’s variance. As in the case of the classical optimal allocation, the problem of minimum sample size allocation can be complemented by imposing upper bounds constraints on sample sizes in strata.
Stratallo provides two user functions, dopt
and
nopt
that solve sample allocation problems briefly
characterized above. In this context, it is assumed that the sampling
designs in strata are chosen so that the variance of the pi-estimator of
the population total is of the following generic form:
where H denotes total number of strata, x1,…,xH are the strata sample sizes, and parameters b and aw > 0 do not depend on xw, w = 1,…,H.
Apart from dopt
and nopt
,
stratallo provides var_tst
and
var_tst_si
functions that compute a value of variance
D2st. The
var_tst_si
is a simple wrapper of var_tst
that
is dedicated for the case of simple random sampling without replacement
design in each stratum. Furthermore, the package comes with two
predefined, artificial populations with 507 and 969 strata. These are
stored in pop507
and pop969
objects
respectively.
See package’s vignette for more details.
You can install the released version of stratallo package from CRAN with:
install.packages("stratallo")
These are basic examples that show how to use dopt
and
nopt
functions to solve optimal sample allocation problems
for an example population with 4 strata.
library(stratallo)
dopt
# Define example population.
<- c(3000, 4000, 5000, 2000) # Strata sizes.
N <- c(48, 79, 76, 17) # Standard deviations of a study variable in strata.
S <- N * S
a <- 190 # Total sample size. n
<- dopt(n = n, a = a)
opt
optsum(opt) == n
# Variance of the pi-estimator that corresponds to a given optimal allocation.
var_tst_si(opt, N, S)
<- c(100, 90, 70, 80) # Upper bounds constraints imposed on the sample sizes in strata.
M all(M <= N)
< sum(M)
n
# Solution to Problem 1.
<- dopt(n = n, a = a, M = M)
opt
optsum(opt) == n
all(opt <= M) # Does not violate upper bounds constraints.
# Variance of the pi-estimator that corresponds to a given optimal allocation.
var_tst_si(opt, N, S)
<- c(50, 120, 1, 1) # Lower bounds constraints imposed on the sample sizes in strata.
m > sum(m)
n
# Solution to Problem 2.
<- dopt(n = n, a = a, m = m)
opt
optsum(opt) == n
all(opt >= m) # Does not violate lower bounds constraints.
# Variance of the pi-estimator that corresponds to a given optimal allocation.
var_tst_si(opt, N, S)
<- c(100, 90, 500, 50) # Lower bounds constraints imposed on sample sizes in strata.
m <- c(300, 400, 800, 90) # Upper bounds constraints imposed on sample sizes in strata.
M <- 1284
n > sum(m) && n < sum(M)
n
# Optimal allocation under box-constraints.
<- dopt(n = n, a = a, m = m, M = M)
opt
optsum(opt) == n
all(opt >= m & opt <= M) # Does not violate any lower or upper bounds constraints.
# Variance of the pi-estimator that corresponds to a given optimal allocation.
var_tst_si(opt, N, S)
nopt
<- c(3000, 4000, 5000, 2000)
a <- 70000
b <- c(100, 90, 70, 80)
M <- 1e6 # Variance constraint.
D
<- nopt(D, a, b, M)
opt sum(opt)