[R] Struggeling with svydesign()
Thomas Lumley
tlumley at u.washington.edu
Thu Apr 8 18:30:36 CEST 2010
On Thu, 8 Apr 2010, ONKELINX, Thierry wrote:
> Dear Thomas,
>
> Thank you for your informative answer. We used epi.stratasize() to
> estimate the required sample size per stratum. Notice in the example
> below that it can select a sample size smaller than 2 in the very small
> strata. Would you recommend to sample at least two items per stratum or
> rather to merge some strata a priori until the sample size is at least
> 2?
Merging the strata would be best
> Or is there a better way to estimate the sample size per stratum?
> Note that the stratification only aims to get a good geographical
> coverage (the strata a geographical regions). We are not interested in
> estimates per stratum.
>
> library(epiR)
> N <- c(39, 270, 1060, 1336, 118, 26, 154, 10, 3)
> epi.stratasize(strata.n = N, strata.mean = 0.9, epsilon = 0.05, method =
> "proportion")
> $strata.sample
> [1] 2 15 57 72 6 1 8 1 0
>
> $total.sample
> [1] 162
>
> The probability of sampling was proportional with the area (larger
> polygons are more likely to be selected than smaller ones). So we will
> use weights = I(1/Area), as you suggested.
If you are using probability proportional to size and you want to use finite-population correctsions, you also need to specify the fpc= argument differently. The simplest version is an approximation that uses only the marginal sampling probabilities
svydesign(id=~1, fpc=~p, pps="brewer", strata=~strat
where p is a variable with the actual sampling probability (not just proportional to sampling probability).
Also, how did you do the sampling? It's quite hard to do unequal probability sampling without replacement (the R sample() function doesn't actually do it, though the sampling package does).
-thomas
Thomas Lumley Assoc. Professor, Biostatistics
tlumley at u.washington.edu University of Washington, Seattle
More information about the R-help
mailing list