[R] RES: Generating an autocorrelated binary variable
André Gabriel
andregfcc at gmail.com
Fri Sep 28 14:02:26 CEST 2012
I think the package BinarySimCLF can help.
See http://cran.r-project.org/web/packages/binarySimCLF/binarySimCLF.pdf.
André Gabriel.
-----Mensagem original-----
De: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Em
nome de Rolf Turner
Enviada em: sexta-feira, 28 de setembro de 2012 00:02
Para: Simon Zehnder
Cc: r help
Assunto: Re: [R] Generating an autocorrelated binary variable
I have no idea what your code is doing, nor why you want correlated binary
variables. Correlation makes little or no sense in the context of binary
random variables --- or more generally in the context of discrete random
variables.
Be that as it may, it is an easy calculation to show that if X and Y are
binary random variables both with success probability of 0.5 then cor(X,Y) =
0.2 if and only if Pr(X=1 | Y = 1) = 0.6. So just generate X and Y using
that
fact:
set.seed(42)
X <- numeric(1000)
Y <- numeric(1000)
for(i in 1:1000) {
Y[i] <- rbinom(1,1,0.5)
X[i] <- if(Y[i]==1) rbinom(1,1,0.6) else rbinom(1,1,0.4) }
# Check:
cor(X,Y) # Get 0.2012336
Looks about right. Note that the sample proportions are 0.484 and
0.485 for X and Y respectively. These values do not differ significantly
from 0.5.
cheers,
Rolf Turner
On 28/09/12 08:26, Simon Zehnder wrote:
> Hi R-fellows,
>
> I am trying to simulate a multivariate correlated sample via the Gaussian
copula method. One variable is a binary variable, that should be
autocorrelated. The autocorrelation should be rho = 0.2. Furthermore, the
overall probability to get either outcome of the binary variable should be
0.5.
> Below you can see the R code (I use for simplicity a diagonal matrix in
rmvnorm even if it produces no correlated sample):
>
> "sampleCop" <- function(n = 1000, rho = 0.2) {
>
> require(splus2R)
> mvrs <- rmvnorm(n + 1, mean = rep(0, 3), cov = diag(3))
> pmvrs <- pnorm(mvrs, 0, 1)
> var1 <- matrix(0, nrow = n + 1, ncol = 1)
> var1[1] <- qbinom(pmvrs[1, 1], 1, 0.5)
> if(var1[1] == 0) var1[nrow(mvrs)] <- -1
> for(i in 1:(nrow(pmvrs) - 1)) {
> if(pmvrs[i + 1, 1] <= rho) var1[i + 1] <- var1[i]
> else var1[i + 1] <- var1[i] * (-1)
> }
> sample <- matrix(0, nrow = n, ncol = 4)
> sample[, 1] <- var1[1:nrow(var1) - 1]
> sample[, 2] <- var1[2:nrow(var1)]
> sample[, 3] <- qnorm(pmvrs[1:nrow(var1) - 1, 2], 0, 1, 1, 0)
> sample[, 4] <- qnorm(pmvrs[1:nrow(var1) - 1, 3], 0, 1, 1, 0)
>
> sample
>
> }
>
> Now, the code is fine, everything compiles. But when I compute the
autocorrelation of the binary variable, it is not 0.2, but 0.6. Does anyone
know why this happens?
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list