[R] RES: Generating an autocorrelated binary variable

Fri Sep 28 14:02:26 CEST 2012

I think the package BinarySimCLF can help.
See http://cran.r-project.org/web/packages/binarySimCLF/binarySimCLF.pdf.

André Gabriel.

-----Mensagem original-----
De: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] Em
nome de Rolf Turner
Enviada em: sexta-feira, 28 de setembro de 2012 00:02
Para: Simon Zehnder
Cc: r help
Assunto: Re: [R] Generating an autocorrelated binary variable

I have no idea what your code is doing, nor why you want correlated binary
variables.  Correlation makes little or no sense in the context of binary
random variables --- or more generally in the context of discrete random
variables.

Be that as it may, it is an easy calculation to show that if X and Y are
binary random variables both with success probability of 0.5 then cor(X,Y) =
0.2 if and only if Pr(X=1 | Y = 1) = 0.6.  So just generate X and Y using
that
fact:

set.seed(42)
X <- numeric(1000)
Y <- numeric(1000)
for(i in 1:1000) {
    Y[i] <- rbinom(1,1,0.5)
    X[i] <- if(Y[i]==1) rbinom(1,1,0.6) else rbinom(1,1,0.4) }

# Check:
cor(X,Y) # Get 0.2012336

Looks about right.  Note that the sample proportions are 0.484 and
0.485 for X and Y respectively.  These values do not differ significantly
from 0.5.

     cheers,

             Rolf Turner

On 28/09/12 08:26, Simon Zehnder wrote:
> Hi R-fellows,
>
> I am trying to simulate a multivariate correlated sample via the Gaussian
copula method. One variable is a binary variable, that should be
autocorrelated. The autocorrelation should be rho = 0.2. Furthermore, the
overall probability to get either outcome of the binary variable should be
0.5.
> Below you can see the R code (I use for simplicity a diagonal matrix in
rmvnorm even if it produces no correlated sample):
>
> "sampleCop" <- function(n = 1000, rho = 0.2) {
> 	
> 	require(splus2R)
> 	mvrs <- rmvnorm(n + 1, mean = rep(0, 3), cov = diag(3))
> 	pmvrs <- pnorm(mvrs, 0, 1)
> 	var1 <- matrix(0, nrow = n + 1, ncol = 1)
> 	var1[1] <- qbinom(pmvrs[1, 1], 1, 0.5)
> 	if(var1[1] == 0) var1[nrow(mvrs)] <- -1
> 	for(i in  1:(nrow(pmvrs) - 1)) {
> 		if(pmvrs[i + 1, 1] <= rho) var1[i + 1] <- var1[i]
> 		else var1[i + 1] <- var1[i] * (-1)
> 	}
> 	sample <- matrix(0, nrow = n, ncol = 4)
> 	sample[, 1] <- var1[1:nrow(var1) - 1]
> 	sample[, 2] <- var1[2:nrow(var1)]
> 	sample[, 3] <- qnorm(pmvrs[1:nrow(var1) - 1, 2], 0, 1, 1, 0)
> 	sample[, 4] <- qnorm(pmvrs[1:nrow(var1) - 1, 3], 0, 1, 1, 0)
> 	
> 	sample
> 	
> }
>
> Now, the code is fine, everything compiles. But when I compute the
autocorrelation of the binary variable, it is not 0.2, but 0.6. Does anyone
know why this happens?

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.