[R] simulation data with dichotomous varuables

William Revelle lists at revelle.net
Sun Aug 10 23:59:35 CEST 2014


Dear Thanoon,
 You might look at the various item simulation functions in the psych package.

In particular, for your problem:

R1 <- sim.irt(10,1000,a=3,low = -2, high=2)
R2 <-  sim.irt(10,1000,a=3,low = -2, high=2)
R12 <- data.frame(R1$items,R2$items)
#this gives you 20 items, grouped with high correlations within the first 10, and the second 10, no correlation between the first and second sets.
rho <- tetrachoric(R12)$rho  #find the tetrachoric correlation between the items
lowerMat(rho)  #show the correlations
cor.plot(rho,numbers=TRUE)   #show a heat map of the correlations

Bill


On Aug 4, 2014, at 8:08 PM, thanoon younis <thanoon.younis80 at gmail.com> wrote:

> Dear R-users
> i need your help to solve my problem in the code below, i  want to simulate
> two different samples R1 and R2 and each sample has 10 variables and 1000
> observations so i want to simulate a data with high correlation between
> var. in R1 and also in R2 and no correlation between R1 and R2 also i have
> a problem with correlation coefficient between tow dichotomous var. the R-
> program supports just these types of correlation coefficients such as
> pearson, spearman,kendall.
> 
> thanks alot in advance
> 
> Thanoon
> 
> 
> ords <- seq(0,1)
> p <- 10
> N <- 1000
> percent_change <- 0.9
> 
> R1 <- as.data.frame(replicate(p, sample(ords, N, replace = T)))
> R2 <- as.data.frame(replicate(p, sample(ords, N, replace = T)))
> # pearson is more appropriate for dichotomous data
> cor(R1, R2, method = "pearson")
> 
> 
> # subset variable to have a stronger correlation
> 
> 
> v1 <- R1[,1, drop = FALSE]
> v1 <- R2[,1, drop = FALSE]
> # randomly choose which rows to retain
> keep <- sample(as.numeric(rownames(v1)), size = percent_change*nrow(v1))
> change <- as.numeric(rownames(v1)[-keep])
> 
> # randomly choose new values for changing
> new.change <- sample(ords, ((1-percent_change)*N)+1, replace = T)
> 
> # replace values in copy of original column
> v1.samp <- v1
> v1.samp[change,] <- new.change
> 
> # closer correlation
> cor(v1, v1.samp, method = "pearson")
> 
> # set correlated column as one of your other columns
> R1[,2] <- v1.samp
> R2[,2] <- v1.samp
> R1
> R2
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

William Revelle		           http://personality-project.org/revelle.html
Professor			           http://personality-project.org
Department of Psychology   http://www.wcas.northwestern.edu/psych/
Northwestern University	   http://www.northwestern.edu/
Use R for psychology             http://personality-project.org/r
It is 5 minutes to midnight	   http://www.thebulletin.org



More information about the R-help mailing list