[R] simulate correlated binary, categorical and continuous variable
Burak Aydin
burak235813 at hotmail.com
Mon Apr 2 03:06:40 CEST 2012
Hello David Duffy-2,
I see that you just proved using rmvnorm and then dichotomize/categorize
them should work. Thanks but please take a look at this link;
http://biostat.mc.vanderbilt.edu/twiki/bin/view/Main/CatContinuous
and this article;
Analysis by Categorizing or Dichotomizing Continuous Variables Is
Inadvisable: An Example from the Natural History of Unruptured Aneurysms
by O. Naggaraa,b, J. Raymonda, F. Guilberta, D. Roya, A. Weilla and D.G.
Altmanc 2011.
Plus; here is my explanatory code.
require(mvtnorm)
sigm=matrix(c(0.12, 0.05, 0.02, 0.00,
0.05, 1.24, 0.38,0.00,
0.02, 0.38, 2.38, 0.03,
0.00, 0.00,0.03, 0.16),
ncol=4, byrow=T)
mu=rep(0,4)
#simulated data
dat1 = rmvnorm(1000,mean=mu,sigma=sigm)
#difference between sigmas before dichotimize/categorize
sigm-cov(dat1)
#difference between means before dichotimize/categorize
means1=apply(dat1,2,mean)
mu-means1
#dichotimization and categorization
#lets dichotimize the third variable
#I wantto keep mean the same (0.50)
dat2=dat1
dat2[,3]=ifelse(dat1[,3]>0.0,0,1)
means2=apply(dat2,2,mean)
mu-means2
# I kept the mean same, but look at the difference in cov matricies
sigm-cov(dat2)
--
View this message in context: http://r.789695.n4.nabble.com/simulate-correlated-binary-categorical-and-continuous-variable-tp4516433p4524882.html
Sent from the R help mailing list archive at Nabble.com.
More information about the R-help
mailing list