[R] Create correlated data with skew
Dimitris Rizopoulos
Dimitris.Rizopoulos at med.kuleuven.be
Tue Sep 18 20:49:27 CEST 2007
Quoting bbolker <bolker at ufl.edu>:
>
>
>
> Mike Lawrence wrote:
>>
>> Hi all,
>>
>> I understand that it is simple to create data with a specific
>> correlation (say, .5) using mvrnorm from the MASS library:
>>
>> > library(MASS)
>> > set.seed(1)
>> >
>> > a=mvrnorm(
>> + n=10
>> + ,mu=rep(0,2)
>> + ,Sigma=matrix(c(1,.5,.5,1),2,2)
>> + ,empirical=T
>> + )
>> > a
>> [,1] [,2]
>> [1,] -1.0008380 -1.233467875
>> [2,] -0.1588633 -0.003410001
>> [3,] 1.2054727 -0.620558768
>> [4,] 1.9580971 2.389495155
>> [5,] -0.9447473 -0.141852055
>> [6,] 0.6236799 -0.826952659
>> [7,] 0.1421782 0.452217611
>> [8,] -0.9050954 0.330991444
>> [9,] -0.7261632 0.217740460
>> [10,] -0.1937206 -0.564203311
>> > cor(a)
>> [,1] [,2]
>> [1,] 1.0 0.5
>> [2,] 0.5 1.0
>>
>>
>> But I'm looking to create data where the variables are non-normally
>> distributed (i.e. somewhat skewed). Any suggestions?
>>
>> Mike
>>
>> --
>> Mike Lawrence
>> Graduate Student, Department of Psychology, Dalhousie University
>>
>> Website: http://memetic.ca
>>
>> Public calendar: http://icalx.com/public/informavore/Public
>>
>> "The road to wisdom? Well, it's plain and simple to express:
>> Err and err and err again, but less and less and less."
>> - Piet Hein
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>
> The simplest (?) solution is probably to exponentiate your MVN data,
> leading to a bivariate log-normal distribution. The hard part is
> specifying the parameters of the lognormal in terms of the desired
> variance-covariance matrix. Variances are not too bad, but correlation
> may not be solvable. (Of course, if you don't care much about the
> precise characteristics of the simulated data and/or are willing to
> use some trial and error to get the desired variance/correlation you
> don't have to deal with this.)
> See e.g.
>
> http://www.stuart.iit.edu/faculty/workingpapers/thomopoulos/SomeMeasuresontheStandardBivariateLognormalDistribution.doc
>
> for some of the relevant formulas.
>
> good luck
> Ben Bolker
>
Another possibility is to use copulas, e.g.,
cop <- claytonCopula(2)
x <- mvdc(cop, c("gamma", "gamma"),
list(list(shape = 3, rate = 2), list(shape = 2, rate = 4)))
x.samp <- rmvdc(x, 1000)
for the Clayton copula with parameter 2, the correlation (in terms of
Kendall's-tau) is 0.5:
cor(x.samp, method = "kendall")
Best,
Dimitris
--
Dimitris Rizopoulos
Ph.D. Student
Biostatistical Centre
School of Public Health
Catholic University of Leuven
Address: Kapucijnenvoer 35, Leuven, Belgium
Tel: +32/(0)16/336899
Fax: +32/(0)16/337015
Web: http://med.kuleuven.be/biostat/
http://www.student.kuleuven.be/~m0390867/dimitris.htm
> View this message in context:
> http://www.nabble.com/Create-correlated-data-with-skew-tf4468269.html#a12762799
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
More information about the R-help
mailing list