[R] Simulation Questions
Andrew Robinson
A.Robinson at ms.unimelb.edu.au
Tue May 3 01:59:43 CEST 2011
Hi Shane,
it sounds to me as though you have a fairly well-defined problem. You
want to generate random numbers with a specific mean, variance, and
correlation with another random varaible. I would reverse-enginerr
the fuinctions for simple linear regression to get a result like
y = beta_0 + beta_1 * x + rnorm(n, 0, sigma^2)
and use that as the basis of generating random numbers.
Not sure how to interpret the second question ...
Cheers
Andrew
On Sun, May 01, 2011 at 12:33:41AM -0400, Shane Phillips wrote:
> I have the following script for generating a dataset. It works like a champ except for a couple of things.
>
> 1. I need the variables "itbs" and "map" to be negatively correlated with the binomial variable "lunch" (around -0.21 and -0.24, respectively). The binomial variable "lunch" needs to remain unchanged.
> 2. While my generated variables do come out with the desired means and correlations, the distribution is very narrow and only represents a small portion of the possible scores. Can I force it to encompass a wider range of scores, while maintaining my desired parameters and correlations?
>
> Please help...
>
> Shane
>
> Script follows...
>
>
>
> #Number the subjects
> subject=1:1000
> #Assign a treatment condition from a binomial distribution with a probability of 0.13
> treat=rbinom(1*1000,1,.13)
> #Assign a lunch status condition froma binomial distribution with a probability of 0.35
> lunch=rbinom(1*1000,1,.35)
> #Generate age in months from a random normal distribution with mean of 87 and sd of 2
> age=rnorm(1000,87,2)
> #invoke the MASS package
> require(MASS)
> #Establish the covariance matrix for MAP, ITBS and CogAT scores
> sigma <- matrix(c(1, 0.84, 0.59, 0.84, 1, 0.56, 0.59, 0.56, 1), ncol = 3)
> #Establish MAP as a random normal variable with mean of 200 and sd of 9
> map <- rnorm(1000, 200, 9)
> #Establish ITBS as a random normal variable with mean of 175 and sd of 15
> itbs <- rnorm(1000, 175, 15)
> #Establish CogAT as a random normal variable with mean of 100 and sd of 16
> cogat<-rnorm(1000,100,16)
> #Create a dataframe of MAP, ITBS, and CogAT
> data <- data.frame(map, itbs, cogat)
> #Draw from the multivariate distribution defined by MAP, ITBS, and CogAT means and the covariance matrix
> sim <- mvrnorm(1000, mu=mean(data), sigma, empirical=FALSE)
> #Set growth at 0
> growth=0
> #Combine elements into a single dataset
> simtest=data.frame (subject=subject, treat=treat,lunch, age=round(age,0),round(sim,0),growth)
> #Set mean growth by treatment condition with treatd subjects having a mean growth of 1.5 and non-treated having a mean growth of 0.1
> simtest<-transform(simtest, growth=rnorm(1000,m=ifelse(treat==0,0.1,1.5),s=1))
> simtest
> cor (simtest)
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Andrew Robinson
Program Manager, ACERA
Department of Mathematics and Statistics Tel: +61-3-8344-6410
University of Melbourne, VIC 3010 Australia (prefer email)
http://www.ms.unimelb.edu.au/~andrewpr Fax: +61-3-8344-4599
http://www.acera.unimelb.edu.au/
Forest Analytics with R (Springer, 2011)
http://www.ms.unimelb.edu.au/FAwR/
Introduction to Scientific Programming and Simulation using R (CRC, 2009):
http://www.ms.unimelb.edu.au/spuRs/
More information about the R-help
mailing list